Image segmentation for humid tropical forest classification in Landsat TM data

Image segmentation for humid tropical forest classification in Landsat TM data R. A. HILL Section for Earth Observation, Institute of Terrestrial Ecology, Monks Wood, Abbots Ripton, Huntingdon, Cambs, PE17 2LS. Abstract Humid tropical forest types have low spectral separability in Landsat TM data due to highly textured reflectance patterns at the 30m spatial resolution. Two methods of reducing local spectral variation, low-pass spatial filtering and image segmentation, were examined for supervised classification of 10 forest types in TM data of Peruvian Amazonia. The number of forest classes identified at over 90% accuracy increased from one in raw imagery to three in filtered imagery, and six in segmented imagery. The ability to derive less generalised tropical forest classes may allow greater use of classified imagery in ecology and conservation planning.

1. Introduction Knowledge of the distribution and spatial extent of humid tropical forests is required, given their significance for global climate, biogeochemical cycling and biodiversity (Whitmore 1990). Remote sensing can provide data at synoptic scales offering the potential to discern large-scale ecosystem patterns (Roughgarden et al. 1991). In studies using optical sensor imagery, the greatest levels of distinction between mature humid tropical forest types have been achieved by visual image interpretation (Paradella et al. 1994, Tuomisto et al. 1994), or by semi-automated methods incorporating ancillary data held in geographical information systems (Gastellu-Etchegorry et al. 1993, Riaza et al. 1998). These approaches are demanding on resources, such as time, image processing facilities, and data availability. Where such resources are limited, alternative techniques such as statistical classifiers must be used. These have typically given high accuracy results only for very general tropical forest classes (e.g. Garcia and Alvarez 1994, Langford and Bell 1997). Humid tropical forests have low statistical separability in the visible, near- and middleinfrared wavebands of Landsat imagery. This relates mainly to the highly textured nature of reflectance patterns, in which variance within forest classes can be greater than that between forest classes (Singh 1987, Foody and Hill 1996). The use of low-pass spatial filters can increase the spectral separability of tropical forest types by reducing within-class variance (Hill and Foody 1994). Low-pass filters de-emphasise the high spatial frequency component of an image by replacing individual pixel values with average values calculated over a moving pixel window. When the window includes two or more forest classes, however, the statistics generated may not be relevant to any one of the classes present. The problem of blurring at class boundaries limits the size of kernel that can be used for spatial filtering and thus the extent to which high spatial frequency spectral variation can be attenuated.

An alternative approach to reducing local spectral variation is the segmentation of imagery into spatially cohesive regions according to intrinsic image properties. This process creates regions which are more homogeneous within themselves than when compared with neighbouring regions (Haralick and Shapiro 1985). This has been shown to increase the spectral separability of forest types in imagery of Amazonia, enabling a more detailed distinction than would be possible in raw imagery (Brondizio et al. 1996, Lobo 1997). This letter compares the results of supervised classification of tropical rain forest types in Landsat TM imagery of Peruvian Amazonia using: i) raw imagery; ii) low-pass filtered imagery; and iii) segmented imagery. The aim is to compare the effects of filtering and segmentation on classification accuracy. 2. Data and Methods Nine lowland forest types were surveyed in the Tambopata-Candamo Reserved Zone (TCRZ) of south-east Peru. They were identified according to the forest classification scheme of Phillips (1993), which is based on species composition, location, and soil type. The forest types identified were: permanently and seasonally flooded swamp forests; lower, middle, upper and old floodplain forests; terra firme forests on clay and sand soils; and bamboo forest. In addition to these, areas of agricultural land were also surveyed along the Rio Tambopata. Such land frequently contained unfelled trees and patches of secondary regrowth, dominated by bamboo and pioneer tree species. As a result, agricultural land was considered as an additional (open-canopied) forest class in this study (Table 1).

The forest classes were examined in a Landsat TM image (WRS 002, 069) acquired in July 1991. Since Landsat TM reflectance data are generally three-dimensional in spectral character with the dimensions relating to visible, near- and middle-infrared wavebands (Townshend et al. 1988), only data in such wavebands were acquired: Band 2 (visible green), Band 4 (near-infrared), and Band 7 (middle-infrared). No radiometric corrections were performed on the image, but a geometric correction to a UTM projection was carried out to enable the accurate identification of field survey locations. To create the two data sets for comparison with raw imagery in tropical forest classification, separate pre-processing pathways were followed. These consisted of (i) spatial filtering using a 5x5 low-pass (mean) filter; and (ii) segmentation using an edge detection and region growing algorithm. Due to software limitations, the segmentation procedure was performed separately for each TM band. Although not ideal, as segment boundaries could differ between wavebands, this performed the desired function of reducing local spectral variation within each waveband by averaging pixel values across identified regions. This was a lowlevel image segmentation process (Haralick and Shapiro 1985) as the regions created were not necessarily meaningful entities (i.e. forest classes), but merely parts of them. Image classification was performed separately on the three data sets, but using the same training sites within each image. These covered 15,800 pixels and incorporated a representative example of each of the 10 forest types, plus savanna grassland, bare ground and water bodies. For methodological consistency, all three images (raw, filtered, and segmented) were classified using a per-pixel Maximum Likelihood classifier. The accuracy of the three classified products was assessed by comparison with 17 independent testing sites of known forest type identified during the field survey. The test sites contained at least one

representative example of each forest type and ranged in size from 166 to 1859 pixels, covering a total of 10,800 pixels. The testing data were extracted from the centre of the field identified locations to ensure pure examples of each forest type were used in the analysis. Classification accuracy was expressed as the percentage of the total number of pixels within the test areas correctly allocated. 3. Results and Discussion The classification accuracy for the 10 forest types was extremely low in all three images; necessitating the post-classification merging of forest classes. If all 10 forest types were aggregated into a single forest class, then a classification accuracy of 100% could be achieved for each image. This may be a useful level of aggregation for mapping forest cover, identifying forest boundaries, or monitoring deforestation. However, of greater potential use is the identification of forest classes characterised by structural or crown features. Three such broad structurally-determined forest classes can be identified in the TCRZ, relating to canopy closure and the presence of emergent trees (Table 1). At this three forest class level, the classification accuracy was very similar for the filtered and segmented images (96% and 98% respectively), but notably lower for the raw image (65%). To the tropical ecologist or conservation planner, the ability to disaggregate these structural forest groups reliably into more ecologically defined forest classes would be of tremendous value. In the segmented image it was possible to disaggregate two of the structural classes into five sub-classes, giving a total of six forest classes at an accuracy of 91% (Table 2). By comparison, these six forest classes were identified at a classification accuracy of only 67% and 17% respectively in the filtered and raw images.

These results demonstrate that low-pass filtering or segmentation can reduce within class variance for humid tropical forest types in Landsat TM imagery, thereby increasing forest spectral separabilty and classification accuracy. The number of forest classes identified at an accuracy of over 90% increased from one in the raw image, to three in the filtered image and to six in the segmented image. The latter compares favourably with results reported elsewhere, in which typically only 4 or 5 mature forest classes have been identified at a similar accuracy level. However, direct comparison of results with those of other studies is complicated by the differing criteria used to define forest classes. The focus of analysis has fallen variously on identifying topographically determined forest classes (Gastellu-Etchegorry et al. 1993); forest density classes (Paradella et al. 1994); pristine from degraded forests (Jusoff and D Souza 1996); or mature from regenerating forests (Brondizio et al. 1996). 4. Conclusions The segmentation of Landsat TM imagery provides a means of reducing the spectral overlap between humid tropical forest types, increasing the maximum number that can be classified at a given accuracy level compared with raw or filtered imagery. The detail in forest discrimination and accuracy achieved is comparable with contemporary studies, many of which have made use of more complex methods of image analysis or a greater supply of ancillary data. Further research is now required to establish whether the image segmentation process is genuinely working within and maintaining spatial patterns or merely creating artificial boundaries in a landscape of natural gradation. Acknowledgements I am grateful to the Tambopata Reserve Society for provision of the data sets and permission to publish the results, to Richard Lucas, Miroslav Honz<k, and Giles Foody for assistance

with image analysis, and to Peter North and Robin Fuller for valuable comments on the manuscript. References BRONDIZIO, E., MORAN, E., MAUSEL, P., and WU, Y., 1996, Land cover in the Amazon Estuary: linking of the Thematic Mapper with botanical and historical data. Photogrammetric Engineering and Remote Sensing, 62, 921-929. FOODY, G. M., and HILL, R. A., 1996, Classification of tropical forest classes from Landsat TM data. International Journal of Remote Sensing, 17, 2353-2367. GARCIA, M. C., and ALVAREZ, R., 1994, TM digital processing of a tropical region in southeastern Mexico. International Journal of Remote Sensing, 15, 1611-1632. GASTELLU-ETCHEGORRY, J. P., ESTREGUIL, C., MOUGIN, E., and LAUMONIER, Y., 1993, A GIS based methodology for small scale monitoring of tropical trees - a case study in Sumatra. International Journal of Remote Sensing, 14, 2349-2368. HARALICK, R. M., and SHAPIRO, L. G., 1985, Image segmentation techniques. Computer Vision, Graphics and Image Processing, 29, 100-132. HILL, R. A., and FOODY, G. M., 1994, Separability of tropical rain-forest types in the Tambopata-Candamo Reserved Zone, Peru. International Journal of Remote Sensing, 15, 2687-2693.

JUSOFF, K., and D SOUZA, G., 1996, Quantifying disturbed hill dipterocarp forest lands in Ulu Tembeling, Malaysia with HRV/SPOT images. Photogrammetry and Remote Sensing, 51, 39-48. LANGFORD, M., and BELL, W., 1997, Land cover mapping in a tropical hillsides environment: a case study in the Cauca region of Colombia. International Journal of Remote Sensing, 18, 1289-1306. LOBO, A., 1997, Image segmentation and discriminant analysis for the identification of land cover units in ecology. IEEE Transactions on Geoscience and Remote Sensing, 35, 1136-1145. PARADELLA, W. R., DA SILVA, M. F. F., ROSA, N. DE. A., and KUSHIGBOR. C. A., 1994, A geobotanical approach to the tropical rainforest environment of the Carajas Mineral Province (Amazon Region, Brazil) based on digital TM-Landsat and DEM data.. International Journal of Remote Sensing, 15, 1633-1648. PHILLIPS, O., 1993, Comparative Valuation of Tropical Forests in Amazonian Peru. Ph.D. thesis, Missouri Botanic Gardens, St. Lois, Misssouri. RIAZA, A., MARTINEZ-TORRES, M. L., RAMON-LLUCH, R., ALONSO, J., and HERAS, P., 1998, Evolution of equatorial vegetation communities mapped using Thematic Mapper images through a geographical information system (Guinea, Equatorial Africa). International Journal of Remote Sensing, 19, 43-54.

ROUGHGARDEN, J., RUNNING, S. W., and MATSON, P. A., 1991, What does remote sensing do for ecology? Ecology, 72, 1918-1922. SINGH, A., 1987, Spectral separability of tropical forest cover classes. International Journal of Remote Sensing, 8, 971-979. TOWNSHEND, J. R. G., CUSHINE, J., HARDY, J. R., and WILSON, A., 1988, Thematic Mapper Data: Characteristics and Use (Swindon: NERC), 55p. TUOMISTO, H., LINNA, A,. and KALLIOLA, R., 1994, Use of digitally processed satellite images in studies of tropical rain forest vegetation. International Journal of Remote Sensing, 15, 1595-1610. WHITMORE, T. C., 1990, An Introduction to Tropical Rain Forests (Oxford: Clarendon Press).

Forest Type Structural Type Agricultural land Bamboo forest Open canopy Lower floodplain forest Middle floodplain forest Upper floodplain forest Upper/old floodplain forest Terra firme (sand) forests Closed rough canopy (with emergent trees) Terra firme (clay) forest Permanently flooded swamp forest Seasonally flooded swamp forest Closed smooth canopy (lacking emergent trees) Table 1 The 10 lowland forest types of the TCRZ studied in Landsat TM imagery.

Forest class Composition Colour in classified imagery 1 Agricultural land Bamboo forest Lower floodplain forest Yellow 2 Middle floodplain forest Bright green 3 Upper-old floodplain forests Terra firme (sand) forest Dark green 4 Terra firme (clay) forest Brown 5 Permanently flooded swamp forest Red 6 Seasonally flooded swamp forest Burgundy Table 2 The six forest classes identified in the segmented TM image (with a 91% accuracy)

Figure Caption Figure 1. Two 25x25 km extracts of Landsat TM imagery covering the TCRZ, Peru: (top) false colour composites where RGB = Bands 7, 4, 2; (bottom) image classification into six forest classes following segmentation (see Table 2 for key). (In addition: grey = savanna, pink = bare ground, blue = water).