Big Data in Water: Opportunities and Challenges for Machine Learning

Size: px
Start display at page:

Download "Big Data in Water: Opportunities and Challenges for Machine Learning"

Transcription

1 2018 Water Resources Assembly and Research Symposium Headwaters Lecture Big Data in Water: Opportunities and Challenges for Machine Learning Vipin Kumar Department of Computer Science and Engineering University of Minnesota Headwaters Lecture Water : A Grand Societal Challenge of the 21 st Century Floods due to Hurricane Harvey Droughts in Southern California Harmful Algal Bloom in Lake Erie Shrinking Lake Mead 2 1

2 Big Data in Water Satellite Imagery Weather/Climate Models Hydrological Models IOT for Water 3 Golden Age of Data Science Hugely successful in commercial applications: 4 2

3 Case Study: Monitoring Global Surface Water Dynamics Impact of Climate Change Impact of Human Actions Early Warning Systems Cedo Caka Lake in Tibet, 1984 Cedo Caka Lake in Tibet, 2011 Aral Sea in 1989 Aral Sea in 2014 Great Flood of Mississippi River, Case Study: Monitoring Global Surface Water Dynamics Impact of Climate Change Impact of Human Actions Early Warning Systems Cedo Caka Lake in Tibet, 1984 Cedo Caka Lake in Tibet, 2011 Aral Sea in 1989 Aral Sea in 2014 Quantifying water stocks and flow Great Flood of Mississippi River, Integrating with hydrological models Global projections of water risks (red) 3

4 Satellite Big Data Time Latitude grid cell MODIS covers ~ 5 billion locations globally at 250m resolution daily since Feb Longitude A vegetation index measures the surface greenness proxy for total biomass This vegetation time series captures temporal dynamics around the site of the China National Convention Center Data Type Coverage Spatial Resolution Temporal Resolution Spectral Resolution Duration Availability LANDSAT Multispectral Global 30 m 16 days present Public Hyperion Hyperspectral Regional 30 m 16 days present Public Sentinal - 1 Radar Global 5 m 12 days present Public Sentinal - 2 Multispectral Global 10 m 6days present Public Quickbird Multispectral Global 2.16 m 2 to 12 days Private SWBD (SRTM Water Body Dataset (Feb 2000) Google-JRC water body product ( ) 7 Challenges for Traditional Big Data Methods Challenge 1: Heterogeneity in space and time - Water and land bodies look different in different regions of the world - Same water body can look different at different time instances Challenge 2: Data Quality Clouds, shadows, atmospheric disturbances Incorrect labels Missing data no labels Great Bitter Lake, Egypt Lake Tana, Ethiopia Lake Abbe, Africa Mar Chiquita Lake, Argentina in 2000 (left) and 2012 (right) Poyang Lake, China (Pink color shows missing 8data) 4

5 Method Innovations for Monitoring Water Ensemble Learning Methods for Handling Heterogeneity in Data 1,2 Learn an ensemble of classifiers to distinguish b/w different pairs of positive and negative modes Positive Modes (Water) Negative Modes (Land) Using Physics Guided Labeling to Handle Poor Data Quality 3,4 Use elevation information to constrain physically-consistent labels P 1 N 1 P 2 N 2 P 3 1 Karpatne et al. SDM Karpatne et al. ICDM 2015 N 3 Elevation A > B > C > D 3 Khandelwal et al. ICDM Mithal et al. (PhD Dissertation) 9 A Global Surface Water Monitoring System Maps the dynamics of all major surface water bodies (surface area > 2.5 km 2 ) shown as blue dots Key Highlights: Detects melting of glacial lakes Maps changes in river morphology Identifies reservoir constructions Finds relationships b/w surface water and precipitation/groundwater 10 5

6 Showing Surface Water Dynamics Don Martin Dam, Mexico Annual Landsat Time lapse of this region (Courtesy: Google Earth Engine) Surface area of water around Don Martin Dam across time 11 Regions of Change in South America Red Dots (Water Gain): Region of size > 2.5 km 2 that have changed from land to water in the last 15 years Example time series of a Water Gain region Green Dots (Water Loss): Region of size > 2.5 km 2 that have changed from water to land in the last 15 years Example time series of a Water Loss region 12 6

7 Examples of Change: Shrinking Water Bodies (Green dots show regions changing from water to land in last 15 years) Annual Time lapse of an example green dot Aggregate dynamics of all green dots shown on left 13 Examples of Change: Melting Glacial Lakes in Tibet Water Gain regions (red dots) show melting of lakes September 2013 November 2015 November 2015 Aggregate dynamics of all red regions in Tibet Red polygons show regions changing from land to water 14 7

8 1/23/2018 Examples of Change: River Meandering (Adjacent occurrence of Water Gain (red) and Water Loss (green) regions all along the river indicate the displacement of water from the green dots to the red dots) Time lapse of 2 Zoomed in View 2 Example time series of a Water Loss region Time lapse of 1 1 Headwaters Lecture Example time series of a Water Gain region Examples of Change: Shrinking Island Headwaters Lecture

9 Examples of Change: Dam Construction Construction of Chubetsu Dam, Japan Construction of a dam characterized by a sudden and persistent increase in surface area Headwaters Lecture Global Reservoir and Dam (GRanD) Database A data curation initiative by Global Water System Project (GWSP) Global Reservoir and Dam (GRanD) Database: A data curation initiative by Global Water System Project (GWSP) Finds 61 dams constructed after 2001 UMN Approach: Finds 701 dams constructed after 2001 Dams reported by GRanD since 2001: 35 Headwaters Lecture

10 Comparison of Dam Detections with GRanD Global Reservoir and Dam (GRanD) Database: A data curation initiative by Global Water System Project (GWSP) Finds 61 dams constructed after 2001 UMN Approach: Finds 701 dams constructed after 2001 Dams only reported by GRanD: 5 Dams reported by both UMN and GRanD: 30 Dams only reported by UMN: 671 Headwaters Lecture Relationship between Ground Water and Surface Water Area Dynamics GRACE land data: Obtained from Available at 1 spatial resolution, monthly since 2002 Preprocessing: Average of GFZ, CSR, and JPL versions computed Prescribed grid scaling factors applied Surface Water Area Dynamics: Number of MODIS water pixels counted for every 1 grid cell every month (to match resolutions with GRACE) Preprocessing: Grid cells with less than 50 MODIS water pixels ignored Data spatially smoothed using a 3 X 3 window Headwaters Lecture

11 Correlations with GRACE GRACE: Gravimetry Recovery and Climate Experiment Measures changes in total water mass (surface + groundwater) at ~100 km Most regions show strong positive correlations b/w surface water dynamics and GRACE measurements 21 Examples of Positive Correlations (1) Correlation: Blue: Surface area time series Red: GRACE data 22 11

12 Negative Correlations in Indus Basin: Over consumption of groundwater? Increase in area of surface water due to rice/paddy farming and widening of Indus river GRACE shows decrease due to depletion of groundwater for agriculture Headwaters Lecture Negative Correlations in Bangladesh and Thailand 24 12

13 Can we produce daily surface water extents maps at high spatial resolution? Challenge: - MODIS (500m resolution, daily) - LANDSAT (30m, every 16 days), Sentinel 2 (10m, every 5 10 days) Quantifying water stocks and flow Global projections of water risks (red) Solution: ORBIT Ordering Based Information Transfer across space and time Kajakai Reservoir Afghanistan Extent at coarse resolution (500m) Extent at high resolution (30m) created using our approach 25 Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13,

14 Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 Surface Extent at 500m created from MODIS data on Dec 13, Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 Surface Extent at 30m from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data 28 14

15 Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 Surface Extent at 30m from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data 29 Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 Surface Extent at 500m created from MODIS data on Dec 13,

16 Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 Surface Extent at 30m created from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data 31 Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 Surface Extent at 30m created from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data 32 16

17 Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 Surface Extent at 500m created from MODIS data on Dec 13, Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 Surface Extent at 30m created from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data 34 17

18 Daily surface water mapping at 30m: Lake Mead, USA Background: LANDSAT 7 image of Dec 13, 2000 MODIS 500m pixel grid in cyan color Surface Extent at 30m created from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data 35 Daily surface water mapping at 10m: Richland Chambers Reservoir, USA (Background image: Sentinel-2 image of Apr 02, 2016) Surface Extent at 500m created from MODIS data on Apr 02,

19 Daily surface water mapping at 10m: Richland Chambers Reservoir, USA (Background image: Sentinel-2 image of Apr 02, 2016) Surface Extent at 10m created from MODIS 500m data on Apr 02, 2016 by ORBIT approach using USGS 10m DEM data 37 Daily surface water mapping at 10m: Richland Chambers Reservoir, USA (Background image: Sentinel-2 image of Apr 02, 2016) Surface Extent at 10m created from MODIS 500m data on Apr 02, 2016 by ORBIT approach using USGS 10m DEM data 38 19

20 Daily surface water mapping at 10m: Richland Chambers Reservoir, USA (Background image: Sentinel-2 image of Apr 02, 2016) Surface Extent at 500m created from MODIS data on Apr 02, Daily surface water mapping at 10m: Richland Chambers Reservoir, USA (Background image: Sentinel-2 image of Apr 02, 2016) Surface Extent at 10m created from MODIS 500m data on Apr 02, 2016 by ORBIT approach using USGS 10m DEM data 40 20

21 Other Applications of Big Data in Water Land-Water Interaction Digital Twin of Anacostia Watershed Collaboration: D.C. Water Cover Crop Mapping Collaboration: University of Minnesota Hybrid Physics-Data Models IOT for Water Modeling Lake Water Quality Collaboration: USGS Hydrological Models for Streamflow Collaboration: Northeastern University Leakage Detection using smart meters 41 Team Members University of Minnesota: Arindam Banerjee, Snigdhansu Chatterjee, Michael Steinbach, Jeff Peterson, David Mulla Northeastern University: Auroop Ganguly, Ed Beighley University of Wisconsin: Paul Hansen, Hilary Dugan USGS: Jordan Read Anuj Karpatne UCLA: Dennis Lettenmaier Ankush Khandelwal Xiaowei Jia University of Maryland: Charon Birkett 42 21