47th Liege Colloquium Luigi Ceccaroni, Meinte Blaas, Marcel R. Wernand, Filip Velickovski, Anouk Blauw and Laia Subirats Liège, May 8 th 2015 A decision support system for water quality in the Wadden Sea
Index INDEX 1.- Optical monitoring in a citizen-science context 1. Item 1 2.- Optical properties and marine-environment status 2. Item 2 3.- Decision support system 3. Item 3 4.- Target users and applications 4. Item 4 5.- Conclusions 2
Index Optical monitoring in a citizen-science context 1. Item 1 Combination of data collected by: 2. Item 2 3. Item 3 4. Item 4 a distributed group of people (Citclops crowdsourcing) publicly available data: government satellites research-institutes campaigns Interpretation by context-aware artificialintelligence techniques 3
Index Optical properties and marine-environment status 1. Item 1 Optical properties as proxies of: 2. Item 2 3. Item 3 4. Item 4 sewage impact dissolved organic matter sediment load natural pigmentation And more generally: marine-environment status anthropogenic pressures on resources natural causes of abnormal conditions 4
Index Focus on color (FU) 1. Item 1 Dependent mostly on the optically active 2. Item 2 3. Item 3 4. Item 4 substances: SPM, algae, CDOM (and POC, TOC) Related to light extinction (Kd) and Secchi depth (SD), also dependent on the same substances Presence of the optically active substances steered by: currents, waves, wind, insolation, temperature, nutrients (TotN, TotP. PO4, NO3, NO3NO2, NH4, SiO2) Salinity as a proxy for river water, which carries most (not all) nutrients into the sea 5
Study area in the Wadden Sea over a fixed study period (2013 2015) 6
Index General objectives 1. Item 1 Assess and use available marine data: FU color 2. Item 2 3. Item 3 4. Item 4 index, suspended particulate matter, TSM, dissolved organic carbon, light extinction, Secchi disk depth, chlorophyll-a, waves height, river input, weather Learn a model of the target variable FU color at future points (2 days, 4 days, 7 days) Evaluate model s prediction of FU using 10-fold cross-validation Integrate model into Citclops Data Explorer Marine Data Analyser 7
Index Available data sources 1. Item 1 In-situ: SPM, chlorophyll-a, DOC, Kd (2003 2013) 2. Item 2 3. Item 3 4. Item 4 average time resolution: 2 data points per month (Deltares) In-situ: River input (1 data point per day), waves height (1 data point per hour) (Deltares) Satellite: MERIS instrument: FU, chlorophyll-a (2002 2011) - time resolution: daily (missing data on cloudy days) In-situ: TSM, FU, chlorophyll-a (2013 2015) - time resolution: every 2 min during daylight (some missing periods) (NIOZ) 8
Jetty Data - Daily means Mar 2013 Sep 2013 Oct 2014 Feb 2015 9
Machine-learning pipeline Extract data from source Convert to data array Parsing Feature preparation Convert data into feature vectors + target value Select feature variables, amount of past time points Feature Selection Model selection Train and test a set of models (decision trees, SVM, Bayesian) Evaluate best model on unseen test set Final evaluation 10
Converting to three class classification problem 11
Using mean daily values (example: TSM) 30 th Apr 6 th May 12 th May 18 th May 24 th May 2014 12
Using mean daily values, sometimes few data points (example: TSM) 14 th Dec 18 th Dec 22 nd Dec 26 th Dec 2013 13
Short-term correlation to tide (Dominant semi-diurnal lunar tidal component (M2) with a period of 12:25 h) 28 th Sep 1 st Oct 4 th Oct 2013 14
Index Knowledge coming from pre-processing 1. Item 1 Clear relation among color, tides, and other drivers: 2. Item 2 3. Item 3 4. Item 4 wind (which steers both waves and currents), river/sluice discharge and biology Short-term forecasting (within a tidal period): not possible without de-tiding the signals Longer-term forecasting: possibly improved by detiding the signals 16
Feature vector at time t Training data Time (days) t TSM Chl-a FU FU t+2 (target) FU t FU t-1 Feature and targetvariable preparation Chl t Chl t-1 TSM t 17
Index Highly flexible system with real-time processing 1. Item 1 Data set 2. Item 2 Feature configuration 3. Item 3 Machine-learning technique 4. Item 4 Temporal range of prediction Experiments and examples of results 18
Wave height TSM Chl-a Feature configuration ML technique Support Vector Machine C (soft-margin) : 1.0 Kernel : radial basis FU Number of examples for training 415 Blind predictor benchmark 48% Distribution of target FU classes decrease stable increase 27% 48% 25% Accuracy (10-fold cross- validation) 53% 19
Wave height Feature configuration ML technique TSM Chl-a Random Forest n_trees: 10 FU Number of examples for training 359 Blind predictor benchmark 35% Distribution of target FU classes decrease stable 33 % 32 % increase Accuracy (10-fold cross- validation) 35 % 52 % 20
Wave height TSM Chl-a FU Feature configuration ML technique Decision Tree max depth: 10 Number of examples for training 345 Blind predictor benchmark 35% Distribution of target FU classes decrease stable 34 % 31 % increase Accuracy (10-fold cross- validation) 35 % 45 % 21
Index Target users and applications 1. Item 1 Data from citizens, marine scientists and coastal 2. Item 2 3. Item 3 4. Item 4 planners to citizens and policy makers Plenty of potential applications: to simulate environmental crises to chart emergency-management plans of the coastal zone to provide sea farmers with bulletins about algal blooms to maximize citizens experience in activities in which water quality has a role to provide citizens with powerful, userfriendly tools of environment interpretation 22
Index Conclusions and future work 1. Item 1 First steps in real-time prediction of water color 2. Item 2 3. Item 3 4. Item 4 using easily-available data sources as context Transition towards a citizen-science scenario Use of citizen data to complement remotesensing and official in-situ data Use of machine-learning techniques in a very flexible framework Short-term impact on policy decision via decision support system 23
24
25
First stage: (Handcraft) Third stage, unit II (Prototype II, 3D print external LEDs) Second stage, unit I (Prototype I, 3D print, internal LED) 26
Slide 27 27
Slide 28 28
47th Liege Colloquium Luigi Ceccaroni, Barcelona Digital Technology Centre, Spain; 1000001 Labs, Spain Meinte Blaas, Deltares, The Netherlands Marcel R. Wernand, Royal Netherlands Institute for Sea Research, The Netherlands Filip Velickovski, Barcelona Digital Technology Centre, Spain Anouk Blauw, Deltares, The Netherlands Laia Subirats, Barcelona Digital Technology Centre, Spain Liège, May 8 th 2015