Developing Big-Data Infrastructure for Analyzing AIS Vessel Tracking Data on a Global Scale

Similar documents
Anomaly Detection in Vessel Tracking Using Support Vector Machines (SVMs)

ERDC and e-nav Data Use

AIS Antenna Network Partnerships. Become part of our global expansion programme

Coastal and Marine Third Party

Asset Monitoring Intelligence Center (AMIC)

Alaska Deep-Draft Arctic Ports Navigation Feasibility Study. U.S. Army Corps of Engineers Bruce Sexauer P.E. Chief of Planning, Alaska District

Marine Mobile Service Identifiers

Marine Board Spring Meeting April 27, 2011 David M. Kennedy

Data supply chain certification - quality monitoring and indication for e-navigation solution reliability

MTS Resilience: enavigation Best Practices

Priority Setting Process For Hydrographic Surveys Page 1 of Parameter 1 - Quality of Survey Data Currently Available... 9

DETAILED COURSE AGENDA

e-navigation workshop

STAD SHIP TUNNEL. Fire Protection & Safety in Tunnels conference Terje Andreassen, Project Manager. Vi tar ansvar for sjøvegen

Maritime Intelligence Risk Suite

Project Overview. Northwest Innovation Works LLC and the Port of Kalama propose to develop and operate

Agriculture Information System. Crop Information Portal of Pakistan: version 2 Database Administration

Trade & Transport Data Lake (a proof of concept)

Application Note. NGS Analysis of B-Cell Receptors & Antibodies by AptaAnalyzer -BCR

MANDATORY SHIP REPORTING SYSTEM

Transport Canada Marine Transportation in the Canadian Arctic Presentation to the International Maritime Statistics Forum

Optimize your Service Cloud Data Best Practices, Data Extraction and NextGen Reporting

SYMPOSIUM March 22-23, 2018

PRODUCT DESCRIPTIONS AND METRICS

Electronic Chart Display and Information Systems for navigational safety in maritime transportation

Klikk for å redigere tittelstil

Risk Management User Guide

Waterways 1 Water Transportation History

Marine Protection Rules Part 100 Port Reception Facilities Oil, Noxious Liquid Substances and Garbage

CONVERSION OF SPECIFIC GRAVITY TO SALINITY FOR BALLAST WATER REGULATORY MANAGEMENT

HPC for Maritime SMEs

Marine Pathways Project - Monitoring

Panama Canal Authority Toll Tariffs approved by Cabinet Council and published on the Official Gazzette Implementation: April 1, 2016

ANNEXES. to the Commission Implementing Regulation

User Needs and High-Level Requirements for the Next Generation of Observing Systems for the Polar Regions

Smart Ship Monitoring. Smart Ship Monitoring. Smart Ship Monitoring

TrajAnalytics: An open-source, web-based software system for visual analysis of urban trajectory data

CREATE INSTANT VISIBILITY INTO KEY MANUFACTURING METRICS

BIGDATAOCEAN PROJECT OVERVIEW

Channel Utilization in South Louisiana Using AIS Data,

MRV Verification Process

Build Digital Democracy

DanelecConnect. Ship-2-shore data solution. Danelec systems Solid Safe Simple

MUNIN -Maritime Unmanned Navigation through Intelligence in Networks

Kx for Telecommunications

Workflow Mining: Identification of frequent patterns in a large collection of KNIME workflows

Managed Services. Service Description West Swamp Road, Suite 301 Doylestown, Pa P

Worldwide and Route-specific Coverage of Electronic Navigational Charts

What Does the MTS Mean to Me?

Enhancing Data Resolution to Improve Maritime Safety

SECTION III NM 48/16 MARINE INFORMATION

SAP Predictive Analytics Suite

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

Port Performance Statistics Fixing America's Surface Transportation ("FAST ) Act was signed into law on December 4, 2015.

Knowledge-Guided Analysis with KnowEnG Lab

A big data driven approach to extracting global trade patterns

TERMPOL Review Process Report on the Enbridge Northern Gateway Project

Human Factors in Control, Trondheim 17/

Statistics Netherlands. AIS ESSnet Projectplan. Version 1.0.p3

HOW DISTRIBUTION COMPANIES CAN IMPROVE EFFICIENCY WITH GP. John DiLeo Senior Implementation Specialist

ANNEX 1. RESOLUTION MSC.325(90) (adopted on 24 May 2012)

AMAZON WATERWAY ENVIRONMENTAL AND TECHNICAL ASPECTS: Marañón and Amazon Rivers, Saramiriza Iquitos Santa Rosa section ; Huallaga river, Yurimaguas

Decision Statement Answer Key

Energy Efficiency EEOI

"The preparation and execution of the National Plan for the protection of the marine environment in the State of Kuwait"

Emerging Patterns for Implementing the ArcGIS Platform. Marten Hogeweg Peter Bottenberg

USACE NAVIGATION R&D UPDATE

The Arctic Ocean, highlighting the area of NOAA s Hidden Ocean expedition

Increasing of Effectiveness and Reliability of Data from Drifting buoys

RESOLUTION MSC.325(90) (adopted on 24 May 2012) AMENDMENTS TO THE INTERNATIONAL CONVENTION FOR THE SAFETY OF LIFE AT SEA, 1974, AS AMENDED

Bulgarian Vessel Traffic Management and Information System and Education and Training of VTS Personnel in Bulgaria

Maritime Services Ship Energy Optimisation Solutions. EMEA USERS CONFERENCE 2017 LONDON #OSISOFTUC 2017 OSIsoft, LLC

3 Ways to Improve ROI from Salesforce.com & Marketing Automation Data

The Concept of the Development of Turkmenbashi International Seaport and the Marine Merchant Fleet till Year 2020 was approved with Resolution of the

NEW YORK CITY DEPARTMENT OF TRANSPORTATION

What about streaming data?

WEATHERING AND OIL SPILL SIMULATIONS IN THE AFTERMATH OF TANKER ACCIDENTS AT THE JUNCTION POINTS IN THE MARMARA SEA

Managing Your Work with ArcGIS Workflow Manager. Tope Bello

A Landscape Conservation Design for the Aleutian and Bering Sea Islands Landscape Conservation Cooperative

HR Business Partner Guide

Current state of our seas, future trends in marine sectors, and the blue growth agenda

INFORMATION BULLETIN No. 51

Wonderware edna. Real-time enterprise data historian

Using Archived Stop-Level Transit Geo-Location Data for Improved Operations and Performance Monitoring

e-navigation Frequently Asked Questions

Coastal Response Research Center October 5, United States Arctic Research Commission s 96 th Meeting. Coastal Response Research Center 1

TEN-T POLICY REVIEW. UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE Geneva, 7-8 September 2010

Administration Action Plan

Experiences in the Use of Big Data for Official Statistics

R Computing services as SaaS in the Cloud. Fernando Aguilar Instituto de Física de Cantabria (IFCA) Santander - Spain

DNV GL Digitalization Drive from Vision to Solution

Maritime Spatial Planning in the German EEZ

Cook Inlet Pipeline Infrastructure Assessment: STATUS UPDATE. Tim Robertson Nuka Research and Planning Group

ARC System Upgrade. What s New with 9.2. Presentations: January & February 2017

HELSINKI COMMISSION HELCOM MARITIME 13/2013 Maritime Group 13 th Meeting Szczecin, Poland, November 2013

Konica Minolta Business Innovation Center

Complex Event Processing: Power your middleware with StreamInsight. Mahesh Patel (Microsoft) Amit Bansal (PeoplewareIndia.com)

Zappos Vendor Scorecard Guide Version

Transcription:

Ocean Science Meeting 2018 Big Data for a Big Ocean: Progress on Tools, Technology, and Services III Developing Big-Data Infrastructure for Analyzing AIS Vessel Tracking Data on a Global Scale Jessica Austin Axiom Data Science 02/15/2018

The Automatic Identification System (AIS) " On-board tracking system, transmitting over VHF AIS equipment integrates with vessel navigation systems, such as GPS Other AIS-equipped ships can see traffic in the area Land-based (and satellite) receiver networks can collect data for an entire region " Increasing interest in analyzing historic datasets Prioritizing hydrographic surveys Predicting probability and impact of oil spills Quantifying interaction with marine wildlife etc https://sat-trak.com/ 2

Project Overview Problem: Due to immense size of these datasets typically 10s of billions of raw messages per year and limitations on infrastructure and computing power, AIS data must currently be processed in small temporal or spatial subsets. Solution: Develop cluster computing infrastructure and applications to analyze the data across many machines in parallel Results: Previous workflow: single machine, small subset days to weeks weeks to process Our workflow: compute cluster, all US waters for one year 48 hours 3

What is Cluster Computing? Technologies 16 computers 368 total cores 1500 GB available memory 4

The Data Processing Pipeline Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 5

Real-world example: Hydrographic Health Model " Hydrographic Health Model establishes survey priorities across all US waters Managed by NOAA Office of Coast Survey (OCS) Includes metrics like: When was this area last surveyed? What is the seafloor complexity? What is the typical vessel traffic in this area? etc " Vessel Traffic Heatmap requirements We only care about ships that are moving and that are in US waters Need to be split out by ship type, metric, and region (30 total files) Tanker, Cargo, Passenger, and Other, and All ships Total Vessel Count and Unique Vessel Count Continental US, Alaska, Hawaii 500m resolution, Albers Equal Area projection " Current state-of-the-art (an ArcMap plugin) takes days to weeks to process small spatio-temporal subset of data 6

Pipeline: Raw AIS messages " AIS Messages are collected by land-based and satellite receivers Agencies aggregate messages, and provide them as a database Each message includes: timestamp, ship MMSI, latitude, longitude, speed over ground, etc " Vessel Catalog Maps MMSI to vessel attributes Ship type, max draft, length, etc " Possible issues Duplicate data Limited range for land-based Self-reported data https://sat-trak.com/ Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 7

Pipeline: Raw AIS messages " AIS Messages are collected by land-based and satellite receivers Agencies aggregate messages, and provide them as a database Each message includes: timestamp, ship MMSI, latitude, longitude, speed over ground, etc " Vessel Catalog Maps MMSI to vessel attributes Ship type, max draft, length, etc " Possible issues Duplicate data Limited range for land-based Self-reported data Real-world Example: Hydrographic Health Model 2015 Terrestrial data from Coast Guard network (via NOAA OCS) 74,212,891,806 messages (7.5TB uncompressed) Vessel Catalog: From US Coast Guard, 80,855 vessels Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 8

Pipeline: Raw messages to Vessel Pings " Parse and clean the messages Discard invalid messages Discard any messages that are not Position Reports Remove duplicate messages " Result is called a Vessel Ping Real-world Example: Hydrographic Health Model Input: 74,212,891,806 raw messages Output: 2,017,535,550 pings (2.72% of raw messages) Total time: 40.7 hrs (~6.7 min/day) Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 9

Pipeline: Vessel Pings to Vessel " Vessel describe the probable path of a ship " are At least two pings long Broken up if the points are too sparse Broken up if the ship stopped in one place for a long time Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 10

Pipeline: Vessel Pings to Vessel " Vessel describe the probable path of a ship " are At least two pings long Broken up if the points are too sparse Broken up if the ship stopped in one place Real-world Example: Hydrographic Health Model Input: 2,017,535,550 pings Output: 12,297,024 voyages Avg length: 51 pings (~31% of pings used) Total time: 3.2 hrs (~31 sec/day) Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 11

Pipeline: Filtered Vessel " Join to Vessel Catalog by ship identifier (MMSI) " Segment data by Region Ship Type Time Frame etc Real-world Example: Hydrographic Health Model 30 total files: Regions: Continental US, Alaska, Hawaii Ship Types: Tanker, Cargo, Passenger, and Other, and All ships Metrics: Total Vessel Count and Unique Vessel Count Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 12

Pipeline: Vessel Traffic Heatmaps " Divide region into a grid, then count crossings " Heatmap format is configurable GeoTIFF or NetCDF Arbitrary grid size; default is 500 meters Choice of Albers Equal Area projection " Possible metrics Total traffic volume Unique vessel count Maximum vessel draft etc... Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 13

Pipeline: Vessel Traffic Heatmaps " Divide region into a grid, then count crossings " Heatmap format is configurable GeoTIFF or NetCDF Arbitrary grid size; default is 500 meters Choice of Albers Equal Area projection " Possible metrics Total traffic volume Unique vessel count Maximum vessel draft etc... Real-world Example: Hydrographic Health Model Input: 12,297,024 voyages Output: 30 heatmap files Total time: 90 minutes (Includes filtering and creating heatmaps) OVERALL TIME: 45.4 hrs Raw AIS Messages Vessel Pings Daily Vessel Filtered Vessel Vessel Traffic Heatmaps 14

Real-world example: Hydrographic Health Model 15

Real-world example: Hydrographic Health Model Passenge r Anchorage Valdez Dutch Harbor Cargo Valdez Juneau Valdez Juneau Tanker Dutch Harbor Anchorage Dutch Harbor Anchorage Juneau Valdez Anchorage Juneau Dutch Harbor Other 16

What's Next? " Publicly available website with static downloads: http://ais.axds.co " More datasets (currently in analysis): Marine Exchange of Alaska, 2008-2017 Satellite AIS Data (global) Danish Maritime Authority (Europe) " Integration into IOOS and AOOS Data Portals View alongside sea ice models, environmental sensors, wildlife habitat datasets, etc Dataset added to IOOS Data Catalog, archived with NCEI " Enable ad-hoc querying with GeoTrellis, Jupyter Notebooks 17

Real-world example: Hydrographic Health Model Tanker Ships Passenger (blue) and Other (green) Total vs Unique Traffic: 18