The Internet of Everything and the Research on Big Data Angelo E. M. Ciarlini Research Head, Brazil R&D Center
A New Industrial Revolution Sensors everywhere: 50 billion connected devices by 2020 Industrial machines Cars, planes, all transportation means, drones, Cities Buildings and at home Wearables and our bodies Radically changing processes and people s lives Data deluge: 10s of zettabytes of data Opportunities and challenges Internet of Things (of Everything), Big Data and Cloud 2
Agenda Dell EMC and Brazil R&D Center Internet of Things (of Everything) Applications and business opportunities: Smart X Key components Technical challenges for IoT and Relation with Big Data Main Research Topics in IoT+Big Data Our Experience at our Research Center Concluding Remarks 3
The world s most value-focused global supply chain providing technology infrastructure for organizations of all sizes The world s leading data center innovation engine with cutting-edge enterprise infrastructure for the most demanding environments 4
The world s largest privately-held technology company with world-class enterprise sales and support 5
Award-winning customized solutions offering innovative devices and services designed for the way people work The most comprehensive portfolio of technology solutions from the edge to the core to the cloud Leading the intersection of Big Data, PaaS and agile development leveraging data on one cloud-independent platform The foundation to transform your data center with industry-leading servers, storage and converged infrastructure The premier provider of security, risk and compliance solutions solving your most complex challenges Elite and trusted intelligence that strengthens security and reduces risk in a dynamic landscape The leading enterprise-class cloud software and solution provider The most trusted virtualization for desktop, data center and applications 6
Brazil R&D Center Corporate investment of US$ 100 million Located at UFRJ Tech Park Initially applied research to the Petroleum Industry and Public Sector Now extended to Telco, Health Care, Finance, Transportation, Electric Utility, Construction Special focus on Big Data and IoT Create revolutionary technologies to solve relevant problems for the industry Around 20 researchers/data scientists + academic partners 7
The Internet of Things (IoT) Technology to connect physical devices exposing them to applications Internet-based Connectivity layer -> multiple points Evolution from Machine-to-Machine approach More than connection from physical device to backend Search for general standardized and efficient solutions Interoperability But things that we might want to connect are not only physical... 8
Internet of Everything (IoE) Everything connected Physical things Digital things People Direct and indirect interaction Sensors providing information from our bodies Processes Distinction between digital and physical things tends to be blurred Sometimes IoT is used in the sense of IoE, sometimes IoE is used to emphasize the difference from M2M 9
Opportunities: Smart X Smart Industry Smart Cities Smart Health Smart Energy Smart Buildings Smart Agriculture, Homes, Transportation, Culture, Tourism, Increasing level of automation and optimization 10
Smart Industry Sensors all over the industrial plants Monitoring all processes Predictive asset maintenance Production optimization in different operating states Detection of anomalies that hinder production Higher level of automation 11
Smart Industry - Examples Petrochemical: optimization for production of different products, reduce non-conformant intermediate products Oil&gas: upstream & downstream Logistics: locations based on RFIDs, stock calculation Aviation: safety, fuel consumption 12
Smart Health Patients surveillance, chronic disease Aging people monitoring, fall detection Teleoperations and automation Science: control tests of new drugs Data for Medical Science: Monitor people s habits Monitor people s vital signs Genome Information about whole populations New treatments Best treatment for each individual 13
Smart Energy Smart Grid: consumption monitoring and management Consume when it is cheaper Local production More reliability of the electrical system: isolation of failures Monitor and control production at plants: best performance of the whole system Better integration with renewable energy: solar and wind Reduction in carbon emissions 14
Smart Cities Dealing with problems of huge cities with huge problems Traffic congestion Public transportation planning and management Driverless cars: 50% of US cars by 2031 Waste management Lighting control Security: video monitoring, fire control 15
IoT Schematic Components Sensors Local processing Local storage Network Internet Cloud processing Cloud storage Obs: Local processing and storage still limited 16
Technical Challenges IoT architecture Network technology and discovery Hardware technology: miniaturization Software and algorithms: monitoring and real-time analytics Data and signal processing: huge amount Increase automation: add intelligence to the things Power and energy storage Security and privacy Interoperability and standardization 17
How Big Data can leverage the IoE Consider data integration at a global scale Need to deal very efficiently with huge amounts of data Contextualization Consider all data available from all sensors (and all related data) Real-time analytics: create, apply and update models Consider streaming and history Make best decisions for each situation Probabilistic approach: quantify and reduce uncertainty Knowledge representation and reasoning: understand current state and reason about future states -> Semantics and Automated Planning Perform global optimizations (in real-time) 18
Key Big Data Challenges Scalability: how to efficiently store and manage an ever-growing amount of data? How to speed up the processing of a huge amount of data Fast data that arrives continuously: be able to obtain results while they are still useful! Increase number of complex simulations to reduce risk Prediction and decision making Discover correlations that are not obvious but can make the difference Optimize actions based on the predictions 19
Massively Parallel Processing IoE is distributed: centralizing all processing creates bottlenecks Complex analytics, simulation and reasoning demand compute power Essentially the cloud Compute at the edge (fog computing) can help a lot Logistic problem: what should be processed and where Parallelism is essential Need to break problems wisely, re-think the problems: the more embarrassingly parallel the better Preprocessing tends to be useful Filtering data at the edge Nondeterministic reasoning at the cloud 20
Infrastructure for Big Data (supporting IoE) Frameworks for large scale parallelism Map Reduce Massively Parallel Processing Databases Spark and Hadoop Solid state storage : DSSD Shared storage directly connected to the BUs Bandwidth of 100 GB/s, latency of 100 µs 21
Infrastructure for Big Data (supporting IoE) cont. Scale-out Storage Data grows all the time High performance in terms of capacity and speed is necessary, but provisioning for the peak is not the best option Scale-out for compute and storage allows growth with simplicity and controlled costs Isilon NAS: up to 50 PB within same file system, multiprotocol including HDFS Resorting to some kind of (private, public or hybrid) might be necessary Flexibility and elasticity to provision resources Hybrid cloud provides flexibility to manage costs and keep sensitive data in-house 22
Some Projects Executed at Brazil R&D Center Production Optimization (Massive Time Series Prediction ) Predictive Asset Maintenance Logistics Fleet Management Workflow Management for Big Data Pattern Matching over Large Multidimensional Datasets Content-aware Data Compression Smart Cities (mobility) Smart Cities - PaaS (Mobility) 23
Production Optimization Find model with most relevant variables in thousands of lagged time-series that explain/predict selected target time series 7000 input time series Billions of combinations Target: Off-shore Oil Rig Production Algorithm that reduces time from10 days to 10 seconds Detection of anomalies Exploratory prediction of outcome when changing operation Machine Learning Time-Series Analysis Parallel Processing 24
Predictive Asset Maintenance Web-based portal for Avoid down-time of critical equipment using predictive maintenance 900 different time-series. Time span: 6 years. Unstructured event data. Multi-Classifier approach. training models and real-time dashboard High accuracy for predicting ignition failures of turbogenerators Machine Learning Time-Series Analysis Real-Time 25
Logistics Answer complex or expensive queries about real-world processes through simulation, analytics & prediction. Model for Oil & Gas Company 80,000 material types >300 destinations 2,000 orders/day Combinatorial problem: TBs of traces Parallel data mining over real-world data to create simulation model. Parallel analytics over simulation traces to build prediction models. Answer complex queries in a few seconds Simulation Time-Series Analysis Parallel Processing 26
Fleet Management Need for fuel efficiency To maintain profitability of the business, a large transportation company needs to find efficiencies in fuel consumption Consider driver behavior, vehicle maintenance, road conditions, traffic and weather conditions Ingest all data (including telemetry, videos, & routes) to a data lake Data discovery and modeling of fuel comsumption based on conservation of energy Decrease fuel consumption, understand driver behavior, identify the need for training, improve customer satisfaction and profitability per route. Improved capacity for logistics, security and preventive maintenance rather than remediation, reducing expense and liability 27
Concluding Remarks A new world with huge challenges and opportunities Rapid evolution of technology IoE: Connectivity and interoperability Data-driven Smart X contexts: data, data and more data Cloud and edge computing Interdisciplinary approach Automation: unprecedented levels Concerns about jobs Limitless opportunities Improve people s lives Research is essential to face challenges and take advantage of opportunities 28
Opportunities For data scientists/researchers at different levels Dell EMC R&D Center in Rio de Janeiro brdcjobs@emc.com Industry partners Academia 29
Questions? Contact: angelo.ciarlini@dell.com 30