DarNet: A Deep Learning Solution for Distracted Driving Detection Christopher Streiffer*, Ramya Raghavendra, Theophilus Benson, Mudhakar Srivatsa

Size: px

Start display at page:

Download "DarNet: A Deep Learning Solution for Distracted Driving Detection Christopher Streiffer*, Ramya Raghavendra, Theophilus Benson, Mudhakar Srivatsa"

Phoebe King
5 years ago
Views:

1 DarNet: A Deep Learning Solution for Distracted Driving Detection Christopher Streiffer*, Ramya Raghavendra, Theophilus Benson, Mudhakar Srivatsa *Duke University, IBM Research, Brown University

2 Distracted Driving Causes and Consequences 3,179 drivers were killed and 431,000 drivers were injured in accidents involving distracted driving Operating a smart phone while driving led to a 2.93 times increase in risk of getting into an accident Drivers take their eyes off the road for an average of 23 seconds when texting/talking

Abundancy of Data Data collection and IoT devices

camera eye gaze tracking, driving behavior OBD-II

3 Abundancy of Data Data collection and IoT devices becoming ubiquitous In-vehicle sensors Dashboard camera eye gaze tracking, driving behavior OBD-II speed, breaking, etc. IMU sensors turns, movements, etc. GPS location, velocity Have access to an abundancy of varying data feeds and recording devices

4 Goal: Combine Data Modalities to Detect Distracted Driving Classification Inference Texting While Driving Show how deep learning can be applied to the task of detecting distracted driving Define driving categories and train model to perform classification based on data streams Combine different modalities together to strengthen the overall classification

5 Challenges of Work Data Collection Design How to align data being collected from various sources Create an open framework to allow for the incorporation of new devices Data Analytics Component Which ML models should be used for classification e.g. LSTM vs SVM How to combine the data streams to produce a single classification Privacy and Network Considerations How to privatize image data in a secure manner Where to run the deep learning classification on device, edge, cloud

6 Roadmap Motivation System Design System Components Implementation Details Privacy Design Distortion model Training methodology Evaluation Data collection Ensemble Results Privacy Results Summary and Future Work

7 DarNet Framework Data Collection Agents Centralized Controller Analytics+Privacy Engine

8 Challenges of Work Data Collection Design How to align data being collected from various sources Create an open framework to allow for the incorporation of new devices Data Analytics Component Which ML models should be used for classification e.g. LSTM vs SVM How to combine the data streams to produce a single classification Privacy and Network Considerations How to privatize image data in a secure manner Where to run the deep learning classification on device, edge, cloud

9 Data Collection Agents Collection Agent Collection Agent Collection Agent Recv Recv Recv Centralized Controller Designed as an abstraction to allow support for a broad range of IoT devices Component Responsibilities Polling device s sensor Maintaining internal clock for timestamping data Streaming data to centralized controller Periodicity of collection and transmission frequency determined based on sensor specifications and network latency

10 Centralized Controller Collection Agent Collection Agent Collection Agent Recv Recv Recv Centralized Controller Component Responsibilities Clock synchronization Data alignment and normalization Maintains a global clock, pushes timestamp to each collection agent periodically Has an interface for each registered component, polls interface periodically for agent sensor data

11 Challenges of Work Data Collection Design How to align data being collected from various sources Create an open framework to allow for the incorporation of new devices Data Analytics Component Which ML models should be used for classification e.g. LSTM vs SVM How to combine the data streams to produce a single classification Privacy and Network Considerations How to privatize image data in a secure manner Where to run the deep learning classification on device, edge, cloud

Data Inference Methodology Image data Convolutional Neural Networks Have made tremendous strides in image classification Time Series data Support Vector Machines (SVM) Recurrent Neural Networks Long

12 Data Inference Methodology Image data Convolutional Neural Networks Have made tremendous strides in image classification Time Series data Support Vector Machines (SVM) Recurrent Neural Networks Long Short Term Memory cells How to combine the data from different modalities Train a single model which takes as input all data Train independent models, combine results to produce overall classification

13 Image Data Convolutional Neural Network Convolution filters learn spatial features of an image, activate for certain patterns Google s Inception-V3 Network Takes advantage of hebian principle Trained on publicly available ImageNet dataset Top-1 error rate of 17.2%, Top-5 error rate of 6.67% on ILSVRC 2012 dataset

14 IMU Sequence Data Recurrent Neural Network Basic RNN Design LSTM Design Time series data typically contains dependencies that exist throughout the input data E.g. speech, language translation, video data State from time x t propagated forward to future inferences LSTM solves the vanishing/exploding gradient problems, allows for stronger classifications

15 Ensemble Methodology P(M 0 ) P(M 1 ) BN Normal Driving P(M n ) Combines the output of each model using a Bayesian Network Each model can be trained individually New models and data can be incorporated without having to retrain any of the original models

16 Challenges of Work Data Collection Design How to align data being collected from various sources Create an open framework to allow for the incorporation of new devices Data Analytics Component Which ML models should be used for classification e.g. LSTM vs SVM How to combine the data streams to produce a single classification Privacy and Network Considerations How to privatize image data in a secure manner Where to run the deep learning classification on device, edge, cloud

17 Image Distortion Low Medium High Distort data by distorting the image to hide user features before image leaves device Three levels of distortion Low Downsample to 100x100 pixels Medium 50x50 pixels High 25x25 pixels User has the option of selecting their level of privacy

18 Training Methodology Client Side Server Side CNN L2 Loss Distortion dcnn

19 Roadmap Motivation System Design System Components Implementation Details Privacy Design Distortion model Training methodology Evaluation Data collection Ensemble Results Privacy Results Summary and Future Work

Hair and makeup Reaching for Item Floor,

20 Driving Behaviors Good Forward/Backward/Turn(L/R)/Merge(L/R) Looking over Shoulder Left/Right Bad Talking Left/Right Texting/ED Application Left/Right Eating Drinking Hair and makeup Reaching for Item Floor, Passenger Seat, Rear Adjusting Radio/Navigation Drinking Texting - Left Talking - Right

21 Data Collection - Setup Class Name Tasks 1 Normal Driving Driving with 1, 2 hands on wheel; Checking mirrors/blind spots before turning/merging 2 Talking Talking with phone in left/right hand held to hear; Talking with phone in speaker mode 3 Texting Texting with phone in left/right hand 4 Eating/Drinking Eating/drinking varying food/drink items while keeping 1 hand on wheel 5 Hair and Makeup Checking hair/makeup in visor mirror and rear mirror 6 Reaching Reaching for item in passenger seat, rear seat, center console; Changing the radio Driving behaviors divided into 6 classification categories Data collected from 5 drivers, using same vehicle and same driving route Driver performed distraction task as instructed by the passenger Video statistics Each video was recorded for a duration of 15 seconds Driving task was repeated 10 times for each driver Videos verified at a later point in time for accuracy of task execution

Data Collection Device Configuration Video Data Video data from embedded camera Collected from Nexus 7 tablet running Android 6.0.

22 Data Collection Device Configuration Video Data Video data from embedded camera Collected from Nexus 7 tablet running Android (API v23) IMU Sequence Data Gyroscope, Accelerometer, Gravity, Rotation sensor data Collected from Nexus S running Android (API v16) Controller running on Nexus 7 tablet Vide Data collection agent embedded within Controller application Communication channel maintained over Bluetooth

23 Data Collection - Dataset Class Name Data Types Data Points 1 Normal Driving Video, IMU Sequence 5,286 2 Talking Video, IMU Sequence 10,352 3 Texting Video, IMU Sequence 9,422 4 Eating/Drinking Video 9,463 5 Hair and Makeup Video 4,848 6 Reaching Video 17,709 Dataset divided into an 80/20 partition for training and evaluation Discrepancy between number of data points due to the different number of orientations of each class Each video sampled at a rate of 4Hz across a time window of 5 seconds, resulting in vector inputs of size 20

Time Series Analysis RNN (using LSTM cells) produces a top-1 classification percentage of 97.44% Outperforms SVM which produces a top-1 classification percentage of 95.

24 Time Series Analysis RNN (using LSTM cells) produces a top-1 classification percentage of 97.44% Outperforms SVM which produces a top-1 classification percentage of 95.37% RNN able to account for non-linear relationships between different accelerometer and gyroscope recordings Produces stronger classification results in all categories

25 Ensemble Analysis CNN produces top-1 classification result of 73.88% CNN+RNN, CNN+SVM produce top-1 classification results of 87.02% and 86.23% respectively Ensemble approach shows combining models improves the overall classification percentage Adding IMU component allows for discrepancies between categories 1, 2, and 3 to be ironed out

26 Video Privacy Analysis Low Medium High Model Top-1 Classification CNN 78.87% dcnn-l 80.00% dcnn-m 77.78% dcnn-h 63.13% Evaluation run on a previously collected dataset consisting of only videos Data collected from 10 drivers Recorded using a GoPro at 30 fps 18 distinct driving classifications Downsampling the data to 100x100 pixels increased classification percentage Removes any effects from over-fitting incurred during training

27 Summary Designed data collection framework for multiple IoT devices and modalities Implemented a methodology for maintaining synchronization and alignment between sensors Presented an open system to allow for the incorporation of more data streams Demonstrated how to use deep learning to classify driving behavior Showed how to combine these models to improve classification results Developed a Bayesian Network capable of producing strong results Took the first steps towards developing an unsupervised framework for protecting sensitive data Developed an unsupervised training methodology using pre-trained models Showed how this produces adequate results even with significant degradation Future Work Explore methods for making smarter offloading decisions Further investigate privacy preserving methodology Incorporate other data sources to further improve results