Big Data in Test and Evaluation

Size: px
Start display at page:

Download "Big Data in Test and Evaluation"

Transcription

1 Big Data in Test and Evaluation Prepared for ITEA Annual Symposium 4 October 2016 David Jimenez Director, Army Test and Evaluation 1

2 Big Picture The tests we conducted over the past 20 years are not representative of the next 20 years Virtual Environments Artificial Intelligence Hypersonics Big Data Autonomy Future T&E Cognitive Workload Cryptology Technology Parity Contested Information Environments Requires Continued Investment in Infrastructure & People 2

3 A Big Data Perspective The Good Old Days The Challenge Small Data Sets Manual Observation Small Data Small Analysis Big Data Little Insights How do I Determine?? 3

4 Force 2025 and T&E Service OTAs must determine whether or not systems are effective, suitable, and survivable in support of unified land operations in an operational environment dominated by: Increased momentum of human interaction Potential overmatch Importance of cyber and space Dense urban areas (megacities) Ubiquitous media WMD proliferation CEMA! Big Data! Increasing complexity on the battlefield increases complexity in T&E. Demand for data and the means to use it effectively is also increasing. See TRADOC PAM The U.S. Army Operating Concept (AOC): Win in a Complex World found at: CEMA Cyber Electromagnetic Activities 4

5 Big Data Causing An Evolution in T&E Yesterday Today Discrete data sets (usually associated with a single test); small overall file size Meaning derived from expert observations Workforce has expertise in the system under test Evaluation products consumed by small, specialized audience Central evaluation question: Did it meet requirements? Large data sets collected over a test program (may include data from contractor tests, simulators, hardware/software-in-the-loop laboratories, M&S, fielded system, and similar systems) Meaning derived from continuous observation Workforce has expertise in analytics Evaluation products consumed by broad audience with diverse interests Central evaluation question: What are the system s strengths and limitations over the range of conditions found on a complex, interoperating battlefield? To focus on the Why and How of a system s operational effectiveness, operational suitability, and survivability, increases the demand for deep analytics. 5

6 T&E Big Data Challenges T&E Big Data Free and shared among responsible practitioners Support model validations Amounts of data straining analytical resources Leverage Advances in Instrumentation Capabilities More Reliance on Supercomputing Need tools to make short order of analysis - visualization, sage, and frame capture T&E Cadre of the Future Requires Data Scientists and Data Analysts 6

7 Big Data Changed Everything We expect to be able to access analytics instantly and on demand -- to measure and understand our complex world. Our wealth Our health Our neighbors Our Surroundings Our interests Implications for Army Operating Concept and Force 2025: The Force 2025 Soldier will not have known a world without analytics. 7

8 2025 T&E and Big Data Goals Goals: Utilize knowledge, information, and data to achieve core mission and business objectives. Faster, more Accurate Decision-Making Cost Optimization Quicker Responses to Requests for Information More Holistic Test and Evaluation Automated tracking items or status Make useful big data capabilities available to everyone, but tailored to specific needs. Common Core Requirements: Faster, More Sophisticated Analytical Tools Leveraging Historical Data Modeling & Simulation 2025 T&E Design of Experiments Cloud Computing Sustainment of data for long term use (Archival) Discoverability and Access to data Analytics of historical and current information Derive context to inform decision making 8

9 Data Driven Deep Dive Analysis Incident Overview Middle East Course near mile marker 9.4 1L and 1R (Front, Left & Right) Half-shafts broken During that test week, vehicle completed 4 passes of this section of Middle East 1 pass on June 5 (date incident occurred) 3 passes on June 6 Large spike in left front spindle, frame, and driver acceleration occurred approximately 10s prior to vehicle stopping on course due to incident. Sheared Right Side Half Shaft Sheared spline inside Left Hub Cartridge 9

10 Pros: Awareness Considerations for Big Data Analytics I paid for all this data. What can I do with it? Cons: Available data may be underutilized due to awareness gaps. What capabilities already exist? What lessons have already been learned? What opportunities exist? Planning Tools Utilizing big data requires careful planning: Information system and data management design Data Collection, Reduction, Analysis (DCRA) Archiving and sustainment Utilizing big data requires appropriate tools. Even small data sets are unmanageable without right tools Tool development requires planning, time, and resources 10

11 The Big Data Community Field Big Data User Needs Big Data is a common resource of the Services analytical community. Diverse analytical organizations contribute to and draw from it: data acquisition methods computational resources models, simulations, laboratories, tools historical data expertise T&E Materiel Development S&T (6.1/6.2/6.3) Important questions going forward: Who manages it for stakeholders? Who sustains it? How do we establish business rules for increased collaboration? Can we obtain synergies through collaboration? 11

12 Performance Test Data Integrated Concept Study * PURPOSE. Address Army s need for timely access to T&E data while aligning Army s storage infrastructure and protocols with DODI SCOPE. Conduct a cost-benefit analysis to determine breath & depth of data to be stored & resources Evaluate sensitivity of results to assumption changes and identify risks associated with changes RESPONSIBILITES. AMSAA will appoint a Study Director Study Advisory Group (SAG) will oversee the planning & conduct of the study SAG Composition: Senior Executive / General Officer from: ASA(ALT) DUSA-TE, CIO/G-6, DCS, G-3/5/7, AMC *RDECOM, ARL, & AMSAA), TRADOC CAA, & DTIC * HQDA (DCS, G-3/5/7) Memo, subject: Performance Test Data Integrated Concept Guidance and Directive Study, 26 Feb 16 12

13 Value of Deep Knowledge EXAMPLE Analysis by Service OTAs & Others Bad Event Increased Survivability 13

14 Big Data Analysis Approach 1) Download vehicle data files 2) Process data for each week of test Run Course Identification Scripts GPS coordinates used to ID course Generates summary file containing metadata for each file (Vendor, Vehicle ID, Course, Date, Miles, & Hours Run Data Collector Scripts Combine files from similar vehicle, course and date Generates files with concatenated channel data and flags the files containing incomplete data Run Report Generator 1) Displays summary of mileage and hours 2) Compares accelerations, temperatures, and speeds, across multiple vehicles 3) Displays plots of major channels for each unique vehicle, course, and date combination Generates.pdf report 3) Review report for reliability highlights Week s worth of test data (~100 GB) processed within 2-3 days 14

15 Analysis in Depth Data Dependent Multiple views of instrumentation data channel values. Creates context for analysis. Links discrete, continuous, hierarchical, and geospatial data types to scenario timeline. SASC Bill for FY17 NDAA Section 853. Enhanced use of data to improve acquisition program outcomes. Army has been investing on the SASC s position from the analysis and test community. By FY18 Army T&E and Analytical communities should be up and running with a coherent data analytics and POA&M that gets at the proposed FY17 NDAA. Unlocking & providing ATEC / RDECOM data to the community is necessary, as well as continuing RDT&E into tools to analyze, HPC investments, and visualization aids. 15

16 Leveraging the Big Data Space: Use Historical Data to Right-size Future Test Field User Needs Field Test T&E Big Data S&T (6.1/6.2/6.3) + Materiel Development ATEC and AMSAA analyzed 18 million miles of Stryker field and T&E data to develop reliability risk areas. Insights will be used to shape test scope on future versions of the systems. Subsystem X Assembly Y Component AB Sub-assembly Z Block interface W Widget subsystem Assembly Case Sub-component ABC Nuts and bolts Assembly Main element Subsystem Box K Superstructure Link Block assembly Crankstick Beta Shaft Structure Drive Component Widget XYZ Interface Risk Areas = Priority Test Areas 16

17 Leveraging the Big Data Space: Developing Cybersecurity Metrics Field User Needs Threat Model for Untrustworthy Insiders T&E Big Data S&T (6.1/6.2/6.3) Materiel Development 0.1% Person is untrustworthy Resource worth $1000 ATEC leveraging Network Integration Evaluation (NIE) events to develop models, methodologies, and metrics for cybersecurity T&E. Insights will be used to enable earlier-in-life cycle assessments and requirements development. 17

18 Leveraging the Big Data Space: Improving System Survivability T&E Field Big Data User Needs S&T (6.1/6.2/6.3) ATEC combined insights about ballistic events on vehicles in theater from: - Intelligence community s trend analyses - On-board vehicle instrumentation - Ballistic response data from live fire testing. - Modeling and simulation Materiel Development Insights used to improve: - Current and future system survivability designs - Test Scope - Test and evaluation methodology - Instrumentation and simulation designs 18

19 WANTED: Cadre of Data Scientists and Data Analysts Expertise in statistical tools and techniques; expertise in applied mathematics. Expertise in high-speed computing systems, data acquisition systems, algorithm analysis and development, and information processing display, control and transfer. Computer Science Mathematical Statistics IT Management Operations Research Computer Engineering Expertise in scientific inquiry into complex relationships and processes using multidisciplinary analysis tools and techniques particularly modeling and simulation. Expertise in engineering; expertise in data systems, data structures, data mining and programming languages. Expertise in data architectures, information systems, and data management. Visual Information Expertise in applying visual design principles to communicate complex information to diverse audiences. New Data Scientists & Data Analysts 19

20 Conclusions Big Data analysis : Terabytes of Data Greater Insights? High potential to leverage learn /understand behaviors of complex systems High potential of over-analysis for sake of over-analysis New generation of Data Scientists needed Real data-driven evidence to investigate anomalies - attribution Investments required: New methods and tools to quickly process and analyze Big Data Support the enterprise decision processes Develop a sharing culture DOD data policy evolutions Big Data will change our T&E enterprise in ways we don t completely grasp yet. 20