JMP Discovery Summit 2012

Size: px
Start display at page:

Download "JMP Discovery Summit 2012"

Transcription

1 JMP Discovery Summit 2012 New Methods for Developing Limited Data Sets for Predicting US Marine Corps Combat Losses

2 Presented to JMP Discovery Summit 2012 SAS World Headquarters, Cary, NC Topic: Predictive Modeling Captain Aaron Burciaga, USMC John Stocker, Booz Allen Hamilton Andrea Ferris, Booz Allen Hamilton Distribution Statement A: This presentation/paper is unclassified, approved for public release, distribution unlimited, and is exempt from U.S. export licensing and other export approvals under the International Traffic in Arms Regulations (22 CFR 120 et seq.) 2

3 Agenda 1. Background 2. Methodology 3. Analysis 4. Results 3

4 Background The United States Marine Corps War Reserve Materiel Program sustains operating forces throughout the spectrum of combat. Combat Active Replacement Factors (CARFs) estimate the rate to replace materiel that has been lost or destroyed in combat (attrition). CARFs are used to calculate the total number of replacements, per type of materiel, that the War Reserve maintains to sustain operating forces. CARF-STAT provides an automated statistical analysis environment using SAS JMP Pro to assign CARF values to predict loss rates based on observed combat losses and special studies. 4

5 Impact Updated 15-year old methodology and calculations. Assignment algorithm successfully assigned values to 100% of most critical equipment, using preferred Recursive Partitioning method. Roughly five times as many values were calculated, compared to what existed previously. New values are, on average, about 60% lower than previous values. General decline in values leads to an estimated savings in requirement costs of $16.4 billion or 16.5% of the portfolio. Minimizing excess inventory curtails the costs associated with acquisition, maintenance, and obsolescence in order to channel those funds toward enhancing other aspects of war fighting effectiveness. 5

6 War Reserve Withdrawal Process CARF calculations focus on two classes of combat-essential equipment Class II: Clothing & Individual Items Class VII: Major End Items 6

7 Definition CARF is a 30-day forecast of the replacement rates for Principle End Items (important combat equipment) that are Combat Essentiality Code 1 (important/critical), derived from historical data. CARFs are calculated for three levels of conflict (High Intensity Conflict, Medium Intensity Conflict, Low Intensity Conflict) and two phases of operations (Assault, Sustainment). Type of Operation Low Intensity Medium Intensity High Intensity Assault LA MA HA Sustained LS MS HS 7

8 CARF Assignment Methodologies 8

9 Recursive Partitioning CARF (RPC) Calculated using an ensemble of bootstrap forest (a.k.a. random forest) recursive partitioning models 25 decision tree results averaged to obtain result of each forest 100 forest results (derived from varying training/test/validation sets) averaged to obtain final RPC result Low- and medium-intensity CARFs only 9

10 Why Bootstrap Forests? Methods considered Generalized linear models Neural networks Boosted trees Bootstrap forests Benefits of bootstrap forests Calculates CARFs even when some predictor data is missing Easier to explain and understand (model contains many trees, but individual trees and overall column contributions are easy to interpret) Produces results that remain stable with minor data changes Resistant to overfitting when many predictors are used 10

11 RPC Analysis Goodness of Fit Test & Validation Sets ECC data was split into training, test, and validation sets RPC model was built using the training data and tested/validated using remaining data More focus on test/validation goodness of fit measures Goodness of Fit Measures Maximize R 2, the amount of variation in the data captured by the model Minimize Mean Square Error, the average difference between an observed, Explicitly Calculated CARF and the CARF predicted by the model 11

12 RPC Analysis Model Selection Predictor Selection Create RPC model using all possible predictors Remove most insignificant predictors one at a time until the optimal R- squared and cross validation MSE combination is obtained Parameter Selection Create RPC models for all reasonable combinations of parameter values Compare R-squared results of various combinations to determine optimal parameters 12

13 Predictor Selection 13

14 Parameter Selection (1 of 3) 14

15 Parameter Selection (2 of 3) 15

16 Parameter Selection (3 of 3) 16

17 CARF-STAT Analysis Control Panel 17

18 CARF-STAT Advanced Modeling Features 18

19 Results Distributions of CARFs 19

20 Results Column Contributions 20

21 Future CARF Assignment Methodologies 21

22 Impact Updated 15-year old methodology and calculations. Assignment algorithm successfully assigned values to 100% of most critical equipment, using preferred Recursive Partitioning method. Roughly five times as many values were calculated, compared to what existed previously. New values are, on average, about 60% lower than previous values. General decline in values leads to an estimated savings in requirement costs of $16.4 billion or 16.5% of the portfolio. Minimizing excess inventory curtails the costs associated with acquisition, maintenance, and obsolescence in order to channel those funds toward enhancing other aspects of war fighting effectiveness. 22

23 Questions? 23