Dallas J. Elgin, Ph.D. IMPAQ International Randi Walters, Ph.D. Casey Family Programs APPAM Fall Research Conference

Size: px
Start display at page:

Download "Dallas J. Elgin, Ph.D. IMPAQ International Randi Walters, Ph.D. Casey Family Programs APPAM Fall Research Conference"

Transcription

1 Utilizing Predictive Modeling to Improve Policy through Improved Targeting of Agency Resources: A Case Study on Placement Instability among Foster Children Dallas J. Elgin, Ph.D. IMPAQ International Randi Walters, Ph.D. Casey Family Programs Image Credit: The Strengths Initiative 2016 APPAM Fall Research Conference

2 The Utility of Predictive Modeling for Government Agencies Challenge: Government agencies operate in an environment that increasingly requires using limited resources to meet nearly limitless demands. Opportunity: Advances in computing technology & administrative data can be leveraged via predictive modeling to predict the likelihood of future events Goal: To provide an improved understanding of the methodology & identify associated best practices

3 What is Predictive Modeling? Process of selecting a model that best predicts the probability of an outcome (Geisser, 1993), or generating an accurate prediction (Kuhn & Johnson, 2013). Over the past several decades, predictive modeling has been utilized in a variety of fields to predict diverse outcomes Within child welfare, predictive models have been used to inform decision-making: Risk assessment instruments Maltreatment recurrence, future involvement, child fatalities

4 Data: Case: Placement Instability 2013 Adoption and Foster Care Analysis and Reporting System (AFCARS) Publicly-available dataset resembling administrative data Sample: 15,000 foster care children that were in care throughout 2013 Operationalization: 3 or more moves, or a total of 4 placements (Hartnett, Falconnier, Leathers & Testa, 1999; Webster, Barth & Needell, 2000) 11,649 children with 3 or fewer placements 3,351 children with 4 or more placements

5 Methodological Approach: Data Partition Strategy The entire dataset of 15,000 children was split into 2 groups: A training set used to train the models (75% of dataset= 11,250 children) A test set used to validate the models (25% of dataset= 3,750 children)

6 Methodological Approach: Data Training Strategy Train a collection of 10 models using the training set Model Type Model Interpretability Computation Time Linear Discriminant Analysis Non-Linear Classification Models Classification Trees & Rule-Based Models Logistic Regression High Low Partial Least Squares Discriminant Analysis High Low Elastic Net/Lasso High Low K-Nearest Neighbors Low High Neural Networks Low High Support Vector Machines Low High Multivariate Adaptive Regression Splines Moderate Moderate Classification Tree High High Boosted Trees Low High Random Forest Low High Utilize ROC Curves to evaluate how well the models calculate: 1. The true-positive rate (sensitivity) 2. The false-positive rate (specificity)

7 Model Performance on the Test Set 3 models with highest ROC scores were applied to the test set (3,750 observations) Overall Accuracy= 87.8% % Less than 3 Moves= 90.1% % 4 or More Moves= 77.4% % Neural Network Model 4 or More Moves Less than 3 Moves 4 or More Moves Less than 3 Moves 302 2,759 Random Forest Model 4 or More Moves Less than 3 Moves 4 or More Moves Less than 3 Moves 300 2,755 Boosted Tree Model 4 or More Moves Less than 3 Moves 4 or More Moves Less than 3 Moves 297 2,754

8 Improving Model Accuracy Iterative process involving: transforming variables, fine-tuning model parameters, or combination of both Fine-Tuning parameters of the neural network model Improved Overall Accuracy= 88.2% Un-tuned Neural Network Model 4 or More Moves Less than 3 Moves 4 or More Moves Less than 3 Moves 302 2,759 Tuned Neural Network Model 4 or More Moves Less than 3 Moves 4 or More Moves Less than 3 Moves 268 2,736

9 Improving Model Accuracy: Cost-Sensitive Tuning Classification Tree with No Cost Penalty 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves Less than 3 Moves 322 2,731 Classification Tree with Cost Penalty of 2 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves Less than 3 Moves 217 2,558 Classification Tree with Cost Penalty of 5 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves Less than 3 Moves 81 2,154 Classification Tree with Cost Penalty of 10 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves Less than 3 Moves 47 1,942 Classification Tree with Cost Penalty of 20 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves 803 1, Less than 3 Moves 34 1,751 Considerable improvements in reducing false negatives, but at expense of notable increases in the number of false positives.

10 Best Practices for Designing & Implementing Predictive Models 1. Predictive Models Can Improve Upon, but Not Replace, Traditional Decision-Making Processes within Government Agencies. 2. Government Agencies Should Clearly Articulate the Methodological Approach and the Predictive Accuracy of their Models. 3. Consider Opportunities for Incorporating Community Engagement into the Predictive Modeling Process.

11 ~Questions & Feedback~ Dallas Elgin, Research Associate IMPAQ International Randi Walters, Senior Director of Knowledge Management Casey Family Programs

12 What is it? Occurs when a child in the care of a child welfare system experiences multiple moves to different settings Why does it matter? Placement instability can have significant consequences on children: Greater risk for impaired development & psychosocial well-being Greater uncertainty surrounding a child s future Greater likelihood of re-entry and/or emancipation Is it a big issue? Placement Instability 25% of foster care children experience three or moves while in care (Doyle, 2007)

13 Improving Model Accuracy: Cost-Sensitive Tuning False-negative predictions may be unacceptable as a failure to correctly identify placement instability could result in unnecessary exposure to adverse events Cost-sensitive models impose cost penalties to minimize the likelihood of false predictions

14 Data 2013 Adoption and Foster Care Analysis and Reporting System (AFCARS): Federal data provided by the states on all children in foster care Sample: 15,000 foster care children that were in care throughout % of Children in the Sample have 3 or fewer moves 22.34% of Children in the Sample have 4 or more moves

15 3 Highest Performing Models on the Training Set Boosted Trees: build upon traditional classification tree models Fit a series of independent decision trees and then aggregate the trees to form a single predictive model Random Forests: build upon traditional classification tree models by utilizing bootstrapping methods to build a collection of decision trees Consideration of smaller subset of predictors minimizes the likelihood of a high degree of correlation among multiple trees Neural Networks: resemble the physiological structure of the human brain or nervous system Use multiple layers (or algorithms) for processing pieces of information

16 Linear Discriminant Analysis Models Utilize linear functions to categorize observations into groups based on predictor characteristics Examples: logistic regressions, partial least squares discriminant analysis, and Elastic Net/Lasso models These models commonly have: High degree of interpretability Low amount of computational time

17 Non-Linear Classification Models Utilize non-linear functions to categorize observations Examples: k-nearest neighbors, neural networks, support vector machines, and multivariate adaptive regression splines These models commonly have: Low to moderate interpretability Moderate to high computational time

18 Classification Trees and Rule-Based Models Utilize rules to partition observations into smaller homogenous groups Examples: classification trees, boosted trees, and random forests These models commonly have: Low to high interpretability High degree of computational time

19 Model Performance on the Test Set 3 models with the highest ROC values were applied to the test set of 3,750 children

20 Identifying Prominent Predictors Variable Name Neural Network Ranking Random Forest Ranking Boosted Trees Ranking Average Ranking Date of Latest Removal Beginning Date for Current Placement Setting Date of First Removal Child's Date of Birth Emotionally Disturbed Diagnosis Discharge Date of Child's Previous Removal Currently Placed in Non-Relative Foster Home Currently Placed in an Institution Number of Days in Current Placement Setting Female Child The caret package s variable importance feature provides one option for characterizing the general effects of predictors with predictive models The feature was ran on the neural network, random forest, and boosted tree models to identify the most important variables