Ben Teusch HR Analytics Consultant

Size: px
Start display at page:

Download "Ben Teusch HR Analytics Consultant"

Transcription

1 HUMAN RESOURCES ANALYTICS IN R: EXPLORING EMPLOYEE DATA Analyzing employee engagement Ben Teusch HR Analytics Consultant

2 What is employee engagement? engaged employees: those who are involved in, enthusiastic about and committed to their work and workplace. (Gallup) [1]

3 What is employee engagement?

4 The survey data > head(survey) # A tibble: 6 x 5 employee_id department engagement salary vacation_days_taken <int> <chr> <int> <dbl> <int> 1 1 Sales Engineering Engineering Engineering Engineering Engineering

5 Review of mutate() > survey %>% + mutate(max_salary = max(salary)) # A tibble: 1,470 x 6 employee_id department engagement salary vacation_days_taken max_salary <int> <chr> <int> <dbl> <int> <dbl> 1 1 Sales Engineering Engineering Engineering Engineering #... with 1,465 more rows

6 The ifelse() function > x <- 5 > if(x < 10){ "True" } else { "False" } [1] "True" > z <- c(5, 8, 11, 14) > if(z < 10){ "True" } else { "False" } [1] "True" Warning message: In if (z < 10) { : the condition has length > 1 and only the first element will be used > ifelse(z < 10, "Yes", "No") [1] "Yes" "Yes" "No" "No"

7 ifelse() + mutate() > survey %>% + mutate(takes_vacation = ifelse(vacation_days_taken > 10, "Yes", "No")) # A tibble: 1,470 x 6 employee_id engagement salary vacation_days_taken takes_vacation <int> <int> <dbl> <int> <chr> No Yes Yes No Yes #... with 1,465 more rows

8 Multiple summarizes > survey %>% + group_by(department) %>% + summarize(max_salary = max(salary)) # A tibble: 3 x 2 department max_salary <chr> <dbl> 1 Engineering Finance Sales

9 Multiple summarizes > survey %>% + group_by(department) %>% + summarize(max_salary = max(salary), + min_salary = min(salary), + avg_salary = mean(salary)) # A tibble: 3 x 4 department max_salary min_salary avg_salary <chr> <dbl> <dbl> <dbl> 1 Engineering Finance Sales

10 HUMAN RESOURCES ANALYTICS IN R: EXPLORING EMPLOYEE DATA Let's practice!

11 HUMAN RESOURCES ANALYTICS IN R: EXPLORING EMPLOYEE DATA Visualizing engagement data Ben Teusch HR Analytics Consultant

12 Visualizing several variables at once

13 The tidyr package

14 Using tidyr::gather() library(tidyr) data %>% gather(columns, key = "key", value = "value")

15 Using tidyr::gather() > survey_summary # A tibble: 3 x 3 department average_engagement average_promotions <chr> <dbl> <dbl> 1 Engineering Finance Sales survey_summary %>% gather(average_engagement, average_promotions, key = "key", value = "value") # A tibble: 6 x 3 department key value <chr> <chr> <dbl> 1 Engineering average_engagement Finance average_engagement Sales average_engagement Engineering average_promotions Finance average_promotions Sales average_promotions

16 Adding color to bar charts survey_gathered <- survey_summary %>% gather(average_engagement, average_promotions, key = "key", value = "value") > ggplot(survey_gathered, aes(key, value, fill = department)) + + geom_col()

17 Adding color to bar charts

18 Side-by-side bar charts > ggplot(survey_gathered, aes(key, value, fill = department)) + + geom_col(position = "dodge")

19 Side-by-side bar charts

20 Adding facets > ggplot(survey_gathered, aes(x = key, y = value, fill = department)) + + geom_col(position = "dodge") + + facet_wrap(~ key, scales = "free")

21

22 HUMAN RESOURCES ANALYTICS IN R: EXPLORING EMPLOYEE DATA Let's practice!

23 HUMAN RESOURCES ANALYTICS IN R: EXPLORING EMPLOYEE DATA Testing differences between groups Ben Teusch HR Analytics Consultant

24 Comparing two groups

25 Quantifying the likelihood

26 The t-test Use when the variable to compare is continuous > t.test(tenure ~ is_manager, data = survey) Welch Two Sample t-test data: tenure by is_manager t = , df = , p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean in group Non-manager mean in group Manager

27 The chi-squared test Use when the variable to compare is categorical > chisq.test(survey$left_company, survey$is_manager) Pearson's Chi-squared test with Yates' continuity correction data: survey$left_company and survey$is_manager X-squared = , df = 1, p-value = 1.97e-06

28 Where are the formulas?

29 HUMAN RESOURCES ANALYTICS IN R: EXPLORING EMPLOYEE DATA Let's practice!