Predictive analytics applied to HR data to uncover the key drivers of employee turnover.
A capstone project from the Google Advanced Data Analytics Certificate, exploring how data-driven insights can help businesses improve employee retention and reduce turnover costs.

1.
Business Problem — What Was the Challenge?
Salifort Motors, a mid-sized automotive company, faced high employee turnover that was affecting productivity, recruitment costs, and overall morale. The business wanted to understand why employees were leaving and identify measurable factors contributing to attrition.
This challenge matters because retaining skilled employees directly impacts a company’s efficiency, cost structure, and ability to innovate. Reducing turnover even by a small percentage can translate into significant savings in hiring and training expenses.
2.
Approach & Method — How It Was Solved
The analysis used a simulated HR dataset provided as part of the Google Advanced Data Analytics Certificate capstone.
Tools & Libraries: Python (pandas, NumPy, matplotlib, seaborn, scikit-learn).
Techniques:
- Data cleaning and feature engineering to prepare HR and performance data.
- Exploratory data analysis (EDA) to uncover patterns in satisfaction, evaluation scores, tenure, and salary.
- Predictive modeling using logistic regression, random forest, and XGBoost to estimate the likelihood of employee turnover.
- Model evaluation using accuracy, precision, recall, and ROC-AUC metrics.
3.
Results & Business Insights
The final random forest model achieved an accuracy of ~92% and identified key drivers of employee churn such as low job satisfaction, limited promotions, and excessive overtime.
Employees with low satisfaction and long average monthly hours were significantly more likely to leave.
Translating this into business terms: focusing on work-life balance initiatives, recognition programs, and career development opportunities could substantially reduce attrition and improve productivity.
Key takeaway: Predictive analytics allowed HR teams to proactively identify at-risk employees and design retention strategies. Based on similar HR analytics applications, such interventions could meaningfully reduce turnover and associated costs.

Feature importance analysis showing key drivers of employee attrition.
4.
Practical Applications
This project demonstrates how HR and People Analytics teams can apply machine learning to workforce management.
Future Expansion: Integrate the model into a Power BI dashboard or HR platform for ongoing monitoring.
Use Case: Predict which employees are at risk of leaving and intervene early.
Business Impact: Save recruitment costs, maintain operational continuity, and boost engagement.
5.
Reflection / Learnings
Technical Learnings:
- Improved understanding of the end-to-end data science workflow — from EDA and feature engineering to model tuning and interpretation.
- Strengthened knowledge of classification algorithms and evaluation metrics.
Practical Learnings:
- Translating technical results into actionable business recommendations is just as important as model accuracy.
- The experience reinforced how essential it is to connect data insights to business strategy — exactly the philosophy behind AlexDoesData.
If improved:
With access to real employee data, I would incorporate qualitative variables (e.g., satisfaction survey text) and build a real-time dashboard to track retention trends dynamically.
Additionally, further feature engineering could be explored to identify new, more predictive variables that may improve model performance and deepen understanding of the underlying drivers of employee turnover.
6.
Resources
Note:
This project was completed as part of the Google Advanced Data Analytics Certificate.
Some of the visualizations and code structure follow the official exemplar provided by Google.
However, all analysis, commentary, and interpretation reflect my own understanding and learning outcomes.