For my final capstone project (M.S. Advanced Data Analytics), I analyzed Fall 2023 U.S. college enrollment data (115K records, 5,900+ institutions) to uncover demographic patterns and predict graduate enrollment.

IPEDS

  • Key Findings:
    • Women consistently outnumber men in enrollment, with the gap widening at the graduate level (60.6% vs. 39.4%).
    • Hispanic student representation drops sharply from undergrad (25.6%) to graduate (15.2%).
    • Institutional size distribution is highly skewed (median 588 vs. mean 3,332).
  • **Models Utilized: Linear Regression, Decision Trees, and Random Forest.
    • Best performer: Random Forest (R² ≈ 0.78, MAE ~631).
    • Strongest predictors of graduate enrollment: female enrollment and Asian student representation.

🔗 View Full Repository on GitHub