ai/ml

Analyze Trends and Predict College Enrollment

For my final capstone project (M.S. Advanced Data Analytics), I analyzed Fall 2023 U.S. college enrollment data (115K records, 5,900+ institutions) to uncover demographic patterns and predict graduate enrollment.

IPEDS

  • Key Findings:
    • Women consistently outnumber men in enrollment, with the gap widening at the graduate level (60.6% vs. 39.4%).
    • Hispanic student representation drops sharply from undergrad (25.6%) to graduate (15.2%).
    • Institutional size distribution is highly skewed (median 588 vs. mean 3,332).
  • **Models Utilized: Linear Regression, Decision Trees, and Random Forest.
    • Best performer: Random Forest (R² ≈ 0.78, MAE ~631).
    • Strongest predictors of graduate enrollment: female enrollment and Asian student representation.

🔗 View Full Repository on GitHub

Analyze Demographic Factors and Predict College Completion

Demographic Factors and College Completion
Tools:Python (pandas, scikit-learn, matplotlib), Jupyter Notebook

This project analyzed 2022–2023 college degree completions across 16,000+ U.S. institutions** to uncover the strongest demographic predictors of success.

Key Findings:

  • Female completions were the most influential factor.
  • Non-traditional students (ages 25–39) play a critical role in completions.
  • Random Forest achieved ~99% accuracy, outperforming logistic regression and decision trees.

View Full Repository on GitHub

Python for Data Science and AI

Completed the IBM Python for Data Science and AI course!

IBM Python for DS and AI

Explored and worked with: Python programming basics (syntax, data types, expressions, variables, string operations, data structures, conditions, loops, functions, objects, classes). Python libraries: Pandas and Numpy using Jupyter Notebooks. Accessed and extracted web-based data by working with REST APIs using requests and performing web scraping with BeautifulSoup.

coursework

Analyze Trends and Predict College Enrollment

For my final capstone project (M.S. Advanced Data Analytics), I analyzed Fall 2023 U.S. college enrollment data (115K records, 5,900+ institutions) to uncover demographic patterns and predict graduate enrollment.

IPEDS

  • Key Findings:
    • Women consistently outnumber men in enrollment, with the gap widening at the graduate level (60.6% vs. 39.4%).
    • Hispanic student representation drops sharply from undergrad (25.6%) to graduate (15.2%).
    • Institutional size distribution is highly skewed (median 588 vs. mean 3,332).
  • **Models Utilized: Linear Regression, Decision Trees, and Random Forest.
    • Best performer: Random Forest (R² ≈ 0.78, MAE ~631).
    • Strongest predictors of graduate enrollment: female enrollment and Asian student representation.

🔗 View Full Repository on GitHub

Analyze Demographic Factors and Predict College Completion

Demographic Factors and College Completion
Tools:Python (pandas, scikit-learn, matplotlib), Jupyter Notebook

This project analyzed 2022–2023 college degree completions across 16,000+ U.S. institutions** to uncover the strongest demographic predictors of success.

Key Findings:

  • Female completions were the most influential factor.
  • Non-traditional students (ages 25–39) play a critical role in completions.
  • Random Forest achieved ~99% accuracy, outperforming logistic regression and decision trees.

View Full Repository on GitHub

Data Visualization and Communication

I recently completed the Data Visualization and Communication course and utilized spreadsheets, Python and Plotly to explore and analyze the 2019 AustinISD TEA accountability ratings as my final project!

An analysis of the AustinISD Texas Education Agency accountability statewide ratings for 2019.

Data Visualization Project

Course Description: Introduces principles and techniques for data visualization for creating meaningful displays of quantitative and qualitative data to facilitate decision-making. Emphasis is placed on the identification of patterns, trends and differences among data sets.

TOOLS: Excel, PowerBI, Tableau, Python (MatPlotLib, Seaborn, Plotly)

SKILLS: Graphic design principles - Color, Text, Interaction, Perception, Exploratory Data Analysis, data visualization techniques from charts to dashboards

Data Analysis and Knowledge Discovery

As my final project for the Data Analysis and Knowledge Discovery course, I created a side-by-side comparison tool to explore the 2019 ratings of any two schools or districts in Texas.

Originally created using Excel using advanced look up functions and formulas, it has been transferred to Google sheets. Make a copy to interact and utilize it. My plan is to make an improved, web based version of this tool.

Course Description: Introduction to data analysis, data mining, text mining and knowledge discovery principles, concepts, theories and practices. Designed for the aspiring or practicing information professional and covers the basics of working with data from a hands-on and practical perspective.

TOOLS: Excel, RapidMiner

SKILLS: Spreadsheet Modeling Basics - Lookup, Index, Match Functions, Pivot Tables, Array Formulas, Charts and Dashboards Data Mining Basics - Data Prep, Correlation Methods, Association Rules, K-Means Clustering, Discriminant Analysis, k-nearest neighbors, Naive Bayes, Text Mining, Decision Trees, Neural Networks

Information Science to Data Science?!

After completing my first two Information Science courses along with a Crash Course in Data Science from John Hopkins University, I decided to take two data analytics related courses. Now, after completing these two courses–Data Analysis and Knowledge Discovery and Data Visualization and Communication I have decided to end my brief journey in the Information Science graduate program to pursue a Master of Science in Advanced Data Analytics. A new journey begins!

Python for Data Science and AI

Completed the IBM Python for Data Science and AI course!

IBM Python for DS and AI

Explored and worked with: Python programming basics (syntax, data types, expressions, variables, string operations, data structures, conditions, loops, functions, objects, classes). Python libraries: Pandas and Numpy using Jupyter Notebooks. Accessed and extracted web-based data by working with REST APIs using requests and performing web scraping with BeautifulSoup.

datavis

Information Science to Data Science?!

After completing my first two Information Science courses along with a Crash Course in Data Science from John Hopkins University, I decided to take two data analytics related courses. Now, after completing these two courses–Data Analysis and Knowledge Discovery and Data Visualization and Communication I have decided to end my brief journey in the Information Science graduate program to pursue a Master of Science in Advanced Data Analytics. A new journey begins!

dataviz

Data Visualization and Communication

I recently completed the Data Visualization and Communication course and utilized spreadsheets, Python and Plotly to explore and analyze the 2019 AustinISD TEA accountability ratings as my final project!

An analysis of the AustinISD Texas Education Agency accountability statewide ratings for 2019.

Data Visualization Project

Course Description: Introduces principles and techniques for data visualization for creating meaningful displays of quantitative and qualitative data to facilitate decision-making. Emphasis is placed on the identification of patterns, trends and differences among data sets.

TOOLS: Excel, PowerBI, Tableau, Python (MatPlotLib, Seaborn, Plotly)

SKILLS: Graphic design principles - Color, Text, Interaction, Perception, Exploratory Data Analysis, data visualization techniques from charts to dashboards

education

Analyze Trends and Predict College Enrollment

For my final capstone project (M.S. Advanced Data Analytics), I analyzed Fall 2023 U.S. college enrollment data (115K records, 5,900+ institutions) to uncover demographic patterns and predict graduate enrollment.

IPEDS

  • Key Findings:
    • Women consistently outnumber men in enrollment, with the gap widening at the graduate level (60.6% vs. 39.4%).
    • Hispanic student representation drops sharply from undergrad (25.6%) to graduate (15.2%).
    • Institutional size distribution is highly skewed (median 588 vs. mean 3,332).
  • **Models Utilized: Linear Regression, Decision Trees, and Random Forest.
    • Best performer: Random Forest (R² ≈ 0.78, MAE ~631).
    • Strongest predictors of graduate enrollment: female enrollment and Asian student representation.

🔗 View Full Repository on GitHub

Analyze Demographic Factors and Predict College Completion

Demographic Factors and College Completion
Tools:Python (pandas, scikit-learn, matplotlib), Jupyter Notebook

This project analyzed 2022–2023 college degree completions across 16,000+ U.S. institutions** to uncover the strongest demographic predictors of success.

Key Findings:

  • Female completions were the most influential factor.
  • Non-traditional students (ages 25–39) play a critical role in completions.
  • Random Forest achieved ~99% accuracy, outperforming logistic regression and decision trees.

View Full Repository on GitHub

Data Visualization and Communication

I recently completed the Data Visualization and Communication course and utilized spreadsheets, Python and Plotly to explore and analyze the 2019 AustinISD TEA accountability ratings as my final project!

An analysis of the AustinISD Texas Education Agency accountability statewide ratings for 2019.

Data Visualization Project

Course Description: Introduces principles and techniques for data visualization for creating meaningful displays of quantitative and qualitative data to facilitate decision-making. Emphasis is placed on the identification of patterns, trends and differences among data sets.

TOOLS: Excel, PowerBI, Tableau, Python (MatPlotLib, Seaborn, Plotly)

SKILLS: Graphic design principles - Color, Text, Interaction, Perception, Exploratory Data Analysis, data visualization techniques from charts to dashboards

Data Analysis and Knowledge Discovery

As my final project for the Data Analysis and Knowledge Discovery course, I created a side-by-side comparison tool to explore the 2019 ratings of any two schools or districts in Texas.

Originally created using Excel using advanced look up functions and formulas, it has been transferred to Google sheets. Make a copy to interact and utilize it. My plan is to make an improved, web based version of this tool.

Course Description: Introduction to data analysis, data mining, text mining and knowledge discovery principles, concepts, theories and practices. Designed for the aspiring or practicing information professional and covers the basics of working with data from a hands-on and practical perspective.

TOOLS: Excel, RapidMiner

SKILLS: Spreadsheet Modeling Basics - Lookup, Index, Match Functions, Pivot Tables, Array Formulas, Charts and Dashboards Data Mining Basics - Data Prep, Correlation Methods, Association Rules, K-Means Clustering, Discriminant Analysis, k-nearest neighbors, Naive Bayes, Text Mining, Decision Trees, Neural Networks

plotly

Data Visualization and Communication

I recently completed the Data Visualization and Communication course and utilized spreadsheets, Python and Plotly to explore and analyze the 2019 AustinISD TEA accountability ratings as my final project!

An analysis of the AustinISD Texas Education Agency accountability statewide ratings for 2019.

Data Visualization Project

Course Description: Introduces principles and techniques for data visualization for creating meaningful displays of quantitative and qualitative data to facilitate decision-making. Emphasis is placed on the identification of patterns, trends and differences among data sets.

TOOLS: Excel, PowerBI, Tableau, Python (MatPlotLib, Seaborn, Plotly)

SKILLS: Graphic design principles - Color, Text, Interaction, Perception, Exploratory Data Analysis, data visualization techniques from charts to dashboards

python

Analyze Trends and Predict College Enrollment

For my final capstone project (M.S. Advanced Data Analytics), I analyzed Fall 2023 U.S. college enrollment data (115K records, 5,900+ institutions) to uncover demographic patterns and predict graduate enrollment.

IPEDS

  • Key Findings:
    • Women consistently outnumber men in enrollment, with the gap widening at the graduate level (60.6% vs. 39.4%).
    • Hispanic student representation drops sharply from undergrad (25.6%) to graduate (15.2%).
    • Institutional size distribution is highly skewed (median 588 vs. mean 3,332).
  • **Models Utilized: Linear Regression, Decision Trees, and Random Forest.
    • Best performer: Random Forest (R² ≈ 0.78, MAE ~631).
    • Strongest predictors of graduate enrollment: female enrollment and Asian student representation.

🔗 View Full Repository on GitHub

Analyze Demographic Factors and Predict College Completion

Demographic Factors and College Completion
Tools:Python (pandas, scikit-learn, matplotlib), Jupyter Notebook

This project analyzed 2022–2023 college degree completions across 16,000+ U.S. institutions** to uncover the strongest demographic predictors of success.

Key Findings:

  • Female completions were the most influential factor.
  • Non-traditional students (ages 25–39) play a critical role in completions.
  • Random Forest achieved ~99% accuracy, outperforming logistic regression and decision trees.

View Full Repository on GitHub

Data Visualization and Communication

I recently completed the Data Visualization and Communication course and utilized spreadsheets, Python and Plotly to explore and analyze the 2019 AustinISD TEA accountability ratings as my final project!

An analysis of the AustinISD Texas Education Agency accountability statewide ratings for 2019.

Data Visualization Project

Course Description: Introduces principles and techniques for data visualization for creating meaningful displays of quantitative and qualitative data to facilitate decision-making. Emphasis is placed on the identification of patterns, trends and differences among data sets.

TOOLS: Excel, PowerBI, Tableau, Python (MatPlotLib, Seaborn, Plotly)

SKILLS: Graphic design principles - Color, Text, Interaction, Perception, Exploratory Data Analysis, data visualization techniques from charts to dashboards

Python for Data Science and AI

Completed the IBM Python for Data Science and AI course!

IBM Python for DS and AI

Explored and worked with: Python programming basics (syntax, data types, expressions, variables, string operations, data structures, conditions, loops, functions, objects, classes). Python libraries: Pandas and Numpy using Jupyter Notebooks. Accessed and extracted web-based data by working with REST APIs using requests and performing web scraping with BeautifulSoup.

spreadsheets

Analyze Trends and Predict College Enrollment

For my final capstone project (M.S. Advanced Data Analytics), I analyzed Fall 2023 U.S. college enrollment data (115K records, 5,900+ institutions) to uncover demographic patterns and predict graduate enrollment.

IPEDS

  • Key Findings:
    • Women consistently outnumber men in enrollment, with the gap widening at the graduate level (60.6% vs. 39.4%).
    • Hispanic student representation drops sharply from undergrad (25.6%) to graduate (15.2%).
    • Institutional size distribution is highly skewed (median 588 vs. mean 3,332).
  • **Models Utilized: Linear Regression, Decision Trees, and Random Forest.
    • Best performer: Random Forest (R² ≈ 0.78, MAE ~631).
    • Strongest predictors of graduate enrollment: female enrollment and Asian student representation.

🔗 View Full Repository on GitHub

Analyze Demographic Factors and Predict College Completion

Demographic Factors and College Completion
Tools:Python (pandas, scikit-learn, matplotlib), Jupyter Notebook

This project analyzed 2022–2023 college degree completions across 16,000+ U.S. institutions** to uncover the strongest demographic predictors of success.

Key Findings:

  • Female completions were the most influential factor.
  • Non-traditional students (ages 25–39) play a critical role in completions.
  • Random Forest achieved ~99% accuracy, outperforming logistic regression and decision trees.

View Full Repository on GitHub

Data Visualization and Communication

I recently completed the Data Visualization and Communication course and utilized spreadsheets, Python and Plotly to explore and analyze the 2019 AustinISD TEA accountability ratings as my final project!

An analysis of the AustinISD Texas Education Agency accountability statewide ratings for 2019.

Data Visualization Project

Course Description: Introduces principles and techniques for data visualization for creating meaningful displays of quantitative and qualitative data to facilitate decision-making. Emphasis is placed on the identification of patterns, trends and differences among data sets.

TOOLS: Excel, PowerBI, Tableau, Python (MatPlotLib, Seaborn, Plotly)

SKILLS: Graphic design principles - Color, Text, Interaction, Perception, Exploratory Data Analysis, data visualization techniques from charts to dashboards

Data Analysis and Knowledge Discovery

As my final project for the Data Analysis and Knowledge Discovery course, I created a side-by-side comparison tool to explore the 2019 ratings of any two schools or districts in Texas.

Originally created using Excel using advanced look up functions and formulas, it has been transferred to Google sheets. Make a copy to interact and utilize it. My plan is to make an improved, web based version of this tool.

Course Description: Introduction to data analysis, data mining, text mining and knowledge discovery principles, concepts, theories and practices. Designed for the aspiring or practicing information professional and covers the basics of working with data from a hands-on and practical perspective.

TOOLS: Excel, RapidMiner

SKILLS: Spreadsheet Modeling Basics - Lookup, Index, Match Functions, Pivot Tables, Array Formulas, Charts and Dashboards Data Mining Basics - Data Prep, Correlation Methods, Association Rules, K-Means Clustering, Discriminant Analysis, k-nearest neighbors, Naive Bayes, Text Mining, Decision Trees, Neural Networks