According to the description in the syllabus for 5410 Applications and Deployment of Advanced Analytics, this course, “…focuses on using advanced analytics in practical case studies to help students develop the skills needed to address complex challenges in industry and business.” The course required the prerequisites of most, if not all, the courses I had taken so far in the program.

The course consisted of 8 modules that began with a crash intro to Python and then moved quickly to ‘Data Exploration and Visualization’ where I explored US College Data using Pandas. Next, we explored a tool called PandasAI with the Titanic dataset and also worked with the process of imputation while doing an exploratory analysis of a west Texas oil fields dataset. All of these assignments required a jupyter notebook with our Python code and a separate pdf written description of the step by step analysis we undertook. For the third assignment of the course we learned about and ran a linear regression of a data set of diamond prices. By this time, I had chosen my data set for my final project and had to submit my research proposal that consisted of the research question, the dataset, the target variable, and, the methodology for the analysis. From here, we explored and used logistic regression using customer acquisition data and then a separate assignment using logistic regression to predict loan default with regularization techniques. Next, we explored Stepwise Regression and Decision Tree models to predict diabetes. The goal was to utilize the diabetes dataset and build two models to predict whether a patient is diabetic based on various health attributes. Next, we used the same dataset but used Random Forest with Hyperparameter Tuning to also predict wheter a patient is diabetic based on various attributes. The course content concluded with an introduction and exploration of neural nets leaving us to finalize and submit our final project.

Final Project: Demographic Factors and College Completion

PDF Report of Final Project

GitHub Repo of Final Project

Slide Deck Presentation of Final Project

Final Project Video Presentation:

Tools Utilized: Python, Google Colab, Jupyter Notebooks, GitHub

Skills Acquired: Apply experimental Design, sampling methodologies, parametric and non-parametric tests, linear regression models (analyze, test, improve). Integrate various data analysis techniques and use statistical software tools and programming applications to perform advanced data analysis on a real world project and effectively display the results.