If 5130 Data Analytics 1 seemed like a crash course in basic R and Statistics 101 then 5230 Data Analytics II seemed like a crash course in advanced R and machine learning!

It was intense and the most difficult course I’ve taken in this program. The pace of the class and the professor assumed a strong background in R that I, obviously, did not have. Unfortunately, after attending office hours, etc. I had to hire an R tutor to help explain to me what the code I was copying/pasting was actually doing. I was also even more grateful for Posit.cloud than I had been in my previous course. I used it extensively for this course and really enjoyed the overall experience, the resources it provides and not having to mess with installation rituals, downloads, command line, etc. since it was all web-based and ready to get you to code.

The course required Data Mining for Business Analytics: Concepts, Technique, and Applications in R as the main textbook, however, we didn’t really use it much, if at all. Most, if not all, of the content came from lectures, slides and supplementary reading.

Data Mining for Business Analytics

The course was meant to be an extension of of the concepts introduced in Data Analytics I including multivariate analysis, classification methods, association rules, dimension reduction, performance evaluation, multiple and logistic regression, k-Nearest Neighbors (k-NN), Naive Bayes classifier, decision trees, Neural Nets and discriminant analysis. However, the pace was much more rushed and the content much more dense which makes sense since it is DA II but the professor and the structure of the course was not very helpful and I had to rely on a lot of outside support and supplementary materials to get through the class and really grasp many of the concepts in order to successfully complete the assignments.

Tools Utilized: Excel, R statistical programming language, POSIT (web-based R-Studio application)
Skills Acquired/Developed: Multivariate and unstructured data analysis, classification methods, association rules, dimension reduction, performance evaluation, multiple and logistic regression, K-Nearest neighbors (k-NN), Naive Bayes classifier, decision trees, Neural Nets and discriminant analysis.