Wearable Fitness Tracker Predictive Modeling


This report was created for a Canadian startup that builds wearable fitness trackers used in gyms and an accompanying mobile application. My solution yielded the best actual results among all report submissions from select individuals with highly qualified backgrounds. Some of the code has intentionally been removed.

Predictive Text Application – Milestones


We are developing an application that can predict a word based on previous ones. Text mining and natural language processing (NLP) are used. This is similar to the software available on mobile platforms such as SwiftKey. The end product will be a web application that takes an incomplete phrase from the user and predicts the next word. In order to build the application, we require an appropriate data collection. Here we use the English language sets from HC Corpora. This milestone report details our initial exploratory analysis of the data and our future goals in a concise and understandable manner.

Human Activity Recognition and Machine Learning


Human Activity Recognition is emerging as a new field where wearable devices are commonly used to quantify the amount of time an activity is performed. In our analysis, we instead look at how well weight lifting exercises were performed in a study. Each individual in the experiment had various accelerometer data collected from devices on different parts of the body while performing barbell exercises in five different ways. We developed machine learning algorithms that predict the way they were performed based on accelerometer data. Our final model that gave us a 100% In Sample accuracy and a 99.0% Out of Sample accuracy was the random forest algorithm with a 10-fold cross-validation repeated 5 times.

Regression Models of MPG in Automobiles


In this report we are interested to know if automatic or manual transmission is better for MPG using the mtcars dataset and to quantify this result. The complication is that other variables also affect the MPG. In our best linear regression model, we see that weight and \(\frac{1}{4}\) mile time influence the MPG and therefore transmission alone cannot be used to determine the better MPG.

Effect of Vitamin C Dose and Supplement Type on Tooth Growth


This is an analysis of the ToothGrowth dataset on guinea pigs available in the R standard installation. We first do a summary and exploratory analysis to see what the data includes. We then perform some statistical inference with confidence intervals and hypothesis testing to see which dose and supplement of vitamin C is more efficient in tooth growth. Assumptions are made to state our conclusions. We can state that orange juice is the better supplement for tooth growth in two of the three dosages. However for the highest dose, we cannot see any advantage of orange juice over ascorbic acid. In general, tooth growth increases with dose.

Consequences of Severe Weather Events on the U.S. Population Health and Economy


This report describes the harmful impact of severe weather events on the American population health and economy. To study the top weather events, we obtained the data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. NOAA keeps records of fatalities, injuries as well as estimates on property and crop damage. We specifically look at the years 1995 to 2011 because earlier years’ data is largely incomplete. From the storm database, we do an analysis to extract the top 10 severe weather events affecting population health and the economy. Our results show that severe events during warm climate months and storm seasons have the greatest impact on population health and the economy. Flooding, excessive heat and tornadoes are such examples. We also found that the economic consequences are significantly higher for properties than crops, as expected.
