In my AI ‘track’ (I have multiple ‘tracks’ on my blog list…so much technology, so little time ;P ), I last explored what is called ‘simple linear regression’ which consists of one variable. More simply, if an ‘x’ value is known and the model is set up correctly, this algorithm can predict the ‘y’ for the supplied ‘x’.
The more complex version of this is called linear regression. As I am learning, there are multiple types of linear regression (reference #1). Today, I will be exploring an algorithm called gradient descent. If you have read my earlier posts, you see me making references to it. It will allow me to have multiple parameters on my machine learning model. Since Dr. Brownlee’s simple linear regression example is the best I have found (so far), I thought about starting with his gradient descent example (reference #2). However, it doesn’t provide the step by step approach I had hoped for. He does describe the high-level process, but the nitty-gritty detail was not present.
So, I did find (earlier) what I hoped to be a step by step example (reference #4). However, after getting excited about video one, I learned in video 2 that Mr. Foltz uses excel and other tools to do the detailed analysis. While this is good for high-level examples, the nitty gritty step by step approach I cherish (again) was not present. But, it dawned on me that perhaps part of the answer is here. More specifically, the target of a successful example. Mr. Foltz provides a detailed data set and the ‘answer’. So, perhaps I can stitch together an example that will use his data set and spit out the ‘correct’ answer. The coefficients (which are the parts I need to calculated based using the 3 parameter, 10 element data set Mr. Fotz uses) are calculated in reference #5, 6).
However, after going through the videos, it turns out that a simple linear regression model seems best for Mr. Fotz’s data set 😦 Sooooo, I went back to Dr. Brownlee’s gradient descent example and even though it is simple linear regression, it does look like that I might be able to use it for 3 parameters. So, using Mr. Fotz’s data set, I set out to use Dr. Brownlee’s simple linear algorithm to figure out what that set of coefficients looks like (even though they would not ultimately be used together because of reasons Mr. Fotz’s points out). What I am really hoping to come across, is a linear regression example that involves at least 3 parameters and has a clear established answer so I can pick it apart to understand what is going on.
Sooo, I reviewed Dr. Brownlee’s gradient descent algorithm for the 2 parameter, 5 element dataset to see if I could achieve his results. The results were similar to his. Specifically:
- To run the application, run the command ‘npm run start’ and put in this url
- Review the results
NOTE: 1-5 were part of the test data set (6 and 7 were not part of the test set).
- The dataset (1-5) was part of Dr. Brownlee’s example. I added them to the prediction array along with 6 and 7.
- NOTE: The min/max and cost function are were not used in this specific endpoint.
- The code consists of four methods. The ones of interest are
- calculateSimpleLinearRegressionCoefficients() – This method takes each test data set row of x and y and tweaks the coefficients for both B0 and B1 using a learning rate (aka alpha) and error cumulative value.
- makePrediction() – This method takes the calculated coefficients (i.e. B0 and B1 in the y = B0 + B1*x formula) and forms the prediction arithmetic with the ‘x’ for each prediction.
As stated, my results were similar to his. The difference is that Dr Brownlee stops after 20 iterations and uses this B0 and B1 coefficients to make his predictions. By running my interaction a little longer, my coefficients were different (but close). The predicted values are line (I think) with what they should be. That said, it is interesting to not have an absolute exit point. In past perceptron code I have worked through, you usually exit after your error is below a specific value. I wonder if something similar here should apply. Seems not.
I had planned on doing all of this in one post. But, it is already wwwwwaaaaayyyyy to long. So, Part 2 will be taking my implementation of Dr. Brownlee’s simple linear algorithm and see if it can be applied to the gradient descent algorithm using Mr. Fotz’s data set.