Welcome back! This is part two of the my implementation of Dr. Brownlee’s gradient descent algorithm for multiple parameters using the simple linear (aka single parameter) algorithm detailed in reference #1. In that same blog, Dr. Brownlee said it could be extrapolated to multiiple parameters. Let’s see if this is the case.
Rewinding back to part 1, I implemented what I thought was a modifiable version of the single parameter simple linear regression algorithm and got similar result to Dr. Brownlee’s blog post. Here in part 2, I am going to update the algorithm to take multiple parameters. So, if successful, this algorithm will work on any number of parameters. However, I am sad to report that it didn’t work like I had hoped (I am sure it is something I am doing wrong). Suffice to say that a lot of math later, the prediction was no where close to what I expected. Using Brandon Foltz’s data set and a modification of Dr. Brownlee’s single parameter linear regression example, I got these results.
- Data Set
- Result (I may be missing something, but I was not expecting to get a -1.63 result for the prediction)
People learn in many different ways and for me, I learn from examples and implementing those examples in code. If I want to learn technology X or algorithm Y, I try to find an example that accomplishes what I want and then I implement it in code without reviewing other code samples. If successful, I start playing with it and trying variations until (hopefully), I can run with technology X or algorithm Y on my own.
Essentially, it is a two parameter calculation and I was able to obtain what I think are reasonable predictions using the test data set he used as well as putting in a couple of preditions of my own not in his test set.
Running it, we get:
- Endpoint call
NOTE: If you compare my two coefficients (i.e. b1 and b2), they are pretty much the same as Dr. Brannick’s.
Digging in, I opted to shorten the result sets to the first two from Dr. Brannick’s test set and two of my own. Reviewing them, we see the test set and predictions used to generate the above result.
The main method and accompanying methods are as follows:
- A lot of the examples I see out there revolve around using libraries to do the heavy computational lifting. I personally think it is a mistake for software engineers like myself who want to move into the Machine Learning (ML) universe to rely on them (at least when trying to understand how it works under the hood).
- I personally don’t want to become a mathematician, but up my programming game enough to where I can navigate the math behind machine learning to be productive. My goal for the next dozen blog posts will in this space will be to explore a number of different algorithms so that I can understand and effectively use them.
The next blog post will revolve around taking this algorithm and trying other datasets and see if it measures up. For example, can this algorithm handle Brandon Foltz’s 3 parameter dataset?