So many topics, so little time. I had intended to focus only on setting AWS code pipelines for each component of TGIMBA. However, as with many things in life, you need to take advantage of opportunities when they present themselves. One such opportunity presented itself to me in the form of a Machine Learning course. More specifically, Professor Andrew Ng’s Machine Learning course on Coursera (see reference #1). I am taking the course at my own pace (tough to keep up while working full time). The first algorithm covered is Linear Regression.
Conceptually, it is pretty straight forward. Given a list of x,y coordinates, the algorithm (once computed) provides a f(x) line slope formula that will give an average of the available x,y coordinates. If you wanted to predict the location of y for a given x, you can.
However, after I started working through the example, the math didn’t add up. I worked through what I thought the correct implementation of the algorithm, but it wasn’t correct. I posted some questions, but I was not able to get a concrete example of the algorithm. So, I looked around online and found an excellent example. In the article “Simple Linear Regression Tutorial for Machine Learning”, Jason Brownlee illustrates a complete math example of a single variable linear regression.
- Make a get request
- View Results
The ‘structure’ of the program follows the article’s math flow.
The process is
- Get the data sets
- Calculate the mean X and Y values
- Calculate the numerator
- Calculate Numerator – Part One
- Calculate Numerator – Part Two
- Calculate the Denominator
- Make a prediction
What was really cool about this experience was having a complete example. My next blog post will deal with taking a data set from Professor Ng’s class and seeing if Mr. Brownlee’s math holds up. If so, I will expand it to multiple variables.