Cloud Adventures Part 1 – The Data Set

This entry is going to be the first in a multi-part series of blog posts dealing with the cloud.  More specifically, when I am done, I hope to have explored a significant portion of the Azure, Amazon and Google cloud services pertaining to how data is loaded, parsed, analyzed and ultimately used – taking all of my readers along for the ride 🙂  The first step in this journey is to generate a significant data set to load.

The question of what type of data was actually a lot harder than I thought it would be.  To be useful, the loaded numbers had to have interesting data points as well as a lot of data.  Ideally, the data set would show patterns that could be measured, analyzed and reported on. I work for an energy company, so I didn’t want to use anything that could remotely be related to work.  So, I chose to simulate an artificial satellite system.  The data points seem interesting and the amount you can generate seems endless.  To scale, you just add more satellites with more and more complex functions and orbits.  Then, I started reading up on satellites 🙂  After numerous articles, I realized my initial thoughts were correct, but there were many complex possibilities.  To make this a small enough project, I had to reduce my initial scope.

I chose to create a two dimensional (i.e. X/Y point system) 8 satellite system orbiting a planet.


Each satellite could go up, down, left and right.  Each had a power and fuel level to make any needed motion happen.  When in place (aka ‘onStation’), each satellite could deploy its solar panels to generate power and fuel.  I know fuel from solar panels is not very feasible, but I chose to link them for simplicity 🙂  If the satellite moves, the solar panels are retracted.  They can’t be deployed while the satellite is moving.

At launch, the satellite travels from its source X/Y position to its destination X/Y position.  It starts out with the power and fuel levels at 100.  It costs .25 from the power value to run the navigation systems to move the satellite one X/Y position.  It costs .20 from the fuel level to move the satellite one X/Y position.  Both the power and fuel starting levels and action costs are arbitrary.

Once on station, each satellite will push itself off position by a small factor where it has to move itself back to its expected position.  Additionally, the planet will initiate a shift which will move the satellite system to a new position and then back.  Both activities are my attempt to quickly add new data points/patterns to analyze later.

Things you will need to run this are:
-SQL Server (I used 2014) installed on the local box and/or accessible in the cloud

–SQL Server Management Studio
-Signal R (web-sockets technology)
-Visual Studio 2013/2015 (only one…I used both for a variety of reasons)


First, create database table to hold the updates.  In SQL Server Management Studio, open a query window and run the table this create script:

CREATE TABLE [dbo].[Updates](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[type] [varchar](250) NULL,
[data] [varchar](max) NULL,
[created] [datetime] NULL CONSTRAINT [DF_Updates_created]  DEFAULT (getdate()),
[id] ASC

Next, download the source code ( and follow the process below:

  • Go to the link and download as a zip.
  • Extract the zipped solution and open in Visual studio.
  • Right-click on the solution and re-build (may take a couple of re-builds to get the Nuget packages to download/link).
  • Right-click on the solution and select ‘Set Startup Projects…’.
  • When the prompt is open, select ‘Multiple startup projects’, select ‘Start’ for the ‘Driver’ and ‘Web’ projects and click ‘OK’.
  • Update the database connection string in the ‘Constants.cs’ class in the ‘Shared’ project with the name of your local machine.
  • Run the project
  • ‘Driver’ console application will launch.
  • ‘Web’ project will launch in web browser (I used IE 11).  This is HTML 5 compliant, so it should work in any of the major browsers.

    Activity includes the visual of the satellite display as well as the dumping of updates to the database table you created earlier.  I ran the project for 5-7 days and created a data set of roughly 23 million records.  This data set will be the focus of the next series of blog posts as we load them into the Azure, Amazon and Google clouds.  Next stop is Azure Blob storage.
    Stay tuned!



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s