SQL on Hadoop: Getting Started

This is my presentation from SoCal Code Camp – San Diego.  Hopefully the slides are helpful, the commands probably won’t copy and paste perfectly into the terminal but please reach out with any questions.

Here is plain text of the commands I used:



  1. Pretty cool, Dustin.

    I might be missing some context, so pardon the following question…

    Were you attempting to define a correlation between “crime” and “batting averages?”

    Any chance they video’d your presentation, and you could post that, or supply a link?



    1. Thanks! I’d like to create a video version of it, if I get a better mic then that will probably happen. Crime and Baseball were two different data sets I found interesting. I actually didn’t use the crime data in my presentation because it is more challenging to get it working in Hadoop, but left it in the slides since in the real world you might encounter that type of delimited file very quickly and its good to see an example of how to make it work.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s