SQL on Hadoop: Getting Started

This is my presentation from SoCal Code Camp – San Diego.  Hopefully the slides are helpful, the commands probably won’t copy and paste perfectly into the terminal but please reach out with any questions.

Here is plain text of the commands I used:



  1. Pretty cool, Dustin.

    I might be missing some context, so pardon the following question…

    Were you attempting to define a correlation between “crime” and “batting averages?”

    Any chance they video’d your presentation, and you could post that, or supply a link?



    1. Thanks! I’d like to create a video version of it, if I get a better mic then that will probably happen. Crime and Baseball were two different data sets I found interesting. I actually didn’t use the crime data in my presentation because it is more challenging to get it working in Hadoop, but left it in the slides since in the real world you might encounter that type of delimited file very quickly and its good to see an example of how to make it work.


