I like many people have become interested in Big Data recently. At the moment it is just an interest, it doesn’t really come into my day job so its just something I’ve been investigating and researching in my spare time, but for me its very early days. These are mainly notes for myself, to remind what I’m planning to look at and read, but may well help someone starting out on the same journey.
Books
The first thing I decided to do was look up some good old fashioned books. The ones that have looked into so far are the following:
Big Data Now: Current Perspectives from O’Reilly Radar – This book is simply a collection of articles but so far that has been a strength for me as its allowed me to dip into all sorts of areas, it seems a nice starting point. At least I now know what Hadoop, MapReduce, Pig and various other terms mean.
Big Data: A Revolution That Will Transform How We Live, Work and Think – I confess that I haven’t read this yet, but it at least sounds interesting in principle and I hope to get some good ideas from it.
Data Science Starter Kit – I’ve been recommended this but haven’t picked it up as yet, I will probably move onto that when I understand some of the basic concepts and decide how I plan to do some experimenting.
Big Data from Manning – This isn’t in print until later this year but I’m quite keen to get hold of it as it sounds like it has some decent information, and I tend to like Manning books.
Environment
One thing I’ve found so far is that there is very little based on Windows, nearly everything I have read is about Unix as the OS, so it looks like I might have to get my hands dirty with that, plus I may have to look into Ruby and Python as languages to start out with, or even return to doing some command line scripts or Perl to get going.
The one Microsoft offering I have seen is HDInsight, which essentially appears to be Hadoop on Azure, Microsoft’s Cloud Service. I’ve done a little work with Azure in the past so will certainly look into this to see how usable it is for me, and what kind of samples I can create.
Hopefully I’ll update my blog regularly when I find interesting articles, or if I roll up some sort of test or code.