The book Data Smart: Using Data Science to Transform Information into Insight by John W. Foreman is one of those fantastic books that upon reading it I kept on asking myself why I haven’t come across it sooner. The author makes everything about data science appear less mysterious and so much clearer.
On the one hand, the book introduces real life case studies of data science problems that can be solved using algorithms such as k-means clustering, regression, network clustering, optimization methods, ensemble models, prediction and the like. On the other hand, each of these case studies is implemented in Excel. Yes, that’s correct, data science can be done in Excel if we really want. It’s the perfect tool to use for case studies because everyone knows Excel and thus the algorithms can be explained without the added complexity of having to learn a data mining technology, such as R for example.
We probably wouldn’t use an Excel spreadsheet to process huge volumes of data that we may come across in real life. But once we understand the process and the algorithm behind each data science scenario, it is just one further step to apply it in a more robust technology.
My takeaway from this book is this: if you want to really understand data science, then this is a must read book.