Anurag Rana          Projects     Contact Me     Blog    
Top (max 10) reviews: Python Data Science Handbook: Essential Tools for Working with Data.

4.4 out of 5.0    30 total reviews.

Buy This Book
All Books
3.0 out of 5.0 -

by gmkiv on March 10, 2017

The figures were generated in color, but printed black and white, so they are often unintelligible. It's hard to tell the red dots from the blue when they are both grey.
Apart from that major oversight, the book is ok. If you want to learn data science, this is not for you; it doesn't get into the fundamentals much at all. If you are an experienced R user looking for how to translate into python, this will get you started. The rest of my review comes from this perspective.
The book spends far too much time on low-level ipython, numpy, and matplotlib functionality (chapters 1, 2, and 4). You are rarely going to use this stuff.
The pandas section (chapter 3) is fine, but I was a little disappointed in the treatment of the grouping/aggregation functions. The book mentions the split-apply-combine paradigm of Hadley Wickham, but doesn't cover the topic in nearly as much detail as the paper of the same name. I was hoping to learn how to translate the dplyr verbs (group_by, filter, select, mutate, summarize, arrange) into pandas, but this book doesn't provide that. You will learn the basics of grouping and aggregation, but your code is going to be a lot more verbose than it was in R.
The machine learning case studies in chapter 5 are pretty nice - probably the only reason I would recommend this book. The chapter provides a good overview of the scikit-learn API and effective patterns for machine learning problems.

4.0 out of 5.0 -

by Dmitry on June 4, 2017

I am currently taking a Machine Learning course from Udacity and this book has proven to be a great reference guide for several projects and quizes. Although it does not go in depth in regards to machine learning (although almost half of the book is dedicated to it), it does give an understanding of essential concepts. For those interested in machine learning I would recommend bying "Hands-On Machine Learning with Scikit-Learn and TensorFlow" by Geron as well as this book.
There is no one book for data science, and this one is no exception. Just keep that in mind before buying it.
Other than that, I am really happy with my purchase.
P.S. For those complaining about black and white graphs and diagrams - check the author's GitHub.

5.0 out of 5.0 -

by aloctavodia on Dec. 15, 2016

I have been reading and recommending this book from the early-release stage. The first half of the book is dedicated to introduce the basic Python libraries for data analysis (and scientific computing in general). The second part deals with Machine Learning from a practical point of view using the Python library scikit-learn. The book is ideally suited to those that already know the basic Python stuff (or know how to program in a language like R/Julia, etc) and want to learn how to use Python for data-analysis. Even if you already know Python and how to use it for data analysis you could still find some gems here and there in the form of very clear examples or comments. In summary, if you want to do data-analysis and you already know Python read this book, if you do not know Python read Think Python by Allen B. Downey first and then read this book.

3.0 out of 5.0 -

by Antonio on March 7, 2017

This book is not bad, but not great either in my opinion. The author obviously knows his stuff and covers the material in an accurate and competent manner. However there are a couple of flaws in my opinion. Firstly this book is quite dull, I've read other data science books that are more engaging. For example he might discuss matplotlib and go exhaustively through many options. That is fine, but boring, in my opinion it might be better to do things in not quite as much depth but more in the context of a realistic analysis.
Secondly this book can't decide if it is a reference or a tutorial. The author gets a bit carried away showing too many features, I often found myself nodding off or losing my concentration. With so many online references, it might be better to concentrate on being a tutorial and not try to show so many features. Or perhaps separate each chapter into a tutorial and then a reference. On the other hand I realize that some readers might want this extra depth, so I'm just saying what I personally would have preferred. A related problem is that the material can quickly go out of date, I already found some options to be deprecated when running code.
Thirdly, I question some of the organization of material, he often introduces some aspect, doesn't explain it properly, and then returns to it later on to explain it in more depth. An example is the Scikit-learn pipeline object, he starts using this leaving me puzzled and only later returned to explain it. This kind of issue was relatively common.
In conclusion this is a decent book and certainly not a bad book, but more suited for particular audiences. This book would be good for those looking for a reference and relatively detailed information on a particular topic.
Not so suited for beginners, I think they would be confused, or overwhelmed. People with some experience who are looking for more of a tutorial could be bored.
It is worth noting as well that there are so many python data science books, but nearly all of them not very good. Relative to the other books, this is probably one of the best. In contrast there are a number of excellent books that use R.

5.0 out of 5.0 -

by L. Wixson on Aug. 5, 2017

When I first received this book, I was surprised that it didn't get to scikit-learn until the last third of the book. The first third is about numpy and pandas, and the middle third is about matplotlib. Now that I've been applying it at work, however, I've found that the items covered in the first two thirds were really essential. I wouldn't be nearly as productive if I had just jumped straight to the sections on scikit-learn. The author does an excellent job covering broad terrain with enough detail that you are able to apply it to your problems. You will find yourself going back to use this book as a reference.

5.0 out of 5.0 -

by Micheal Nguyen on June 9, 2017

This is an excellent reference book for people working with data science. Remember, 80% of the effort in machine learning, data analysis or data science in general is about processing data and understanding data. This book is for that purpose and I think it's the best book out there about data processing, analysis and visualization using python. If you look for hardcore machine learning, go for other books. Highly recommended!

5.0 out of 5.0 -

by Gary W. Garrison on June 8, 2017

I have used R for a few years and this was my first book that covered Python for data science. Even though it does not go into super great depth in any area, it is definitely a super book. It covers everything from Pandas, Matplotlib, and scikit-learn. I would highly recommend it for anyone that is new to Python and/or data science. The book is written with Jupyter Notebooks so it is easy to follow along and try code from the book in your own notebook.

5.0 out of 5.0 -

by Fontana Federico on Feb. 7, 2017

I'm and experienced R user in data analytics and I wanted to learn Python from scratch. I've just finished to study this book from cover to cover and I'm extremely happy with the book. However, as mentioned by the book, the reader is assumed to be familiar with the basics of the language. For this reason, I spent a few hours to get familiar with the language before going through the book. Even if the author works in IPython (i.e. Jupyter), I haven't found any problem working using pycharm.
Pros:
- This is the book for the R user.
- The book adopts python 3.
- The writing style of the author makes the book enjoyable to read.
- Amazing chapters on numpy, pandas and matplotlib/seaborn.
The section on scikit-learn assumes no experience in machine learning. If you are new to machine learning, this section becomes great for you since the author provides an excellent high-level description.
Cons:
- There are no explicit suggestions about how to structure a data science project in order to go beyond a simple script.
- The section on scikit-learn assumes no experience in machine learning. If you are familiar with learning theory, trees, SVM, PCA etc you will not get much out of this section.

3.0 out of 5.0 -

by Peace and Love on March 11, 2017

I received the book today and got very disappointed. The reason I gave two stars is not the content of the book, rather the quality. Everything is black and white. There are plots with different colors and since this book is grayscale, plots look nonsense to me. I am going to return the book.

5.0 out of 5.0 -

by Brijesh on Jan. 29, 2017

This is by far the best book out in market to get you started with using python for data science. You will need some basic understanding of python and machine learning to understand concepts here, but this book will definitely take you skill to next level.This is no-nonsense book and goes deep into stuff which are relevant and important to do data science in python, every page is rich in information and provides practical use case, optimization tricks and adds new dimensions to your understanding of topic.