Anurag Rana          Projects     Contact Me     Blog    
Top (max 10) reviews: Introduction to Machine Learning with Python: A Guide for Data Scientists.

4.3 out of 5.0    31 total reviews.

Buy This Book
All Books
5.0 out of 5.0 -

by Noemi Derzsy on Aug. 8, 2017

My current work revolves around using machine learning for the study of criminal behavior, so I read Introduction to Machine Learning with Python by Andreas Muller and Sarah Guido with great interest. The book comprises a complete documentation of the scikit-learn library, and provides a comprehensive overview of the machine learning models and the fundamental theory needed to get started in applying ML tools in practice. Each chapter contains Python source code that cover a wide range of interesting and practical data science problems. In addition to the basic theory, scikit-learn tools and code samples, the book also includes many useful hints, tricks and words of wisdom that can save you a lot of time by avoiding issues that invariably arise in your learning process. This is an excellent book that I highly recommend both to machine learning experts who want to be proficient in scikit-learn and also to beginners who want to learn machine learning basics and how to apply them on data.
The concepts are clearly described and their implementation is presented through useful and exciting data science problems, giving the reader a clear understanding of how to apply the ML tools on real problems. The code is very well organized and structured, and ready to be used as reference and as a starting point for future projects.
The flow of the book is constructed such that it can serve two purposes: it can be read to familiarize one with the machine learning techniques and how they are being applied on data without actually having to get into coding with Python, or it can be read as a ML course for those who want to learn ML with scikit-learn by studying the theory and applying it on real data problems throughout the reading process. Therefore, I recommend this book not only for Python users, but for anyone who wants to learn the basics of ML, and to see their applications without delving too deeply into ML theory and math. The book is a comprehensive self-contained literature for those who want to learn the basics of ML and to try them out on data.

5.0 out of 5.0 -

by gcgutier on Nov. 1, 2016

Fantastic introduction to machine learning in Python. The examples are well written, and do a very nice job of introducing both the implementation and the concept for each model. I'm halfway thru the book, and am really enjoying it.
I have a background in math and wrote software professionally for a number of years, but haven't spent much time doing either for the past 5-10 years. This book is technical enough to keep me interested, and accessible enough to allow me to ramp up on the language and the scikit framework.
An added bonus - the instructions actually allowed me to set up my development environment, and the code in the book actually runs!
100% recommend for someone looking to get started in ML with Python.

2.0 out of 5.0 -

by William P Ross on Nov. 20, 2016

I started determined to go through all the code samples in this book.
The book starts with only four sentences about the Jupyter notebook although is the main environment for the whole book. The first code sample shown starts on line two of a cell, and it was very strange there was no line one. I was wondering if there was some type of misprinting.
The code as printed is broken on page 10 where there is a line with 'display(data_pandas)'. This line gave me an error that display was unrecognized. I thought maybe this was a built-in Jupyter function so I went online to search. Eventually, I had to go to the author's GitHub and ask about this problem where I was told that he simply forgot to include 'from IPython.display import display'. It was a surprising admission because he did not say there was a misprint or mistake, but simply that he forgot to do that. It is very obvious there were zero technical reviewers for this book, because they would have also noticed the broken code right away.
On page 11 we are introduced to a library called 'mglearn' which is a utility function that authors say they wrote for the book. Strangely, this repository has 733 stars on GitHub so it is obvious the library is not just for the book. Then in chapter two the author has tons of calls to mglearn which take in multiple parameters. The parameters are never explained and you have to go to the author's GitHub to see what the code actually does. In the 2nd chapter multiple of these mglearn calls broke for me. One seemed to be a conflict with numpy, and another I never figured out. I went to look at dicussions on mglearn to discover it is still a work in progress and there were sections where somebody was notifying the author that something was broken, and the author replying that he would look at it soon.
The second chapter has 120 cell entries for supervised learning techniques. Each cell has roughly 5-10 lines of code, so there are nearly 1000 lines of code for the second chapter and they are all tossed into one gigantic Jupyter notebook. Explanations are very weak often defaulting to a brief description followed by code and then more code. Function calls and parameters are rarely explained at all.
The last chapter is about natural language processing which is the machine learning subject I am most familiar with. Terms are often introduced with zero effort to define them, and it is assumed you already know many of the concepts. TF-IDF barely had any explanation at all, except to show the forumla for it. You can find much better explanations online.
For a book which is so heavy on code and light on explanations, it is unacceptable that the code is broken.

5.0 out of 5.0 -

by Julien Julien on Jan. 6, 2017

This is a great book, and I'd say it is even great for those that are not familiar with python (you just obviously won't be able to run the code). For anyone with some basic understanding of linear algebra/statistics, the authors are able to present to you all the important (and sometimes subtle but significant) details, without the usage of equations, and more importantly, how they all relate to one another.
All the concepts mentioned here are heavily backed with well thought of and well presented figures, in such a way that again I'd suggest you don't even need python to understand. If you do know python, loading the data sets and reproducing the figures is just a few lines of easy to understand code away (with the exception of the mglearn library includes which does some "plotting magic" for you. However, I believe each of them were appropriate. You can ignore them and make the plots in your own way, or just print the variables, it just may not look as publication friendly).
Normally, I hesitate purchasing books that claim they may explain algorithms without the need of equations, and I expect them rather to be cook books of lightly and disjointly explained techniques (like an encyclopedia). However, I do not think such is true of this book. The power of scikit-learn is demonstrated and the algorithms behind them explained intuitively, and are referred as to how they fit together and complement each other.
As with any introductory read, a supplement is needed from time to time and the authors' reference to Elements of Statistical Learning is a useful one (equation heavy). There are points in the book where the author defers to elements of statistical learning. I found these points suitable since further explanation would be out of scope.
I read this book on my free time while on vacation, and much of the time I didn't have access to a computer. The concepts were so well presented that it was just a nice leisurely read. When I finally had time to access a computer, I was able to try the techniques on my data sets with some browsing back and forth through the book again, but otherwise with little effort.
Finally, since I myself am a researcher, I would recommend this book to any other researcher willing to start delving into the world of machine learning. Further reading will always be necessary, but this book will give you such a good intuitive understanding and overview of the subject matter that you'll know what to do to proceed next, and how to do it without running in circles. Even better, you'll likely already have applied it to your research!

5.0 out of 5.0 -

by Entrope on Jan. 23, 2018

This is an impressive treatment of machine learning without being math heavy like ESL. While it is entitled "Introduction to ...", it provides very good insights that go beyond what you might think an introduction would offer, with the benefit of seeing how to code with scikit-learn.
Whether you would want to pick this book up, with no prior background, and try to learn everything about ML might not be a great idea. To me it's far better to start with any number of free or low cost online classes and use then use this afterwards to reinforce and broaden your knowledge, because you can only learn so much from any single class or textbook. Statistical Learning offered on lagunita dot stanford dot edu is a good option, with some HW in R.
It looks like a few of the negative reviews here pertain to broken code. While I don't promote sloppiness, it is inevitable that code will break with ever-evolving library updates. And in my view, while I haven't tried to replicate the code line by line, to me that doesn't detract significantly from the usefulness and clarity of the explanations and insights in the same way it would in a paid class with hard deadlines. Be sure to check if you have the latest release if you have an ecopy. As of today, the latest is the 3rd release dated June 2017.
I was interested in more information about dimension reduction for image classification problems and was pleasantly surprised. Unless you have deep experienced with ML, you will probably find something useful here. I got my copy from Oreilly, which is why it is not a verified purchase here.

5.0 out of 5.0 -

by Amazon Customer on Nov. 29, 2016

I bought this book to help me get up and running quick for a project in an "Introduction to Machine Learning" independent study course. Of the books I bought for the same task, this was by far the most helpful for building practical machine learning applications.
The book is a great introduction to the scikit-learn framework which, in my opinion, is an extremely elegant machine learning tool kit.
Reading this book helped me improve the quality of the code I was developing for the project which dramatically improved the speed I could produce new results for the project.
If you are looking for an extremely theoretical text on machine learning, then you might want to look elsewhere.
If you are looking for a guided introduction to the "bread-and-butter tools" of a great machine learning framework in Python, buy this.

4.0 out of 5.0 -

by A quiet reader on Feb. 12, 2017

From O'Reilly and others, there's been a profusion of data science books in the past few years. Given that many of these books are intended to introduce readers to data science methods and tools, it's perhaps unsurprising that many of these books overlap at various points: you've got to introduce the reader to NumPy, pandas, matplotlib and the rest somehow, after all.
Müller & Guido's Introduction to Machine Learning with Python is distinct from many of these other works in both its stated aims and in its execution. In contrast to many of the more introductory books on data science, Müller & Guido give readers with a serious interest in the practice of machine learning a thorough introduction to scikit-learn. That is to say, their Introduction largely eschews coverage of the data science tools often treated in introductory data science texts (though they briefly note the other tools they draw upon in Chapter 1). At the same time, because their book focuses on practice and scikit-learn, they neither discuss the mathematical underpinnings of machine learning, nor do they cover writing algorithms from scratch.
What is here is a comprehensive overview of things already implemented in scikit-learn (which is a considerable amount, as they show). More precisely, they focus on classification and regression in supervised learning, and clustering and signal decomposition in unsupervised learning. If your interest falls in those areas (particularly the former), their coverage is quite good. Chapters 2 and 3 discuss the algorithms for supervised and unsupervised learning respectively, and in considerable detail. That said-- and though it's somewhat less thorough-- I might turn to the discussion of some of the same algorithms in Chapter 5 of VanderPlas' Python Data Science Handbook before Müller & Guido's; VanderPlas' treatment is more conversational and less dry. (Note, however, that Müller & Guido do cover more territory.) Similarly, I was left wanting more from Chapter 7's coverage of working with text.
Müller & Guido's book really shines, though, when it discusses all of the other things that go into machine learning, beyond their march through the algorithms themselves. Chapter 4 discusses ways to numerically model categorical variables, also (briefly) covering ANOVA and other techniques of feature selection; Chapter 5 covers cross-validation and techniques for carefully tuning model parameters; Chapter 6 compellingly explains the importance of using the Pipeline class to prevent data leakage (during preprocessing, for example); and Chapter 8 discusses where scikit-learn and Python fit within the wider horizons of machine learning. The strongest parts of the book, then-- and the parts where it's the most fun to read-- are where Müller & Guido discuss the practical details of machine learning. (One wonders if they felt a bit hamstrung by avoiding the mathematics of the algorithms they discuss.) There are points where the book is less engaging than other introductory data science books, but then it's not really in the same category; rather than an introductory overview of the entire landscape, Müller & Guido provide a clear, comprehensive, detailed guidebook to one particular part of the map.

5.0 out of 5.0 -

by Amazon Customer on Dec. 30, 2017

This book covers all of the basics. It goes into ample detail about the assumptions that are made with different models. It also talks extensively about various algorithms fit for different, real life problems. I have read 5 different Machine Learning books, and this is the best. It is even better than "Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems" (although that book is very good too). I think Muller and Guido do a stellar job of incorporating algorithmic examples and providing enough information for consumers to grasp fundamental concepts. Although the book is an introduction to Machine Learning with python, one should have basic (and I mean very basic) knowledge of python. Awesome book! Would recommend to anyone.

5.0 out of 5.0 -

by Jonquille on July 12, 2017

I've attended Andreas ODCS sessions, where he works thru the examples, and adds color commentary.
A clear writer/speaker - Very good, look forward to his next book(s)

5.0 out of 5.0 -

by NYIsalnders fan on Dec. 3, 2017

Great book for an intermediate data scientist. Talks about numpy and pandas a lot so its great!