Anurag Rana          Projects     Contact Me     Blog    
Top (max 10) reviews: Web Scraping with Python (Community Experience Distilled).

3.8 out of 5.0    14 total reviews.

Buy This Book
All Books
4.0 out of 5.0 -

by KIRAN RAMA on Jan. 31, 2018

Poor quality writing - Not worth the money. It is better to use BeautifulSoup package in Python

1.0 out of 5.0 -

by TheLyingThief on May 14, 2017

Another Packt-poor exposition. Script elements are presented without intersecting preceding or subsequent elements; there is no code-base one can download to check the scripts for errors; the subsidiary bitbucket repository (code that supports offline website access) is not explained, nor is it functional - initial scripts require a 'sitemap.xml' to get links, and this document cannot be accessed (returns error)! I don't know about these other guys, but as I have tried to work my way through the text, I only encounter issue on issue, poor explanation after poor explanation, dangling instruction upon instruction.
Packt offers the most uneven of technical material, and this is one that really requires editing. Hilariously, Packt's reviewers/editors don't speak English natively, or if they do, it's hard to believe they've finished grade school.
Go with the o'reilly book on web scraping; it's no gem, but the code works.
Why is software so poorly designed? Because software/technical writers like Lawson lack an orderly approach to their work.
Ultimately, a disappointment made more bitter by the promise.
tlt

4.0 out of 5.0 -

by Julian Cook on Dec. 29, 2015

This is a good up-to-date book on grabbing data from web sources..especially for python users who are not professional developers. I have found that any time you collect data, you normally end up either needing to auto-refresh the data, or you end up needing meta-data about the raw data. He has a good example, which is country statistics - something that you may need to 'normalize' other statistical data you are already working with.
Typically python libs dealing with url fetching have very simple examples of the parser.parse("hello world") type, so you have no idea what to do with a real web page, like a 'needs login' page.
The author, to his credit, does not tell you to download a 'magic' python library.In these cases he gives a thorough walk-through on how to research the structure or scripting of the page and then go about fetching it via python. Most of the book is in fact devoted to analysis, rather than action.
The only parts that didn't seem particularly useful to me, were the chapters on creating crawlers and spiders. I don't see myself doing any of that and I don't see why an amateur or professional would use them. A professional would probably use Elasticsearch for instance.
Other than that the book will probably be useful for a long time.

3.0 out of 5.0 -

by Amazon Customer on Nov. 9, 2017

As reviewed, some examples of the first chapter (e.g. 'sitemap_crawler') doesn't work, and subsidiary bitbucket repository doesn't function. It's a shame.
But, almost every other examples are working well( (I'm starting chapter 4 now). I think this book has very good resources for anyone interested in web scraping. At least, you can practice with a real example website.
The O'reilly web scraping book and this one can be complementary. No book guides perfectly.
Notice: The example webpage addresses changed a little. Therefore you should change the addresses in your code, too.

5.0 out of 5.0 -

by Scott Prime on Dec. 17, 2015

Having no idea where to start I've found this to be a perfect introduction to getting information from the web. The code examples are for Python 2.7, though I was able to convert and muddle through using the utility 2to3.py
I would definitely recommend this book.

5.0 out of 5.0 -

by bbeny on March 23, 2017

I am a python guy and wanted to get into web scraping. This book was perfect for me. Very well written and great examples. Hope to see more from this author in the future.

5.0 out of 5.0 -

by Tim Crothers on Dec. 4, 2015

Hands down the best resource I've found for practical examples of how to write web scrapers in Python. The author's style is very easy to read and very practical focused. He also clearly knows the subject inside and out and does a great job of not only showing you actual working code to do everything but also covers multiple approaches for different situations as well as key pitfalls to avoid.

2.0 out of 5.0 -

by ps on April 28, 2016

This would be a good book if only it was updated to Python 3. As it is it's dated, most examples won't work at all (unless you fall back to Python 2). Some code can be tweaked to work with modern Python but at times like in chapter 6 the damage renders it useless. Overall a good book but for a previous age.

5.0 out of 5.0 -

by John Osborne on Dec. 4, 2015

If you are looking for ways to automatically gather and curate data off of webpages, this book is probably for you. There is a chapter on scrapy, a chapter on dealing with CAPTCHA, a chapter on handling dynamic (ie javascript based) pages, and a chapter on concurrent downloads, plus a few others covering housekeeping details like parsing scraped pages and caching. All in all, a book that provides a broad foundation on web scraping.