Jump to ratings and reviews
Rate this book

Real-World Machine Learning

Rate this book
Summary

Real-World Machine Learning is a practical guide designed to teach working developers the art of ML project execution. Without overdosing you on academic theory and complex mathematics, it introduces the day-to-day practice of machine learning, preparing you to successfully build and deploy powerful ML systems.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the Technology

Machine learning systems help you find valuable insights and patterns in data, which you'd never recognize with traditional methods. In the real world, ML techniques give you a way to identify trends, forecast behavior, and make fact-based recommendations. It's a hot and growing field, and up-to-speed ML developers are in demand.

About the Book

Real-World Machine Learning will teach you the concepts and techniques you need to be a successful machine learning practitioner without overdosing you on abstract theory and complex mathematics. By working through immediately relevant examples in Python, you'll build skills in data acquisition and modeling, classification, and regression. You'll also explore the most important tasks like model validation, optimization, scalability, and real-time streaming. When you're done, you'll be ready to successfully build, deploy, and maintain your own powerful ML systems.

What's Inside




Predicting future behavior
Performance evaluation and optimization
Analyzing sentiment and making recommendations
About the Reader

No prior machine learning experience assumed. Readers should know Python.

About the Authors

Henrik Brink, Joseph Richards and Mark Fetherolf are experienced data scientists engaged in the daily practice of machine learning.

Table of Contents



THE MACHINE-LEARNING WORKFLOWWhat is machine learning?
Real-world data
Modeling and prediction
Model evaluation and optimization
Basic feature engineeringPRACTICAL APPLICATIONExample: NYC taxi data
Advanced feature engineering
Advanced NLP example: movie review sentiment
Scaling machine-learning workflows
Example: digital display advertising

264 pages, Paperback

First published March 2, 2016

34 people are currently reading
304 people want to read

About the author

Henrik Brink

6 books3 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
26 (31%)
4 stars
41 (50%)
3 stars
14 (17%)
2 stars
1 (1%)
1 star
0 (0%)
Displaying 1 - 14 of 14 reviews
Profile Image for dmacjam.
3 reviews
December 18, 2016
The book is divided into two parts. First part deals with machine learning workflow and the second part consists of five practical examples. The book is focused only on supervised machine learning, specifically regression, classification, recommendation and imputation. As the title says "real-world", it contains examples in Python with scikit-learn and pandas libraries which are the most popular libraries for machine learning in Python.

In the first chapter one can find what is machine learning and when it is useful. Next chapters cover data preprocessing, data visualization techniques and feature engineering. Reader can try classification of linear problem by logistic regression and non-linear problem by SVM. For regression problems linear regression and random forest is applied. Later chapters explain evaluation metrics, cross-validation and model optimization by brute-force searching of the hyper-parameters.

The second part of the book with practical examples is more interesting. Each chapter starts with an easy solution that is iteratively improved. First chapter covers whole machine learning pipeline, from exploring the data to solving the task of passenger tipping habits predictions on the NYC taxi data. Next two chapters are about advanced feature engineering with examples of extracting features from text (modelling reviews sentiment), images (modelling edges, shapes) and time-series data.

The algorithms are described at a high-level with no math at all. Moreover, there are best practices and references to dig deeper into each algorithm covered. Each chapter contains a vocabulary of terms and summary, which makes it clear what topics are covered in a chapter and it is possible to read each section separately.

To sum up, this book is a brief introduction to a machine learning. It gives you great overview of machine learning lifecycle with practical code examples for each part. I would recommend the book for all people proficient in Python who are interest to learn machine learning. When you read the book, you gain practical skills of preparing the data, extracting features and building a prediction model for several types of problems. It is a good starting point to begin exploring the machine learning world, because it covers most of the machine learning algorithms used in the industry. However, to know which algorithm to choose for a particular problem and how to choose the right hyper-parameters for a model, in my opinion you definitely need a theoretical knowledge of the algorithms. For me the first part of the book was easy and obvious, but second practical part was really great, especially the chapter about scalability.

Covered algorithms: logistic regression, linear regression, k-nearest neighbors, SVM, decision tree, random forest.
Profile Image for Michal Paszkiewicz.
Author 2 books8 followers
July 19, 2017
A great read about the theory of Machine learning with practical examples of real world applications that gave clear explanations of how the processes and various algorithms work. However, I must say I preferred Manning's "Machine Learning in Action" which I found considerably more readable, partly due to the fact that the structure of "Real-World Machine Learning" was just not quite as clear. But this is still a good read that I would recommend. Get cracking!
Profile Image for Chris Esposo.
680 reviews56 followers
August 30, 2020
This book is a wonderful technical text suitable as either a supplement for a more formal text in the subject area, it could easily be mated with a book like “Introduction to Statistical Learning” by Tibshirani et. al, and in a way, that would be a good combo as that book is written in R and this is focused in Python, so it would give the student exposure to both of the current (circa 2020) main languages of data science/machine learning. Real World Machine Learning as a text is very much a “project” centric text, where roughly the first half is a grounded, coding-centric introduction to machine learning, and the second half provides nice mini-projects, and importantly, a walk-thru of those projects within the currently popular PyData API/stacks.

But again, the real jewels for the book are in the second-half, which goes through step-by-step analysis for real world data in a handful of projects. In each step, the student see's visualizations, and the author discusses motivations for the visualizations in the EDA, what certain features are selected, why certain models are selected, how to execute them, and how to test their use. From someone who learned this material before the explosion in data science material the past few years, I can genuinely say I wish I had something like this when I was first learning it, especially as validation that the steps I was making in my own professional projects made sense, and that I wasn't wandering in the woods randomly (something data scientist sometimes do).

Despite this practical view on the material, I really appreciated that the book did not dumb down the apparatus, and went into suffecient mathematical detail when appropriate. The book does not only discuss classifiers, logistic regressions, and trees, but also goes into a high-level mathematical derivation of logistic regressions via a discussion of the log-odds ratio, further, he goes into fairly decent length on the nature of classification and how one may adjudicate the ‘quality’ via the FPR/FNR, as well as the challenge of classification in non-linearly separable data, with some basic suggestions on how to deal with these via kernel-methods.

Going back to the practical view, this book also equips the reader with a very clear and organized template for both the data-quality analysis in the pre-processing and the standard machine learning pipeline for supervised machine learning with the goal of prediction use-cases. Whether conceptually, in code, or diagrammatically, the author hammers-in the process-flow, and introduces all of the appropriate graphs (and their accompanying code-snippet) to the reader so they can understand sequentially/logically what it is they are doing what they are doing. Further, these code-snippets often are complete functions, with code syntactical hygiene and some design with respect to exception handling which incorporates elements of basic programming that will be key to a data scientist’s success.

Overall, I liked this book, another good installment from the Manning series. I think the audiobook is a good asset to consumer as a refresher for more senior scientist who may be able to pick up one or two tricks, or just solidify their understanding on some elements, and is definitely great for introductory students in the subject area. It’s definitely not a total substitute for a deeper education on the subject, but those are very specific-needs, that are moving more towards the ML/AI engineering domains nowadays anyways. Recommended.
Profile Image for Evan Oman.
31 reviews2 followers
August 6, 2019
A really good, practical overview of the ML development process. The book is light on ML theory but this allows the authors to explore more general themes. There are several practical case-studies throughout the book covering the full process: data collection, pre-processing, feature engineering, model training, and model evaluation.

The only disappointment was there was very little discussion of "deep learning" techniques -- the book focuses on feature engineering approaches.

All things considered I thought this was an excellent book and would be a great introduction to ML for engineers who are interested about the process first before diving into learning algorithm details.
Profile Image for Arthur Saveliev.
8 reviews1 follower
December 27, 2017
If you’re a practical learner like me then this is your gateway to ML.

It covers a variety of topics around ML with enough theory to make understand but not too much to spin your head around. It’s up to you to dig deeper on the topics. The author also provides references and tool recommendations.

Final chapters include full fledged examples to practice your newly obtained superpowers.
Profile Image for Raj Sadasivam.
20 reviews
September 19, 2018
It’s a good book for beginners or shall I say semi beginners. authors dont explain about each technique in detail but rather the process of ML is explained in detail with a few live examples. Further reading would be required. Some portions of the book would need advanced mathematical knowledge such as matrix operations and algebra. Overall a good read.
Profile Image for Denis Romanovsky.
215 reviews
December 23, 2017
This book is very good because it is focused on complexity of real life machine learning projects. So algorithms are not the main problems - it’s data and many iterations of trial/error.
Profile Image for William Cruise.
18 reviews8 followers
April 9, 2023
A little bit dated, but any ML book is as soon as it's printed. For such a short book (< 250 pages) the author covers a wide range of topics, with good explanations of the fundamentals.
Profile Image for Psyckers.
243 reviews3 followers
December 30, 2023
A great book for those that want a complete understanding of Machine Learning for automation and AI models. It provides an extensive notes of what to look for when constructing such models, and how errors and bias can creep in. The second half of the book provides 5 case studies that you can adapt to your own model requirements.
Though, technically this is now an old book on this subject, the fundamentals prescribed in this book is still relevant and are worth considering on any ML plans and executions.
Profile Image for Donn Lee.
382 reviews6 followers
August 3, 2019
I see this book as a great primer for machine learning, and I highly recommend it for beginner-intermediate practitioners of data science. True to their title, the concepts that the authors go through cover most of the most common real-world use-cases for machine learning without going into too much detail on the theory. It is the kind of book I wish I had three years ago when I was starting out on my machine learning journey.

For those like me, who have had formal data science training or who already have had non-insignificant exposure to machine learning, this book will probably be a little less "dramatically useful".

But though I cannot say that I learned anything new, the book itself reminded me of many of the core activities that one should consider when tackling a machine learning project, some of which I must admit I had forgotten about.

UPDATE (3 Aug 2019): Re-read this book, and my thoughts above still holds. It was a great book to pick up again as it helped refresh my memory on a lot of data science concepts.
Profile Image for Kanstantin Tsiarokhin.
24 reviews
November 19, 2017
This is the book that you should read if you are a beginner in the world of machine learning because the book is full of useful information with lots of code examples(you need to know base Python syntax to understand it). The only one bad feature of the book is that there were about 20-30 pages that are really hard to read because of difficult math formulas which are not really understandable for everyone
Displaying 1 - 14 of 14 reviews

Can't find what you're looking for?

Get help and learn more about the design.