top of page

Introduction to Machine Learning, Part 1

Over the holidays I read Machine Learning for Dummies: IBM Limited Edition, by Judith Hurwitz and Daniel Kirsch (please see full citation at the bottom of this post). Although the "Dummies" series tends to be written for a more business-oriented audience, I found that it was a good introduction for a layman like me. I wanted to summarize some of the things I learned from the book so that I could remember them later. (And for you too, but honestly the book is less than 100 pages so just get out there and read it!) Please note that all the information presented in this post is from Hurwitz and Kirsch, except where otherwise cited.

Source: https://blog.algorithmia.com/introduction-machine-learning-developers/

First of all, a definition: Machine learning applies mathematical algorithms to large data sets to develop accurate models that can predict the future. Or, according to Hurwitz and Kirsch: "Machine learning is a form of AI that enables a system to learn from data rather than through explicit programming." (Hurwitz and Kirsch, 2018, page 4.) Some areas where machine learning is already being implemented includes online advertising, retail and medical imaging [1].

The mathematics and the foundations for machine learning have existed for a long time; for example, the mathematician Bayes and his theorem made a huge contribution to the field and he was alive in the 1700s (Source: https://en.wikipedia.org/wiki/Thomas_Bayes). However, machine learning has only relatively recently become a reality because of some technology developments that happened more recently, such as [1]:

  1. The development of modern processors with a high density : performance ratio

  2. Reduced cost of data storage and associated improvement in storage performance

  3. Clustered computing (where large data sets can be processed by a group of computers simultaneously)

  4. Increased repositories of commercial data sets with information on everything from weather to social media. Often readily available as cloud services and Application Programming Interfaces (APIs)

  5. Increased development of machine learning algorithms in multiple programming languages, with large developer communities that are often open source

  6. The development of visualization tools that do not require programming, allowing mathematically-inclined folks without programming experience to still analyze data sets

As I mentioned earlier, machine learning uses large data sets to refine mathematical models to make them more accurate to the particular application. When I say "large data sets," I really mean "big data," a big buzzword today. According to Hurwitz and Kirsch, big data has 4 key attributes [1]:

  1. Large volumes of data

  2. The data can be moved at high speed

  3. The data sources are varied

  4. The data is an accurate representation of the truth

Hurwitz and Kirsch spend a lot of time emphasizing the need for accurate and clean data in machine learning. They explain that machine learning developers need to understand where the data comes from, how it is collected and they need to understand the data's context. Often the data also has to be cleaned up after it is collected before it can be used by a model. They argue that a machine learning algorithm is only as good as the data it is based on [1].

The book also spends some time explaining that one way companies can affordably conduct the huge computational processes that machine learning requires is by using a "hybrid cloud." A hybrid cloud combines public and private (i.e. owned by the company) cloud services that work together to perform computations. Graphics processing units (GPUs) are powerful computer chips that are used to perform these computations. (Any obsessive gamer who has built their own computer will know what a GPU is because that is arguably the critical component in building a gaming computer capable of processing and rendering graphics quickly enough for the most computationally expensive games on the market today.) It can be expensive for a company to build their own bank of GPUs for processing data, so it can be more cost-effective to contract public cloud services instead [1].

Okay, that's all I want to cover in this post, but next up we'll discuss machine learning in some more detail.

[1] Hurwitz, J and Kirsch, D. 2018. Machine Learning for Dummies: IBM Limited Edition. New Jersey (NJ): John Wiley & Sons, Inc.


bottom of page