Machine Learning for Beginners

Daniel Albertini

Daniel Albertini

Engineering Fellow and Co-Founder at Anyline

Jan 21, 2017

In recent years there has been a huge uprise and general interest in Machine Learning and Deep Learning. The term “Machine Learning” (ML) was first defined in 1959 as “the ability to learn without being explicitly programmed”. This basically means that a machine could learn from past experiences and reprogram itself in order to work better. Computers are therefore trained to learn from data.

Artificial Intelligence (AI), machine learning, and deep learning are getting mixed up quite often, even though they are distinct fields of research and technology. But other than AI, which is – in a nutshell – about machines mimicking cognitive functions associated with humans, machine learning and deep learning are closely related to statistics.

If you are interested in machine learning, we want to provide you with a brief introduction to machine learning and hope that we can show you some interesting ways to start your ventures into this exciting area of technology.

Basics of Machine Learning

The difference between machine learning and classic approaches to software programming is that machines learn from experience and don’t have to stick to hardcoded rules. Among many other things, machine learning technology already brought us to the brink of (semi-)autonomous / self-driving cars, search engines like Google delivering better results, Facebook’s algorithms proposing more relevant posts and ads as well as general speech and/or face recognition capabilities. Google’s machine learning framework TensorFlow can even protect endangered sea cows – yes, you heard that right!

The first you have to ask yourself when starting with machine learning is if your problem is solvable with this technology. If that’s the case, the next step is one of the biggest and most important tasks: to collect data and to prepare it for training properly. This can be any data from excel sheets to text files, audio files, images, and so on.

The better the variety, density, and volume of relevant data, the better the learning prospects for the machine becomes. Therefore, the next step is to prepare the data for training. You have to check on the quality of the data and if there’s any kind of missing data. Having the best data set(s) possible should be one of your main goals.

Another important step before you can start the training of your machine learning model is the classifier pre-processing of the data. Depending on the domain (images, audio files, measurements,…) several different pre-processing techniques are applied. But the main idea is to make the data “pretty” for the classifier so that it can learn its parameters much faster. For example, subtracting the mean value and dividing by the standard deviation of the training samples, etc.

Different Learning Methods in Machine Learning

In general, there are these three different learning methods:

  • In Unsupervised Learning, the machine learns with an unlabelled data set. Simply put, this means that you don’t tell the machine what is what. Therefore it will try to figure it out itself and cluster the data in different groups and learn from that.
  • Reinforcement Learning means that the model is trying to figure out the parameters for maximizing the outcome only by a simple reward feedback look. Imagine you want to train a dog. He doesn’t know which tricks you want him to learn, but you reward him with positive feedback such as treats or praise so he gets an idea. And every time you repeat this, the dog will know the trick better. This may sound funny, but it’s actually quite the same with machine learning!
  • In Supervised Learning, labeled data is provided, and the model is learning its parameters so that the number of misclassified samples (based on an error function) is minimized.

You can also combine all those different learning methods. For example, you could start with unsupervised learning and then move on to supervised learning at a later stage to improve the training.

Training a Classifier in Machine Learning

Like we’ve just learned, to do supervised training you need a well-labeled data set. The goal is then to train a classifier that will be able to label the data by itself. So, if I wanted to distinguish between a whale and a dog, I would have to define features, which describe the characteristics of the data, in this case, animals. Some features would be for example the shape, the length, and the texture. For the beginning, a set of training data could look similar to something like this:

Texture Weight Label
smooth 3,600 kg Whale
hairy 30 kg Dog
smooth 2,700 kg Whale

 

With this data, the classifier can be trained. The input to the classifier should be the features. The output should be the correct labeled data.

With a black and white picture – for example, text on a sheet of paper – it basically works the same. If you’d want to separate a 1 from an 8, you’d just have different features you’d use to train the classifier. The features could be for example if the number has roundness or if it has holes.

Machine Learning Tutorials & Resources

So, enough with the terms and the theoretical stuff. When it comes to tutorials, there are many great websites with videos and papers on how to start working with machine learning:

  • Springboard offers a free machine learning in Python course. You’ll learn computer science techniques and how to build basic deep neural networks. It’s a great introduction to data science and can be completed in 2-4 weeks.
  • Udacity is always a good tip when it comes to programming courses. This intro to machine learning course is for free and takes about 10 weeks. The required programming skills are listed with “intermediate”.
  • On this CS231N website, you can find parts of the Standford Computer Science class. This is a great written tutorial on Image Classification, Python, and Neural Networks!
  • Another course with a lot of very good reviews is on Coursera, also from Stanford University. The basis description says that “this course provides a broad introduction to machine learning, data mining, and statistical pattern recognition.”
  • Neural Networks and Deep Learning is a free online book that “helps you master the core concepts of neural networks, including modern techniques for deep learning.”

You have your own company and are interested in machine learning? Check out this article about how to make your company machine learning ready and if you want to get to know people in Vienna, Austria, that has a Computer Vision and/or Machine Learning background, feel free to join our awesome community on Meetup!

 

Try our Demo App

Like it? Share it.