BRIEF INTRODUCTION OF MACHINE LEARNING

Machine learning is a subfield of artificial intelligence. Machine learning algorithms helps computers to understand the structure of a data and how to fit that data into model so that it can be understood and utilized by people to solve problems.

With this technology, computers can learn if not as same as humans then more than what humans can do. There are various algorithms that can help understand various forms and structure of data; machine learning helps to create model out of data to shape the decision-making system in almost every situations.

In 1959, Arthur Samuel, a pioneer in machine learning (ML) field, defined machine learning as “field of study that gives computers the ability to learn without being explicitly programmed”.

In 1998, from Tom Mitchell, a computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

From above definitions, computer uses machine learning to learn from experiences, which are past information, when doing some tasks. But we normally says the computer is actually learning when the performance keeps increasing with increasing experiences.

Examples of these can be:

  1. Spam Detection. Suppose your email program watches which email you do or do not mark as spam and based on your actions, learns to better filter spams. In here, the tasks T, can be classifying email as spam or not spam, Experience E, will be watching you label email as spam or not spam and finally the performance P, will be the number of emails classify as spam or not spam.

Machine learning algorithms can be basically grouped in three levels.

  1. Supervised Learning. The computer is trained on data that is already labelled with correct labels or outcomes. The labelling is done by the programmer or the data collector. For example, trying to classify a dog from non-dog image, we need to collect image data with correct label and wrong label. Based on that the computer can learn to classify unseen input well taking into consideration, the amount of experience learnt. During data collection, it would be more advantageous if various data with dog and non-dog images are collected. The data should be skewed so that the computer can get a lot of data of different forms to train on in order to increase performances. Let me use another example for better understanding.
number 1 and alphabet A with different shape

With image examples above, we have different shapes of same data sample which when fed as data to the computer, will make the computer learn more ways a particular data can appear, in our case, take a look at the letter A and number one. The more dataset we have, the more the accuracy the model appear sometimes but not all the times. We will explain this scenario later.

Categories of Supervised Learning

a. Classification

b. Regression

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output about whether the house “sells for more or less than the asking price.” Here we are classifying the houses based on price into two discrete categories.

2. Unsupervised Learning

In this instance, the computer is fed with training data which is not labelled and it is allowed to identify the repeated patterns, relationships and correlations in the raw unlabelled dataset using the methods of estimation based on inferential statistics. We normally approach problems with little or no ideas how our results should look like. When patterns are found in these raw dataset, training set of common pattern, relationships or correlations are grouped together. Using statistical algorithms to identify the boundaries in the dataset. Examples of learning is clustering. Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

3. Reinforcement Learning

Reinforcement learning (RL) is an area of machine learning concerned with how software agent ought to take action in order to maximize the notion of cumulative rewards. It is like the unsupervised learning, the dataset is unlabelled however when asked a question about the data, an outcome is graded. Positive and negative grades are provided to serve as a feedback loop to check if it is performing well or not.

References.

Courses on Machine Learning on coursera by Prof. N.G Andrew.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store