Current developments in areas such as Computer Vision and Natural Language Processing continue to push us towards technologies that are expected to have a profound and long-lasting impacts in our daily lives (for example, autonomous driving and speech-to-speech translating systems, respectively).

However, due to the novelty and complexity of many underlying concepts, the media and a few ‘tech experts’ tend to mix them up for simplicity and click-baiting, leading to some sensationalist and/or misleading claims, such as “Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day”[1] and “Worker killed by robot at Volkswagen car factory”[2]. In fact, in neither case was there any will or thought done by the robots: they were merely reproducing what they learned (from the available data) or were taught to do (through a sequence of instructions), despite these titles suggesting otherwise.


One such example of concept confusion is between Artificial Intelligence (AI) and Machine Learning (ML), as they are many times used interchangeably when in fact each has its own meaning. In short, Artificial Intelligence refers to the intelligence displayed by machines as an attempt to simulate human (or animal) behaviour and actions. It includes a number of sub-areas, including Knowledge-Based Systems and ML. Machine Learning is a sub-area of AI that focuses on computer algorithms that improve automatically through experience and by the use of data. A more informal way to clarify the difference between the two is to focus on the key words in their names: intelligence and learning. Intelligence has to do with self-improvement, dealing appropriately with previously unseen contexts and, ultimately, adaptation. Learning has to do with memorizing, replicating, inferring and deciding, as long as it falls within what has been experienced, and therefore repetition.


Nowadays, ML is undisputedly the most important part of AI. There is no unique correct way to identify the different types or approaches of ML. If we take a task-oriented approach to ML, it can be divided into supervised learning, unsupervised learning and reinforcement learning:


Supervised learning: When data are labelled according to some criteria, these labelled data can be used to train a model that learns how to map inputs (the data) into the desired outputs (the labels). After training, the model can receive new (previously unseen) inputs for which it will also produce the desired output. Supervised learning is divided into two categories: classification (i.e., predict a category) and regression (i.e., predict a real number).


Unsupervised learning: When no labels exist for the data, the best that can be achieved is to detect useful patterns of some sort within the data. Unsupervised learning is divided into three categories: clustering (i.e., group by similarity), association rules (i.e., identify sequences) and dimensionality reduction (i.e., find hidden dependencies / ‘compression’).


Reinforcement learning: Sometimes, no data is directly available for model training. Instead, there is an agent that can perform actions in an interactive environment. As the agent performs different actions, he is rewarded or penalized for performing them, depending on whether he is approaching or diverging from a predefined goal. Given sufficient time for the agent to explore different actions in different contexts, he becomes increasingly better at achieving the desired goal, eventually finding an efficient sequence of actions for reaching it.


If we take a complexity-oriented approach to ML, then its algorithms can be split into classical algorithms, ensemble algorithms, artificial neural networks / deep learning and reinforcement learning:


Classical learning: Classical ML algorithms include the first generation of algorithms to produce interesting results in real-life problems. Most of them were designed at the time when limited computational power and low amounts of data were available and for solving relatively simple tasks with well-defined data features, although many of them are still relevant today. Examples of such algorithms include Support Vector Machines (SVM) and Decision Trees.


Ensemble learning: Ensemble methods combine multiple algorithms in order to achieve better performance than any single algorithm on its own. They are particularly useful for complex tasks with well-defined data features. Examples of such techniques include Random Forests and Gradient Boosting.


Artificial neural networks / Deep learning: An artificial neural network is computer system that attempts to mimic the workings of the human brain and nervous system, by combining a set of simple units in layers. Deep learning is a particularization of neural networks when many (sometimes hundreds) hidden layers are considered. This introduces a high degree of flexibility to neural networks, making them suitable to tackle complex tasks with poorly-defined data features where large amounts of computational power and data are available. Different architectures depending on the problem to be solved exist, such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN).


Reinforcement learning: Already presented above, it is used when data in the traditional sense is not available. Examples of reinforcement learning techniques include Q-learning and Asynchronous Advantage Actor-Critic (A3C).


All the concepts mentioned above were presented in a very summarized way. Each of them hides inside a world of its own.


José Portêlo

Lead Machine Learning Engineer