Data and AI: The Reason Why They Can’t Live Apart

In this article, we will discuss the importance of data and their analysis, in order to design efficient and effective artificial intelligence (AI). But first of all, let’s set up an example to help us understand.

A typical problem to solve for a company is to anticipate the needs of their customers. For example, an insurer wants to know which insurance coverage would be most suitable for his client. Several factors can consciously or unconsciously influence the client’s choice of insurance (and also insurer!): Age, education, health, current and future financial situation, short and long-term objectives and so on! With the ability to serve a wide range of clients, it can become difficult for the insurer to have in mind all possible customer scenarios and recommend the appropriate product. Consequently, an AI could be conceived to handle this kind of situation. Assisting the insurer, it could quickly identify the type of client and suggest which insurance and options would suit him best.

Artificial intelligence

Artificial intelligence is the art of developing a system capable of performing tasks normally performed by humans. How it is done? Just like a child who learns from experience through mistakes and successes, the machine “learns” from the experiences that enable it to accomplish the required task. Again, as a child might require the help of a person, so does the AI. In recent years, machine learning methods have proved their effectiveness in “teaching” a system (here AI) how to perform tasks previously reserved for humans. (To learn more about machine learning, see our blog article: A Non-Technical Guide to Understanding Machine Learning)

And how does one give the experience to a machine? Data is composed of information used to characterize a situation, a phenomenon, an element, etc. It is the data, in all its forms, that provides experience to the machine allowing it to make the necessary correlations to accomplish a task.

If an AI uses the characteristics of the client; Age, goals, and so on, it can classify them into different groups. Subsequently, the AI can suggest one or more types of coverage depending on what the other members of the same group have previously chosen. A person could handle this type of task if the client variables and related products are somewhat simple. However, at large scale (think Amazon selling online), man needs machine. The number of variables, clients, and products would then require an AI to handle the task efficiently.

The amount and reliability of the data determine the level of accuracy the AI will have when responding to the business problem. Generally, the larger the amount of data, the better the AI can predict accurately. The collection, distribution, and validation of data are therefore important issues in the creation of solutions involving AI.

But how do we process the data properly so that it can be useful in our AI?

Two words: Data Science

Let’s take a second to define the concept before going further.

Data Science is an interdisciplinary field in which scientific methods, mathematics, statistics, and information overlap in order to extract knowledge and ideas from data sets. Source: Wikipedia and simplystats

To put it simply, Data Science is the art of finding and choosing hidden information. Different associations and causes can exist between the characteristics of the client and his choice of insurance coverage. Using mathematical, statistical and tools, but also, using the knowledge already acquired in the context that is being studied, relationships and correlations can be found between seemingly independent variables. For example, a potential relationship exists between the brand, year and model of the vehicle and the decision to take replacement cost coverage, but where do we draw the line? A civic 1998? A BMW 2010? Mathematical models can help us predict this. The challenge that many companies are experiencing today is deciding whether this task should be given to a person or an AI.

Choosing the right data for AI

Some might ask why is it necessary to understand the context in which AI is applied. If we already have a lot of data, would it not be simpler to use all possible data and let, for example, a neural network train itself by finding the various links to accomplish the desired task?.

Nope. Here is why.

In more technical terms, too much information means increased variability and the model becomes unstable. For example, when predicting whether a client will choose a life insurance policy, age and income could be strongly correlated with whether the client will purchase or not. We can then question the need for both variables or how many of them are required. If predicting purchase using income only is equally accurate, then why have more when you can do the same with less? Identifying the strictly necessary data needed for machine learning is simpler and saves time and money.

A short guide to Data Science

Here is a short guide that summarizes where you should start when identifying what data should be used.

Identify and understand the business problem: Knowing what specific need is being answered is at the core of a good analysis in data science. Having a firm understanding of the problem at hand. You’ll know where to look when looking for data sources.

Understand your data: Having data is good, but you need to know what information they contain to know if they can answer the problem at hand. Understanding each characteristic with the model and how they impact the system as a whole is important. The same applies for the links between the different variables. Understanding them makes it possible to better target what you are trying to accomplish.

Prepare your data: This involves cleaning, processing, and filtering. This step is essential before any information is retrieved because it impacts accuracy. Without a clean data set, results will be poor.

Modeling and evaluation: Using statistical analysis, we want to know if the pre-established hypotheses are exact and if the information is sufficiently relevant to explain (or predict) the variable of interest. This stage is often perceived as the “black box” in data science.

Deployment: When the hypotheses have been validated and the final model with the correct entries found, it can be implemented for use on real-time data. At this point, the model can complete its designated task as mandated by the business problem.

Before you go.

Always remember the principles of good data science when designing artificial intelligence. The famous phrase “You are what you eat” is befitting when speaking of AI and its data.

Hopefully, this article will have provided you with a better understanding of the principles behind data science and how to properly use it in the AI and machine learning field. In my next article, we will go through the process of identifying a problem, selecting the data, and cleaning it in preparation for ML/AI, using a real data set! Subscribe to our blog to know when it goes live.

As a young data scientist, I am very interested in scientific research and especially in the statistical and mathematical methods surrounding this field. I’d be happy to hear your comments and questions and discuss them further. You can follow me or contact me on LinkedIn, Twitter or directly by email.

A Guide to Working Remotely, as a Team.

How half our team traveled abroad for two weeks and still stayed productive.

We’re no strangers to working remote.

Back in October, I worked from Banff, Alberta because I am passionate about climbing mountains. (Summits are awesome!)

Our Branding Goddess and Design Adventures, Manon, spent 4 months working remotely while traveling the world last year. She wrote all about it in this great guide.


This year, we pushed the concept further. Eight of us moved to Puerto Rico for two weeks and still managed to produce great results. Here is how we did it and what we learned.

Continue reading “A Guide to Working Remotely, as a Team.”

An Introduction to SVG – Part 3

Hey guys, welcome to another post about SVG. So far we’ve seen how to create, stylize, and load SVG shapes and we took a look at the SVG coordinate systems, the viewport and the viewbox. By now, you have acquired some basic knowledge on SVG and you might have ideas or little tricks of your own.

Continue reading “An Introduction to SVG – Part 3”

Introduction to Recommender Systems

Many receive advice, only the wise profit from it.” – Harper Lee

Many of us see recommender systems as mysterious entities that seem to know our thoughts. Just think of Netflix’s recommendation engine which suggests us movies or Amazon which suggests what products we should buy. Since their inception, these tools have been improved and refined to continuously improve user experience. Although many of them are very complex systems, the fundamental idea behind them remains very simple.

Continue reading “Introduction to Recommender Systems”

2016: When life gives you lemons…

We have a confession to make: At Arcbees, we liked 2016.

Yes, you read correctly. Unlike many others who want 2016 to quickly fade away in the meanders of oblivion, for us, it was an inspiring year.

Here are our 3 reasons why we loved it and why we will surely love 2017.

Continue reading “2016: When life gives you lemons…”

A Non-Technical Guide To Understanding Machine Learning

In last week’s post, we discussed if machine learning was right for your business. As part of that effort, I recently went through the process of learning the ins-and-outs of machine learning and realized most information out there is technical and aimed at developers or data scientists.

I thought an explanation from a non-technical person might be of interest.

Let’s begin.

What exactly is machine learning?

The simplest definition I came across:

Machine learning is “[…] the branch of AI that explores ways to get computers to improve their performance based on experience”. Source: Berkeley

Let’s break that down to set some foundations on which to build our machine learning knowledge.

Continue reading “A Non-Technical Guide To Understanding Machine Learning”

Is Machine Learning Right For Your Business?

Most businesses recognize that machine learning can generate exceptional value but many still wonder how, in what specific areas, and if the time is right to integrate it within their data strategy. Today, we explore what questions you should be asking to know if machine learning is right for your business. Continue reading “Is Machine Learning Right For Your Business?”

Do you really know Arcbees?

Although the Québec-based company celebrated its sixth anniversary in October 2016, it still remains, for a large majority of people, a somewhat mystical or even unrecognized enterprise. Despite the fact that there are many players in the IT and web sector, Arcbees is doing very well thanks to its niche expertise and strong corporate culture.

Continue reading “Do you really know Arcbees?”

An Introduction to SVG – Part 2

At the end of my first post about SVG, I told you that today, I would talk about the use of CSS with SVG, but guess what… I lied. Yep!

Do I feel sorry? Not even a bit. Why? Because it’s for your own good.

As I was writing the CSS and SVG article, I realized it was a better idea to tell you about other SVG properties first, so you can see the whole SVG world and use it to its full potential. Being able to understand, draw and use basic SVG shapes is cool, but like I said in the first article, it’s only the tip of the iceberg. Knowing and using only the basic shapes is like having a Tesla car, and only driving it in your driveway.

Continue reading “An Introduction to SVG – Part 2”

Web Developer

Want to join our team? We’re looking for a Web Developer! Send your application to

At Arcbees, you will have to develop web and mobile solutions by utilizing several languages (Java, TypeScript and friends) in an environment where amazing coding skills meet an unmatched sense of humour. To be part of the Arcbees team, you will have to demonstrate your extraordinary potential and your motivation to become a coding god (just like Gohan looking for Old Kai). Continue reading “Web Developer”