An Introduction to Data Mining for Marketing and Business Intelligence

I spent the last few weeks digging deeper into time series-related methods and data mining methods. For this post, I have decided to write a broad introduction related to the latter (data mining), since it may look more practical and also trendier than time-series (but watch out with the emergence of Particle Filtering and other Sequential Monte Carlo methods (SMC)).

Introduction to Data Mining and Criss Angel

So what is data mining? Data mining can be defined as the process but also the “art” of discovering/mining patterns, meaning and insights in large datasets by using statistical and computational methods. In other words, a data miner is like a Criss Angel (You can pick any other magician here!) that will make appear from your messy ocean of data, insights that will be valuable to your company and may give you a competitive advantage compared to your competitors; simply read Tom Davenport’s bestselling book “Competing on Analytics: The New Science of Winning” if you’re not convinced yet about the power of analytics and by extension of data mining. Furthermore, data mining related tasks are also considered as part of a more general process called Knowledge discovery in databases (KDD) which includes the “art” of collecting the right data as well as organizing and cleaning these data, which are also extremely important tasks prior to analyzing the data.

Some Brief History and a Link to Business Intelligence

Data mining mainly takes its roots from the fields of Statistics and Computer Science (some might say Artificial Intelligence) and may also be referred as “Statistical Learning”. From a statistical perspective, most early and recent advances coming from Statistics have come from the Stanford Statistics department school of thoughts (Leo Breiman (was at UC Berkeley), Bradley Efron, Jerome H. Friedman, Trevor Hastie and Robert Tibshirani). By the way, don’t forget that Stanford University is only 7 miles away from Google. Furthermore, the emerging field of Business Intelligence has blossomed as a combination of: (1) data mining tasks, (2) information systems technology and (3) crispy marketing insights.

Professor Leo Breiman - Considered As The Founder of the field of Data Mining
Professor Leo Breiman - Considered As The Founder of the field of Data Mining

Types of Data Mining Methods and Marketing

Data mining methods can be divided in multiple ways. However, most books on the topic, and especially those related to marketing and business intelligence, will generally divide data mining methods into two types, the ones related to supervised learning and the ones related to unsupervised learning.

Supervised Learning

Supervised learning is often more associated to scientific research as it includes tasks where the data miner needs to describe or predict the relationship between a set of independent variables (also referred to as inputs, features) and a dependent variable (also referred to as outcome, output or a target variable). Moreover, the dependent variable can be categorical (i.e. churn rate or classes of customers) or continuous (i.e. money earned from that customer) while the independent variables may be of any type but needs to be coded properly (i.e. dividing the categorical variables into separate binary variables). From a marketing and business intelligence perspective, I will divide supervised learning into two interrelated tasks: (1) supervised classification tasks and (2) Predictive Analysis.

Supervised Classification tasks: Supervised classification tasks occurred when you want to predict correctly to which class/category (this is the dependent variable) belong the new observations (i.e. customers) based on results from an already known training dataset. Generally, you will achieve this task by using: (1) a training dataset, (2) a validation dataset and (3) a test dataset. Most known methods I’m using for these tasks are the following:

1. Multinomial Logit (MNL)
2. Linear Discriminant Analysis (LDA)
3. Quadratic Discriminant Analysis (QDA)
4. Flexible Discriminant Analysis with Multivariate Adaptive Regression Splines (FDA – MARS)
5. Penalized Discriminant Analysis (PDA)
6. Mixture Discriminant Analysis (MDA)
7. Naïve Bayes Classifier (NBC)
8. K-Nearest Neighbor (KNN)
9. Support Vector Machines with multiple Kernels (SVM)
10. Classification and Regression Trees (CART)
11. Bagging
12. Boosting
13. Random Forests
14. Neural Networks

Predictive Analysis: I’ve decided to include the expression “Predictive Analysis” here, since it’s a buzzword in the web community nowadays. However, any task related to supervised classification involve a so-called “Predictive Analysis”. However, “Predictive Analysis” is a broader expression that also includes tasks related to the prediction of a continuous dependent variable rather than a categorical variable. Additional methods which can’t be used to conduct classification analyses may be used for predictive analyses with continuous variables and vice-versa.

Unsupervised learning

Unsupervised learning is when the data miner task is to detect patterns based only on independent variables. It is generally presented more from an algorithmic fashion rather than from a purely statistical fashion. Well-known methods applied to marketing includes: (1) Market Basket Analysis and (2) Clustering.

Market Basket Analysis: Market basket analysis (also abbreviated as MBA to confuse you even more) is certainly one of the most known and easier task relating data mining and marketing. It is considered more as a typical marketing application rather than as a data mining method. It can be simplified as a simple Amazon recommendation algorithm showing as an association rule that “the probability that customers who bought item A also bought item B is 56%”. The classic urban legend about Market Basket Analysis is the “beer” and “diapers” association where a large supermarket chain, most people will say Walmart, did a Market Basket Analysis of customers’ buying habits and found an association between beer purchases and diapers purchases. It was theorized that the reason for this was that fathers were stopping off at Walmart to buy diapers for their babies, and since they could no longer go to bars and pubs as often as before, they would buy beer as well. As a result of this finding, the supermarket chain managers have placed the diapers next to the beer in the aisles, resulting in increased sales for both products.

Clustering: The method of Clustering is defined as the assignment of a set of observations (customers) in subsets (clusters) where customers in a cluster are similar to each other while they are different from other customers in other clusters. Clustering is often used in marketing for segmentation tasks. However, even though segmentation may be achieved through “clustering”, more modern supervised methods such as Bayesian Mixture Models, which I must say are not really part of the data mining field, are used by the few practitioners who can actually understand how to program this method (this is one method I am programming these days). For more about segmentation, I would refer anyone to the book “Market Segmentation: Conceptual and Methodological Foundations” by Michel Wedel and Wagner A. Kamakura, both are professors and well-known authorities on the topic.

Some Top References

I must say without a doubt that the best book I know about data mining is surely “Elements of Statistical Learning” by Stanford Professors Trevor Hastie, Robert Tibshirani, and Jerome H. Friedman which covers broadly and nearly every type of methods you can use to conduct data mining tasks. However I must admit that this book has a focus on the statistics behind the methods (but it’s extremely clear) rather than on the software tools (No, it’s not a cookbook) you could use to conduct these analyses, and it may also lack of marketing applications for a marketer. Furthermore, to get some updates about the data mining world, KD Nuggets, administered by Gregory Piatetsky-Shapiro, is actually THE reference for the data mining world.

Some Top Software

Here is a description of some software I would recommend for data mining tasks, feel free to propose your own software in the comments section:

1. R: R is actually my favorite software. I have been using the software for mainly all of my statistics-related tasks for the last 2 years. Its free, open source, it has an extensive and very knowledgeable community, it’s extremely intuitive and it can be learned more easily if you have knowledge of software such as C++, Python and/or GAUSS. Furthermore, there are a lot of useful packages available to facilitate the coding. However, I must say that compared to C++ or SAS, sometimes R can be slow for data mining tasks involving a heavy load of data.

2. rattle: rattle, which stands for the R Analytical Tool To Learn Easily, is a “point and click” data mining interface related to R and developed by Graham Williams of Togaware. Frankly, I must admit that this software rocks even though I generally don’t like “point and click” software. It’s extremely complete and quite easy to use.

3. SAS Enterprise Miner: SAS Enterprise Miner, a module in SAS, was the first software I used for performing data mining tasks. It is extremely fast and user-friendly. However, I must admit that it reminds me software like Amos, now included in PASW (formerly SPSS) for Structural Equation Modeling (SEM) tasks, where you move the “little truck” to build your model and don’t really understand what you’re doing at the end of the day. Furthermore, it costs a lot but to my knowledge, SAS is the only software platform integrating data mining tasks with web analytics and social media analytics.

4. RapidMiner: RapidMiner formerly known as YALE (Yet Another Learning Environment) is considered by multiple data miner as THE software to use to conduct data mining tasks. Similarly to R, the software is open source as well as free of charge for the “Community” version. I haven’t made the switch from R to RapidMiner yet and I am currently testing the software in depth.

5. Salford Systems: I must confess that I never used Salford Systems software but know them by reputation, thus, I can’t have a clear personal opinion on the software. However, statisticians working at Salford Systems are presenting workshops on data mining for the next Joint Statistical Meeting (JSM) in Miami at the end of July 2011 which I might attend.

rattle - Software for Data Mining and Business Intelligence

Waiting time and Conclusion

Whatever the software you’re using, data mining-related tasks will always be demanding in terms of your computer memory. Data Mining in marketing and business intelligence and more broadly KDD is an art that requires strong statistical skills but also a great comprehension of marketing problems. So when you’re waiting for your data mining computations, feel free to come by and read my other cool posts on your other computer! In anyways, enjoy data mining and as one of my friend would say “show some respect to the machine”, but even more to the data miner!


Jean-Francois Belisle

Enter your email address below to subscribe to this blog

Delivered by FeedBurner


The World of Marketing is Changing – Three Types of Interconnected Marketing Revolutions

The right-wing is growing in popularity in the United States, Ben Ali is finally kicked out of Tunisia. Greece, Ireland, Portugal and Iceland are leaning toward bankruptcy, Fidel Castro will die soon, Duvalier is back in Haiti, and India and China are continuing to embrace capitalism. It smells change! It smells revolution! But what about business and more precisely marketing? What does it smell? I would say it smells the roast of a revolution too! But what kind of revolution? I would say three types of interconnected marketing revolutions: (1) the retailing experience revolution, (2) the automated personalization revolution and (3) the social media revolution.

Revolution #1 –The Retailing Experience Revolution

Let’s first start with the retailing experience revolution, since it’s the oldest and the more marketing-related revolution. In the 90’s, companies started to become aware of the importance of store design and atmospherics and how it could enhance a consumer’s experience and thereafter leverage sales. Companies started to become aware of the power of (1) music style, (2) music tempo, (3) decors, (4) store colors arrangement and more importantly, (5) the power of building the right brand-related ambiance. All these techniques have culminated with the bestselling book “Why We Buy? – The Science of Shopping” by Paco Underhill that brought “ambiance research” to another level. An interesting update and complement published in 2010, is “Buy-ology” by the danish author Martin Lindstrom with a foreword by Paco Underhill.

Thus, running into a mall is no longer a boring utilitarian task for many of us (but it is still for some guys!), it’s more than ever an “Experience” (with a capital “E”), and the cheapest experience you can buy, the only thing it costs is gas or any transportation fees, and that … until you buy. Some companies have even brought the experience one step further, simply taking a look at the Charmin’ restrooms in New York City and the Apple Genius Bar will make you understand that it’s all about “Experience”.

The Apple Genius Bar - A great example of The Retailing Experience Revolution
The Apple Genius Bar – A great example of The Retailing Experience Revolution

Pushing the peanut one step further, the importance of a fit between a website design and its associated brand is another example of this retailing experience revolution. Thus, the creative part of ergonomics is totally inlaid in this revolution.

Revolution #2 – The Automated Personalization Revolution

If petroleum was the “Black Gold”, the water the “Blue Gold”, then the web can be defined as a golden mine of information. But it is more and more a golden mine of information for companies to find you and then target you. More and more companies know who you are, where you are, and what you want? It looks more and more like a “Big Brother” issue, to quote one of my favorite authors George Orwell. In web terminology, it could be referred to as Web 3.0 – the information finds you and meets your needs before you find it.

From Online to Offline to On-line

The beginning of this revolution started with Amazon in 1994, the objective was to target the consumer/user based on collaborative filtering. Most importantly, the automated personalization is a real-time updating (on-line) process that collects information about you and adapts its offering as you evolve in the society always relying on people similar to you. Amazon CEO, Jeff Bezos, is a graduate in Computer Science and Electrical Engineering from Princeton. Since then, tons of computer scientists and statisticians with expertise in computer programming and data mining/statistical learning emerged in start-ups or changed the business culture of already existing companies.

More and more, this personalization revolution will appear offline. With the advent of location-based services like Foursquare and Gowalla, this is simply the icing on the cake. Companies can now bring your offline information online and then target you. This is marvelous for the consumer but in the long-run it would be even more interesting for the company. Offline examples include the possibility to learn about consumers using intelligent in-store displays (see my post entitled “The In-Store Displays are Watching You” on the topic), then merging the retailing experience revolution with the automated personalization revolution.

Revolution #3 – The Social Media Revolution

Let’s first put something clear. As in the ancient time, people are still communicating, they are simply communicating using different platforms. In this way, there was social life before social media, as there was statistics and KPIs before web analytics, as there was information available before the Internet. Traditional way to communicate to others should always be taken in consideration by any company and ad agency. However, what should also be taken in consideration is that what is said on the web, stays on the web. In the 1950’s at Tupperware parties, people could say what they want, and what was said, could be forgotten the other day or burned in the fireplace, but now it stays and sticks on the web as an unwanted Facebook picture. Thus, the traditional ways of communication stay, but the media changes and the information is carved in stone. Here are my top 9 changes related to social media (feel free to suggest me some others):

1. People are less patient than before, they want it right now;
2. People are more connected, but they know less about their connections than before;
3. People communicate more with strangers online, but less with strangers offline;
4. People trust less traditional advertising than before and more their weak ties than their strong ties;
5. People have less privacy than before;
6. People who have nothing important to say are gaining supporters compared to experts;
7. People are spending less time on television and more time on the Internet;
8. The information is spreading faster;
9. The information is more tractable.

The Social Media Landscape by Fred Cavazza - A "Must" at Every Update
The Social Media Landscape by Fred Cavazza – A “Must” at Every Update


But why is social media a marketing or more globally a business revolution? Since the information is easily tractable on the web, what is most useful from a business perspective is the possibility to broadly answer the 5W’s for almost any brand:

1. Who is writing about your brand?
2. What is written about your brand?
3. Where are consumers writing about your brand?
4. When are consumers writing about your brand?
5. Why are consumers writing about your brand?

Nowadays, using social media monitoring platforms like Radian6, Sysomos, Lithium (Scout Labs), etc… to answer these 5 questions is more and more common for firms or agencies. Consulting agencies using their own powerful solutions such as Nexalogy have also emerged in the business world. More and more, the social media revolution is tending toward the automated personalization revolution.

The power to the morons? – From Plato to Ashton Kutcher

In the 4th century BC, Plato dreamed about what he referred to as “The State”, where experts could lead and share their decisions. If Plato would be alive today, he would certainly kill himself. Social media platforms sure give more power, the problem is that sometimes, it reinforces power to people that sometimes simply have nothing intelligent to say. Following Ashton Kutcher on Twitter may be funny, but most of what he is saying is useless. However, he is certainly one of the most influential individual on Twitter. Overall, there is surely more information exchanged than before on the social media platforms, but there is also extremely high number of “babbles”. Getting rid of “babbles” is one of the main challenges of any social media monitoring platform.


In conclusion, whatever is the revolution that is affecting your job the most, what is most important to keep in mind is that there was life before these revolutions. What you’ve learned in your “Introductory” marketing courses 10 years ago is not good for trash, it is still relevant, but it simply needs to be updated, to be integrated with concepts related to these revolutions. However, what mainly differs from political revolutions is that we can’t escape from these revolutions, we are all voluntarily or involuntarily part of these revolution. Any other ideas of revolutions? Want to play Risk© against me and start you own revolution?

Enjoy these revolutions you little “Che”,

Jean-Francois Belisle


Enter your email address below to subscribe to this blog

Delivered by FeedBurner


Is Clotaire Rapaille Feeding or Failing Marketing?

Last week, brands’ psychoanalyst (sometimes referred as cultural anthropologist) Clotaire Rapaille was fired by the Quebec City mayor Regis Labeaume from his role as a brand management (image) consultant for the city because mainly of curriculum vitae falsifications. The news arrived more than a month after being hired to a $300,000 3-month contract to propose a branding plan for Quebec City. By trying to push too far his own marketing, Clotaire Rapaille completely violated the fundamental principles of personal branding and lost a portion of his credibility. When reading all these stories about Rapaille, one question came off the top of my head: Is Clotaire Rapaille feeding or failing marketing?

Clotaire Rapaille
Clotaire Rapaille

Why is Clotaire Rapaille feeding marketing?

By qualifying himself as an anthropologist, Clotaire Rapaille first reminds me cultural anthropologist Grant McCracken who to my limited knowledge in this field, was one of the first high-end anthropologist marketing consultant to sign lucrative consulting contracts with multinationals (Coca-Cola Company, Diageo, IBM, IKEA, Chrysler, Kraft, and Kimberly Clark). Perhaps the hiring of Clotaire Rapaille is the sign pointing the beginning of an era of lucrative consulting contracts for marketers’ anthropologists. On the academic side, this hiring could reinforce the appeal of cultural anthropology in marketing at the undergraduate and MBA-level, a field led by the York University crew (Russell W. Belk, Eileen Fischer, Robert Kozinets & Detlev Zwick) and growing in importance in the Montreal area (Zeynep Arsel and Annamma Joy at Concordia University and Jonathan Deschênes, Jean-Sébastien Marcoux, Marie-Agnès Parmentier, Yannik St-James at HEC Montréal) and especially at HEC Montréal.

Why is Clotaire Rapaille failing marketing?

By being fired from his consulting contract with Quebec City, Clotaire Rapaille makes marketing sounds like magic in the eyes of the populace, which is completely false. Surely, marketing is not a hard science at the same level as pure mathematics. However, marketers are not magicians or should not claim to be, let magic to mindfreak like Criss Angel. The discipline takes its roots in psychology, anthropology, statistics, economics and computer science, which creates a sexy melting pot. The “science” of marketing is based on empirical generalizations, strong conceptual frameworks and learning-by-doing case studies that lead to best practices.

Magician Criss Angel
Magician Criss Angel


Briefly, one sure thing is that Clotaire Rapaille is a good example of a personal branding failure. However, the Clotaire Rapaille personal branding failure has had negative and positive spillover effects for all those working in the field of marketing in the province of Quebec and perhaps even in North America. What do you think? Any other comments?

Jean-Francois Belisle

Enter your email address below to subscribe to this blog

Delivered by FeedBurner


Launching a Successful Online Marketing Campaign for Buzzing the Buzz Using the Buzz/Viral/Buzz Sequence

Mentioning the need to build a successful online marketing campaign is easy to say but way harder to detail. What can be considered as a successful campaign? The answer is straightforward; a campaign that reaches the fixed objectives whilst minimizing the costs and maximizing gains. In other words, we can define a successful campaign as one that generates a positive long-term return on investment (ROI). To the minimum, if your objective is to gain subscribers to your newsletter, all you need to implement in relation to your online strategies is one designed ad and a single video that redirect viewers to a subscription form thereafter. I personally think that the simpler is the better, and that the buzz/viral/buzz sequence is a good recipe to success. So what is the buzz/viral/buzz sequence?

The buzz/viral/buzz sequence is a 3-step hierarchical procedure which includes:

1. Buzz in the creation process;
2. Viral to propagate in the targeted population;
3. Buzz in the targeted population.

Each of these steps is described with more depth in the following paragraphs.

1. Buzz in the creation process

If there is no buzz around the video in the creation process, the probability that there will be a buzz once launched to the targeted population is minimal. Why would I share a video is the question to ask to any member of the marketing team before the campaign is launched. Great campaigns often come with innovative and simple ideas. You are much better off dumping a bad video than showing the whole world how much the video sucks!

2. Viral to propagate in the targeted population

Viral marketing is about the techniques employed to propagate a message/video in the online environment. If the message/video sucks, you will spend plenty of time trying to spread rotten material. For a more detailed view of viral marketing myths, feel free to read my post entitled “Demystifying Viral Marketing – 7 Myths of Viral Marketing Campaigns”.

3. Buzz to the targeted population

If there is a buzz in the creation process, then chances that viral techniques worked are multiplied and chances that a buzz occurred in the targeted population are exponential. In other words, buzz in the targeted population is a function of the two previous elements of the sequence. A well-executed example of these three steps is the successful online campaign featuring the Bee Boys Dance Crew for Häagen-Dazs video (see picture below) launched a year and a half ago.

The Bee Boys Dance Crew for Häagen-Dazs - A Successful Online Marketing Campaign
The Bee Boys Dance Crew for Häagen-Dazs - A Successful Online Marketing Campaign


What do you think of the proposed sequence? Do you have any example of organizations skipping the first part of the sequence and then whining about the fact that users didn’t buzz on their campaigns? Users are not dumb, so proponents of the online marketing intelligentsia, please stand up!

Jean-Francois Belisle

Enter your email address below to subscribe to this blog

Delivered by FeedBurner