Perceptive Analytics

customer

admin — Sun, 24 Jul 2022 16:35:57 +0000

1 Customer Lifetime Value

1.1 What is Customer Lifetime Value

Customer Lifetime Value (CLV), putting in simple terms, is the worth of a customer to a business for the complete period of their relationship, or the total amount of money, a business can expect from the customer during their lifetime. Revenue generated by the customer varies for each customer business pair individually. For a business, a customer might be worth a million, but a second customer might be worth nil, or maybe the first one might worth nil for another business. CLV can be used as a metric of profit associated with a customer-business relationship. A fast way for determining a customer’s profitability and the potential for the company to develop over the long run is to compare the customer lifetime value against the cost of acquiring a new client.

Customer profitability, sometimes referred to as CP (the difference between the revenues and the expenses connected with the customer connection for a specific period), is distinguished from customer lifetime value, or CLV, by the fact that CP evaluates the past, while CLV anticipates the future. It’s a significant measure, and the way you approach it may not only help define your company but also change considerably based on what it is you want your company to do for you in the long run. For further detail, have a look at this article.

1.2 Strategic Importance of CLV

As the cost of acquiring new customer is higher than retaining an existing customer, which means that existing customers are great wealth to a business. With customer lifetime value in action, a business can decide how much money to spend on acquiring new customer and how much on retaining existing customers. Not just that, with in depth analysis of Customer Lifetime Value, a business can divide its customers into segments and then decide upon strategies, expenditure and action plan for each group separately.

CLV can also be used to predict or catch early signs of attrition, for example for TeleCom Company, lesser and lesser subscriptions over the months or years shows signs of attrition. All businesses irrespective of size, capital needs strategies to retain its customers and acquire new customers in such a manner to maximise the profit. High CLV is an indicator of product-market fit, brand loyalty and recurring revenue from existing customers. Hence, it becomes important for every business to analyse its customer base and profitability. For more details, refer to this article.
With in-depth analysis of Customer Lifetime Value, a business can divide its customers into segments and then decide upon strategies, expenditure and action plans for each group separately — a key practice among leading marketing analytics companies.

2 Calculating Customer Lifetime Value

2.1 Basic Understanding

As described in the above section, CLV is the total worth generated by customer in the complete relationship lifetime. So, we can say that for a customer CLV can be calculated

Customer Life time Value = Customer Value × Customer Life time

where Customer Value is what truly differentiates a customer from other and is given by

Customer Value = Average Purchase Value × Number of Transactions

If talking in terms of profit, the Customer Lifetime Value will see a change in the term of Customer
Value given as

Customer Value = Average Purchase Value × Number o f Transactions × Profit Margin

and the term Customer Lifetime represents the retention period of a customer for the business. Let’s take an small example, For a hypothetical Clothing Store, the average purchase value of an item is |500, and a customer, on average shops for 5 times a year, in this case the Customer Value would be calculated as

Customer Value = 500 × 5 = 2500

and let say the average profit margin is 20%, then the Customer Value would be:

Customer Value = 2500 × 0.2 = 500

And an average customer shops for 10 years from the Store, then the Customer Lifetime will be 10 years. So the Customer Lifetime Value will be calculated as:

CLV = Customer Value × Customer Life time = 500 × 10 = 5000

This is the basic idea behind calculation of CLV. However, in practical, the calculation is not this simple, there are several factors that account in calculation of CLV. We will be talking about these factors in upcoming sections. Important point to note that in our calculation, we have taken a few factors, such as Purchase Value, Customer Lifetime, Profit Margin but these factors are not as static as we have assumed. Such factors change with time and will affect our Customer Lifetime Value.

2.2 Factors Affecting CLV

2.2.1 Customer Lifespan

As stated earlier, it is the duration for which a customer was actively associated with the business, performing transactions before being dormant or dropping off. The cost of customer retention is generally much lower than customer acquisition cost, so it can be simply understood that the longer the duration for which a customer remains loyal to a brand, more profitable the customer will be for the firm. This statistic is significant when you examine the expenses connected with recruiting new consumers as well as determining whether you should spend more or less to discover new customers. This calculation is a component of a much broader one that determines the Customer Lifetime Value (CLV).

2.2.2 Retention Cost and Rate

Retention Cost, is the amount spent on a customer in various forms, such as discounts, ad campaign, emailing, customer services, etc. Retention Rate is the percentage of customers who continued with the business. Higher retention cost mean that the customer is costing high for the firm and it may be the case that customer may not prove to be profitable at all, if they doesn’t provide firm back with purchases (revenue). Higher retention rate is desired by each companies. Companies like to keep Retention cost low, up to a certain limit that it doesn’t cost them with losing customers, keeping the retention rate high. The retention cost may vary from customer to customer. Companies generally have different policies for different segment of customers, resulting in different retention cost for each segment.

2.2.3 Customer Churn Rate

Customer Churn rate (also known as attrition rate), in its simplest terms, is a measure of the number of customers ceasing the relationship with their current service provider on the basis of dissatisfaction. Customer churn is when an existing customer, user, player, subscriber or any kind of return client stops doing business or ends the relationship with a company. This could, for example, mean cancellation of a subscription, non-renewal of a contract or service agreement, ending of a membership or Consumer decision to shop at another store/use another service provider. It is just opposite of retention rate. High Churn Rate means companies need to lookout for new customers or get their past customers back, if they intend to make profits. As stated earlier, acquisition cost are much higher than retention cost, meaning more cost to the company, which is not desired.

2.2.4 Acquisition Cost

Acquisition Cost is the money spent for acquiring a new customer, that could be in term of media and publicity, advertisement, direct contacts, etc. Higher Acquisition Cost implies high input and an additional reduction from the profit. This means that this quantity just like retention cost, must be subtracted from the pre-calculated Customer Lifetime Value, to get a better result for CLV. Acquisition cost varies from customer to customer. It maybe possible that for all the customer some expenses in acquisition such as Advertisement may be same, but other components vary from one customer to other. It is the business that decides how much to expend for acquiring a particular type of customer.

2.2.5 Profit Margin

Profit Margin is one of the most popular profitability ratio for measuring the income of a business in terms of the money expend. It is represented in percentage and shows how much money business made per |100 of investment. There are several types of Profit Margin, but the most commonly used one is net profit Margin, which the companies saving after all other expenses.

Profit Margin is an important factor is calculation of CLV, as we have seen already. Higher Profit Margins means more profit by the business, which is desired. However, the profit margin may not be same for every customer, for example, in general trend, upon purchasing large bundles of a product, a customer might get and extra discount, thus reducing the profit margin. Similarly, TeleCom companies offer different plans for different type of users, each having a different profit margin for the business.

So far, we had a basic understanding of the calculation of CLV, but we also saw that the calculation is not that simple and involves several factors, which maybe dynamic in nature. We have specialised models for calculation of CLV. These models use different approaches, different assumptions and consider different factors and analyse them in their unique way. We will look into these models in the subsequent sections.

3 Models for Customer Lifetime Value

Many models are present in the literature for calculation of CLV, each one with dealing with different condition of factors, having different target outcome. On the basis of approach, they can be summarised in following three categories:

A) Deterministic Models

Customers in deterministic models are given scores that are based on the characteristics of their previous purchases. These criteria include the purchase frequency, recency, purchase amount, and so on. On the basis of these scores, it is projected what the consumer behaviour is going to be in the next purchase period, assuming that customer behaviour is going to remain same, and thus, CLV is derived. RFM models, retention model, migration model are some of the most common models of this category.

B) Probability Models

In probability models, the behaviour of customers is analysed in terms of the stochastic processes that are operating in the background. These processes are defined by the observable and latent aspects of client purchase behaviours. The fact that these qualities differ from person to person makes the concept more applicable in the real world. These models are used when calculating CLV on an aggregate level such customer cohort or customer base, rather than at the individual customer level. Pareto/NBD, EP/NBD, Gamma-Gamma are the most popular models of this category. For predicting future transactions, purchase frequency and Churn, we used Pareto/NBD, or EP/NBD or BG/NBD Model but for predicting monetary values such as Average order Value, we use Gamma-Gamma
model(see here).

C) Econometric Models

In the third category, known as the Econometrics Models, the behaviour of customers is observed based on variables such as customer acquisition, retention, and growth (cross-selling or margin), and then these factors are combined (all or few of them) to estimate customer lifetime value (CLV). These models operate on a basis that is fairly similar to that of probability models. Some of the most popular models include model for Customer Acquisition, Customer Retention, Customer Margin, Customer Expansion, etc. We will into econometric models later in more detail.

Persistence Models are an improved version of the Econometric Models that are included in the underlying concept. In a manner that is analogous to that of the Econometric Models, they model behaviour on the basis of the purchase components such as acquisition, retention, and so on. Within the framework of the Persistence Models, these constituents are regarded as dynamic systems and subjected to time series analysis. The method examines how the changes in one variable influence the others and takes those relationships into account.

On the basis of desired outcome, we can also classify the models as follows:

A) Models for calculation of CLV

In the first category are the models that have been developed specifically for the goal of calculating the Customer Lifetime Value (CLV), or that utilise the results of CLV calculations to develop a strategy for the most effective use of available resources in order to maximise CLV. These are a compilation of applicable models that are used in the process of formulating CLV-based plans and choices. Some of the popular models of this category are basic structural model, customer migration model, optimal resource allocation model, customer relationship model, we will look into more detail for some of these models.

B) Models of customer base analysis

The second category takes into account the previous purchase patterns of an organization’s entire customer base in order to forecast the probabilities of customer behaviour during the subsequent purchase period. This can be done in terms of the likelihood of a purchase being made or the predictive value of a purchase being made. When assessing the chance of a customer making a purchase in the subsequent time period, these models take into consideration the stochastic behaviour of consumer purchases. This implies that they take into account the reliability of each and every client. The CLV is calculated using the results of these models as the foundation for the computation or as the underlying theorem for calculating CLV. Some of the most common models for this category
are Pareto/NBD, EP/NBD, etc.

C) Normative models of CLV

Normative Models focus primarily on the problems that have an effect on the CLV and, as a result, help maximise the CLV. These include the research into the effects of a variety of variables on CLV and the elucidation of the guiding principles that may optimise these aspects to provide the greatest possible CLV. When calculating CLV, these models account for some of the most fundamental assumptions and beliefs, such as the notion that customers with longer lifetimes generate higher profits. The concerns with CLV are studied using normative models, which, in contrast to empirical models, these do not allow for the interference of noise. The majority of these models have overlooked competition, mostly because there was a lack of data related to the influence that competitors had. There are several models for the many components that make up CLV, such as acquisition, retention, profit margin,
purchase frequency, etc., and occasionally these aspects are merged into a single model. However, there are distinct models for each of these components. Some of the most popular models of this category are Customer Equity Model, Dynamic Pricing Model, etc.

We looked at the broad categorization of CLV Models, let’s have a look at some of the most popular CLV models.

3.1 Basic Structural Model of CLV

This model specifies a class of distinct CLV models based on the Net Present Value (NPV) of the future cash flows from customers. These NPVs are derived from the cash flows that will be received from consumers in the future. This fundamental concept of NPV is what defines such models to their core. These models make the assumption that there will be a cash flow at a certain moment throughout each and every time period. This is one of the most noticeable aspects of these models. This model only takes into account the customers who are now doing business with the company; it does not take into account customers who have done business with the company in the past or customers who may do business with the company in the future. In addition to this, it disregards the cost of acquisition as well as the probabilistic aspects of cash flow and shopping behaviour. The model will be given as

where i is the period in which cash is generated from customer transactions; Ri is the amount of revenue generated from customers in period i and Ci is the entire cost of earning the revenue Ri in period i and n is the total number of periods that make up the customer’s expected life span.

3.2 Customer Migration Model

According to Dwyer (1997), who was the one who introduced this concept, the consumers may be separated into two subgroups: those who are always-a-share and those who have lost their share permanently, lost-for-good. Always-a-share denotes that a consumer may maintain relationships with several companies at the same time, doing business with each company in turn in a back-and-forth cycle, or sharing his purchases with all of the companies with whom he is affiliated. When using lost for-good, a client who has recently switched companies or ended their previous connection with the business is deemed lost. This applies even if the customer has been doing business with the same company for a significant amount of time. The fundamental premise that underpins lost-for-good is that switching suppliers is expensive and difficult from a management standpoint. This is exactly the
reverse of the perspective that is taken in the always-a-share category. The model that we looked at before, which is the fundamental structural model for CLV, may either be utilised on its own or be supplemented with some changes in order to be used for the customers who fall into the category of lost-for-good. For the second Category, which is always-a-share, we are going to have to make the assumption that a client is never going to be lost. For this specific category, we have a customer migration model that makes use of a customer’s most recent purchase history in
order to forecast that customer’s behaviour towards future purchases. It only indicates whether or not the buyer will make a purchase during the next season but does not talks about the purchase amount. This model may be combined with a few other models, which will be covered in more detail later on, that discuss the anticipated purchase amount in order to arrive at an accurate estimate of CLV.

The fact that this model takes into consideration the stochastic character of buying behaviour (in the form of recency), which is something that many simpler models don’t do, is a noteworthy benefit of this model. Dwyer’s model has other limitations as well, some of which are similar to those of the fundamental structural model, and it also makes the assumption that the purchase will occur at the same same instant in each time period, which presupposes that all time periods are the same in and of themselves.

3.3 Customer Retention Model

Customer Retention Models(CRM) are deceptively simple in that they account for basic features of CLV and represent them concisely very well. We provide a definition for the term Discounted Expected Transaction (DET), which was first proposed by Calciu and Salerno. This term DET represents the CLV or transactions in which the monetary benefit (profit margin) is one hundred percent. Therefore, we are able to define CLV as a product of DET and monetary gain, which, in the typical situation, does not equal 1. This model is one of the deterministic models.

C LV = g × DET

where g is the monetary gain. For a just acquired customer, CLV is given by

where “g” represents the net gain or margin, “d” represents the discount rate, and “r” represents the retention rate that we discussed before. After some more streamlining, the model for CLV that was illustrated above may be expressed in writing as:

The fraction obtained by dividing r by (1+d-r), i.e, r (1+d−r) is referred to as the margin multiple. It has been assumed that the gain would stay unchanged with time, but if the gain has to rise at a constant rate of q, then the margin multiple will be r (1+d−r(1+q) . When estimating CLV for groups of customers, deterministic models neglect the discrete probabilistic character of consumers, which is one key shortcoming of such models, despite the fact that they perform quite well for management purposes.

Earlier on, we had made the assumption that the retention rate would stay the same, but it turns out that this is not the case. Businesses have the ability to modify retention criteria by altering either their retention budget or their retention policy. So now, r instead of remaining constant, could be a function of these variable given as f (R). So our model would change to be:

where (m −R f/(R)) is the gain g, which is margin minus average retention cost and R is the fraction of retention spending per customer.

3.4 Pareto/Negative Binomial Distribution Model

When calculating CLV, it is of the utmost importance to have a comprehensive understanding of the client base, which includes both current customers and potential new consumers. The Negative Binomial Distribution (NBD), often known as the Pareto Model, is one of the most well-known probabilistic models for addressing CLV. It was first proposed by Schmittlein, Morrison, and Colombo. It gives us an idea of the likelihood that a consumer is still active. This approach has been used by commercial enterprises in the process of calculating their customer base, also known as the company’s active clientele.

The forecast is made on the basis of the purchase history of the consumer, taking into consideration both the recency and frequency of their purchases. This model may not only be used to analyse the already-established client base, but it can also be used to provide forecasts about the expansion of the customer base. Unlike the models that were previously mentioned, this model takes into consideration the stochastic character of the client, which means that a customer may make any purchase at any moment in time and can become inactive and active at any time. Pareto/NBD Model is given as follows:

r, s, α,β are model parameters, t is the time since trial at which the most recent transaction occurred, T is the time since trial, x is the number of purchases made in the time period T, F() is the Gauss hypergeometric function Reinartz and Kumar made several modifications to the Pareto Model in order to provide a framework
that can be used in non-contractual contexts for the purpose of determining the customer lifetime profitability pattern and the model is named as Extended Pareto/NBD Model. A non-contractual situation is one in which the client has complete power over the purchasing choices, and the company has no say in the matter. They decided to go with a discontinuous output in the form of a categorization of Alive/Dead rather of a continuous probability stream in the event that the Pareto Model was used.

This model used information about when the connection with the customer first began and the customer’s purchasing patterns to generate a probability threshold. Using this threshold, the researchers determined when the client would no longer be profitable to the company. This model also provides us with information on the lifespan of the client, which may be expressed as the difference between the expected time of death and the time of birth. These models may be used in scenarios where the analyst is unaware of the client’s period of inactivity and the customer is free to make any number of purchases at any time as well as to become inactive at any time. In several experiments on finding best suited models for online retail stores, the EP/NBD model has outperformed other probabilistic models in a majority of evaluation metrics and can be considered good and stable for non-contractual
relations in online shopping. It should be emphasised, however, that in order for such models as Pareto or Modified-Pareto to perform, a lengthy stream of data is necessary; transaction statistics for at least three years is believed to be essential. This is something that should be kept in mind while using these models. The two models discussed above are quite complex, which makes it challenging to use them in real-world settings. Additionally, when the number of clients in the customer base grows and the average amount each customer spends declines, these models become less effective since they take into account each customer individually to determine individual probabilities of purchase.

3.5 Customer Equity Model

In the last section, we looked at a model to help us better understand a company’s customer base; this model, the Customer Equity Model, which was established by Blattberg and Thomas, provides the customer equity value for newly acquired consumers. With the assistance of this model, a firm is able to do an in-depth analysis of the influence that marketing has, element by element, on the value of the client base of the company throughout the course of time. The authors claim that this model provides a way to assess the accuracy of various qualitative results on customer relationship marketing that are often made. According to this approach, a company’s customer equity model is determined by the following:

The model states that the customer equity is equivalent to the profit made by acquiring new customers or the profit made from first-time customers minus the cost of acquisition, plus the profit made from future sales to these customers divided by the discount rate, with the total for all of the segments being added up.

3.6 Econometric Models

The term “Econometric Models” refers to a class of models or a collection of models that adhere to the same fundamental principles as Probabilistic Models like Pareto/NBD. In these models, the components that make up CLV, such as customer acquisition, customer retention, profit margin, and customer growth, are modelled and investigated individually or in combination with one or more of the aforementioned components.
Some of these models are:

3.6.1 Customer Acquisition

Customer Acquisition is the process of regaining either a former client or a new customer or a former
client. The logit and probit models are often used in the estimation of customer acquisition. The model
is going to be presented as

where Zjt refers to customer j at time t, Xjt are the covariates and αj are consumer specific response parameters. εjt are the error term, depending on which it will be decided whether the model is logit or probit.

3.6.2 Customer Retention

The term “customer retention” refers to the likelihood of an existing client making more purchases from the same company in the future. In contractual relationships, such as those between a customer and a TeleCom company or magazine publisher, the customer notifies the company before halting the relationship with the company; consequently, the company is already aware of the retention parameters. On the other hand, in non-contractual relationships, such as those between a company and an online retail store, the company must predict whether or not a customer has churned. It has been shown via studies and models that client acquisition and customer retention are not two very independent processes. Rather, there is a strong correlation between the two processes, which has proven to be profitable for businesses when it is modelled. Also, surveys have discovered that substantial price cuts,

which have proved to be quite advantageous in the process of attracting new consumers, do not seem to be particularly effective for pre-existing clients. This is connected to the level of satisfaction that the customers experience; initially, during the acquisition process, they received the same product at a lower price; however, later on, after the acquisition discount was removed, the price went up, and the customer experienced a price hike; as a result, the customer might not wish to continue with the service. We have already seen customer retention model is one of our previous sections, so we will skip that here and move to the next section.

3.6.3 Customer Margin and Expansion

The margin created by each client throughout the time period t is another component of customer lifetime value. The profit that is earned is dependent not only on the purchasing behaviour of the firm’s customers but also on the efficiency with which the company engages in up-selling and cross-selling its products, that is the selling behaviour of the company. Two different methods may be used to simulate customer margin in a general sense. The first strategy models the margin in a direct manner, while the second technique models cross-selling in order to comprehend the margin. In the process of modelling the customer margin, one of the most frequent assumptions that have been made is that the margin would be constant over time. This is something that we have already seen in several of the models that have been presented above. On the other hand, Venkatesh and Kumar came
up with a model for calculating the margin that was based on linear regression which does not make this assumption, rather tends to find out how margin is changing over time. The model is presented as follows:

∆CMj t = βXjt + εjt

where ∆CMj t is the change in margin for customer j at time t, αj is customer specific response parameters, εjt is the error term and Xjt is the covariate.

These models use advanced statistical and machine learning techniques — often implemented by firms specializing in AI consulting — to predict future customer behavior, retention likelihood, and revenue generation.

4 Limitation of CLV Models

All of the models that have been suggested for determining CLV have some shortcomings. These constraints come in the form of limits on the quantity of cash flow from a client, the timing of cash flow, the sort of company to which the model is relevant, the type of data that is required, and so on and so forth. The scope of research in CLV modeling has been narrowed down to a few distinct variants of the fundamental model, the Pareto/NBD model, and the Markov chain model. Their application is severely restricted due to the restrictive assumptions made, the complexities involved, and the limited applicability. When estimating CLV, practitioners often rely on models that are quite fundamental. It is difficult to find empirical validation for many CLV models. There is a need for research on estimating systems that may produce estimates that are reliable, consistent, and objective.

Most CLV models do not contain demographics and product consumption factors. There is a need for research to enhance such models so that they can include such factors. There is a need for more research to build new models for various product categories or to alter current models so that they may be used for other product categories. The currently available CLV models do not take into account the purchasing decision made by the customer. It is possible to incorporate into CLV models factors such as those that drive consumers to make purchases, the effect of marketing activities on consumers, and factors that influence consumers to purchase products from the same company on multiple occasions, such as switching costs and the influence of marketing.

There are hardly any studies that combine the concepts of client acquisition and customer retention into a single model. Because the cost of client acquisition plays a significant role in calculating the net profit that may be made from a single customer, it is preferable to use models that take into account both customer acquisition and customer retention. There is a need for more study on such models.

In order to correctly anticipate CLV based on consumption patterns in the past and estimations of CLV
from the past, further study is required.

5 Using Machine Learning to predict Customer Lifetime Value

5.1 Approaches for modeling CLV

There are two broad approaches for modelling Customer Lifetime Value:

A) Historical Approach: In order to calculate the Customer Lifetime Value, this framework takes use of data from the past. The primary objective of this framework is to determine the Customer Value for the time period during which the customer has previously been in a connection with the organisation. These models were all based on the same fundamental idea, which is

Historical CLV = (Sum o f all transaction) × (Average gross margin)

Such models which uses historical approach can be categorised in following two categories:

1. Aggregate Model: Computing the CLV by making use of the typical revenue generated by each client based on previous purchases. We only get one value back for the CLV when we use this approach.

2. Cohort Model: Dividing the consumers into distinct cohorts according to the date of their most recent transaction, for example, and determining the average revenue generated by each cohort. The CLV value may be obtained for each cohort using this procedure. Now we will take a look at predictive CLV, which is an algorithmic method that analyses a customer’s transaction history and behavioural patterns to calculate the customer’s present value and anticipate how the customer’s value will change over the course of time. This value grows more accurate over time and is a superior technique for calculating customer lifetime value (CLV) when a customer makes
more purchases and interacts with a business.

B) Predictive Approach: It is an algorithmic method that analyses a customer’s transaction history and behavioural patterns to calculate the customer’s present value and anticipate how the customer’s value will change over the course of time. This value grows more accurate over time and is a superior technique for calculating customer lifetime value (CLV) when a customer makes more purchases and interacts with a business. There are five steps in calculation of Predictive CLV, which are:

Step 1: Calculating Average Order Value

It is the revenue generated by an average customer. For calculating Average order value, firstly take a time period which is good representation of customer purchase behavior, something like 1 year or 2 year.

Step 2: Calculating Average Purchase Frequency

It is the representation of the frequency with which the customer makes purchases from a business. A
customer with high frequency will be termed as loyal customer

Step 3: Calculating Customer Value

Customer value represents how much revenue(or profit) is generated by an average customer in the
given time period.

Customer value (CV) = Average order value × Average purchase frequency

Step 4: Calculating Average Customer Lifespan

Customer’s Lifespan is the duration between the customer’s first purchase and the last purchase, before
ending their relationship with the firm.

Average Customer Lifespan can also be given as 1/(Churn Rate)

Step 5: Calculating Customer Lifetime Value

As we have already seen earlier, customer lifetime value is given as Customer life time value

(CLV) = Customer value × Average customer lifespan × Margin

all of which we have calculated earlier and Margin is given as

Predictive Models can also be categorised into set of models:

1. Probabilistic Model: Makes an estimate of the future count of transactions as well as the monetary value associated with each transaction by attempting to fit the data into a probability distribution.

2. Machine Learning Model: Using Machine Learning techniques such as regression, clustering, etc on past data to predict CLV for upcoming future.

5.2 Modeling Machine Learning Algorithm

5.2.1 Dataset

For using any sort of machine learning or analysis, firstly we need a proper dataset. Some of the must required quantities in the dataset would be InvoiceID, CustomerID, ProductID, Quantity, Amount, InvoiceDate. Other than that, columns such as Order Address, Buyer’s descriptions, etc can also be helpful if performing demographic analysis.

For example a open-source retail dataset containing all transactions occurring during a year for a
UK-based and registered non-store online retail looks like

5.2.2 Data-Preprocessing

Next Important and must step is Data-Preprocessing., where the data set is cleaned and altered in order to make it useful for the purpose. It involves reducing noise and redundancy. Some of the important processes in this step would be 1. Add checks and balances to verify the data points and remove the data points with inconsistent information. 2. Removing Noise 4. Converting Datatypes.

5.2.3 Feature Selection and Feature Engineering

Feature selection is very important for reducing the computation power and time, and improving model efficiency. In our dataset there might be some features which are of no use to us but including them in our model, unnecessarily increases the load on the model and slows it down. Similarly, there might be some latent features in the data that are not visible, or are rather a combination of more than on features of the dataset of an alteration of a feature, such features can be incorporated into our dataset using feature enginnering. Both feature selection and feature engineering involves understanding the correlation of the features in the dataset with other features in the dataset and the result.

5.2.4 Customer Segmentation

To efficiently and responsibly advertise to each group, customer segmentation separates consumers into segments or groups based on their shared characteristics. Additionally, it helps in raising the retention rate, which in turn aids in raising the Customer Lifetime Value. Calculating RFM, or recency, frequency, and monetary value, is one of the most helpful segmentation approaches. The customer base can be divided into several groups or segments based on the revenue generated as well. This will allow us to understand each group separately and the effect of different parameters on each group separately.

5.2.5 Modeling CLV

This is most crucial step which involves finding and training a suitable model for Customer Lifetime Value. As it might seem that all the values are numerical in the calculation of CLV such as Average Order Value, Average Purchase Frequency, Average Customer Lifespan, a regression model might suit our need. Given the dataset for past year or two year or so, we can apply a regression model to predict all of these features independently and then use them to calculate CLV, or rather combine them in form of some suitable model and then apply regression model. One important thing to keep in mind is while using a predictive model, it’s better to keep individual traits in the dataset, that is while using a combination of feature, it should always be seen that if that particular customer is going to make a purchase in the next period or not, if only they are going to be involved with the business, their values are to be added in the CLV.

Regression was just an example, there are several other models possible such as XGBoost, LGBM, Gradient Boosting, etc. It’s always a good practice to split our dataset into training set and testing set, and then training our model only on training set, and testing it on test set. Sometime, a crossvalidation set may also be put to use to further improve the model statistics. Also, if not sure about which model would fit betters, is good practice to test various models and compare their results. The train-test size can also be altered to improve our model’s performance.

5.2.6 Model Performance and Evaluation

After modeling CLV, it’s better to check, how well our model(s) have performed. This involves, testing
our model on test set and look at various evaluation metrics such accuracy, f1-score, recall, r2
value, etc. It is obviously not possible to check if the CLV calculated for future time-period is correct or not, but we might train our model for some past time, leave a past time period, such as 6 month os so, and then see how well our model predicts the CLV for this 6 months, this can also be a measure for the model performance.
For a detailed paper on machine learning modelling of CLV, please check this paper.
References

[1] D.Jain and S.S. Singh, “Customer lifetime value research in marketing: A review and future directions,” Journal of interactive marketing, vol.16, no.2, pp.34–46, 2002.
[2] S.Sharma, “Customer lifetime value modelling,” 2021.
[3] M.Calciu, “Deterministic and stochastic customer lifetime value models. evaluating the impact of ignored heterogeneity in non-contractual contexts,” Journal of Targeting, Measurement and Analysis for Marketing, vol.17, no.4, pp.257–271, 2009.
[4] S.Gupta, D.Hanssens, B.Hardie, W.Kahn, V.Kumar, and N.Lin, “& ravishanker, n.(2006),” Modeling customer lifetime value. Journal of Service Research, vol.9, pp.139–155.
[5] P.D. Berger and N.I. Nasr, “Customer lifetime value: Marketing models and applications,” Journal of interactive marketing, vol.12, no.1, pp.17–30, 1998.
[6] W.J. Reinartz and V.Kumar, “On the profitability of long-life customers in a noncontractual setting:
An empirical investigation and implications for marketing,” Journal of marketing, vol.64, no.4, pp.17–35, 2000.
[7] R.C. Blattberg and J.S. Thomas, “Dynamic pricing strategies to maximize customer equity,” Unpublished manuscript, Northwestern University, Evanston, IL, 1997.
[8] D.C. Schmittlein, D.G. Morrison, and R.Colombo, “Counting your customers: Who-are they and
what will they do next?,” Management science, vol.33, no.1, pp.1–24, 1987.
[9] J.S. Thomas, R.C. Blattberg, and E.J. Fox, “Recapturing lost customers,” in Perspectives On Promotion And Database Marketing: The Collected Works of Robert C Blattberg, pp.229–243, World
Scientific, 2010.
[10] P.Jasek, L.Vrana, L.Sperkova, Z.Smutny, and M.Kobulsky, “Modeling and application of customer lifetime value in online retail,” in Informatics, vol.5, p.2, MDPI, 2018

11] S.Chen, “Estimating customer lifetime value using machine learning techniques,” in Data mining,
vol.17, 2018.

The post customer first appeared on Perceptive Analytics.

How We Reduced 99.8% Load Time For A Tableau Workbook With A Multiple Sheet Filter?

admin — Fri, 13 May 2022 09:38:07 +0000

In this article, we will show you how to speed up the loading of your Tableau workbook when you are dealing with a huge dataset.

The Workbook Loaded in 5 Minutes and 19 Seconds

We were dealing with a dataset that has 27 million rows. This dataset has information about movies, their genres and ratings.

We made three visualizations-

Image 1

Image 2

Image 3

These vizs have Titles filter in common.

The last two vizs are made using fields from the Ratings dataset, which is a huge dataset with 27 million rows. The first viz is made using Movies dataset which is relatively very small.

Now when we apply filter on any of the sheets. The workbook takes 5 min 19 sec to load. The reason is, the filter being common will be applied to all the selected sheets and since the dataset is huge, the vizs take time to load.

As Tableau partners, we follow performance-first design principles to ensure faster load times and scalable dashboards.

Solution

Changing the filter setting and applying the Titles filter individually to the worksheets. Now, if we apply filter to any of the sheets, the other vizs stay unaffected.

The same workbook now takes 0.65 sec to load, that is 99.8% lesser time.

Our experienced Tableau consultants restructured the filtering logic, reducing load time from 5+ minutes to under a second.

To Conclude

When you are dealing with a huge dataset having multiple visualizations which have common filters, applying the filters individually works better than applying the filters to the selected worksheets altogether. It reduces the load time and improves the performance significantly.

For complex workbooks and large datasets, businesses often rely on experienced Tableau consulting firms like us to optimize performance and user experience.

For more details visit our website.

The post How We Reduced 99.8% Load Time For A Tableau Workbook With A Multiple Sheet Filter? first appeared on Perceptive Analytics.

How we reduced 99.6% in load time of a Tableau Workbook with 112 million string Calcs?

admin — Mon, 04 Apr 2022 06:07:15 +0000

String calculations take up a lot of processing power. When you have lots of string calcs, it can slow down a dashboard.

In this article, we will show you how we reduced the time by 99%.

We help clients achieve similar results through our Tableau Consulting Services.

You can also look this video to see how we did it!

The Dataset had 28 million Rows and 112 million Calculations

It was a movie review dataset.

We extracted the year in which the move was released which was embedded in the movie title. E.g., ‘The Shawshank Redemption (1994)’, with even multiple parentheses in some cases.

We wrote 4 new calculated fields, using string functions, to extract the movie year. An example calculation is shown below:

So, it has 112 million calcs (4 row level string calcs * 28 million rows).

The Viz Loaded in 7.9 Minutes!

A simple viz that shows #movies and #reviews by movie year, took 7.9 mins to load!

We feel dashboards should be fast that they don’t delay the speed of our thought process. And our ability to derive insights from data.

So, we needed to do way better.

First Solution: We Moved the String Calcs to Data Source

To start with, we wanted to try taking the load off Tableau and do the calculations in the database itself.

So, we computed the ‘Movie Year’ field in MS SQL server and saved it to the data source.

Now, the same viz took 1.44 min to load in Tableau with a Live connection – that is 82% lesser time!

But we wanted to improve it further.

Second Solution: We ‘Materialized’ the String Calcs in Hyper Extract

As we did not need real-time data, we took the initial workbook (5.5 mins) and created an Extract for it. This materialized the calcs on it.

What does materializing a calc mean, in Tableau? It means two things:

Pre-calculating the calculation results:

You’re shifting the processing in Tableau to a time when a user isn’t waiting.

2. Storing those results on the Extract:

Tableau can use those results to be faster, instead of computing every time a query is made with that respective calculation.

It can be done by right clicking on the datasource > Extract > Compute Calculations Now.

With that done, the viz just took 4.65 seconds to load – a 99% reduction in time!

4.65 sec still feels a bit slow, right?

Third Solution: Leverage the Power of Tableau Hyper Extracts

We wanted to see how much would be the viz load time, with an extract of the data source that already has ‘Movie Year’ field on it.

In this case, the viz took just 1.8 seconds to load – a 99.6% reduction in time!

This shows that Hyper Extract is better optimized to handle Tableau’s queries than other databases.

Our Tableau Developers regularly use these techniques to optimize enterprise dashboards.

To Conclude

Tableau dashboards can perform better when you:

Move your string calcs to data source if you need real-time data.
Materialize your string calcs on Hyper Extract if you don’t need a live connection.
Move your string calcs to data source and use Hyper Extract simultaneously.

The post How we reduced 99.6% in load time of a Tableau Workbook with 112 million string Calcs? first appeared on Perceptive Analytics.

How to Create Groups Efficiently in Tableau?

admin — Thu, 24 Mar 2022 06:42:02 +0000

A great business dashboard combines high performance and ease of use.

In this article, we will show you how to efficiently create groups in Tableau.

You can also look this video to see how we did it!

We Created Groups with Native Feature (28 Mn Rows)

It was a movie review dataset.

We wanted to see the average rating of a selected few movies against the rest of them.

We created groups using Tableau’s native feature as shown below:

The Viz Loaded in 2 Minutes 51 Seconds!

A simple viz that shows the avg reviews of grouped movies, took 2 min 51 secs to load!

The built-in group feature loads the entire domain of the dimension. Thus, it takes time.

As it is too long, we wanted to improve this.

We used CASE Statement to Create Groups

We created a Calculated Field using CASE Statement to do the grouping.

It only loads the named members of the domain. Thus, it would be faster.

Now, the same viz took 1 min 40 sec to load in Tableau with a Live connection – that is 42% lesser time!

The load time can be still reduced further if we leverage the power of Tableau Extracts as explained here.

Our Tableau Consulting team often recommends this method for clients dealing with large datasets.

To Conclude

Groups created with CASE Statements would perform better compared to the native ‘Create > Group’ feature in Tableau.

The post How to Create Groups Efficiently in Tableau? first appeared on Perceptive Analytics.

How we Reduced 98.9% Load Time for a Tableau Viz that used Multiple OR Conditions?

admin — Wed, 16 Mar 2022 05:58:08 +0000

If you don’t write efficient calcs, it can slow down your dashboard.

In this article, we will show you how to speed up OR statements with multiple conditions in Tableau.

You can also look this video to see how we did it!

The Workbook Loaded in 41 Seconds

The dataset had information about flights.

We wrote a simple calculation to compare a value against a list of values:

The workbook took 41 sec to load.

The above calculation has a series of conditions which get evaluated one by one.

The more conditions you have, the more time it takes.

The more rows (data) you have, the more time it takes.

We Modified the Calculation to use ‘IN’ Function

IN statement directly compares a dimension against a list – much simpler & faster.

So, we modified the calc to use IN statement instead of OR statement.

Now, the same workbook took 29 sec to load in Tableau– that is 29% lesser time.

To improve further, we switched to an Extract connection.

Then, it took just 0.43 sec – 98.9% lesser time!

You can also explore our Tableau Implementation Services for performance-focused deployments.

To Conclude

When you want to compare a value against a list of values, IN function would perform better than OR function.

The post How we Reduced 98.9% Load Time for a Tableau Viz that used Multiple OR Conditions? first appeared on Perceptive Analytics.

Recreating John Snow’s Viz in Tableau

admin — Tue, 19 Jan 2021 11:53:38 +0000

London cholera outbreak slides

London cholera outbreak viz Tableau workbook

Data prep python code

The post Recreating John Snow’s Viz in Tableau first appeared on Perceptive Analytics.

Tableau Web Data Connector for FACTSET

admin — Thu, 11 Oct 2018 10:34:10 +0000

The project was to create Tableau Dashboards for visualizing financial information of different stocks/tickers that would be used by Financial Analysts. The data had to be obtained through an Application Programming Interface (API) of a third-party service provider called FactSet. The data had to be obtained on the fly as and when the dashboard user inputs a specific ticker.

What is a Web Data Connector?

A Web Data Connector (WDC) is a data connection option in Tableau that can be used to fetch information from web. A WDC is a web page with HTML, CSS & JavaScript. Whenever a user inputs a ticker, the WDC would take that ticker and make multiple AJAX calls to the third-party API asking for the required data. The third-party API’s servers would check authorization & authentication. If successful, they return the requested data in JSON / XML formats. The WDC parses the received JSON / XML information and transforms them into a tabular structure which is submitted to Tableau. Tableau then receives this information and populates the dashboard with updated data.

Project Experience & Learning

Part I: To build the dashboards in Tableau, we built a WDC that can fetch the required data from FactSet’s servers. While developing the WDC, we used JavaScript extensively to get the ticker input by user, frame the appropriate URL, make the AJAX calls, receive information, parse the response & transform it into the requires shape so that it can be given to Tableau. HTML and CSS were used to build the user interface of the WDC where the Tableau user can input the ticker data. We dealt with Cross Origin Resource Sharing (CORS) limitations to fetch the required data. To overcome CORS, we had used a proxy server to route our request.

Part II: We built multiple dashboards to display the fetched data in a way useful for the analyst to understand the trading history of the stock/ticker, look into its financials, look into the broker estimates for future & make buy/hold/sell decision.

How to Get in touch
You can reach out to us by emailing cs AT perceptive-analytics.com

To build the dashboards in Tableau, we built a WDC that can fetch the required data from FactSet’s servers.
Need help with a custom Tableau integration? Our professionals can help Tableau Consultants | Tableau Consulting | Tableau Expert | Tableau Contractor | Tableau Freelance Developer | Tableau Developer | Tableau Partner Company

The post Tableau Web Data Connector for FACTSET first appeared on Perceptive Analytics.

Basic Statistics in Tableau: Correlation

Saneesh V — Thu, 30 Aug 2018 09:00:21 +0000

Statistics in Tableau

Data in the right hands can be extremely powerful and can be a key element in decision making. American statistician, W. Edwards Deming quoted that, “In God we trust. Everyone else, bring data”. We can employ statistical measures to analyze data and make informed decisions. Tableau enables us to calculate multiple types of statistical measures like residuals, correlation, regression, covariance, trend lines and many more.

Today let’s discuss how people misunderstand causation and correlation using Tableau.

Correlation and Causation

Correlation is a statistical measure that describes the magnitude and direction of a relationship between two or more variables.

Causation shows that one event is a result of the occurrence of another event, which demonstrates a causalrelationship between the two events. This is also known as cause and effect.

Types of correlation:

1 → Positive correlation.
-1 → Negative Correlation.
0 → No correlation.

Why are correlation and causation important?

The objective of analysing data is to identify the extent by which a variable relates to another variable.

Examples of Correlation and Causation

Vending machines and obesity in schools: people gain weight due to junk food. One important source of junk food in schools is vending machines. So if we remove vending machines from schools obesity must reduce, right? But it isn’t true. Research shows that children who move from schools without vending machines to schools with vending machines don’t gain weight. Here we can find a correlation between children who were overweight and eating junk food from vending machines. In actuality, the “causal” point (which is the removal of vending machines from schools) has a negligible effect on obesity.
Ice cream sales and temperature: If we observe ice cream sales and temperature in the summer, we can determine that they are causally related; i.e. there is a strong correlation between them. As temperature increases, ice cream consumption also increases. Understanding correlation and causation allows people to understand data better.

Now let’s explore correlation using Tableau. We are going to use the orders table from the superstore dataset which comes default with Tableau.

Before going further let’s understand how to calculate the correlation coefficient ‘r’.

We can easily understand the above formula by breaking it into pieces.

In Tableau, we can represent the above formula as 1/SIZE() -1 where SIZE is function in Tableau.

We can use WINDOWSUM function for doing this summation in Tableau.

x_i is the sum of profit and x-bar is the mean of profit, which is window average of sum of profit, and s_xis standard deviation of profit. That means that we need to subtract mean from sum of profit and divide that by standard deviation.

(SUM([Profit])-WINDOW_AVG(SUM([Profit]))) / WINDOW_STDEV(SUM([Profit])))

This is similar to the formula above but we only need to swap profit with sales.

(SUM([Sales])-WINDOW_AVG(SUM([Sales]))) / WINDOW_STDEV(SUM([Sales])))

Now we have to join all these formulae to get the value of the correlation coefficient of r. Be careful while using parenthesis or you may face errors. Here is our final formula to calculate r.

1/(SIZE()-1) * WINDOW_SUM(( (SUM([Profit])-WINDOW_AVG(SUM([Profit]))) / WINDOW_STDEV(SUM([Profit]))) * (SUM([Sales])-WINDOW_AVG(SUM([Sales]))) / WINDOW_STDEV(SUM([Sales])))

Let’s implement this in Tableau to see how it works. Load superstore data into Tableau before getting started.

After loading the superstore excel file into Tableau, examine the data in the orders sheet. You can see that it contains store order details complete with sales and profits. We will use this data to find correlation between profit and sales.

Let’s get our hands dirty by making a visualization. Go to sheet1 to get started. I made a plot between profit and sales per category.

Now in order to find the correlation between profit and sales, we need to use our formula to make a calculated field which serves our purpose.

Now drag and drop our calculated field onto the colors card and make sure to compute using customer name as we are using it for detailing.

Here we can see the strength of the relationship between profits and sales of data per category; the darker the color, th he stronger the correlation.

Next we’ll add trend lines to determine the direction of forecasted sales.

These trend lines help demonstrate which type of correlation (positive, negative or zero correlation) there is in our data. You can explore some more and gain additional insights if you add different variables like region.

From this analysis we can understand how two or more variables are correlated with each other. We begin to understand how each region’s sales and profits are related.

Let’s see how a correlation matrix helps us represent the relationship between multiple variables.

A correlation matrix is used to understand the dependence between multiple variables at same time. Correlation matrices are very helpful in obtaining insights between the same variables or commodities. They are very useful in market basket analysis.

Let’s see how it works in Tableau. Download the “mtcars” dataset from this link. After downloading it, connect it to Tableau and explore the dataset.

The dataset has 35 variables where each row represents one model of car and each column represents an attribute of that car.

Variables present in dataset:

Mpg = Miles/gallon.

Cyl = Number of Cylinders.

Disp = Displacement (cubic inches)

Hp = Gross Horsepower

Drat = Rear axle ratio

Wt = Weight (lb/1000)

Qsec = ¼ mile time

Vs = V/Sec

Am = Transmission (0 = automatic, 1 = manual)

Gear = Number of forward gears

Carb =Number of Carburetors

Let’s use these variables to make our visualization. I made this amazing visualization showing correlation between models by referring to Bore Beran’s blog article, in which he explained how to make this visualization which helps us understand more about using Tableau to understand correlation.

Conclusion

We must keep in mind that if we want to measure the dependence between two variables, correlation is the best way to do it. A correlation value always lies between -1 and 1. The closer the value of the correlation coefficient is to 1, the stronger their relationship. We must remember that correlation is not causation and many people misunderstand this. There are many more relations and insights that can be unlocked from this dataset. Explore more by experimenting with this dataset using Tableau. Practice to be perfect.

The post Basic Statistics in Tableau: Correlation first appeared on Perceptive Analytics.

Tableau for Marketing: Become a Segmentation Sniper

Saneesh V — Wed, 29 Aug 2018 09:00:07 +0000

Did you know that Netflix has over 76,000 genres to categorize its movie and tv show database? I am sure this must be as shocking to you as this was to me when I read about it first. Genres, rather micro-genres, could be as granular as “Asian_English_Mother-Son-Love_1980.” This is the level of granularity to which Netflix has segmented its product offerings, which is movies and shows.

But do you think is it necessary to go to this level to segment the offerings?

I think the success of Netflix answers this question on its own. Netflix is considered to have one of the best recommendation engines. They even hosted a competition on Kaggle and offered a prize money of USD 1 million to the team beating their recommendation algorithm. This shows the sophistication and advanced capabilities developed by the company on its platform. This recommendation tool is nothing but a segmentation exercise to map the movies and users. Sounds easy, right?

Gone are the days when marketers used to identify their target customers based on their intuition and gut feelings. With the advent of big data tools and technologies, marketers are relying more and more on analytics software to identify the right customer with minimal spend. This is where segmentation comes into play and makes our lives easier. So, let’s first understand what is segmentation? and why do we need segmentation?

Segmentation, in very simple terms, is grouping of customers in such a way that that customers falling into one segment have similar traits and attributes. The attributes could be in terms of their likings, preference, demographic features or socio-economic behavior. Segmentation is mainly talked with respect to customers, but it can refer to products as well. We will explore few examples as we move ahead in the article.

With tighter marketing budgets, increasing consumer awareness, rising competition, easy availability of alternatives and substitutes, it is imperative to use marketing budgets to prudently to target the right customers, through the right channel, at the right time and offer them the right set of products. Let’s look at an example and understand why segmentation is important for marketers.

There is an e-commerce company which is launching a new service for a specific segment of customers who shop frequently and whose ticket size is also high. For this, the company wants to see which all customers to target for the service. Let’s first look at the data at an aggregate level and then further drill down to understand in detail. There are 5 customers for whom we want to evaluate the spend. The overall scenario is as follows:

Should the e-commerce company offer the service to all the five customers?

Who is the right customer to target for this service? Or which is the right customer segment to target?

We will see the details of each of the customers and see the distribution of data.

Looking at the data above, it looks like Customer 1 and Customer 2 would be the right target customers for company’s offering. If we were to segment these 5 customers into two segments, then Customer 1 and Customer 2 would fall in one segment because they have higher total spend and higher number of purchases than the other three customers. We can use Tableau to create clusters and verify our hypothesis. Using Tableau to create customer segments, the output would look like as below.

Customer 1 and customer 2 are part of cluster 1; while customer 3, customer 4 and customer 5 are part of cluster 2. So, the ecommerce company should focus on all the customers falling into cluster 1 for its service offering.

Let’s take another example and understand the concept further.

We will try to segment the countries in the world by their inbound tourism industry (using the sample dataset available in Tableau). Creating four segments we get the following output:

There are few countries which do not fall into any of the clusters because data for those countries is not available. Looking at clusters closely, we see that the United States of America falls in the cluster 4; while India, Russia, Canada, Australia, among others fall in the cluster 2. Countries in the Africa and South America fall in the cluster 1; while the remaining countries fall in the cluster 3. Thus, it makes it easier for us to segment countries based on certain macro-economic (or other) parameters and develop a similar strategy for countries in the same cluster.

Now, let’s go a step further and understand how Tableau can help us in segmentation.

Segmentation and Clustering in Tableau

Tableau is one of the most advanced visualization and business intelligence tool available in the market today. It provides a lot of interactive and user-friendly visualizations and can handle large amounts of data. It can handle millions of rows at once and provides connection support to almost all the major databases in the market.

With the launch of Tableau 10 in 2016, the company offered a new feature of clustering. Clustering was once considered a technique to be used only by statisticians and advanced data scientists, but with this new feature in Tableau it becomes as easy as simple drag and drop. This feature can provide a big support to marketers in segmenting their customers and products, and get better insights.

Steps to Becoming a Segmentation Sniper

Large number of sales channels, increase in product options and rise in advertisement cost has made it inevitable not only for marketers but for almost all the departments to analyze customer data and understand their behavior to maintain market position. We will now take a small example and analyze the data using Tableau to understand our customer base and zero-in on the target customer segment.

There is a market research done by a publishing company which is mainly into selling of business books. They want to further expand their product offerings to philosophy books, marketing, fiction and biographies. Their objective is to use customer responses and find out which age group like which category of books the most.

For an effective segmentation exercise, one should follow the below four steps.

Understand the objective
Identify the right data sources
Creating segments and micro-segments
Reiterate and refine

We will now understand each of the steps and use Tableau, along with, to see the findings at every step.

Understand the objective

Understanding the objective is the most important thing that you should do before starting the segmentation exercise. Having a clear objective is the most imperative thing because it will help you channelize your efforts towards the objective and prevent you from just spending endless hours in plain slicing and dicing. In our publishing company example, the objective is to find out the target age group which the company should focus on in each of the segments, namely philosophy, marketing, fiction and biography. This will help the publishing company in targeting its marketing campaign to specific set of customers for each of the genres. Also, it will help the company in identifying the target age group that like both business and philosophy or business and marketing, or similar other groups.

Identify the right data sources

In this digital age, data is spread across multiple platforms. Not using the right data sources could prove to be as disastrous as not using analytics at all. Customer data residing in CRM systems, operational data in SAP systems, demographic data, macro-economic data, financial data, social media footprint – there could be endless list of data sources which could prove to be useful in achieving our objective. Identifying right variables from each of the sources and then integrating them to form a data lake forms the basis of further analysis.

In our example, dataset is not as complex as it might be in real life scenarios. We are using a market survey data gathered by a publishing company. The data captures the age of customer and their liking/disliking for different genres of books, namely philosophy, marketing, fiction, business and biography.

Creating segments and micro-segments

At this stage, we have our base data ready in the analyzable format. We will start analyzing data and try to form segments. Generally, you should start by exploring relationships in the data that you are already aware of. Once you establish few relationships among different variables, keep on adding different layers to make it more granular and specific.

We will start by doing some exploratory analysis and then move on to add further layers. Let’s first see the results of the market survey at an aggregate level.

From the above analysis, it looks like fiction is the most preferred genre of books among the respondents. But before making any conclusions, let’s explore a little further and move closer to our objective.

If we split the results by age group and then analyze, results will look something like the below graph.

In the above graph, we get further clarity on the genre preferences by respondents. It gives us a good idea as to which age group prefers which genre. Fiction is most preferred by people under the age of 20; while for other age groups fiction is not among the top preference. If we had only taken the average score and went ahead with that, we would have got skewed results. Philosophy is preferred by people above the age of 40; while others prefer business books.

Now moving a step ahead, for each of the genre we want to find out the target age group.

The above graph gives us the target group for each of the genres. For biography and philosophy genres, people above the age of 40 are the right customers; while for business and marketing, age group 20-30 years should be the target segment. For fiction, customers under the age of 20 are the right target group.

Reiterate and refine

In the previous section, we created different customer segments and identified the target segment for publishing company. Now, let’s say we need to move one more step ahead and identify only those age groups and genres which have overlap with business genres. To put it the other way, if the publishing company was to target only one new genre (remember, they already have customer base for business books) and one age group, which one should it be?

Using Tableau to develop a relation amongst the different variables, our chart should look like the one below.

Starting with the biography genre, age group 30-40 years comes closest to our objective, i.e., people in this age group like both biography and business genre (Biography score – 0.22, Business score – 0.31). Since, we have to find only one genre we will further explore the relationships.

For fiction, there is no clear overall with any of the age groups. For marketing, age group 20-30 year looks to be clear winner. The scores for the groups are – marketing – 0.32, business – 0.34. The relation between philosophy and business is not as strong as it is for business and marketing.

To sum it up, if the publishing company was to launch one more genre of books then it should be marketing and target customer group should be in the range of 20-30 years.

Such analysis can be refined further depending on the data we have. We can add gender, location, educational degree, etc. to the analysis and further refine our target segment to make our marketing efforts more focused.

I think after going through the examples in the article, you can truly appreciate the level of segmentation that Netflix has done and it clearly reflects the reason behind its success.

Check out our case studies : Executive Marketing Dashboard | Market Optimization For Retailers

The post Tableau for Marketing: Become a Segmentation Sniper first appeared on Perceptive Analytics.

Tableau Sales Dashboard Performance

Saneesh V — Tue, 28 Aug 2018 09:00:24 +0000

Business heads often use KPI tracking dashboards that provide a quick overview of their company’s performance and well-being. A KPI tracking dashboard collects, groups, organizes and visualizes the company’s important metrics either in a horizontal or vertical manner. The dashboard provides a quick overview of business performance and expected growth.

An effective and visually engaging way of presenting the main figures in a dashboard is to build a KPI belt by combining text, visual cues and icons. By using KPI dashboards, organizations can access their success indicators in real time and make better informed decisions that support long-term goals.

What is a KPI?

KPIs (i.e.Key Performance Indicators) are also known as performance metrics, performance ratios or business indicators. A Key Performance Indicator is a measurable value that demonstrates how effectively a company is achieving key business objectives.

A sales tracking dashboard provides a complete visual overview of the company’s sales performance by year, quarter or month. Additional information such as the number of new leads and the value of deals can also be incorporated.

Example of KPIs on a Sales Dashboard:

Number of New Customers and Leads
Churn Rate (i.e. how many people stop using the product or service)
Revenue Growth Rate
Comparison to Previous Periods
Most Recent Transactions
QTD (quarter to date) Sales
Profit Rate
State Wise Performance
Average Revenue for Each Customer

Bringing It All Together with Dashboards and Stories

An essential element of Tableau’s value is delivered via dashboards. Well-designed dashboards are visually engaging and draw in the user to play with the information. Dashboards can facilitate details-on-demand that enable the information consumer to understand what, who, when, where, how and perhaps even why something has changed. This is where Tableau development services come into play, enabling customized solutions tailored to business-specific metrics.

Best Practices to Create a Simple and Effective Dashboard to Observe Sales Performance KPIs

A well-framed KPI dashboard instantly highlights problem areas. The greatest value of a modern business dashboard lies in its ability to provide real-time information about a company’s sales performance. As a result, business leaders, as well as project teams, are able to make informed and goal-oriented decisions, acting on actual data instead of gut feelings. The choice of chart types on a dashboard should highlight KPIs effectively. If you’re looking to implement these practices in your organization, working with Tableau experts ensures your dashboards follow industry standards and drive actionable insights.

Bad Practices Examples in a Sales Dashboard:

A sales report displaying 12 months of history for twenty products; 12 × 20 = 240 data points.
- Multiple data points do not enable the information consumer to effectively discern trends and outliers as easily as a time-series chart comprised of the same information
The quality of the data won’t matter if the dashboard takes five minutes to load
The dashboard fails to convey important information quickly
The pie chart has too many slices, and performing precise comparisons of each product sub-category is difficult
The cross-tab at the bottom requires that the user scroll to see all the data

Now, we will focus on thebest practices to create an effective dashboardto convey the most important sales information. Tableau is designed to supply the appropriate graphics and chart types by default via the “Show me” option.

I. Choose the Right Chart Types

With respect to sales performance, we can use the following charts to show the avg. sales, profits, losses and other measures.

Bar chartsto compare numerical data across categories to show sales quantity, sales expense, sales revenue, top products and sales channel etc. This chart represents sales by region.

Line chartsto illustrate sales or revenue trends in data over a period of time:

AHighlight tableallows us to apply conditional formatting (a color scheme in either a continuous or stepped array of colors from highest to lowest) to a view.

UseScatter plotsorscatter graphsto investigate the relationship between different variables or to observe outliers in data. Example: sales vs profit:

UseHistogramsto see the data distribution across groups or to display the shape of the sales distribution:

Advanced Chart Types:

UseBullet graphsto track progress against a goal, a historical sales performance or other pre-assigned thresholds:

TheDual-line chart(or dual-axis chart), is an extension of the line chart and allows for more than one measure to be represented with two different axis ranges. Example: revenue vs. expense
ThePareto chartis the most important chart in a sales analysis. The Pareto principle is also known as 80-20 rule; i.e roughly 80% of the effects come from 20% of the causes.

When performing a sales analysis, this rule is used for detecting the 80% of total sales derived from 20% of the products.

UseBox plotsto display the distribution of data through their quartiles and to observe the major data outliers

Tableau Sales Dashboard

Here is a Tableau dashboard comprised of the aforementioned charts. This interactive dashboard enables the consumer to understand sales information by trend, region, profit and top products.

II. Use Actions to filter instead of Quick Filters

Using actions in place of Quick Filters provides a number of benefits. First, the dashboard will load more quickly. Using too many Quick Filters or trying to filter a very large dimension set can slow the load time because Tableau must scan the data to build the filters. The more quick filters enabled on the dashboard, the longer it will take the dashboard to load.

III. Build Cascading Dashboard Designs to Improve Load Speed

By creating a series of four-panel, four cascading dashboards the load speed was improved dramatically and the understandability of the information presented was greatly enhanced. The top-level dashboard provided a summary view, but included filter actions in each of the visualizations that allowed the executive to see data for different regions, products, and sales teams.

IV. Remove All Non-Data-Ink

Remove any text, lines, or shading that doesn’t provide actionable information. Remove redundant facts. Eliminate anything that doesn’t help the audience understand the story contained in the data.

V. Create More Descriptive Titles for Each Data Pane

Adding more descriptive data object titles will make it easier for the audience to interpret the dashboard. For example:

Bullet Graph—Sales vs. Budget by Product
Sparkline—Sales Trend
Cross-tab—Summary by Product Type
Scatter Plot—Sales vs. Marketing Expense

VI. Ensure That Each Worksheet Object Fits Its Entire View

When possible, change the graphs fit from “Normal” to “Entire View” so that all data can be displayed at once.

VII. Adding Dynamic Title Content

There is an option to use dynamic content and titles within Tableau. Titles can be customized in a dynamic way so that when a filter option is selected, the title and content will change to reflect the selected value. A dynamic title expresses the current content. For example: if the dashboard title is “Sales 2013” and the user has selected year 2014 from the filter, the title will update to “Sales 2014”.

VIII. Trend Lines and Reference Lines

Visualizing granular data sometimes results in random-looking plots. Trend lines help users interpret data by fitting a straight or curved line that best represents the pattern contained within detailed data plots. Reference lines help to compare the actual plot against targets or to create statistical analyses of the deviation contained in the plot; or the range of values based on fixed or calculated numbers.

IX. Using Maps to Improve Insight

Seeing the data displayed on a map can provide new insights. If an internet connection is not available, Tableau allows a change to locally-rendered offline maps. If the data includes geographic information, we can very easily create a map visualization.

This map represents sales by state. The red color represents negative numbers and the green color represents positive numbers.

X. Developing an Ad Hoc Analysis Environment

Tableau facilitates ad hoc analysis in three ways:

Generating new data with forecasts
Designing flexible views using parameters
Changing or creating designs in Tableau Server

XI. Using Filters Wisely

Filters generally improve performance in Tableau. For example, when using a dimension filter to view only the West region, a query is passed to the underlying data source, resulting in information returned for only that region. We can see the sales performance of the particular region in the dashboard. By reducing the amount of data returned, performance improves.

Enhance Visualizations Using Colors, Labels etc.

I. Using colors:

Color is a vital way of understanding and categorizing what we see. We can use color to tell a story about the data, to categorize, to order and to display quantity. Color helps with distinguishing the dimensions. Bright colors pop at us, and light colors recede into the background. We can use color to focus attention on the most relevant parts of the data visualization. We choose color to highlight some elements over others, and use it to convey a message.

Red is used to denote smaller values, and blue or green is used to denote higher values. Red is often seen as a warning color to show the loss or any negative number whereas blue or green is seen as a positive result to show profit and other positive values.

Without colors:

With colors:

II. Using Labels:

Enable labels to call out marks of interest and to make the view more understandable. Data labels enable comprehension of exact data point values. In Tableau, we can turn on mark labels for marks, selected marks, highlighted marks, minimum and maximum values, or only the line ends.

Without labels:

With labels:

Using Tableau to enhance KPI values

The user-friendly interface allows non-technical users to quickly and easily create customized dashboards. Tableau can connect to nearly any data repository, from MS Excel to Hadoop clusters. As mentioned above, using colors and labels, we can enhance visualization and enhance KPI values. Here are some additional ways by which we can enhance the values especially with Tableau features.

I. Allow for Interactivity

Playing, exploring, and experimenting with the charts is what keeps users engaged. Interactive dashboards enable the audiences to perform basic analytical tasks such as filtering views, drilling down and examining underlying data – all with little training.

II. Custom Shapes to Show KPIs

Tableau shapes and controls can be found in the marks card to the right of the visualization window. There are plenty of options built into Tableau that can be found in the shape palette.

Custom shapes are very powerful when telling a story with visualizations in dashboards and reports. We can create unlimited shape combinations to show mark points and create custom formatting. Below is an example that illustrates how we can represent the sales or profit values with a symbolic presentation.

Here green arrows indicate good sales progress and red arrows indicate a fall in Year over Year Sales by Category

III. Creating Calculated Fields

Calculated fields can be used to create new dimensions such as segments, or new measures such as ratios. There are many reasons to create calculated fields in Tableau. Here are just a few:

Segmentation of data in new ways on the fly
Adding a new dimension or a new measure before making it a permanent field in the underlying data
Filtering out unwanted results for better analyses
Using the power of parameters, putting the choice in the hands of end users
Calculating ratios across many different variables in Tableau, saving valuable database processing and storage resources

IV. Data-Driven Alerts

With version 10.3, Tableau has introduced a very useful feature: Data-Driven Alerts. We may want to use alerts to notify users or to remind that a certain filter is on and want to be alerted somehow if performance is ever higher or lower than expected. Adding alerts to dashboards can help elicit necessary action by the information consumer. This is an example of a data driven alert that we can set while displaying a dashboard or worksheet.

In a Tableau Server dashboard, we can set up automatic mail notifications to a set of recipients when a certain value reaches a specific threshold.

Summary

For an enterprise, a dashboard is a visual tool to help track, monitor and analyze information about the organization. The aim is to enable better decision making. To ensure these dashboards effectively align with sales and revenue goals, tableau professional services provide tailored visual analytics solutions.

A key feature of sales dashboards in Tableau is interactivity. Dashboards are not simply a set of reports on a page; they should tell a story about the business. In order to facilitate the decision-making process, interactivity is an important part of assisting the decision-maker to get to the heart of the analysis as quickly as possible.

The post Tableau Sales Dashboard Performance first appeared on Perceptive Analytics.