Part III: Key Analytics Models – Cohort analysis, Cluster analysis, Time-series analysis, and Sentiment analysis
This is the third of an article series on where data analytics stands today and what to look forward to in the coming year. Read Part I here.
In Part II of this series, we explained the basics of linear regression, Monte Carlo simulations and Factor analysis. Here we discuss a few more key models.
Cohort Analysis: Here, instead of inspecting the data as a whole, it is broken down into groups for analysis over time. These are groups that share commonalities within a defined timespan.
As a subset of behavioural analytics, customers or users are broken into groups and their behaviour is analysed over a specified time period. By doing so, instead of getting an isolated snapshot of consumer behaviour, you can examine customers’ behaviour with regard to their position in their overall life cycle.
As a result, CareerFoundry writes, “you can start to identify patterns of behaviour at various points in the customer journey—say, from their first ever visit to your website, through to email newsletter sign-up, to their first purchase, and so on. As such, cohort analysis is dynamic, allowing you to uncover valuable insights about the customer lifecycle.”
Cluster Analysis: This involves grouping data into ‘clusters’ in a way that items within one cluster, albeit similar to each other, are completely dissimilar to those in another cluster. This helps provide insights into the distribution of data and can easily help reveal patterns behind anomalies.
For instance, an insurance company can use the technique to determine why more claims are associated with certain specific locations.
Time-series Analysis: This, studies the characteristics of a variable over time, identifying trends that may help predict future behaviour. Consider an example of a firm using past sales data to anticipate future sales data.
When considering time-series analysis, some of the major patterns to look out for in data include trends, seasonality and cyclical patterns. Trends are linear and stable changes over an extended time period; seasonalities are predictable fluctuation in data owing to aspects such as seasonal factors over a short time period (such as sale of woollens during winter and swimwear during summer) and cyclical patterns are more unpredictable, owing to other non-seasonal factors such as economic and industry conditions (such as number of new loans issued before and after a series of interest rate hikes).
There are three major types of time-series analysis models: autoregressive (AR) models, moving average (MA) models and Integrated (I) models.
Sentiment analysis: This helps find the emotional tone behind a dataset, helping organisations identify opinions about a service, product or idea. There are three main types:
Fine-grained sentiment analysis, focusing on opinion polarity in depth, such as interpreting star ratings given by customers and increasing granularity there; emotion detection, often using machine learning algorithms to extrapolate various emotions based on textual data (such as associating certain words to specific emotions on a product review) and aspect-based sentiment analysis, where specific aspects, emotions or opinions can be determined based on customer feedback to a new ad campaign or a new product feature.
This uses Natural Language Processing (NLP) algorithms trained to associate certain inputs with certain outputs. The words ‘annoying’ for example, will be put into the ‘negative’ basket. This is crucial in identifying how customers feel about a particular product, identifying areas for improvement as well as avoiding PR disasters.
[Read further parts for more of the key models of data analytics and the maturity stages of automation and AI]
Know more about the syllabus and placement record of our Top Ranked Data Science Course in Kolkata, Data Science course in Bangalore, Data Science course in Hyderabad, and Data Science course in Chennai.