While big data is grabbing regular headlines, small data is steadily gaining foothold. Do you know what it is?
It is next to impossible to have a surprise element in today’s conventional warfare. Social media chatter from anxious Russian girlfriends, spouses, and relatives have revealed everything from large scale troop deployments to military and naval hardware mobilisations. While the posts give no specific indication of whether President Vladimir Putin has decided to launch a new military offensive targeting Ukraine, they serve as evidence that troops and equipment are being moved enmasse from Russia’s Far East and offer a rare glimpse into the fears voiced by relatives of the soldiers. It seems nothing is insignificant, and small talk equates to small data.
Social chatter analysis
Back in December 2019, analysis of social media chatter of doctors in Wuhan had led the Canadian outbreak analysis company Blue Dot to identify a mysterious illness that was spreading rapidly. It was almost two months before the World Health Organization sounded the Coronavirus pandemic alarm. These are just a few instances of Small Data Analytics.
‘Small Data’ is a term that describes datasets with fewer than 1,000 rows or columns. The term was coined in 2011 by researchers at IBM to describe datasets that are too small for traditional statistical methods. In contrast to big data, small datasets can be analysed using estimation. Examples of small datasets include customer transactions, social media posts, and individual genome sequences.
Customer data, such as booking information, meals bought, turnover per seat, and seasonal variations in customer flow – all of which can be easily accessible. A Copenhagen restaurant grew its turnover from $1.1 million to $6.1 million within two years whilst depending on small data insight.
Pandemic spurs small data growth
It was the once-in-a-century event of the pandemic that led to the growth of small data analytics. Past data simply wasn’t of any use in a situation which did not have any precedence. The retail algorithms failed to forecast that chain stores would face a critical shortage of toilet paper in the initial days of the lockdowns. According to a Gartner Press release, 70% of organisations will shift their focus from big to small and wide data by 2025.
The analyst firm reported that the extreme business changes from the COVID-19 pandemic caused Machine Learning (ML) and Artificial Intelligence (AI) models based on large amounts of historical data to become less relevant. In parallel, decision-making by humans and AI are wider and require data from different sources for accurate responses to queries.
As a result, organisations adopted technologies that can use whatever data is available, as well as wider sets of data that enables the analysis and use of synergy of a variety of small and large, unstructured, and structured data sources, as well as small data which is the application of analytical techniques that require less data but still offer useful insights.
Big innovation & small data
Small data will play a prominent role in analytics as the future continues to be fluid and bear little correlation with the past. Organisations will have to forecast more of near-term scenarios to create agile strategies for staying in step with highly unpredictable times ahead. Most of the scientists of the 19th and 20th centuries used small data for discoveries. They made all the pathbreaking calculations by hand, using small data. They discovered the fundamental laws of nature by compressing them into simple rules. It was found that 65% of the big innovations are based on small data. Small data can also be used to make some important conclusions – especially when it comes to training an AI. Huge data volumes can create confusion in machine learning methods. AI is all about mastering the right knowledge and not processing data. It involves providing knowledge to the machines to make them perform any task.
Transfer learning
Small data approaches like transfer learning are widely being used nowadays. Transfer learning is the reuse of a pre-trained model on a new problem. It is currently very popular in deep learning because it can train deep neural networks with comparatively little data.Scientists train machines using transfer learning, enabling them to work in various fields. For example, researchers in India have successfully used transfer learning to train a machine to locate kidneys in ultrasound images by using only 45 training examples. Transfer learning is expected to grow more soon.
One of the major challenges of AI is that machines require generalization, i.e., to provide proper answers to questions in which they are trained because transfer learning is transferring knowledge. But it is possible even with limited data. Transfer learning is being used for the diagnosis of cancer, playing video games, spam filtering, and many more. Advanced AI tools and techniques are opening new possibilities to train AI with small data and change processes. For training entire AI systems or machines, large organizations are using thousands of small data.
Small Data calls for Casual AI
Small data calls for more tailor-suited AI systems, too. Causal AI represents the next frontier of artificial intelligence. This technology has been developed to reason about the world in a similar way to humans. While we can learn from extremely small datasets, causal AI has been developed to do the same. Causal AI is the only technology that can reason and make choices like humans do. It utilises causality to go beyond narrow machine learning predictions and can be directly integrated into human decision-making. It is the only AI system organisations can trust with their biggest challenges – a revolution in enterprise AI.
Technically speaking, causal AI models can learn from minuscule data points owing to data discovery algorithms – a novel class of algorithms designed to identify important information through very limited observations – just like humans. Causal AI can also enable humans to share their own insights and pre-existing knowledge with the algorithms, which can be an innovative way of generating circumstantial data when it doesn’t formally exist.
In business terms, this means that casual AI algorithms can be fed small data across a range of different sources to identify recurring themes that typical augmented reality would be unable to address. As the technology continues to emerge, we’re likely to see casual AI identify more consumer insights for marketers through the wealth of information businesses generate across a range of touchpoints. This can breathe new life into small data models and equip businesses with a more manageable approach to organising their data in the future.