What is big data, how is big data used and why is big data essential for digital transformation and today’s data-driven business where actionable data and analytics matter most amidst rapidly growing volumes of mainly unstructured data across ample use cases, business processes, business functions and industries?
Big Data in a way just means “all data” (in the context of your organization and its ecosystem). And there is quite some data nowadays. The sheer volume of data we can tap into is dazzling and, looking at the growth rates of the digital data universe, it just makes you dizzy.
With IoT (the Internet of Things) and digital transformation having an impact across all verticals it goes even faster. More importantly: data has become a business asset beyond belief. So, better treat it well.
Originally, Big Data mainly was used as a term to refer to the size and complexity of data sets, as well as to the different forms of processing, analyzing and so forth that were needed to deal with those larger and more complex data sets and unlock their value. Most people used to look at the pure volume and variety perspective: more data, more types of data, more sources of data and more diverse forms of data. But data as such is meaningless, as is volume. What really matters is meaning, actionable data, actionable information, actionable intelligence, a goal and…the action to get there and move from data to decisions and…actions, thanks to Big Data analytics (BDA) and, how else could it be, artificial intelligence.
A guide to the essence of Big Data.
Table of Contents
- Big data: from volume to more volume but mainly to value
- The information opportunity of Big Data
- From big data to big insights and big decisions
- Big Data: a consequence and a catalyst
- The Vs of Big Data: Adding Value
- Moving to high-value data and use cases: the place of Big Data in transformation
- List of big data examples, use cases and applications across industries and business functions
- Data never sleeps: data on and from the Internet, email and mobile (apps)
- The global datasphere
- Where do organizations focus their Big Data efforts on?
- More about Big Data and its evolutions and applications
- Smart data: beyond the volume and towards the reality
- Fast data: speed and agility for responsiveness
- Big data analytics: making smart decisions and predictions
- Unstructured data: adding meaning and value
- What makes (Big) data actionable?
- Big data in customer service
- Solving the Big Data challenge with artificial intelligence
- Data lakes for Big Data Analytics
- Big Data: order from chaos
Big data: from volume to more volume but mainly to value
It’s easy to see why we are fascinated with volume and variety if you realize how much data there really is (the numbers change all the time, it truly is exponential) and in how many ways, formats and shapes it comes, from a variety of sources.
Consider the data on the Web, transaction logs, social data and the data which gets extracted from gazillions of digitized documents. Consider several other types of unstructured data such as email and text messages, data generated across numerous applications (ERP, CRM, supply chain management systems, anything in the broadest scope of suppliers and business process systems, vertical applications such as building management systems, etc.), geolocation data and, increasingly, data from sensors and other data-generating devices and components in the realm of IoT and mainly its industrial variant, Industrial IoT (and for Europeans Industry 4.0, a very data-intensive framework).
Regardless of when you read this: if you think the volumes of data out there and in your organization’s ecosystem are about to slow down, think again. You can imagine how Big Data and the Internet of Things, along with artificial intelligence, which is needed to make sense of all that data, only have started to show a glimpse of their tremendous impact as, in reality, for most technologies and applications, whether it concerns digital twins, predictive maintenance or even IoT (and related technologies enabling some of these applications; think AR and VR) as such, it is still relatively early days for most.
The information opportunity of Big Data
So, the term Big Data has a technology and processing background in an increasingly digital and unstructured information age where ever larger data sets became available and ever more data sources were added, leading to a real data chaos.
However, just as information chaos is about information opportunity, Big Data chaos is also about opportunity and purpose. On top of that, the beauty of Big Data is that it doesn’t strictly follow the classic rules of data and information processes and even perfectly dumb data can lead to great results as Greg Satell explains on Forbes.
The mentioned increase of large and complex data sets also required a different approach in the ‘fast’ context of a real-time economy where rapid access to complex data and information matters more than ever. Just think about information-sensing devices that steer real-time actions, for instance. Or the increasing expectations of people in terms of fast and accurate information/feedback when seeking it for one or the other purposes. Indeed, customer experience optimization, customer service and so on are also key goals of many big data projects.
From big data to big insights and big decisions
Amid all these evolutions, the definition of the term Big Data, really an umbrella term, has been evolving, moving away from its original definition in the sense of controlling data volume, velocity and variety, as described in this 2001 META Group / Gartner document (PDF opens).
The renewed attention for Big Data in recent years was caused by a combination of open source technologies to store and manipulate data and the increasing volume of data as Timo Elliot writes. Add to that the various other 3rd platform technologies, of which Big Data (in fact, Big Data Analytics) is part such as cloud computing, mobile and additional ‘accelerators’ such as IoT and it becomes clear why Big Data gained far more than just some renewed attention but led to a broadening Big Data ecosystem as depicted below.
Today, and certainly here, we look at the business, intelligence, decision and value/opportunity perspective. From volume to value (what data do we need to create which benefit) and from chaos to mining and meaning, putting the emphasis on data analytics, insights and action.
A key question in that – predominantly unstructured- data chaos is what are the right data we need to achieve one or more of possible actions. The creation of value from Big Data – and of data and information overall – is a holistic one, driven by desired outcomes.
With the Internet of Things happening and the ongoing digitization in many areas of society, science and business, the collection, processing and analysis of data sets and the RIGHT data is a challenge and opportunity for many years to come.
As such Big Data is pretty meaningless or better: as mentioned it’s (used) as an umbrella term. And as is the case with most “trending” umbrella terms, there is quite some confusion. Analyzing data sets and turning data into intelligence and relevant action is key.
Big Data: a consequence and a catalyst
While Big Data is often misunderstood from a business perspective (again, it’s about using the ‘right data’ at the right time for the right reasons) and there are debates regarding the use of specific data by organizations, it’s clear that Big Data is a logical consequence of a digital age.
At the same time it’s a catalyst in several areas of digital business and society. Just one example: Big Data is one of the key drivers in information management evolutions and of course it plays a role in many digital transformation projects and opportunities.
The importance of Big Data and more importantly, the intelligence, analytics, interpretation, combination and value smart organizations derive from a ‘right data’ and ‘relevance’ perspective will be driving the ways organizations work and impact recruitment and skills priorities. The winners will understand the Value instead of just the technology and that requires data analysts but also executives and practitioners in many functions that need to acquire an analytical, let alone digital, mindset. A huge challenge, certainly in domains such as marketing and management.
The Vs of Big Data: Adding Value
On top of the traditional three big data ‘V’s’ IBM decided to add a fourth one as you can see in the illustration above.
Why not? In the end value is what we seek. And, sure, there is also value in data and information. It’s perhaps not that obvious as volume and so forth. Others added even more ‘V’s’. We can think of one too but let’s not go there.
The sheer volume of data and information that gets created whereby we mainly talk infrastructure, processing and management of big data, be it in a selective way.
Velocity is about where analysis, action and also fast capture, processing and understanding happen and where we also look at the speed and mechanisms at which large amounts of data can be processed for increasingly near-time or real-time outcomes, often leading to the need of fast data.
On top of the data produced in a broad digital context, regardless of business function, societal area or systems, there is a huge increase in data created on more specific levels. Variety is about the many types, being structured, unstructured and everything in between.
Veracity has everything to do with accuracy which from a decision and intelligence viewpoint becomes certainty and the degree in which we can trust upon the data to do what we need/want to do.
As said we add value to that as it’s about the goal, the outcome, the prioritization and the overall value and relevance created in Big Data applications, whereby the value lies in the eye of the beholder and the stakeholder and never or rarely in the volume dimension. Welcome to Big Data in Action.
Moving to high-value data and use cases: the place of Big Data in transformation
As mentioned a few times, organizations have been focusing (far too) long on the volume dimension of ever more – big – data. This isn’t too much of a surprise of course.
Volumes were and are staggering and getting all that data into data lakes hasn’t been easy and still isn’t (more about data lakes below, for now see it as an environment where lots of data are gathered and can be analyzed). At a certain point in time we even started talking about data swamps instead of data lakes. You can imagine what that means: plenty of data coming in from plenty of (ever more) sources and systems, leading to muddy waters (not the artist).
Having lots of data is one thing, having high-quality data is another and leveraging high-value data for high-value goals (what comes out of the water so to speak) is again another ballgame.
Fortunately, organizations started leveraging Big Data in smarter and more meaningful ways. Although data lakes continue to grow (to be sure, do note that Big Data and data science isn’t just about lakes, data warehouses and so on matter too) and there is a shift in Big Data processing towards cloud and high-value data use cases.
This is happening in many areas. According to Qubole’s 2018 Big Data Trends and Challenges Report Big Data is being used across a wide and growing spectrum of departments and functions and business processes receiving most value from big data (in descending order of importance based upon the percentage of respondents in the survey for the report) include customer service, IT planning, sales, finance, resource planning, IT issue response, marketing, HR and workplace, and supply chain.
In other words: pretty much all business processes. As mentioned in an article on some takeaways from the report, the shift to the cloud leads to an expansion of machine learning programs (machine learning or ML is a field of artificial intelligence) in which enhancing cybersecurity, customer experience optimization and predictive maintenance, a top Industry 4.0 use case, stick out.
More departments, more functions, more use cases, more goals and hopefully/especially more focus on creating value and smart actions and decisions: in the end it’s what Big Data (analytics) and, let’s face it, most digital transformation projects and enabling technologies such as artificial intelligence, IoT and so on are all about.
By now it should be a no-brainer that all these and other technologies are umbrella terms and enablers that overlap, converge, need and enrich each other (also leading to new applications such as digital twins to name just one), depending on the goal. It’s not just about IoT and AI or about any other combination, it’s about the right combinations for the right purposes which are actions and decisions based upon the right data at the right and so on? You know the mantra by now.
List of big data examples, use cases and applications across industries and business functions
This introduction to big data is a starting point and is regularly updated with important topics such as big data and artificial intelligence, examples of big data in action, the evolutions of the market, updated and new articles on BDA (Big Data Analytics) and data science in a broader sense.
To get you started below is a list of some websites and online resources offering lists of big data examples and application areas in real-life.
Data never sleeps: data on and from the Internet, email and mobile (apps)
Since several years, DOMO, a company that offers a platform to bring together data systems and people for a digitally connected business, has been putting out a fun infographic, aggregating some data on, well, big data, albeit mainly Internet-related (so no IoT and whatnot).
In the 2019 edition below, you can see some of the data that’s being generated every minute according to the various sources the company consulted for the latest edition of the latest “Data Never Sleeps” infographic (download in PDF).
- 390,030 apps are downloaded every minute (admittedly, apps get uninstalled frequently as well).
- 511,200 tweets are sent each minute (which doesn’t necessarily mean they’re all valuable to say the least).
- 188,000,000 emails are sent every minute (there goes the statement that email is dead).
- 4,500,000 YouTube videos are watched, and 18,100,000 texts are sent.
- Google conducts 4,497,420 search queries per minute.
- Americans use 4,416,720 GB of Internet data every minute.
Well, more in the infographic. We know the global datasphere is growing fast indeed. Making sense from noise and filtering all the data from and on the Internet, email, text message and the many apps and services mentioned in the infographic is just part of the big data equation.
The global datasphere
A more comprehensive overview of the growth of the global datasphere is offered each year by research firm IDC.
In Data Age 2025, the company forecasts that by 2025 the global datasphere will have grown to 175 zettabytes of data created, captured, replicated etc. per year. Here the data generated by ever more IoT devices are included. They are expected to create over 90 zettabytes in 2025.
The continuous growth of the datasphere and big data has an important impact on how data gets analyzed whereby the edge (edge computing) plays an increasing role and public cloud becomes the core.
Where do organizations focus their Big Data efforts on?
Obviously analytics are key. However, which Big Data sources are used to analyze and derive insights?
In 2012, IBM and the Said Business School at the University of Oxford found that most Big Data projects at that time were focusing on the analysis of internal data to extract insights. Among the internal data sources the majority (88 percent) concerned analysis of transactional data, 73 percent log data and 57 percent emails.
Fewer businesses were busy looking at external big data, from outside their firewalls, which are mainly unstructured (as are most internal sources) and offer ample opportunities to gain insights too (e.g. sentiment analysis).
By now this picture probably has changed and of course it also depends in the goal and type of industry/application. With the network perimeters fading, the ongoing development of initiatives in areas such as the Internet of Things and increasing Big Data analysis maturity, we would like to see a detailed update indeed.
More about Big Data and its evolutions and applications
Smart data: beyond the volume and towards the reality
Big data is…big. With increasing volumes of mainly unstructured data comes a challenge of noise within the sheer volume aspect.
In order to achieve business outcomes and practical outcomes to improve business, serve customer betters, enhance marketing optimization or respond to any kind of business challenge that can be improved using data, we need smart data whereby the focus shifts from volume to value.Learn more about smart data
Fast data: speed and agility for responsiveness
In order to react and pro-act, speed is of the utmost importance.
However, how do you move from the – mainly unstructured – data avalanche that big data really is to the speed you need in a real-time economy? Fast data is one of the answers in times when customer-adaptiveness is key to maintain relevance.Learn more about fast data
Big data analytics: making smart decisions and predictions
As anyone who has ever worked with data, even before we started talking about big data, analytics are what matters.
Without analytics there is no action or outcome. While smart data are all about value, they go hand in hand with big data analytics. In fact, big data analytics, and more specifically predictive analytics, was the first technology to reach the plateau of productivity in Gartner’s Big Data hype cycle.Learn more about big data analytics
Unstructured data: adding meaning and value
The largest and fastest growing form of information in the Big Data landscape is what we call unstructured data or unstructured information. Coming from a variety of sources it adds to the vast and increasingly diverse data and information universe.
To turn the vast opportunities in unstructured data and information (ranging from text files and social data to the body text of an email), meaning and context needs to be derived. This is what cognitive computing enables: seeing patterns, extracting meaning and adding a “why” to the “how” of Big Data.Learn more about unstructured data
What makes (Big) data actionable?
Without intelligence, meaning and purpose data can’t be made actionable in the context of Big Data with ever more data/information sources, formats and types.
Moreover, there are several aspects of data which are needed in order to make it actionable at all. Whether it concerns Big Data or any other type of data, actionable data for starters is accurate: the data elements are correct, legible and valid. A second aspect is accessibility, which comes with several modalities as well. Other dimensions include liquidity, quality and organization.More on what makes data actionable
Big data in customer service
Today’s customers expect good customer experience and data management plays a big role in it.
In the increasing Big Data reality regarding customer service and the contact center, making sense of data from a customer service and customer experience perspective requires an integrated and omni-channel approach whereby the sheer volume of information and data sources regarding customers, interactions and transactions, needs to be turned in sense for the customer who expects consistent and seamless experiences, among others from a service perspective.More on Big Data and customer service
Solving the Big Data challenge with artificial intelligence
Roland Simonis explains how artificial intelligence is used for Intelligent Document Recognition and the unstructured information and big data challenges.
Among the AI methods he covers are semantic understanding and statistical clustering, along with the application of the AI model to incoming information for classification, recognition, routing and, last but not least, the self-learning mechanism.Solving the information and Big Data challenge with AI
Data lakes for Big Data Analytics
Traditional methods of dealing with ever growing volumes and variety of data in the Big Data context didn’t do anymore. That’s where data lakes came in.
Data lakes are repositories where organizations strategically gather and store all the data they need to analyze in order to reach a specific goal. The nature and format of the data nor data source doesn’t matter in this regard: semi-structured, structured, unstructured, anything goes. The data lake is what organizations need for Big Data Analytics in a mixed environment of data. However, there are challenges to this model as well where Hadoop is a well-known solutions player and data lakes as we know them are not a universal answer for all analytics needs.Data lakes: the what, why and how
Other useful links:
- Big data 2020: the future, growth and challenges of the big data industry
- The DIKW model for knowledge management and data value extraction
- Big data and customer service: turning volume into sense
- The gap between information management maturity perceptions and reality
- Information management and strategy – an executive guide
Big Data: order from chaos
There is a bunch of infographics, charts and data on Big Data.
While, as mentioned, the predicitions often have change by the time they are published, below is a rather nice infographic from the people at Visual Capitalist which, on top of data, also shows some cases of how Big Data gets used in real life and nicely illustrates a few other aspects of Big Data such as some sources of Big Data and some challenges.
Check out the ‘creating order from chaos’ infographic below or see it on Visual Capitalist for a wider version.
Top image: Shutterstock – Copyright: Melpomene – All other images are the property of their respective mentioned owners.