Data and information: definitions, types and meaning – from unstructured data to actionable information and tangible value.
Without a doubt you use one or more of the following terms constantly: data, information, content and knowledge.
Moreover, chance is high you have read about or work with other terms, describing various ‘types’ and/or characteristics of data, information and content such as unstructured content and unstructured data, big data and semi-structured information.
Welcome to the terminology chaos of the information age where data, content, information and knowledge are sources of value and business as such, where big data analytics are key and the right data for the right outcomes matter more than ever. Time for an exploration of what all these terms stand for, what they mean for your business, today’s economy, transformation, innovation and obviously what matters most: how it can all lead to value for people, stakeholders, workers, consumers, you and us.
Definitions and terminology in a growing digital universe of data
Data and information are crucial assets to create value on various possible levels. They need to be protected as we would do with all important assets, they need to be treated with care, they derive their meaning from their purpose and they are becoming economic goods as such.
They enable businesses to tap into new revenue streams and even transform their very business models (information is key in digital transformation, as is data excellence).
In other words: understanding the potential value of data and information and the ability to capitalize upon it all, often even by detecting untapped opportunities in data sources you have but might overlook, is important. It’s one of the reasons we started using data lakes.
With an exponentially growing digital universe, mainly one of unstructured data, and the understanding that generating value from data in several possible ways is key, you hear all the mentioned terms constantly and several often even interchangeably, as people try to grasp the data chaos and information complexity to turn it into an opportunity. Moreover, all these terms are used by people with different functions and backgrounds, from IT to marketing, from data experts and information managers to legal. As you can imagine they don’t always mean the same thing when using one of these terms.
Avoiding Babylonian confusions means avoiding misunderstandings and myths
As such the terms don’t really matter. We’ve always been strong advocates to look at things in a holistic way with the end goals, value, customers and outcomes in mind.
However, at the same time it’s important to know what we are really talking about when we’re talking about it. This is especially true in these days, where several departments and functions are expected to work closer together in the overall context of digital transformation and, as a consequence, where speaking the same language does matter.
Time and time again we have noticed that Babylonian confusions always reign at one or the other point when something becomes more important, business-critical and occasionaly hyped. This often leads to misunderstandings and sometimes even myths and false assumptions. That’s not what we want and it’s not what you want.
From data to wisdom: the DIKW hierarchy
For this journey across the marvelous and jargon-heavy world of information-related jargon we’re going to refer – among others – to something which is known as the DIKW model (for knowledge management) or the DIKW Pyramid, whereby the D stands for Data, the I for Information, the K for Knowledge and the W for Wisdom.
If you never heard about it you’ve certainly seen some kind of similar model whereby a gradual process from data to actions and outcomes is depicted. The DIKW Pyramid is a hierarchical pyramid. In other words: Data sits at the bottom, Wisdom is at the top. While this seems logical it is also one of the main criticisms regarding the DIKW model.
We’ll mention it several times but if you really want to know all about it, check out this page. There are several reasons we use DIKW. In the scope of this guide the main one is that as mentioned, derivated views, are used in information management, business intelligence, content management, business process management and so on.
Diving deeper beyond the models
As you know there have been quite some evolutions and changes in the data and information space, from document capture systems to systems of meaning and insight (shall we say cognitive, data/content analytics and artificial intelligence?) and from the sheer volume of data and information sources to the diversity and complexity in which they are used by businesses and by people like you and us, in our capacities as consumers, workers and users of a myriad of digital, mobile and social platforms, even leading to changing expectations.
Models remain models and, knowing all these evolutions, we need to dive deeper to understand what data, information, knowledge and content (indeed, content as a term is absent in DIKW) is, why it matters and, most of all, how to derive value from it (and protect and respect it) by creating systems of insight (another term lacking in DIKW but stay tuned as it matters). Moreover, with big data and analytics, the classic approach of information/knowledge management in some form of hierarchic pyramid is a bit hard to sustain and was always a point of debate as mentioned.
However, for now we have a point to start looking at data, information, knowledge and wisdom as in the wisdom to not just leverage the right data and information, but also to use it in a smart way for effective actions.
Defining data: the crucial building block at the bottom of the pyramid and the foundation of your business
So, at the bottom of our good old DIKW model sits data. Valid question: what is data? When looking at Wikipedia’s page on that famous pyramid we read that “In the context of DIKW, data is conceived of as symbols or signs, representing stimuli or signals, that are “of no use until…in a usable (that is, relevant) form”.
There are literally thousands of other definitions out there but let’s say, as we did often before, that data as such is meaningless. It’s out there, we can set up systems to capture it, we can build systems to create it, there is loads of it but in the end it’s just simple raw and as such meaningless data about a myriad of possible things, objects, people and phenomena.
If a sensor says it’s 28 degrees celsius, that’s data. If someone on your website enters their email to register to your newsletter, that’s data. With all this data as such you know nothing. It’s merely a raw fact such as a number or a text string.
Obviously there is a reason why we use a sensor to know the temperature or to gather any kind of number, text string and so forth. For a smart thermostat the fact that it’s 28 degrees can be information as it has context and even can drive to action: time to – automatically – turn the heating down (or the airco on). An email address derives its meaning from the fact it is used in the context of some form of action, for instance sending a newsletter.
However, most data is not as straightforward. There is a bunch of data, structured, semi-structured and, increasingly, unstructured. So, most data needs to be acquired (captured, entered via a data pipeline) and processed with a goal and context in mind, making it information, which essentially is about processed data. With Big Data, certainly in combination with the Internet of Things, the picture is even more complex;
Back to processing data. If you receive a paper invoice and scan it to feed some database and system that helps you deal with invoices, you capture the data from the paper invoice that matters (e.g. PO number, vendor number, bank account, etc.) and the act of processing that data in function of the specific goal makes it information that can be used by your accounts payable people.
In other words: data gets its meaning after processing with a purpose, an end goal and a business process in mind.
Data matters a lot: data quality and accuracy
The fact that data is at the bottom of the pyramid doesn’t mean it is less important. Well, on the contrary! When you have a business process or a goal that is served by data, information, content and insights, it’s quite obvious that data is the foundation.
You heard about the acronym GIGO (Garbage In Garbage Out). With most data capture it’s a very important given. Data needs to be qualitative, accurate, complete, timeless and fit for purpose. As said, it also needs to be managed, protected and respected.
Data accuracy and data quality do matter a lot. And in the end, we capture data with the outcome in mind. It’s just like in information management or document capture: businesses need a holistic approach with the end goal driving the data (processing) and information needs.
If the foundations of a house consist of too much garbage and not enough steady ground and strong materials to support it, the house will fall down or need lots of reparations, which as you know can be so expensive that sometimes it’s even cheaper to rebuild the house alltogether. It’s the same in data and IT. Just think about how often it’s better to replace legacy IT, rather than keeping it.
There are several reasons why data can be garbage or not fit for purpose. They include human errors, for instance in manual data entry or capturing data, flaws in the systems used to capture the data, corrupted data, the list goes on. We’ll talk more about GIGO, data quality and data accuracy later.
Turning data into action: how are organizations doing?
Probably by now it’s clear that there is what we could call a data processing and ‘information transformation and activation’ process (often really a series of consecutive and/or highy connected processes) before we get from data to action and outcome/value. However, we can already give you some overview on how organizations are deriving value from their data – or not – and this how successful – or not – they are in turning data into action.
According to Forrester’s Global Business Technographics® Data And Analytics Survey 2015, 29 percent of firms are good at turning data into action (all possible actions), even if 73 percent aim to be data-driven as you can see in the SlideShare presentation below by Forrester’s Brian Hopkins, made at the occassion of a webinar by Cloudera on “the need to rethink big data ad move from data to insights to business decisions”. More about these terms and the realities behind them, such as systems of insight and big data analytics, coming soon but you can indeed see that there is quite a big gap between aiming to be data-driven and effectively being able to derive value from data, turning it into action.
We see the exact same phenomenon in the capabilities in the ability to unlock the business value that sits in information and the perceptions businesses have in regards to their capability to do so as mentioned in a post on 2015 research by PwC and Iron Mountain.
A range of other surveys come with similar findings and it shouldn’t come as a suprise as there is a big difference between what we do in practice with data, content and information and what we could. This is not just about big data and data anlytics by the way. The gaps between the ‘feasible/desired/ideal’ state and the ‘actual’ state and even the gaps between the ‘necessary/urgent’ state and the actual state, all the way from very simple data capture processes to complex Big Data analytics, are still huge for many organizations.
Further information about the various forms and applications of data and information
Below is a list of articles with more content regarding the different types of data and information which we mentioned in this overview, as well as some sources on information management and data capture. Below that is the promised presentation by Forrester to look at.
- Big data in action: definition, value and context
- Unstructured data:turning data into actionable intelligence.
- Information management and strategy
- Information capture: from document capture to omnichannel capture
- Smart data: from big data to smart data, processes and outcomes
- Fast data for the real-time customer
- Big data analytics: from big data to smart decisions
- From document capture to business process optimization and value
- What makes data actionable?
- Big data and customer service: turning volume into sense