Data Age 2025: the datasphere and data-readiness from edge to core

The Digitization of the World study predicts that the installed bytes across the enterprise is to represent over 80 percent of total installed bytes worldwide in 2025.

The volumes of data created, captured and replicated keep growing fast: instant data, small data, big data, real-time data, more data everywhere. Ongoing digitization, the impact of new technologies and the data-driven economy behind digital transformation are some of several contributors to this growth.

Data age - the global datasphere - trends and data-readiness from edge to core. The global datasphere will grow from 33 zettabytes in 2018 to 175 zettabytes by 2025. IoT devices are expected to create over 90 zettabytes of data in 2025. By 2025, 49% of all data worldwide will reside in public cloud environments as cloud becomes the new core. Nearly 30% of the world’s data will need real-time processing as the role of the edge continues to grow.

Update: check out data creation and data storage 2020 data and additional forecasts from 2021.

At the same time the challenges to extract value from data and manage it remain high. Moreover, demand for real-time analytics and data streams is on the rise: the velocity aspect of big data and the impact of IoT.

Research firm IDC is one of several companies that keep track of data evolutions across all aspects related with it. Its global datasphere quantifies and analyzes the amount of data created, captured, and replicated across the globe. And of course, this comes with consequences and ongoing trends on the data management and information management, analytics, and data strategy level, with DataOps as one of the latest evolutions.

The digitization of the world – the datasphere continues to move edge

End 2018 IDC released a white paper with the support of Seagate, the well-known data and storage company that has been around since four decades, entitled ‘The Digitization of the World – From Edge to Core’.

In the announcement of the paper (also see the video below) IDC stated that the global datasphere had reached 33 zettabytes (ZB) of digital data in 2018 and could grow to 175 zettabytes (of data created, captured and replicated on a yearly basis) by 2025.

While endpoints continue to be the primary location for data creation in the short term, the fastest growth is forecasted to happen at the core and the edge – with more data stored in the core than in the world’s endpoints by 2025 (David Reinsel, senior vice president at IDC)

At the same time, Seagate and IDC announced a data-readiness index (called DATCON, which stands for DATa readiness CONdition). The goal: evaluate the management, usage and monetization of data (typically through analytics, artificial intelligence and machine learning) of four industries that – together – account for almost half of the global enterprise datasphere – manufacturing, financial services, healthcare (digital health) and media & entertainment.

As the name of the white paper indicates the importance of edge computing (managing, processing and analyzing data at the edge of the network for real-time purposes and/or when such makes business sense in any other way) is one of the tackled topics and inevitable trends. The importance of the edge, with all the software, hardware, infrastructure and new services that have been popping up in recent years (IoT platforms for the IoT edge, edge gateways, edge servers, micro data centers, the whole fog computing environment, the list is long), brought back a distributed computing paradigm so fast that nowadays it’s hard to find vendors that don’t have some edge computing offer. And the battles will be fierce as edge is where the margins and growth are for many.

Cloud to the core, edge for more – cloud data centers the new enterprise data repository

Contrary to what we sometimes read this doesn’t mean that centralized data centers, let alone cloud computing, are done with, well on the contrary.

Annual size of the global datasphere - source IDC Datasphere whitepaper - download here PDF opens
Annual size of the global datasphere – source: IDC Datasphere whitepaper – download here (PDF opens)

On the other hand, with the evolution from connected cars to autonomous cars and ever more edge use cases (not just in heavy industry, manufacturing & Industry 4.0, critical facilities and other areas where industrial IoT is big but also in financial services and even gaming) we haven’t seen the last use case for the edge yet. 5G and what it will enable is another factor.

Nevertheless, cloud is poised to become more prominent in the core of the network and, as virtually all market data and evolutions show, there is still ample room for public cloud to grow. In the words of Seagate and IDC: cloud is the new core. The ‘Digitization of the World’ found that by 2025 49 percent of the world’s stored data could reside in public cloud environments by 2025, with “cloud data centers becoming the new enterprise data repository as companies continue to pursue the cloud for increasing data processing and storage needs”.

Sensors and other endpoints in the global datasphere

One of the main drivers of the shift to the public cloud on top of the traditional benefits: IoT (the Internet of Things) with its smart sensors that ‘are constantly capturing, recording, and analyzing data in business environments’.

It should by the way be reminded that many IoT applications and use cases won’t need edge capabilities at all. Think about some of those low-data applications where now and then data is sent in small volumes for less mission-critical purposes as we see quite some in the space of LPWAN connectivity (certainly those standards with the highest latency of the various LPWA network protocols).

By 2025, almost 90 percent of all data created in the global datasphere requires some level of security, but less than half will be secured.

Yet, here too change is afoot as, all in all, there are still quite some applications coming, on top of the existing ones which also include anything related with video monitoring and security. The connected car is again an example. Constant connection and capturing will even be more important if autonomous vehicles take off.

IoT devices are expected to create over 90 zettabytes of data in 2025. Endpoints continue to be the primary location for data creation in the short term says IDC SVP David Reinsel. However, “the fastest growth is forecasted to happen at the core and the edge – with more data stored in the core than in the world’s endpoints by 2025” he adds. And this will be especially the case for major industries “as edge computing continues to be a key driver of business-critical factors and digital transformation”.

Data-readiness with the DATa readiness CONdition index (DATCON)

This brings us to those four industries Seagate looked at with its DATa readiness CONdition index, a.k.a. DATCON. It probably won’t come as a surprise that global data readiness differs per sector (and per region and more).

In 2025, each connected person will have at least one data interaction every 18 seconds. Many of these interactions are because of the billions of IoT devices connected across the globe, which are expected to create over 90 ZB of data in 2025.

It’s not a secret that data readiness, let alone the capacities to deal with data and data management overall, could be, well, a bit better. Silos, skills, culture, understanding analytics, monetization strategies, security, the ability to aggregate data with all our monolithic systems, you name it. In manufacturing, for example, loads of data is captured but a lot doesn’t make it to some central data lake or other system/repository (where it could be, for example, leveraged using machine learning).

However, we evolve, and organizations do as well. And from the four industries in the DATCON index, manufacturing and financial services have scored best overall at 3.3 each. Both represent the greatest use of edge computing in the four industries, with opportunity for blockchain, analytics and AI.

A score of 3.3 is definitely ‘better than medium’ since the DATCON scores range from 1 (critical, as in ‘alert’) to 5 (optimized). The metrics used for the DATa readiness CONdition index include data growth, criticality, security, investment, management, skills and C-level involvement.

As Dave Reinsel puts it in the video, made at the occasion of Data Age 2025 (embedded below), the DATCON index “is a calculated score that is synthesized across six weighted vectors and numerous metrics that emerge from surveys, research, industry experts, and other sophisticated modeling techniques”. That doesn’t sound like something you can try at home.

With 2.4 healthcare has room for improvement in data readiness Seagate emphasizes, adding that survey results indicate blockchain will be important for the industry, but nearly 60 percent lack a strategy or have yet to implement any initiative.

As consumers cede more control of their data to the enterprise IDC expects that more data will be stored in the enterprise core than in all existing endpoints (David - Dave - Reinsel comments on the Data Age 2025 research project and datasphere forecasts - picture David Reinsel source and courtesy IDC
As consumers cede more control of their data to the enterprise IDC expects that more data will be stored in the enterprise core than in all existing endpoints (David – Dave – Reinsel comments on the Data Age 2025 research project and datasphere forecasts – picture David Reinsel source and courtesy IDC)

Media and Entertainment, finally, have the lowest DATCON score with 2.0, which would show that the sector is ripe for advanced data technologies; especially in data security and data management.

Healthcare data: becoming the number one datasphere segment

While media & entertainment is scoring worst we’d like to emphasize the healthcare industry that has been digitizing and digitalizing since years.

Not just because of its 2.4 DATCON score but, most of all because 1) healthcare is becoming the fastest growing sector on a data level (and it is already huge today, even if you just look at EHR, healthcare analytics, imaging data, what’s happening in healthcare facilities or what’s ahead with IoT in healthcare) and 2) healthcare data obviously are among the most private sensitive data you can imagine (one of the reasons they’re extra protected in the EU’s GDPR with its stringent data processing principles and obligations for data processors and data controllers alike, whereby healthcare data in a very broad sense are defined as highly sensitive).

Seagate and IDC also point out the healthcare industry and state that, while healthcare currently has the smallest share of the global enterprise datasphere among key industries examined in the study, it is primed to not just grow the fastest but also surpass the media and entertainment sector and match the financial services sector by 2025.

According to the announcement the growth reflects advancements in healthcare analytics and imaging technology, as well as the increasing amount of real-time data created in medical care.

The enterprise as the data steward – responsibilities ahead on the road to Data Age 2025

Data management and data security (should) go hand in hand. Adding personal data protection and compliance, security really should be a top priority for organizations even if, admittedly, it now and then seems people care less about breaches than one would believe.

The place and interplay of the endpoints the edge and the core according to IDC - source IDC Datasphere whitepaper - download here PDF opens
The place and interplay of the endpoints the edge and the core according to IDC – source IDC Datasphere whitepaper – download here (PDF opens)

However, it’s not just about the protection of personal data. Cybersecurity is simply a must to make digital business work. Or, as TÜV Rheinland put it when announcing an industrial cybersecurity report: “the existential question for many companies will be whether they can manage the security challenges in the digital economy. It may simply amount to a question of success or failure, without the opportunity to compromise”.

The reason why we focus on this so much? On top of the obvious here’s another data point that’s interesting in the communication from Seagate: “The Digitization of the World study predicts that the installed bytes across the enterprise is to represent over 80 percent of total installed bytes worldwide in 2025”.

That brings quite some responsibilities and questions regarding data ownership, protection and stewardship, doesn’t it? Seagate recognizes that of course. We quote again: “The enterprise is fast becoming the world’s primary data steward in today’s connected world…this trend will only continue to amplify the data protection responsibilities of companies around the world”.

Strategy, storage, zettabytes and more datasphere 2025 resources and findings

So, work ahead as we move to that 175 zettabytes datasphere on all levels: data management, security, deciding what fits best in the edge and what goes in the core etc.

Seagate’s CEO Dave Mosley: “We are at the beginning of an era where both data creation and data utilization are forecasted to grow rapidly over the next decade. While some industries are more prepared for digital transformation than others, all businesses need to be ready to act on a solid digital strategy in order to be successful in the data age”.

Strategy indeed. Seagate obviously offers ample solutions needed in deploying these strategies. Chance is small you never had a Seagate hard drive in some computer or device you owned. And hard drives (and the hard drive industry), although the face of storage obviously evolves too, aren’t going anywhere soon.

Data technologies are becoming central for productivity expansion data monetization and value-creation - Seagate CEO Dave Mosley comments on Data Age 2025 and the datasphere forecasts - picture source and courtesy Dave Mosley Seagate
Data technologies are becoming central for productivity expansion data monetization and value-creation – (Seagate CEO Dave Mosley comments on Data Age 2025 – picture source and courtesy Dave Mosley Seagate)

That brings us back to the video below with IDC’s Dave Reinsel who states that to keep up with the storage demand stemming from increasing data creation, IDC forecasts that over 22 zettabytes of storage capacity must ship across all media types from 2018 to 2025, with nearly 59% of that capacity supplied from the hard drive industry.

In case you’re not familiar with a zettabyte yet and wonder what 175 zettabytes really means, Dave Reinsel knows how to make it a bit more tangible: “a zettabyte is a trillion gigabytes, and if one was able to store the 175 zettabytes we could reach by 2025 onto Blu-ray discs, then you would have a stack of discs that could get you to the Moon 23 times”. And another one: “Even if you could download 175 zettabytes on today’s largest hard drive, it would take 12 and a half billion drives”.

More in the announcement of the DATCON index and ‘The Digitization of the World – From Edge to Core’ paper, on the mentioned home page of the Data Age 2025 section of Seagate’s website where you can download the IDC report (PDF opens), check out reports per region, download PDFs with a deeper dive into each of the mentioned four sectors, watch the video below and see one with a discussion on edge computing and far more.

In case you just can’t get enough: also check out this blog post from Seagate’s Editorial Chief, Global Content Marketing, John Paulsen, who zooms in on additional findings and adds some thoughts. The last word is also for John Paulsen. “By 2025, almost 90 percent of all data created in the global datasphere requires some level of security, but less than half will be secured” he writes.

So, time to manage it well, because, as Paulsen states “beyond the societal impact, if not managed well, this growing flood of data could result in businesses experiencing operational inefficiencies, delivering poor customer experience, and losing revenue”. We couldn’t have said it better.

For the DATCON analysis over 2,400 companies were surveyed. IDC has been calculating the size of the Global DataSphere for over a decade.

All images are property of their respective, mentioned owners and serve illustration purposes.