Document imaging and document capture: to business process, outcome and value

Document capture and digitization conceptData and information are the lifeblood of the digital economy. Data is the new oil and the new currency. You’ve probably heard one of these statements. They express what indeed can not be ignored: information is key for business processes, transactions, workers, customers and the very essence of business as you can read in our information management guide.

Even if information and data are the great connectors to make businesses perform better, enable people to fulfil any given task and achieve numerous benefits, there are virtually no companies that can optimally leverage the information they dispose of. That’s OK. An ideal state is always a goal, never a given. However, most organizations are still very far away from some very essential ways to deal with the information they create and capture.

Four reasons why looking at document capture matters

In this overview we look at the evolutions in document capture, also simply called capture, first from a paper document and form perspective and next from a multi-channel perspective as capture is moving towards an integrated approach whereby paper and hard copy documents are joined by (the capture of) email and other sources and formats.

Moreover, capture and Enterprise Content Management (ECM) as a process in which it plays a role, are shifting their focus towards business process management and optimization and, along with it, the outcome, the customer, the case, the specific process that serves a purpose and a user.

Why does understanding the essence of document capture and the changes in the space matter?

  • Loads of organizations still miss out on important opportunities when it boils down to digitization and automation, even at those very essential levels where urgent capture action is needed for this digital age.
  • Many organizations do realize the benefits of digitization but hesitate as it’s a rapidly changing landscape with significant shifts that impact the way document capture strategies are designed. Indeed, strategies: document capture and scanning for business reasons is not just a matter of turning paper documents into digital images, it’s crucial, even in a customer-facing context. And without a clear strategy it doesn’t work.
  • Without digitization there is no digital transformation or innovation. You can have all the structured and unstructured content you want, you can pile paper documents up to the roof, you can scan and digitize for storage reasons: if you don’t digitize essential documents, leverage the data they contain and turn them into information (remember DKIW?) that drives better and transformative processes, it can’t be used for digital transformation.
  • There isn’t a business, large or very small, that doesn’t have opportunities to gain rapid benefits and ROI from document capture. It doesn’t even need to be about digital transformation or business process automation/improvement or customer service. Although these are all important as we’ll see, there are also still plenty of opportunities to drive out paper to reduce costs, to enable your knowledge workers or to simply avoid the frustration that is called paperwork. Obviously, many of these opportunities and potential benefits are often highly connected in practice.
Speed of processes and rapid outcomes are differentiators in today’s increasingly real-time and information-intensive age.

There are more reasons why document capture does matter, which we’ll cover later with evolutions and tips to have a better end-to-end capture approach and with a clear focus on the business outcomes and customer and worker value on one hand and our simple mantras that 1) without people and processes nothing works and 2) there is never a one-size-fits-all solution.

What is document capture anyway?

Document capture is the process of scanning paper documents (such as invoices or proof of delivery documents) and hard copy documents (such as books, ID cards, driver licenses etc.) and capturing electronic forms, documents and, increasingly other forms of unstructured content in order to extract the relevant data and transforming them into actionable information that serves a specific business goal or purpose.  

The capture of paper documents includes the creation of digital images (with scanning or document imaging), whereby the important information you need to capture/scan is extracted, classified, stored, managed and fed into databases, often to trigger a myriad of potential processes, workflows and outcomes, whereby each process is connected with another.

The lines between document capture, information capture and data capture (information does not equal data) are blurring, the reason we talk about capture.

The sources of data and content to “capture”

Capture and input - source AIIM
Capture and input – source AIIM

Traditionally, capture was predominantly looked upon from an Enterprise Content Management (ECM) and document management perspective. This is still a big part of the overall document capture equation and is subject to evolutions as such too. One example: the use of mobile devices for (professional) document capture has led to the rise of mobile capture.

In the ECM approach as depicted in AIIM’s ECM framework, capture – or the input side of the information management reality – was divided in two “forms” of information entering the organization:

  1. Human-created information, where you among others find paper documents (scanning) and an increasing number of mainly unstructured information in multiple formats and from multiple sources (email, social media, SMS, rich media, forms etc.).
  2. Application-created information, including information in and from ERP systems and a range of information-intensive applications with their various structures and ‘languages’ (for instance, XML).

Nowadays there are all forms and sources of data and information, ranging from emails and email attachments, forms and digital documents to paper documents, structured data and an ever increasing volume of unstructured data (which includes among others emails).

It’s clear that application created information is strongly on the rise and will continue to in a context of the IoT and APIs. However, in the scope of this overview of document capture we limit ourselves to the forms and sources of information for capture displayed in the graphic with paper documents taking center stage.

To summarize, this includes, among others, papers forms, electronic forms, rich media, office and paper documents, email, text, fax and, older, microfilm (human created information) and information from applications coming from, among others, back-end systems such as ERP, front-office application, e-Forms and XML (structured data).

A variation on the AIIM ECM model by BPO DATAMARK with a closer look at the capture part – source infographic
A variation on the AIIM ECM model by BPO DATAMARK with a closer look at the capture part – source infographic

Document capture moves closer to – and gets embedded in – the business processes

Document capture has moved beyond simply scanning documents in rather separated environments such as mailrooms and certainly beyond archiving. It has become more than only the automation of manual processes as it’s often done by specialized business process outsourcers. Is is even more than only a trigger for workflows and processes, it is starting to become embedded in processes as the outcome matters more and the opportunities to capture closer to the source of the information, closer to the process and closer to the customer/worker have significantly grown. That’s the context in which case management and the focus on content-as-a-service (‘content services’) also needs to be seen.

Today, capture is all about this end result, the process (with Business Process Automation and Business Process Management taking center stage) and the need to get information and data in processes and systems in order to serve a bigger business and value outcome. It’s about the customer, the user, the experience.

Strictly speaking the term document capture doesn’t really cover what it is. In the end we don’t want to capture documents. We want to capture the relevation data and information from those documents, using several possible methods, and turn it into data in a database, a back-office system or a line-of-business application in the fastest, cheapest, most accurate, safest and most efficient way possible. Speed of processes and rapid outcomes are differentiators in today’s increasingly real-time and information-intensive age.

Document capture and the paper dilemma

We mentioned a few reasons why document capture matters before. Let’s – in general – say that document capture is a key activity for many organizations because there is still so much paper involved in business processes.

Making processes paper-free results in a four time speed up in response times (Doug Miles).

Paper-based processes are slow, don’t enable the use of data for digital business processes and need to be digitized and thus captured so the resulting data can be leveraged.

Despite the strong attention for digitalization, in many industries and businesses the volume of paper even keeps growing. In others there is a decline, even if few have really achieved the full benefits of a more digital and less paper-intensive environment.

There are literally thousands of articles, reports and initiatives to drive paper out of business processes. However, the reality is that paper-based processes are still the norm, rather than the exception as information management expert Doug Miles put it at the AIIM Forum 2015, adding that paper-based processes kill productivity.

Process optimization and business at the speed of paper do not go hand in hand - Doug Miles at the AIIM Forum 2015 - picture J-P De Clerck
Process optimization and business at the speed of paper do not go hand in hand – Doug Miles at the AIIM Forum 2015 – picture J-P De Clerck

Obviously removing paper from processes through document capture doesn’t happen just like that. You need to start somewhere and in some cases paper is still required. The question where to start digitizing and automating is one of productivity, efficiency, ROI, compliance, business process needs and ultimately key business goals such as enhancing customer service, improving customer experience and/or saving costs.

Why do you need a document capture strategy?

The capture process consists of several stages, from the reception and preparation of documents that need to be digitized, the actual capture and classification to the storage, routing and delivery of captured information to various systems, destinations and systems where they trigger processes, feed workflows or lead to a desired outcome.

That whole capture process needs to be looked at as the overall efficiency of any capture solution depends on these various steps. If your solution, typically involving hardware (document scanners) and software doesn’t have the right indexing and recognition technologies on board for your classification and capture needs (more about these technologies later) or you spend a lot of time preparing documents for scanning (often the de facto largest cost) and fail to include that time, your expected ROI will be wrong.

As there is no one-size-fits-all solution, it’s key to look at your document capture needs, taking all those aspects into account and map out the various related processes and considerations. Obviously, the capture processes isn’t a stand-alone one. Moreover, as mentioned, nowadays, capture takes place within several business processes, instead of being an almost separate one.

On top of the typical questions to ponder when seeking a professional document capture solution (what types of documents, how many, for which process, where does the digitized information go to, what about compliance,….) we’ll tackle later, there are the overall processes and business goals in which the capture operations fit.

Automating more manual processes with document classification is an immediate priority for the improvement of current capture systems - source AIIM
Automating more manual processes with document classification is an immediate priority for the improvement of current capture systems – source AIIM

By having a holistic approach regarding the capture strategy, you can make sure you don’t overlook important cost factors, required resources and, most of all, opportunities. In the end, that’s why we need a plan and strategy for in the first place. It’s a way of analyzing needs, with the end in mind, map the various elements and come to a best solution for individual needs, whereby we can also see if there are improvements in the current ways we work, even apart from the digitization part – which is already a change in how we work – as such.

Another reason to put a clear document capture approach in place has to do with mandates and reaching goals.  The move towards a more paper-less environment (which clearly requires a mandate) and measuring the efficiency of your overall document capture project are important in that regard.

In the end, document capture is essentially about business process management, automation and optimization with a clear scope and purpose in mind. But then again, isn’t everything that is related with information management and beyond?

Capture beyond paper: the multiple ways to acquire information – towards multichannel capture

In line with the overall multichannel reality, the emergence of ever more information sources/formats and the evolutions in the market of vendors starting to look at capture beyond paper (which remains a key challenge though), we are increasingly talking about multichannel capture. It is one of the key evolutions in capture and even in ECM and information management.

The term is maybe not the best possible one (some say omnichannel capture) as it risks focusing too much on the channels via which information and data is captured. At least as important, however, are the sources (a channel is not a source), the formats and the types (variety) – with unstructured data being the predominant one (and an umbrella term).

One of the many companies seeing the multichannel capture evolutions was Forrester. The company analyzed the market of multichannel capture vendors in 2012 in a “new” so-called Forrester Wave report. Authors Alan Weintraub and Craig Le Clair described the evolutions well in the introduction (the report is available via multiple vendors who are cited in it – some uploaded the PDF, some such as EMC uploaded it on SlideShare – we embedded it below).

Capture has extended beyond the single dimension of paper scanning in one or two primary locations to become the multichannel, distributed onramp for acquiring information“, the authors wrote. We couldn’t say it any better.

Next, they summarized some of the evolutions such as the incorporation of advanced analytics in capture, obviously mobile, a stronger integration with enterprise production platforms and the incorporation of BPM and case management for the coming years. These evolutions are indeed part of what we’ve seen happening the last few years and what will continue for several years to come, along with a few new phenomena which we’ll describe later.

Beyond document capture: from cloud to mobile and Internet of Things

As the change of thinking and working (in practice and in the vendor space) regarding capture is among the key topics in the ECM industry today, Forrester obviously isn’t the only one to cover it.

Harvey Spencer, who runs Harvey Spencer Associates and is a capture industry veteran, for instance, calls it Capture 2.0 (omni-channel capture).

AIIM, the association of information professionals refers to content analytics, mobile, the cloud and collaborative technologies (the so-called MACC stack) as game changers in Enterprise Content Management. Cloud is also, on top of the already mentioned mobile and analytics, an important evolution we see in the capture space (and takes center stage in Harvey Spencer’s Capture 2.0 story).

And then there is the Internet of Things that will not only (and in fact, already does) have a significant impact on Enterprise Information Management overall but also opens up new opportunities for capture, even for document scanning.

Segmenting (document) capture: three core use cases

Before looking deeper at these various evolutions, let’s go back to the ways capture is split into various types and/or segments. The mentioned “The Forrester Wave™: Multichannel Capture, Q3 2012” report (see below), divided the capture market into three segments which are not related with three key use cases. There are really MANY use cases (vertical, horizontal, combined,…) to look at capture as each business, division, process, application, task and ecosystem is different but the three segments of Forrester describe the overarching categories pretty well.

Batch capture

The production capture or batch capture segment remains an important one, even if it is changing as well (we see more hybrid approaches, also in the mailroom, which typically is an important part of this batch capture segment and where there are evolutions towards digital mailrooms). Similar evolutions towards more hybrid approaches can be found in other high-volume and high-speed capture environments, typically specialized BPOs and document service bureaus.

So, although batch or production capture certainly still has its role and place and is often chosen in very demanding and centralized scanning environments, it is changing. Production capture is mainly about document capture and is more used in high-volume capture environments in industries such as healthcare and insurance and by the mentioned business process outsourcers who are specialized in document processing, capture, document management and/or in Managed Content Services (MCS), although here it depends on the scope. Across all verticals and aside from the BPO niche, digital mailrooms are where you’ll see most high-volume batch capture.

More on the digital mailroom

Decentralization: the rise of distributed capture

The on-demand and point-of-service capture segments is where we see a lot of evolutions happening and growth happening for multiple reasons, certainly in document capture.

This on-demand and point-of-service segment is where we find the so-called transactional and distributed document capture approach, as opposed to the centralized capture model that plays a bigger role in production capture. The increasing use of transactional or distributed models goes hand in hand with changing business priorities, decentralization of organizations and processes as such and the technological evolutions, to name a few.

More on distributed capture

Highly specific capture

Application-specific capture with a focus on specialized processes is, as the name indicates very specific.

We’ll soon cover all three of them through numerous typical use cases and processes and at the same time dive deeper into production capture, transactional capture and hybrid approaches with a look at the role of the cloud, mobile, IoT and other technological evolutions.

Document capture and document management

Document capture occurs through several technologies of which document imaging (scanning) refers to paper documents. Of course the information obtained through imaging and techniques to identify and digitize the needed information and classify it, is connected to a system (CRM, ECM, SharePoint etc.) or process/person to do something with it.

This is where, for instance, document management comes in. Although it’s often seen as an integral part of ECM (as are web content management, records management, BPM and even image processing applications as such), document management differs somewhat of content management in the strict sense.

Document management essentially refers to the activities of capture, storage and retrievals of documents, regardless whether these documents are electronic or come in the shape of paper. And of course both don’t stand on themselves in most cases. If you get a paper document it is probably linked to electronic documents and records depending of the overall context within which the paper document fits (e.g. a claim, a legal case, an account opening, customer onboarding overall, an invoice etc.).

Document management - Shutterstock
Document management – Shutterstock

Document management is one of those information management activities businesses constantly are dealing with. We get mails, letters and forms the whole time, as we receive other information carriers and information as such. While it’s probably not the most sexy aspect of what we do, it’s crucial and we do it constantly, even if we have no (electronic) document management system (think about how we file documents manually in our private lives). When using an (electronic) document management system, it simply becomes easier to manage it all in many aspects and for many purposes, ranging from legal obligations to finding and sharing the information in the electronic document management system. And of course these systems in turn can be connected with other systems, processes and workflows.

To wrap up for now, note that the input of information is not just about capture but also – increasingly – about automatically generated data and information through a myriad of new technologies and information processes with a multiplication of touchpoints (e.g. Internet of Things), APIs, analytics and algorithmic/cognitive computing processes (artificial intelligence).

The key takeaway for these various use cases and the various segments (and which approach to choose), however, as always is and will be most of all about the individual business context in the broadest sense.

Top image purchased under license from Shutterstock


Top image: Shutterstock – Copyright: jijomathaidesigners