The Future of Data Integrity

Drivers for Information Archiving

September 17, 2019

The New Age of Archiving

In our last blog post we have described the background of compliant archiving. In the following article you will learn about different drivers for information archiving. 

With the rise of digitization and the ever-increasing variety and amount of data created, processed and inevitably archived, organizations will be faced with the governance and management of such massive data amounts. Without any governance or management of large amounts of data it can easily become a “data swamp”, meaning there is little to no curation of data, no ordering, no active management and hence little to no value to the data itself. Further it would also fail to comply to many regulatory requirements.  

A data lake in contrast has some resemblance of structure and order through the attachment of active management throughout the data life cycle and some contextual metadata and Data Governance. By introducing structure and policies, hence creating a data lake, vast amount of data of various types and structures can be ingested, stored, assessed, and analyzed. Data scientists are enabled by data lakes to mine and analyze data, to require minimal transformation if any, to facilitate automated pattern identification. 

Lastly a data warehouse is the most well-structured and rigorously governed type of repositories. For that data cleansed, transformed, catalogued, and made available for use by managers and other business professionals for data mining, online analytical processing, market research and decision support. This in turn means that to retrieve and analyze data, to extract, transform, and load data, their needs to be managed centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format (a data dictionary). This essential component may make it difficult to comply with regulations such as GDPR (General Data Protection Regulation) while still creating value from the information points between these three digital archiving trends towards data lakes rather than data swamps or data warehouses. To make such a concept compliant and effective organizations need a second generation of archiving tools to manage their data lakes to separate important information from irrelevant, make these available to the right people across the organization and be sure that the information in question has not been tampered with. 

For modern information archiving we can identify three general pain points that drive companies to invest in improving their information archiving practices and processes: 

Retention policies put in place to meet compliance standards are of immense importance. One that had the greatest impact so far was GDPR, which stated that individual consumers have to give their explicit consent for companies to commercialize the data generated from them. Further, if the individual has the right to be forgotten, meaning that every organization is responsible to absolutely delete all information tied to an individual and more. Each separate violation of this regulation may cost an organization up to 20 million Euros or 4% of its annual revenue (whichever is greater). 

This and other regulations create the necessity for compliant information archiving solutions to provide interactive, secure long-term storage of electronic business content, such as: receipts, invoices, email, file systems, Microsoft SharePoint content, social media, instant messages, and a broad range of other structured and unstructured information. 

In addition to archiving capabilities, solutions need to include fast and easy search and retrieval of information and allow organizations to set granular retention policies which provide the foundation for Data Loss Prevention (DLP), Legal Hold, eDiscovery, and Information Governance. They currently share a global revenue of 5.58 billion USD which is expected to increase to almost 8.86 billion USD by 2022. 

Business organizations will typically deploy an information archiving solution to meet one or more of the following use cases: 

  • Compliance with Regulatory Requirements – some industries are heavily regulated so that organizations are required to retain and preserve electronic information to meet government and/or industry regulatory requirements. 
  • Litigation – during internal and external legal proceedings, organizations will need to efficiently search, discover, and retrieve all pertinent information. 
  • Internal Corporate Policies – due to increasingly large amounts of electronic content that needs to be managed and disposed of according to internal corporate policies by organizations. 
  • Leveraging Information through Content Analytics – organizations are increasingly using information archiving solutions to provide valuable insight into their stored data.  
  • Data and Information Security – information archiving solutions help secure information in a long-term repository – here content can be easily restored in the event of a disaster or during any planned or unplanned downtime.

Learn more details about each use case in the white paper “Compliant Information Archiving – Digitalization and Regulation” here.