Photo by Jade Masri on Unsplash

Photo by Jade Masri on Unsplash

Legally compliant blockchain archiving

All Together Now

Article from ADMIN 54/2019
Unite multiple data warehouses and data silos set up for ERP, tax compliancy, and EAI/ESB processes with blockchain-archived data.

A tax-compliant archive, a data warehouse, or an enterprise resource planning (ERP) system largely processes the same data, but unfortunately with a different persistence level. Companies can no longer afford this kind of redundancy. Deepshore provides an alternative concept that draws on blockchain as an important component in its implementation and thus adds often-missed value to the overall system.


As if the sheer volume of electronic information were not already challenging enough, virtually all large companies in this country allow themselves the luxury of processing data multiple times and then storing the data redundantly. Even today, the areas of archiving, data warehouses, and ERP are still organizationally and technically isolated. IT departments do not like people seeing their hand, and setting the isolation of applications in stone does not make it easier to design solutions across these system boundaries.

Different approaches for merging all data on a single platform include the many data lake projects of recent years. Such undertakings, however, were rarely driven by the desire to store data centrally to make the data usable across the board; rather, they were set up primarily to evaluate its content (e.g., as Google did) and use the knowledge gained in a meaningful way to optimize business processes.

Typically, companies in the past stored quite large amounts of data in a Hadoop cluster and then began to think about how to extract its added value from the dormant treasure trove. In many cases, such projects have not been able to realize their promised benefits, because it is not trivial to master the diversity of platform components in their individual complexity. To avoid this complexity, as it was 15 years ago in the ERP environment, it is still common today to react to every technical application with a systemic standard response. Now it is essential to reflect on for which use case the respective system or specific component (technology) is actually suitable.

Complex Data Structures as a Challenge

The partly complex and different data structures within a data lake prove to be a massive challenge, because usually the information, already indexed in the source system, is stored in its natural or raw format without any business context, which of course does not make its subsequent use any easier. Current systems likely are not able to extract the right information automatically from existing raw data and perform intelligent calculations. The attempt to generate significant added value in a classic data lake approach is considered a failure by experts. However, giving up digitization of data would certainly be the worst alternative.

The search for the largest common data denominator was at the forefront of considerations on how to build a meaningful and uniform data persistence in an enterprise that does not run into the typical problems of data lake projects. In the case of electronic information, the specific question was: What are the technical requirements for data persistence? ERP should, for example, execute and control processes and, if necessary, provide operational insights into current process steps. Aggregated reports, evaluations and forecasts for tactical and strategic business management are to be delivered in a data warehouse (DWH), and the archive should satisfy the compliance requirements of the tax authority or other supervisory authorities. Everyone knows that many times the same data is processed redundantly in their respective systems. This considerable cost block, which is reflected in the IT infrastructure, does not yet seem to have reached the focus of optimization efforts because of technical and organizational separation. Indirectly, it is not only about persistence (i.e. storage), but also about the entire data logistics.

Highest Data Quality in the Archive

To come closer to the goal of a uniform data platform, the question now arises as to which business environment defines the highest data requirements in terms of quality, completeness, and integrity. For example, in the typical aggregations of a data warehouse scenario, completeness at the individual document level is not important. In the area of ERP systems, raw data is frequently converted and stored again when changes are made to the original data. In a correctly implemented archive environment, on the other hand, all relevant raw data is stored completely and unchangeably.

Strangely, an archive, the ugly duckling of the IT infrastructure, normally has the highest data quality and integrity. The requirement to capture, store, and manage a lot of data about a company's processes in a revision-proof manner in accordance with legal deletion deadlines is a matter close to the hearts of very few CEOs and CIOs. Instead, legislation, with the specifications of tax codes and directives for the proper keeping and storage of books, records, and documents in electronic form, as well as for data access, along with a multitude of industry-specific regulations are the causes of an unprecedented flood of data and the need for intelligent storage.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Data security and data governance
    Protecting data becomes increasingly important as the quantity and value of information grows. We describe the basics of data security and governance and how they intertwine.
  • SQL Server 2022 and Azure
    SQL Server 2022 focuses on even closer collaboration between on-premises SQL servers and SQL functions in Azure, including availability and data analysis. We highlight the innovations of the database server and the interaction with versatile and powerful Azure services.
  • Energy efficiency in the data center
    Storage systems are one of the biggest factors in power consumption, so data storage can make a massive difference in operating costs. We look at how you can achieve savings through technologies such as flash, tiered storage, or even cloud-native container environments.
  • MarkLogic and SGI Announce DataRaptor
  • What's new in SQL Server 2016
    The focus in SQL Server 2016 is on mobility, cloud usage, and speed, with improvements to in-memory processing and security.
comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs

Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>


		<div class=