Blog

The Future of Data Integrity

Challenge Disinformation – Proving Integrity

September 14, 2020

Mathematical Proof of Integrity

Last week we explored the necessity of data integrity in the social media environment to safeguard creators aswell as enable recipients to challenge disinformation. This week we will delve deeper into the act of proving data integrity to find a solution.

As mentioned before the trustworthiness of information should be ensured through an objective or mathematical source of truth. If this mathematical truth can be reproduced and safeguarded by a consensus, the possibility for interference with the mathematics becomes incredibly small.
A technology that can deliver such a consensus supported mathematical mechanism can be found with distributed ledger technologies such as blockchain. Here, a larger network of independent computers called nodes creates a calculated consensus supported immutable record to which original data can be anchored to. After anchoring, the creation of a shareable digital proof making independent verification easy needs to be created. This is incredibly important for digital communication.

New technologies relying on the ingestion of digital data like machine learning, big data analytics, and AI would benefit tremendously. These technologies are able to exclude human error in their processes, by reducing them to machine-to-machine interactions. They are, however, vulnerable to ingesting already tampered or falsified data or even data corrupted by error. Machines can be programmed to automatically verify all data and detect and exclude verifiably tampered with data. Data that hasn’t been anchored in a source of truth and doesn’t have the verification mechanism embedded may then be more extensively vetted and ingested with appropriate precaution.
Such M2M processes would also benefit from the mathematics-based source of truth not reliant on authority. This makes cross-system trusted data exchange possible and easy without extensive complicated data vetting and forensics.

The usefulness of data integrity and data verification for machine learning and data analytics requires that data is actually verified. Hence, the same must be true for society in general. Even if we have the verification mechanism, it’ll only have an impact if we use it. This reinforces the requirement of responsibility not only for creators but also recipients.

### Caveat
During this examination, the importance of verification of data integrity was introduced to counter the spread of disinformation. By establishing ‘original data’ and making it easily verifiable and shareable society could create a new standard for data integrity. This should encourage the spreading of content that hasn’t been altered and discourage manipulation by making it easily detectable. To further encourage honest editing defend them from misrepresentation of their additional work they could use the same mechanism.
This measure is unlikely to fully counteract the spread of disinformation as the topic is multi-faceted. However, this may help with the most aggressive type of ‘fake news’: content manipulation. It should also further discourage less direct forms like misrepresentation and framing.

You can download the entire paper here.