Working on Authenticated Data with Authenticated Attributes

Aug 22, 2024By Cole Capilongo

Over the coming year, we are excited to focus on our new initiative, Authenticated Attributes, a design and prototype dedicated to enhancing the authenticity and trustworthiness of digital media. This effort addresses the growing challenges posed by AI-generated content, such as deepfakes, which make it increasingly difficult to distinguish between genuine and manipulated media.

What is Authenticated Attributes?

Authenticated Attributes aims to establish the integrity and authenticity of metadata for digital content. This project utilizes cryptographic tools such as hashing, timestamping, and digital signatures to achieve this, focusing on digital media. By leveraging these technologies, the Starling Framework<internal link> provides a robust process that shifts the burden of proof from detecting falsifications to verifying and supporting authenticity. Authenticated Attributes permits investigators and journalists to understand and produce authenticated data, making their work more accessible and trustworthy.

The Challenge of Deepfake Detection

The current landscape of AI-generated content presents significant challenges. As AI technology advances, so does its ability to create realistic fake images and videos, making traditional detection methods obsolete.

Authenticated Attributes offers an alternative by ensuring that digital media can be verified independently, fostering trust in the content we consume.

The Technical Foundation

Our approach builds on three core cryptographic tools:

Cryptographic Hashes: These unique identifiers act like digital fingerprints for files. Any alteration to the file results in a different hash, enabling the detection of tampering.
Timestamping Services: By recording the hash of a file on a third-party service that establishes a trustworthy time (more on that in this dispatch <Cole’s piece>), we can prove that the content existed at a specific time, which is crucial for verifying the authenticity of media.
Digital Signatures: These provide a way to authenticate the source of information, no matter who provides it. It also ensures that the content has not been altered since it was signed.

So far, we have represented data as bundles, composed of a main asset hash (often, that of a piece of media) and of metadata, and integrity proofs. Authenticated Attributes does away with these monoliths of conjoined meaning, and promotes a more atomic way to represent data: as atoms of facts related to a principal hash or CID. They’re simpler, more portable – and relationships between these are much more explicit.

Verification and claims

Metadata about a piece of media, and its provenance, are verifiable following these steps:

Acquiring or coming into contact with metadata: Metadata is stored in Authenticated Attributes with explicit reference to a unique media. It can be retrieved from an Authenticated Attributes instance (e.g. as one would work inside a content-management system) or encountered separately, as an exported file (e.g. as a sidecar to a media presented online).
Verifying reference to the original media: The attestation (our term of a single piece of metadata) contains the CID of the specific asset, usually a media file. A potential verifier can hash their own media file, get the CID, and compare it to ensure the attestation they have is referring to the same file.
Verifying provenance and integrity: The verifier can check that the signature is valid, and was made by the entity they expect, such as a specific individual or news organization. This proves the attestation has not been tampered with, and was authored by an entity they trust.
Verifying the time origin: They can also check the timestamp proof. This operation combines information in the attestation timestamp proof with a calendar server and the Bitcoin blockchain, to prove that the attestation was made before a certain time. If this is valid, then the verifier can know that the attestation (including the signature) was made before a specific time, and has not ever been modified since. This “locks in” metadata to prevent organizations from secretly altering it in the future.

Integration with Asset-Management Systems

Authenticated Attributes supports verification workflows and the consumption of authenticated, verifiable data. It will be integrated into our research and verification processes, allowing for the creation of verifiable records of digital media. This system not only helps in identifying tampered content but also provides a way to preserve the original media’s integrity over time. Researchers, journalists, and the public can rely on this system to verify the authenticity of images, videos, and other digital content.

The original Authenticated Attributes prototype formed the backend of our latest integration with an asset-management system, allowing investigators and journalists to work with and produce authenticated data. We have integrated with Uwazi, a leading human rights product created and maintained by HURIDOCS. This integration makes it easier to work with and consume authenticated data, enhancing the accessibility and reliability of digital media.

Our reference implementation of the Starling Framework is presently undergoing a re-architecture to conform to the Authenticated Attributes design. This is motivated by learnings from the past two years operating in the field, so we can better handle interoperability and selective disclosure of potentially sensitive metadata to different parties.

Technical Details

The Authenticated Attributes project utilizes several sophisticated tools to ensure data integrity and provenance:

IPFS (InterPlanetary File System): A distributed file system that allows secure and efficient storage and sharing of media.

Hyperbee: A key-value database built on the Hypercore protocol, enabling peer-to-peer interactions. Hyperbee supports a decentralized network of creators and consumers of authenticated data, enhancing the robustness and accessibility of our platform. Its peer-to-peer nature is particularly beneficial for applications like the Index for Accountability, a use case for the United Nations, where a decentralized network can enable more secure and reliable verification and documentation of human rights abuses.

Verification systems: Asset-management systems, integrated or integrating with Authenticated Attributes to manage and verify digital assets, with integrity proofs packaged in industry standards such as the C2PA Specification and Verifiable Credentials

For more detailed information about the project and to get involved, visit our GitHub repository and read our detailed documentation here.