Publication of our whitepaper on Best Practices for Admissibility of Web Archives


Nov 1, 2022

Law

In the autumn of 2022, we convened two pivotal workshops to focus on the evolving role of archiving web pages and social media in the context of international justice, particularly concerning Russia’s war against Ukraine. These workshops, one technical and the other legal, aimed to explore how recent advances in web archiving could support the collection, storage, authentication, and utilization of digital evidence in accountability proceedings for victims of the conflict.

The discussions formed the basis of a whitepaper, authored by Scott Martin (Global Justice Advisors) and Basile Simon (Starling Lab) and set of best practices that outline the ideal characteristics of a web archive for use in court, drawing on the requirements of the Berkeley Protocol on Digital Open Source Investigations.

Best Practices for Web Archiving

According to this whitepaper, the ideal web archive demonstrates the following properties:

  • It can be produced by anyone, notably by individual actors with tools they can grasp and control (as opposed to using a commercial service or being granted access to a platform). This is correlated, to an extent, with the use of open-source and local software.
  • It is of high fidelity, meaning it was carried out by a tool that preserved most, if not all, of the original material.
  • It includes the content itself, its surrounding metadata, the metadata of the web scraping software. This includes cryptographic hashes of all website assets and the signature of these hashes authenticating it to the author.
    • Furthermore, cryptographic hashes and signatures must be preserved, that is to say, stored securely and made available for the long term, as would the content itself.

 

Establishing Clear Methodologies

To maximize the admissibility of a web archive as evidence, archivists and legal professionals must establish clear, detailed methodologies. These methodologies should document the provenance of the digital evidence — detailing where it comes from, how it was procured, who procured it, when it was procured, and the process followed. This includes documenting the chain of custody and demonstrating that the webpage has not been altered during archiving.

Key points include:

  • Detailed Record-Keeping: Identify the person conducting the archiving, their qualifications, and the web collection protocols observed. Describe the hardware and software used, and explain the process for selecting and assessing websites and articles for credibility and resistance to manipulation.
  • Storage Protocols: Describe measures against corruption, hacking, and other risks to ensure the integrity of the archives over time. This should be recorded in a chain of custody that tracks who has handled the document.

Background on Workshops

Technical Workshop: Enhancing the Integrity of Web Archives

On August 25, Starling brought together experts in web archiving to discuss methods to preserve information for accountability purposes in Ukraine. The workshop delved into various collection, authentication, and preservation strategies, emphasizing the technical aspects that ensure the integrity of recorded web pages and other digital materials.

Participants first examined existing web archiving practices and their operation on a technical level, then discussed the potential risks to these archives that could threaten their integrity. A significant focus was on the vulnerabilities of storing web archives using traditional archival models. The discussion highlighted how a shift towards more distributed and decentralized models could offer improved long-term resilience and availability, essential for maintaining the integrity of the archives in unpredictable environments.

We thank the following participants for their contributions:

  • Mark Graham, from the Internet Archive;
  • Ilya Kreymer, from WebRecorder;
  • Michael Nelson, from the Old Dominion University;
  • Nicholas Taylor, expert witness in the Internet Archive Wayback Machine;
  • Ed Summers, from the Stanford Libraries;
  • And Cade Diehm, from the New Design Congress.

 

Legal Workshop: Web Archives in the Courtroom

Following the technical discussions, a roundtable of legal experts convened on September 27 to explore the legal dimensions of web archiving practices. This group included lawyers specializing in war crimes and legal professionals experienced with digital evidence. The goal was to identify potential legal vulnerabilities in current archiving practices and determine how such materials could be admitted into evidence in courtrooms, particularly in war crimes and other international criminal proceedings.

The legal experts articulated best practices to ensure that web archive data are preserved, produced, and authenticated in ways that maintain their integrity. This enhances their reliability, utility, and probative value as evidence in a judicial context. The roundtable discussed the characteristics and challenges of various web archiving practices and presented a framework to assess these methods.

We thank the following participants for their contributions:

  • Scott Martin, from Global Justice Advisors;
  • Melissa Bender, from Ropes and Gray LLC;
  • Tim Parker, from Blackstone Barristers;
  • Cari Spivack, from the Internet Archive;
  • Karolina Aklamitowska, from Tallinn University;
  • Clare Stanton, from Harvard Law School;
  • Bastiaan van der Laaken, from the UN IIIM Syria.

Next Steps: Call for Contributions on Witness Servers

Finally, to improve on the process of entering web archives into evidence, Starling are formalizing the concept of “Witness Servers” as an additional layer of self-corroboration for web archives. A Witness Server is a service, hosted and run by an institution, which carries out web crawls on-demand on behalf of individuals conducting web archiving activities.

Participating institutions, e.g. the Stanford Libraries, WebRecorder, or the Harvard Library Innovation Lab, bestow the individuals or team they accept to witness with the trust that might be placed in the institutions themselves. The roundtable findings identified the reliance on the social trust placed in institutions as particularly supportive of strengthening the work of potentially vulnerable investigators and archivists.

Several Witness Servers act in concert on the instruction of a web archivist and simultaneously capture the same web page. Such an approach addresses the possibility of a webpage having slight variations depending on locale (and many other potential anomalies) and works to otherwise corroborate the contents of a website through a replication process that validates the contents of a web archive from several different locations and actors.To learn more, participate as an institution or a researcher, read the Call for Contributions.


Supervisory Testimony as a Novel Tool for Accountability


Jul 1, 2021

Law

The testimony of a primary observer’s supervisor could help bridge evidentiary gaps. Supervisory testimony would consolidate the institutional knowledge of the chain of custody of specific pieces of digital evidence with a single person.

Criminal investigations depend on evidence gathered in the field. Unfortunately, these scenes can be difficult to access (ex: war zones, disaster areas) and individuals on the ground may not be accessible to courts (ex: threats to safety, loss of contact). Emerging technologies could help overcome such challenges, providing novel solutions to weigh probative value and establish authenticity.

In the summer of 2021, and as part of the Human Rights and International Justice Policy Lab at Stanford Law School, the Starling Lab and Hala Systems convened a group of international experts on digital evidence and war crimes prosecutions to solicit feedback on their proposed model of Supervisory Testimony.

The gathering was chaired by Beth Van Schaack (then visiting professor at Stanford), assisted by Mackenzie D Austin (then Lab associate). 

Problem: Authentication and chain of custody of digital evidence registered on a distributed ledger

Data (including multimedia assets like photo and video)  that is collected by field observers can present a number of evidentiary and admissibility issues when that data is submitted in accountability settings. Because of the ad hoc nature of evidence collection in conflict zones, the chain of custody of individual pieces of digital evidence can be especially hard to trace. Furthermore, a variety of custodians might have handled evidence before it reaches a host organization or tribunal, which can be difficult to track as well. The extended period of time between evidence collection, storage, and admission into evidence also presents the possibility that many primary observers and custodians may no longer be reachable by the time that accountability processes take place. Thus, those individuals would not be able to provide critical affidavits or testimony as to the veracity and authenticity of the digital evidence at the time of trial. All of these factors threaten the admission of digital evidence in a court of law. 

Proposed Solution: Supervisory Testimony

Supervisory testimony, or the testimony of a primary observer’s supervisor (who may participate remotely), could help bridge the evidentiary gap. Rather than requiring a different witness to account for each individual link in the chain of custody, a single supervisor could account for the entire chain of custody. Supervisory testimony would consolidate the institutional knowledge of the chain of custody of specific pieces of digital evidence in a single person.

For example, a supervisor would be tasked with training and overseeing a set of field observers. During a years-long conflict, the supervisor would keep track of the provenance of the swaths of digital evidence submitted by their cohort of observers. Once an accountability mechanism (e.g. a tribunal) is initiated, that single supervisor would present a consolidated bundle of data (e.g. photographs) obtained by their field observers. As a witness, the supervisor would testify as to the chain of custody of specific pieces of evidence, having been present as a supervisor during the collection and storage process. That supervisor would also explain the general process of capture and transfer of all recorded data functions, including the training of field observers. Most importantly, that supervisor would provide a blanket certification of authenticity of the digital evidence. Ultimately, a single supervisory witness could stand in for the dozens of field observers who collected the data over an extended conflict and account for the entire lifecycle of digital evidence from the moment of capture to its presentation in court. This strikes an important balance, streamlining several obstacles to evidence admission while still allowing a defendant an appropriate party to cross-examine.

Technological solutions like cryptographic hashes and distributed ledger entries have been pitched as solutions for chain of custody concerns, serving a bit like a notary system. However, those solutions may be insufficient to adequately account for the human observers whose testimony may still be necessary to authenticate digital evidence. Instead, the combination of technological authenticity markers and supervisory testimony would help shore up any gaps in the chain of custody and enhance authentication for accountability purposes. While technological solutions will be discussed, the workshop has primarily explored how both human and technological protocols are essential. Together, these robust protocols could be a new frontier in authentication.

Possible Implementation Methods

  • Asynchronous supervision: A supervisor would train new cohorts of field observers with an established protocol to ensure the authenticity of the collected data and reliability of the collection methods. This could include corroborating data, like photos or videos of the field process itself. A supervisor could also conduct debriefs with their observers to affirm that the collection protocol was followed at the time that specific evidence was collected. 
  • Direct supervision synchronous with the moment of capture: For example, a supervisor may text back and forth with a field observer at the moment they collect evidence. Alternatively, the field observer may livestream the evidence collection with their supervisor.
  • Development of uniform collection protocols: Hosting organizations that employ supervisors must develop uniform collection protocols to be used in the field. As part of these protocols, organizations should also contemplate ways to maintain impartiality and neutrality in the collection of their data, and consider common or likely rules of evidence.


Privacy Preference Center