New Evidence Techniques Document Bombed Ukrainian Schools For ICC

New Evidence Techniques Document Bombed Ukrainian Schools For ICC

Empowering justice through evidence: Starling Lab leverages cryptographic tools to document attacks on Ukrainian schools for criminal accountability at the International Criminal Court.

Basile Simon

Reading Time: 5min

Background

The large-scale invasion of Ukraine by Russia in February 2022 triggered widespread international outrage. This was accompanied by a surge of interest in the conflict from many in the West: not only did the conflict feel closer geographically than what many of us had experienced in our lifetimes, the information coming out of Ukraine was also of extraordinary quality (and impact). Smartphones seemed to be in every pocket, and strong 3G and 4G networks around the country added to the sense of immediacy and rawness of this new open conflict on the eastern edge of Europe, which marked a dramatic escalation of a conflict that had been simmering since Russia’s initial intervention in Ukraine in 2014.

The ongoing conflict has severely impacted Ukraine’s education system, with over 3,790 educational facilities damaged or destroyed since 2022. This destruction has left more than five million children struggling to access education, compounding previous disruptions caused by the COVID-19 pandemic and ongoing conflict in Eastern Ukraine since 2014. Many schools have allegedly been used for military purposes, turning them into legitimate targets and further endangering students’ lives. As a result, Ukrainian children are facing significant learning losses, particularly in foundational subjects like language and mathematics. International efforts are focused on supporting Ukraine in rebuilding its education infrastructure and ensuring that students have access to safe learning environments, whether through physical reconstruction or online alternatives. Furthermore, the disruption emphasizes the need for transitional justice, as documenting these attacks is essential to empower the affected generation with the evidence needed to pursue future accountability and justice for war crimes against educational facilities.

Concerned with allegations of harm in Ukraine, the Lab’s law program made international criminal law a key focus of our research. We started Project Dokaz (“Доказ”, Ukrainian for proof) in partnership with Hala Systems, a social enterprise we have historical connections with from consultations on their projects related to early-warning systems in Syria.

This coalition was led by the Lab’s newly-appointed accountability lead, Basile Simon, and Hala’s director of accountability, Ashley Jordana. Basile joined from past work on OSINT methodologies and investigations, including developing an open source suite of tools aiming to meet evidentiary standards. Ashley, a Canadian-qualified barrister, had joined Hala Systems with a wealth of international trials experience, including strategic litigation covering notably Ukraine.

We were motivated by the words of the International Criminal Court Prosecutor, Karim A.A. Khan KC, who four days after Russia invaded Ukraine and escalated the conflict simmering since 2014, said he had tasked his team with “explor[ing] all evidence preservation opportunities.” We set out to demonstrate how our methodology could apply to an ICC set of standards.

The first research partner of Project Dokaz was the Digital Forensic Research Lab at the Atlantic Council, a Washington D.C. think-tank. The DFRLab team focuses on implementing OSINT methodologies and monitoring for the purpose of studying disinformation campaigns and documenting human rights abuses.

Michael J. Sheldon, a researcher at the DFRLab at the time, compiled a list of social messaging material allegedly posted by Russian entities about the invasion of Kharkiv. Sheldon’s systematic monitoring focused on March 2-16, 2022, representing the approach of the city by Russian armed forces. The dataset was shared with the Lab and Project Dokaz, and served as the basis of the investigation.

Another early-day partner of the project was Scott Martin from Global Justice Advisors, a Washington DC-licensed attorney based in Europe. Scott has spent decades investigating international crimes and working in the international criminal tribunal system. His experience putting together these investigations, as well as preparing submissions to the ICC, was instrumental in our ability to translate what this evidence base could mean in an ICC context.

The idea that such a coalition of organizations and individuals could contribute to accountability is not new. In fact, the role of “Track II” practices has been studied for several decades. This refers to non-state mechanisms that use civil society, NGOs, and technology to document human rights abuses and ensure accountability when formal legal avenues are constrained or delayed. Starling Lab and Hala Systems embody this approach through their use of decentralized technologies and cryptographic tools to capture, secure, and preserve evidence of war crimes in Ukraine.

Context

The monitoring compiled by the DFRLab included messages from several Telegram “Channels,” which the investigative team systematically searched using English and Ukrainian keywords. In a busy environment with considerable noise in the beginning of the conflict, this methodology permitted the monitors to narrow down their focus on 144 Telegram posts containing instances of artillery fire which could be geolocated with some certainty. Geographical location was established using details included in the media and cross-referenced with publicly available information such as satellite imagery.

From this original dataset, a team of legal experts at Hala Systems picked incidents they thought would be the most relevant and reliable for further investigation. Drawing on the qualifications and experience of Ashley (rostered Justice Rapid Response expert and advisor to the Crimean Prosecutor’s office, both on crimes against children), and inspired by Starling’s roots in higher education, the investigative team decided to focus on allegations of attacks against schools – protected objects under international law. Article 8(2)(b)(ix) of the Rome Statute prohibits intentionally attacking buildings dedicated to religion, education, art, science or charitable purposes, historic monuments, hospitals and places where the sick and wounded are collected, which are not military objectives, as a serious violation of the laws of war in the context of an international armed conflict.

The legal team put forward an online investigation plan going beyond the Telegram messages and looking into the prior and subsequent attacks on the same schools, nearby military installations, and the context of the armed conflict at the time. 

The Starling Lab identified a methodology and tools for the forensic capture of the Telegram messages, as well as their verification, along with corroborating online information that required archiving. 

To illustrate the complexities and need for corroboration, one incident involved public counter-allegations aiming to justify Russia’s targeting of the school by claiming that it hosted a group of soldiers from Ukraine’s Azov Battalion. Indeed, protected objects can lose their status and become legitimate military objectives if taken over, even partially, by combatants and other active participants in hostilities.

Remnants from an air-delivered bomb which hit the running track around a Kkarkiv school playground.
Remnants from an air-delivered bomb which hit the running track around a Kkarkiv school playground.

Over the course of the investigation, the need arose to document consequences of the attacks beyond what could be observed through the Telegram posts and online verifications (e.g., closer shots of a wall pockmarked by shrapnel, and of the damage from what appeared to be an air-delivered bomb). Individuals local to Kharkiv were hired to take photographs and videos of two impacted schools: School 17 and School 35.

Starling Lab had two objectives when assembling tools to support the field verifications. First, photographs and video captured by the team should be demonstrably attributable to them in the future. Particularly, should the photographers not be available to testify in the future, the attribution and metadata of the photos must have a mechanism to independently establish their authenticity and credibility. Without direct photographer testimony, documented provenance and digital verification methods (such as tamper-evident technology or cryptographic signatures) are essential to uphold the legal validity of the evidence. Second, for safety reasons, producing strongly-authenticated data could not hamper or slow down the field work.

To make the most of the field photographs and their ability to better contextualize the physical space surrounding the two schools, the Lab produced an interactive map. Photographs were laid out to show where the camera was aiming, and each included strongly-authenticated metadata pulling directly from image registration records. The interactive map itself was hosted on and served from distributed systems.

The resulting evidence base included original Telegram posts documenting the allegations, online OSINT verifications by investigators, and field photographs confirming specific points of detail. This supported the team drafting of two communications to the ICC – one detailing the original allegations and a follow-up focusing on the production of the field photographs. Both were shared with the ICC Office of the Prosecutor under the terms of Article 15 of the Rome Statute, which stipulates:

“The Prosecutor shall analyze the seriousness of the information received. For this purpose, he or she may seek additional information from States, organs of the United Nations, intergovernmental or nongovernmental organizations, or other reliable sources that he or she deems appropriate, and may receive written or oral testimony at the seat of the Court.”

Following these submissions on June 10, 2022, and Jan 17, 2023, a briefing took place with investigators at the Office of the Prosecutor, as well as subsequent conversations with some of the Court’s IT staff. The object of these meetings was to socialize the concepts, as well as potential benefits and costs should they be considered beyond cases and more broadly into how the Court manages digital information.

Framework

The Challenge

Starling Lab wanted to maximize the admissibility chances and probative weight of the material: web archives of social messages, supporting online documentation, and photographs and videos from the field.

All data in these investigations are subject to risks of loss, tampering, and damage. Content hosted on the web, including social media and messaging, is particularly vulnerable and must be preserved separately and independent from the platform through web archiving. In all cases, the material must be protected against deliberate attacks (tampering) and accidental loss or damage.

Moreover, we believe the digital evidence collected today will be assessed and examined in a near future when cheap and accessible generative AI can upend our expectations of “trustworthiness.” We aim to collect, verify, and put forward stronger, better contextualized data by preserving digital content alongside provenance and integrity records.

 
ukraine-building
Caption TK Caption TK Caption TK Caption TK Caption TK Caption TK Caption TK Caption TK Caption TK
A single Telegram message, which can be captured as a web archive using specialized tools. (left). Many such messages can be aggregated and presented to provide context and corroborating evidence for an event (right).

The Prototype

These investigations were underpinned by the Lab’s research into the creation of authenticated data and the claims that can be made about it, notably as it relates to digital evidence. The methodology, jointly developed by Starling and Hala, is best summarized as three phases in the lifecycle of a piece of evidence: capture to adequate evidentiary standards; preservation for the long-term; and availability for analysis, scrutiny and investigation. This “Capture, Store, Verify” Framework is the subject of a Technical Whitepaper <internal link>

The capture of web pages and social messages
We previously identified notable threats in the current practice of producing web archives and backups of web pages, and summarized them in a November 2022 whitepaper, “Best Practices for Web Archiving.” In short: Content on the web, and social media / messages in particular, is subject to rapid rot and deletion, and it is crucial to preserve it should it be required as evidence. The whitepaper also outlines criteria of strong web archives, as well as preservation techniques for their use as evidence: 

For maximal probative value, “the ideal web archive (…) could have been produced by anyone, meaning that the material was publicly accessible (i.e. not privileged or obtained through deception) and the tools to produce it are available to everyone (i.e. not proprietary). This could be either or institutional actor or an individual (see characteristic No. 1 above); and Is of high fidelity, meaning it was carried out by a tool that preserved much if not all of the original material (see Characteristic No. 2 above); and Includes the content itself, its surrounding metadata, the metadata of the web scraping software, its hash value, and the signature of these hashes authenticating it to the author (see characteristic No. 3 above). The aforementioned hashes and signatures must be preserved, that is to say stored securely and made available for the long term, as would the content itself.

At this point in time, the whitepaper’s criteria for authentication are satisfied by the use of modern web archiving tools such as Browsertrix, the Webrecorder suite, and the cryptographically-signed WACZ standard. Another important benefit of the use of the aforementioned tools is their unique offering in terms of contextualization and renderability (see Berkeley Protocol). These archives not only include the files of the webpages themselves, but also a replay engine, making the web archive effectively browsable. Investigators can interact with a web page as it was at the time of archiving, including interactive elements like clickable links. The inclusion of the replay engine defends against technological change which could render such viewing and engagement impossible, thus making the archive obsolete.

Downstream from this capture of web archives, the Starling Framework further confers forensic qualities to the material, by establishing additional data points corroborating provenance and distributing proofs of integrity. The Framework storage policies and choice also aim to render the material available for the long term.

The legal team included verifications of counter-claims which could have turned out to be exculpatory. Above, an archived Telegram message from a group assessed as pro-Russian, claiming the bombardment of School 17 was legitimate due to the prior presence of armed forces of Ukraine’s Azov Battalion – a claim our submission refutes.
The legal team included verifications of counter-claims which could have turned out to be exculpatory. Above, an archived Telegram message from a group assessed as pro-Russian, claiming the bombardment of School 17 was legitimate due to the prior presence of armed forces of Ukraine’s Azov Battalion – a claim our submission refutes.

The production of strongly-authenticated field photographs
The field photographs, taken at selected schools in Kharkiv, were captured using guidelines from the Framework. They include an additional layer of context around them, in the shape of metadata.  Traditionally this includes verifiable location, date, time, etc. Our legal submissions make the point that this metadata should also provide corroborating information about the photographs, the photographer, as well as their smartphone.

From a custody perspective, this layer of metadata gives the ability to make determinations regarding the person who used the device to take photographs, as well as where and when the photographs were taken. Notably, these determinations rely on the attributability of cryptographic keys. In other words, the photographer could provide access to the smartphone and prove the photographs come from the key contained in a camera app. The method used to cryptographically seal the photographs provides secure timestamping of their capture. According to the guidelines of the Framework, this is attested by a third-party, who also stores proofs of this timestamp, permitting later auditing.

Once again, downstream from this capture of photographs, the Starling Framework further confers forensic qualities to the material, by establishing additional data points corroborating provenance and distributing proofs of integrity. The Framework storage policies and choice also aim to keep the material available for the long term.

The verification displays available use both Content Credentials Verify tools as well as metadata registered on distributed ledgers (see paragraph below).
The verification displays available use both Content Credentials Verify tools as well as metadata registered on distributed ledgers (see paragraph below).

The inclusion of strong metadata and cryptographic registration receipts
Both submissions consist solely of digital evidence. Both sets of original assets were preserved and stored according to the methodology. As such, they possess unique properties. Overall, the Framework aims to produce a more authenticated, traceable, and auditable digital evidence base:

  • Authenticated: The authenticity of digital items preserved through the Framework can be established by comparing the hash value of the files with those submitted to the Court or stored with third-party record holders.
  • Traceable: The provenance of digital items preserved through the Framework can be demonstrated by referring to timestamps proving date of reception (typically through registration with those third parties), and cryptographic receipts proving the preservation of the material on cold storage network systems.
  • Auditable: The auditing and verification of all digital items preserved through the Framework can be carried out by referring to the third-party record holders.

We argue that these additional integrity characteristics, while not strictly necessary today considering the permissive standards of the International Criminal Court (which rest on a three-part test of relevance, probative value, and prejudice), will be the markers of the strong evidence bases of tomorrow. The Lab’s research covers several industries, all of them dealing with issues related to synthetic content and generative artificial intelligence. We have found that these characteristics are often present in distributed ledgers like blockchains, which also don’t require their users to place their trust in single centralized authorities. As international criminal proceedings tend to span years, if not decades, the evidence bases created today must seize this opportunity to demonstrate their integrity and resilience over time.

One of the photographs, presented in situ and in the context of the school’s geography and the other images, displaying in addition integrity data of the photograph.
One of the photographs, presented in situ and in the context of the school’s geography and the other images, displaying in addition integrity data of the photograph.

Cryptographic signatures, employed throughout the submissions to secure the digital items, their integrity proofs, and the transport to and through the archive, are one such strong marker permitting the verification of attestations to individuals and organizations. These might be supported by witness testimony or an affidavit verifying the individuals were in possession of the keys and carried out such and such action.

The inclusion of registration receipts in the submissions, including hyperlinks to online “block explorer” tools pointing to these receipts, aimed to permit some level of auditing by the staff of the Office of the Prosecutor. While conversations with the Office revealed a lack of familiarity with these techniques, we argue that the presentation of this data in a familiar spreadsheet environment and interactive map of photographs enabled a relatively accessible auditing beyond prima facie consumption of the photographs.

The submissions themselves contained lengthy methodological descriptions of the processes followed by the investigation, as well as of the meanings and claims of the integrity layer.

 

Technology

Capture

Web archives (of webpages and social messages) were created using tools developed by Webrecorder. Each of these archives results in a cryptographically-signed file matching the WACZ specification, and can be browsed independently and offline using the ReplayWeb.Page tool. For more about WACZ in this context, see this introductory dispatch from the Lab.

 

A high level diagram of how the Starling Framework for Data Integrity is applied to photo assets.

The cryptographic signatures of the web archives used AuthSign, another Webrecorder tool, as well as Hala Systems’ and Starling’s SSL certificates. In short, the very certificates used by both organizations to secure their online websites are also used in this instance to digitally sign the captured webpages. Notably, this demonstrates support to and oversight of the crawls, as both signing organizations needed to take active steps to enable these signatures. These steps aim to demonstrate that the archiving party “has a secure, private connection” to the signing server on the designated domain, as per the Webrecorder spec.

The field photographs of the schools were taken using Android smartphones equipped with ProofMode, a mobile application from the Guardian Project collecting surrounding metadata for a media capture, as well as notarizing the image hash value and timestamp through third-parties.

Signal Messenger was used to transfer ProofMode bundles from the field to Starling servers. Prior to transfer, we established trust between the Signal phone numbers involved.

From the point where Starling took custody of the field photographs, as well as from the moment the legal team finished assembling web pages they wished to preserve, the bundles of data assembled went through the Starling Integrity Pipeline, a proposed implementation and assembly of tools aiming to standardize the way data is organized and preserved, as well as to trigger cryptographic registrations on the Avalanche and LikeCoin blockchains, which are both publicly inspectable ledgers.

The registration of web archives and field photographs using the Numbers Protocol and International Standard Content Number (ISCN) specification.

Store

Once integrity records were secured on public ledgers, data bundles were prepared for long term preservation. These bundles containing the original assets were encrypted at rest on Starling’s storage pools.

Non-critical metadata about the assets, limited to the hash values of the media and in accordance with the legal team’s investigative plan, were made available on the IPFS public storage network. 

The preservation of the data bundles is backed by best-in-class digital storage. The core of the infrastructure resides in a data center belonging to OVH, one of the world’s largest cloud hosting providers, protected by security badge control system, video surveillance and security personnel 24/7 on-site, and further protected by encryption at rest.

Encrypted copies of the files (field photographs, web archives) were also preserved on two decentralized storage providers: Filecoin and Storj. This multi-tiered approach ensures redundancy, data integrity, and compliance with the investigative requirements, while also providing a secure, immutable record of the assets that can be independently verified if needed. While Filecoin offers a robust, blockchain-based verification system for storage proofs, Storj focuses on high performance through its distributed network, providing faster access times and scalability, making each provider uniquely suited to different aspects of secure, decentralized data storage.

 

Verify

A key element of the demonstration of non-tampering of the files is the aforementioned distribution of integrity data on third-party ledgers, thus permitting verification of hash values and signatures at a later stage. In this instance, such non-critical metadata was registered on the Avalanche blockchain through Numbers Protocol, and on the LikeCoin blockchain following the International Standard Content Number (ISCN) specification. While Avalanche’s Numbers blockchain offers fast transaction times and a flexible framework for integrating integrity proofs, LikeCoin’s ISCN-native blockchain provides a specialized content registry tailored for media assets, making it ideal for precise tracking and verification of digital works.

ISCN integrity record of a web archive on LikeCoin viewed through its media asset explorer.

The investigative team manually marked relationships between the initial Telegram posts and their subsequent verifications. The resulting “web” of relations can be leveraged and displayed by asset-management software downstream (e.g. Uwazi, a flexible AMS for human rights defenders).

For the purpose of socializing some of these concepts to the Office of the Prosecutor, the submission included spreadsheets mapping all included digital items with their registration and timestamp data, as well as links to the various ledgers on which the data can be confirmed.

Finally, the submission included a bespoke online map of the schools visited by the in-country team. Satellite images of the schools were overlaid with the location of the photographers, as well as the vantage points of the photographs themselves, permitting viewers to explore the damage through different angles. Each display of the photographs also directs viewers to the integrity records of each image, permitting manual verification.

Learnings

Articulation of the Framework Benefits in a Legal, Investigative Context

This joint project represented an important milestone for the Lab, insofar as it involved presenting the methodology and the practices the Lab had been working on to an external, legal audience. Abstracting from the seriousness of the allegations contained in the submissions, as well as from the difficult context of a conflict causing suffering for civilians, the team felt an obligation to present its work to the Court as tersely and intelligibly as possible – keenly aware of the novelties it contained.

The Starling Framework had been applied in the past in the shaping of prototypes and projects related to pieces of journalism, e.g. election coverage in Hong Kong or reporting on deforestation in Brazil. This project, however, was the first articulation of the methods, merits and limitations of the Framework in an international law context. It was the legal team’s first contact with some of the specific implementations as well. 

We were pleased to be able to rely on the technical literacy brought by the publication of the Berkeley Protocol for Open Source Investigations. Published in December 2020, this seminal collaboration between Berkeley’s Human Rights Center and the UN Human Rights Office set the stage for investigations relying on openly-accessible material collected online – more commonly referred to as OSINT. The Protocol notably formulates technical high-level recommendations related to hashing, signing, and storage. We built on the Protocol’s guidelines to develop specific technical designs that implement and fulfill these standards.

Similarly, the conversation between the two parts of the team was key to identifying how the methodology and the Framework fit in a legal context. The vehicle for this project, a submission under Article 15, was extremely permissive – it is in fact a free-form communication to the Office of the Prosecutor, who then reviews and assesses on merit. This freedom led to questions regarding the merit of the efforts and innovations deployed in this instance. We found it helpful to reframe the methodology from its original claims of solving problems in the practice of international law, and towards presenting opportunities to produce digital evidence that is of a stronger kind, and possibly more fit to stand forensic scrutiny over a long period of time.

 

Articulation of the Framework Benefits in a Legal, Investigative Context

urther to the section above, project members sometimes had issues finding semantic common ground.

The Lab focuses on three practice areas (Journalism, History, and Law) and builds upon a software engineering background. Consequently, much of the terminology used in our methodology aims to speak to a broad audience. But sometimes, professions see specific words and phrases as terms of art (i.e., holding a narrow meaning within their field).

Take “Authentic”, for example. To a legal audience, “authentic” is a quality of a piece of evidence which was demonstrated as not being a forgery, i.e. genuine. While English dictionaries similarly converge towards this definition (Merriam-Webster: “Conforming to an original so as to reproduce essential features. Not false or imitation.” Collins: “Of undisputed origin or authorship. Reliable or accurate”), it is common vernacular to use “authentic” interchangeably with “genuine”, “truthful”, “real”, etc. Meanwhile, psychologists have tried to define different types of authenticity, including historical, categorical, and values.

What is “Verified”? “Verification” is narrowly-defined in law as a specific process by which a person, under oath or before a notary public, declares that a statement is true. The term is used more broadly in everyday life. One notable example is the choice of this terminology by the Content Authenticity Initiative, a consortium of various companies with stakes in visual media, who call one of their tools Verify. This tool permits the inspection of a digital item in view of surfacing provenance information it might contain.

 

A Need for Cheap, Accessible, Open Tools

In several instances, the issue of accessibility of the tools involved in the project arose. How difficult is it to reproduce the Framework’s processes? Is preserving and storing each item this way prohibitively expensive? Does it require a dedicated engineering team, or advanced technical knowledge to operate?

The underlying question was the opportunity for all parties to potentially benefit from the tools and the methodology. Any choice of tool or technique which would be too complex or expensive to run, or too bespoke for the project, would face hurdles related to fair trial and equality of arms.

The SITU digital evidence platform, developed for the ICC’s Office of the Prosecutor (OTP), provided a detailed 3D model of Timbuktu for contextualizing evidence. Originally designed for the Al Mahdi case, the platform featured a detailed digital reconstruction of Timbuktu, allowing judges, defense teams, and prosecutors to visually explore and contextualize evidence within a 3D environment. While effective, critics have pointed out that the high complexity and the resources needed for these systems can be a barrier, particularly for smaller defense teams lacking equivalent technical capacity​ (see blog from Jonathan Hak KC, for example).

Similarly, OTPLink, another ICC platform, aimed to streamline evidence submission but faced significant accessibility barriers for victims and witnesses, such as language issues and internet access limitations. Critics argue that such platforms, despite their utility, may unintentionally create an imbalance by favoring those with greater resources.

One additional concern when preserving digital evidence is the cost of storage, especially when dealing with large-scale or long-term needs. Decentralized storage options like IPFS, Filecoin, and Storj offer a different approach to traditional cloud storage providers, using distributed networks that not only reduce reliance on a single provider but also help keep costs down. For instance, Filecoin’s pay-as-you-go model can be much more budget-friendly for storing big datasets over time, while Storj uses a global network of nodes to cut costs by spreading the load. In addition, because these networks are decentralized, they’re less likely to be affected by sudden price hikes or monopolistic control, making them a fairer option for smaller teams that need to manage tight budgets without sacrificing security or data integrity.

 

Privacy Preference Center