Data Counterfactuals
What changes when the data world, evaluation world, or trust institution changes?
This is a short memo meant to explain the datacounterfactuals.org project and website. It will be cross-posted on the Data Leverage Substack. The site exists to:
- show how cross-cutting the idea of "data counterfactuals" is (to name just a few relevant topics: data valuation, data dividends, algorithmic collective action, data scaling, data selection, data poisoning, evaluation data, privacy, machine unlearning)
- make data counterfactual measurements easier to understand
- illustrate connections between technical and social data-centric work
There are many reasons we might want to understand how a specific piece of data impacts an AI system. Perhaps we want to inspect particularly valuable training data, reason about data dividends or other ways of paying people based on the impact of their data (though this is a tricky endeavor!), check data for errors, or decide what should be reserved for evaluation. Or perhaps a group of people want to withhold data for bargaining or protest. Counterfactual questions about how data might change are foundational to many pressing issues about the impact of AI on power concentration, knowledge work, information flow, and more, and so understanding various questions in terms of data counterfactuals is both practically and academically useful.
What is a data counterfactual?
A data counterfactual is a scenario in which the data world around an AI system changes in some way. In the first version of this site, that mostly meant changes to training data. The next layer is broader: the evaluation set, permitted data uses, and the institutions that make a measurement trustworthy can also change. Often, we are interested in comparing two counterfactual scenarios to understand the impact of some change on AI capabilities, measurement, or confidence.
Consider this thought experiment: imagine you are going to train a machine learning model on a very small dataset: let's say the dataset has just four units of data (or, if it seems implausible that we'd ever want to do this, we can imagine it's a big dataset with distinct bundled subsets). Now imagine a grid where every possible combination of training objects appears as a row, every possible evaluation set appears as a column, and each cell records the performance for a given train/eval pairing. For our very small example with just four data objects, we can call them A, B, C, and D. (Again, these could literally map to four single observations in a toy example, or map to four large datasets we are considering mixing.)
With this grid in mind, we can explore the most basic useful data counterfactual, "leave-one-out." By comparing a row that includes one point with the nearby row in which that point is missing, we can understand the impact (in a causal sense) of adding or removing that point. By computing the difference between these two cells, we can learn how much a given data point helped or hurt our model. From there the same logic can be extended to groups of points, weighting data points, replacing data with other synthetic data, corrupting certain examples, or coordinated withdrawal.
Very simply, we can imagine training an LLM with a bunch of fiction books, science articles, and social media posts. If we train a second LLM without the science articles and compare the performance, we are exploring the "no science articles" data counterfactual. Researchers have indeed performed such experiments, for instance at non-profit institutions like AI2 and for-profit companies like Meta.
Leave-one-out toy example
Training, evaluation, and trust counterfactuals
The grid also makes a second move visible. Most of the familiar examples are row moves:
$$ f(D_T, D_E) \rightarrow f(D_T \setminus z, D_E) $$
Here the evaluation target stays fixed while the training world changes. But we can also ask column moves:
$$ f(D_T, D_E) \rightarrow f(D_T, D_E \cup z) $$
Here the trained model stays fixed while the evaluation world changes. This is an evaluation counterfactual: the data object changes what we measure, which claims we trust, or which deployment decision we make.
And some questions are really institution moves:
$$ f(D_T, D_E, G) \rightarrow f(D_T, D_E, G') $$
Here $G$ stands for governance or trust state: provenance, licensing, evaluator independence, contamination controls, label process, secrecy, and other facts that decide whether a train/eval comparison should count. This does not replace the original training-data frame. It adds a second layer: the first version of the site focuses on changes to training data; the next layer asks what changes when the evaluation set, holdout institution, or permitted data use changes.
Why data counterfactuals are relevant to data leverage
This frame helps us connect topics that might seem distinct, for instance connecting influence estimation and Shapley values with data strikes and data contribution campaigns. In ML, we often want to ask questions about removing a point, reweighting a group, fitting a scaling curve, etc. with the purpose of understanding our data and model. But counterfactuals can also be induced by strategic actors. Strikes, boycotts, contribution campaigns, and bargaining efforts all try to impact AI through data.
When people can withhold, redirect, or condition the supply of data, data counterfactual measurement directly maps to governance power! In other words, the kind of experiments we'd want to run if we're just an ML researcher trying to make our model better (via data selection or other data-centric approaches) are the same experiments we'd want to run if we're trying to organize data-related collective action, design data dividend schemes, or set up an efficient data market. If we had a shared bank of results from such an experiment, those results would be useful to actors with a wide variety of interests and goals! Furthermore, this frame also makes it very clear where questions about provenance, licensing, contribution governance, and evaluation use rights directly determine which training rows and evaluation columns are legally, socially, or politically available in the first place.
- Read the launch memo
More on the motivation for this site
- When the column changes
A short memo on evaluation data, secure holdouts, use rights, and trust as first-class counterfactual objects.
- Open the glossary
Quick definitions for the recurring terms without leaving the main thread for long.
- Open the grid explorer
Explore row moves, column moves, and coupled train/eval comparisons directly.
- Follow the lightweight course path
A suggested reading-and-explorable sequence for using the site as open course material without a heavy LMS.
- Compare formalisms
A more technical companion that lines up neighboring formalisms side by side. Currently very WIP.
- Related work
A dynamic, non-exhaustive set of related areas and readings, updated via Semble.so.