This collection gathers foundational and recent work on **data leverage**—the strategic use of data withholding, contribution, or manipulation as a form of collective action.
## Core Concepts
- **Data Strikes**: Coordinated refusal to contribute data to platforms
- **Data Poisoning**: Intentionally corrupting training data to degrade model performance
- **Conscious Data Contribution**: Strategically directing data to preferred systems
- **Data Valuation**: Methods for quantifying individual data contributions (Shapley values, etc.)
## Why This Matters
As AI systems become more dependent on user-generated content and behavioral data, data creators gain potential leverage over technology companies. This research explores when and how such leverage can be effectively exercised.
-
Algorithmic Collective Action with Two Collectives
Aditya Karan, Nicholas Vincent, Karrie Karahalios, Hari Sundaram
ACM FAccT (2025)
Read paper
-
The Economics of AI Training Data: A Research Agenda
Hamidah Oderinwale, Anna Kazlauskas
arXiv preprint (2025)
Read paper
-
Collective Bargaining in the Information Economy Can Address AI-Driven Power Concentration
Nicholas Vincent, Matthew Prewitt, Hanlin Li
NeurIPS Position Papers (2025)
Read paper
-
Push and Pull: A Framework for Measuring Attentional Agency on Digital Platforms
Zachary Wojtowicz, Shrey Jain, Nicholas Vincent
ACM FAccT (2025)
Read paper
-
Poisoning Web-Scale Training Datasets is Practical
Nicholas Carlini, Matthew Jagielski, Christopher A. Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, Florian Tramèr
2024
Read paper
-
Large language models reduce public knowledge sharing on online Q&A platforms
R. Maria del Rio-Chanona, Nadzeya Laurentsyeva, Johannes Wachs
PNAS Nexus (2024)
Read paper
-
Algorithmic Collective Action in Machine Learning
Moritz Hardt, Eric Mazumdar, Celestine Mendler-Dünner, Tijana Zrnic
International Conference on Machine Learning (ICML) (2023)
Read paper
-
The Dimensions of Data Labor: A Road Map for Researchers, Activists, and Policymakers to Empower Data Producers
Hanlin Li, Nicholas Vincent, Stevie Chancellor, Brent Hecht
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (2023)
Read paper
-
Behavioral Use Licensing for Responsible AI
Danish Contractor, Daniel McDuff, Julia Katherine Haines, Jenny Lee, Christopher Hines, Brent Hecht, Nicholas Vincent, Hanlin Li
ACM FAccT (2022)
Read paper
-
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses
Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein
IEEE Transactions on Pattern Analysis and Machine Intelligence (2022)
Read paper
-
Addressing Documentation Debt in Machine Learning Research: A Retrospective Datasheet for BookCorpus
Jack Bandy, Nicholas Vincent
NeurIPS Datasets and Benchmarks (2021)
Read paper
-
Machine Unlearning
Lucas Bourtoule, Varun Chandrasekaran, Christopher A. Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, Nicolas Papernot
IEEE Symposium on Security and Privacy (S&P) (2021)
Read paper
-
Extracting Training Data from Large Language Models
Carlini, Nicholas, Tramer, Florian, Wallace, Eric, Jagielski, Matthew, Herbert-Voss, Ariel, Lee, Katherine, Roberts, Adam, Brown, Tom B., Song, Dawn, Erlingsson, {\'U}lfar, Oprea, Alina, Papernot, Nicolas
Proceedings of USENIX Security Symposium (2021)
Read paper
-
Can "Conscious Data Contribution" Help Users to Exert "Data Leverage" Against Technology Companies?
Nicholas Vincent, Brent Hecht
Proceedings of the ACM on Human-Computer Interaction (2021)
Read paper
-
Data Leverage: A Framework for Empowering the Public in its Relationship with Technology Companies
Vincent, Nicholas and Li, Hanlin and Tilly, Nicole and Chancellor, Stevie and Hecht, Brent
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021)
Read paper
-
Data Shapley: Equitable Valuation of Data for Machine Learning
Amirata Ghorbani, James Zou
International Conference on Machine Learning (2019)
Read paper
-
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain
Tianyu Gu, Brendan Dolan-Gavitt, Siddharth Garg
IEEE Access (2019)
Read paper
-
How Do People Change Their Technology Use in Protest?: Understanding Protest Users
Hanlin Li, Nicholas Vincent, Janice Tsai, Jofish Kaye, Brent Hecht
ACM CSCW (2019)
Read paper
-
"Data Strikes": Evaluating the Effectiveness of a New Form of Collective Action Against Technology Companies
Nicholas Vincent, Brent Hecht, Shilad Sen
The World Wide Web Conference (WWW) (2019)
Read paper
-
Examining Wikipedia With a Broader Lens: Quantifying the Value of Wikipedia's Relationships with Other Large-Scale Online Communities
Nicholas Vincent, Isaac Johnson, Brent Hecht
ACM CHI (2018)
Read paper