Legal Playbook For Natural Language Processing Researchers


Purpose of this Playbook

This playbook is a legal research resource for various activities related to data gathering, data governance, and disposition of an AI model available as a public resource. It aims to benefit academic and government researchers including those in New York State who wish to understand how best to use AI models to provide natural language processing (“NLP”) as public infrastructure, but who do not have legal resources. The playbook aims to be a general informational resource to public organizations, including cross national organizations focused on non-commercial open science in NLP and promotion of the human rights to equal access to scientific advancement under UDHR Art. 27.

With this playbook, we strive to assist researchers who have less resources to help them guide their communities and their research, including low income communities who may not have access to legal resources. In particular, this playbook is cross jurisdictional, and hopefully will be relevant to NLP and data researchers in underserved language communities whose data will be processed (e.g., minority dual-language speakers) and those who wish to participate and have a stake in AI.

This playbook was drafted as part of the year-long BigScience workshop and is not legal advice. This playbook is made available under the CC-BY license. This playbook is not legal advice and is only intended as research of the legal landscape at or around the “Last Updated” date.


This playbook is organized by jurisdictions. In each jurisdiction, we provide an overview of the legal system, and then provide legal research for common questions that NLP researchers may have relating to intellectual property, licensing, privacy, text data mining and prohibited or restricted content and technologies.

Authorship & Contributions

The authors of the various playbooks and their executive summaries are named in their sections. The following are editors, reviewers and other contributors to this playbook: Huu Nguyen, Jen Dumas, Somaieh Nikpoor, Tara Davis, Danda Zhao, David Lansky, Stella Biderman, Jessie Dodge, and Yacine Jarnite.