BigScience Episode #5 – Challenges & Perspectives in Creating Large Language Models

ACL 2022 Workshop – May 27th, 2022


Two years after the appearance of GPT-3, large language models seem to have taken over NLP. Their capabilities, limitations, societal impact and the potential new applications they unlocked have been discussed and debated at length. A handful of replication studies have been published since then, confirming some of the initial findings and discovering new limitations. This workshop aims to gather researchers and practitioners involved in the creation of these models in order to:

1. Share ideas on the next directions of research in this field, including – but not limited to – grounding, multi-modal models, continuous updates and reasoning capabilities.
2. Share best-practices, brainstorm solutions to identified limitations and discuss challenges, such as:

Infrastructure. What are the infrastructure and software challenges involved in scaling models to billions or trillions of parameters, and deploying training and inference on distributed servers when each model replicas is itself larger than a single node capacity?
Data. While the self-supervised setting dispenses with human annotation, the importance of cleaning, filtering and the bias and limitation in existing or reported corpora has become more and more apparent over the last years.
Ethical & Legal frameworks. What type of data can/should be used, what type of access should be provided, what filters are or should be necessary?
Evaluation. Investigating the diversity of intrinsic and extrinsic evaluation measures, how do they correlate and how the performances of a very large pretrained language model should be evaluated.
Training efficiency. Discussing the practical scaling approaches, practical questions around large scale training hyper-parameters and early-stopping conditions. Discussing measures to reduce the associated energy consumption.

This workshop is organized by the BigScience initiative and will also serve as the closing session of this one year-long initiative aimed at developing a multilingual large language model, which is gathering 1,000+ researchers from more than 60 countries and 250 institutions and research labs. Its goal is to investigate the creation of a large scale dataset and model from a very wide diversity of angles.

Call for Papers

We call for relevant contributions, either in long (8 pages) or short (4 pages) format. Accepted papers will be presented during a poster session. Submissions can be archival or non-archival. Submission opens on February 1st, 2022 and should be made via OpenReview. The paper submissions will be done via ARR. For more information about templates, guidelines, and instructions, see the ARR CFP guidelines.


March 7, 2022: Extended Submission Deadline
March 30, 2022: Notification of Acceptance
April 10, 2022: Camera-ready papers due

Program – all times are local (Dublin) time

Panels (pre-recorded)
Scaling: Myle Ott, Connor Leahy, Shruti Bhosale
Moderator: Matthias Gallé

Ethical & Legal considerations: Emily M. Bender, Zeerak Waseem, Laura Weidinger
Moderator: Margaret Mitchell

Large-scale collaborations: Guy Gur-Ari (BIG-bench), Samira Shaikh (GEM), Stella Biderman (EleutherAI), Rosanne Liu (ML Collective), Jade Abbott (Masakhane)
Moderator: Suzana Ilić

10:45am – Coffee break (15min)

Posters (virtual)
11:00am –
Virtual poster session via
12:30pm – Lunch break

BigScience Talks (hybrid)
2:00pm – BigScience: Thomas Wolf
3:00pm – Data Governance: Huu Nguyen, Margaret Mitchell, Yacine Jernite
3:20pm – ‍Data: Angie McMillian-Major, Pedro Ortiz
3:40pm – Modeling: Iz Beltagy, Julien Launay
4:00pm – Prompt Engineering: Victor Sanh, Stephen Bach, Albert Webson, Colin Raffel
4:20pm –Evaluation: Verena Rieser, Marine Carpuat, Ellie Pavlick, Sebastian Gehrmann, Thomas Scialom, Dan Garrette, Aurélie Névéol


Steering Committee
Yoav Goldberg, Bar Ilan University / Allen Institute for Artificial Intelligence
Percy Liang, Stanford
Margaret Mitchell, HuggingFace / Ethical AI LLC
Alice Oh, KAIST
Alexander Rush, Cornell

Program Committee
Angela Fan, Meta
Matthias Gallé, Naver Labs
Suzana Ilić, HuggingFace
Thomas Wolf, HuggingFace


Logo of HuggingFace