Philosophy Fellowship 2023

Sections

2023 Fellows

2023 Speakers

Applications for the 2023 fellowship are now closed. Thanks to everyone who applied.

The Program

As AI capabilities continue to improve dramatically, the need for safety research has become increasingly apparent. But given the relative youth of the field, much of the conceptual groundwork has yet to be done.

The CAIS Philosophy Fellowship invites philosophers from a variety of backgrounds to acquire an in-depth understanding of the current state of AI safety and contribute to novel and field-orienting research directions.

Philosophy Fellowship Motivation

How the work of philosophers contributes to the broader sociotechnical AI safety community.

1. Conceptual problem

Identify a lack of conceptual clarity in the existing AI safety literature.

2. Conceptual Clarification

Dissect the problem using rigorous conceptual analysis and relevant philosophical literature.

3. Sociotechnical Orientation

Publish conceptual research to inform sociotechnical strategy.

1. Conceptual Problem

Advanced AI creates unique conceptual difficulties.

Artificial intelligence is reshaping many aspects of day-to-day life. As AI continues on the trajectory to outperform humans on a wide range of cognitive tasks, questions about their properties and potential harms grow increasingly urgent.

Conceptual Examples:

How can we build systems that are more likely to behave ethically in the face of a rapidly changing world?
What processes might shape the behavior of advanced AI systems?
Could advanced AI systems pose an existential risk, and if so, how?

2. Conceptual Clarification

Academic philosophers are particularly well-positioned to address these conceptual difficulties

Philosophers are experts at thinking hard about abstract conceptual problems with no clear answers. Their expertise in working with imprecise concepts makes them the ideal candidates to address the conceptual issues that are characteristic of AI safety.

3. Sociotechnical Orientation

Conceptual clarity orients the broader sociotechnical landscape

Having the frameworks to analyze which concerns are the most urgent, which are the most likely candidates for serious harm, and how to navigate these risks enables researchers and key decision-makers to reassess their strategies.

Goals & Outcomes:

This fellowship addresses the need for conceptual clarification through research and field-building efforts.

Research:

Our team of philosophers critique and build on the existing conceptual AI safety literature, producing new conceptual frameworks to guide technical research.

Thus far, our fellows have collectively produced eighteen original papers, soon to be published, covering topics including interpretability, corrigibility, and multipolar scenarios, to name a few.

Field-building:

We aim for the influence of this fellowship to extend beyond our current cohort, promoting and incentivizing conceptual AI safety research within the broader academic philosophy community.

To date, our fellows have received $50,000 in funding to run a workshop connecting technical and conceptual AI safety researchers, organized numerous workshops, and created a special issue journal publication in Philosophical Studies.

2023 Fellowship:

Simon Goldstein

Simon Goldstein is an Associate Professor in philosophy at Australian Catholic University.

Jacqueline Harding

Jacqueline Harding is a PhD Student in Symbolic Systems at Stanford University.

Cameron Kirk-Giannini

Cameron Domenico Kirk-Giannini is an assistant professor of philosophy at Rutgers University–Newark.

Nicholas Laskowski

Nick Laskowski is an Assistant Professor in the Philosophy Department at University of Maryland, College Park.

Nathaniel Sharadin

Nate Sharadin is an Assistant Professor of Philosophy at the University of Hong Kong.

Dmitri Gallow

Dmitri Gallow is a Senior Research Fellow at the Dianoia Institute of Philosophy at the Australian Catholic University.

Mitchell Barrington

Mitchell Barrington is a PhD student in Philosophy at the University of Michigan - Ann Arbor.

Harry Lloyd

Harry R. Lloyd is a PhD student in philosophy at Yale University.

Frank Hong

Frank Hong received his PhD in Philosophy from USC and is an incoming postdoc at Hong Kong University.

Bill D'Alessandro

William D’Alessandro is a Postdoctoral Fellow at the Munich Center for Mathematical Philosophy and will be a Marie Curie/UKRI Postdoctoral Fellow at the University of Oxford.

Elliott Thornley

Elliott Thornley is a Postdoctoral Research Fellow in Philosophy at the University of Oxford. He is working on coherence and corrigibility.

Robert Long

Robert Long recently completed his PhD in philsophy at New York University, during which he also worked as a Research Fellow at the Future of Humanity Institute.

2023 Guest Speakers:

Peter Railton

Gregory S. Kavka Distinguished Professor of Philosophy at the University of Michigan - Ann Arbor

Hilary Greaves

Professor of Philosophy at the University of Oxford, Former Director of the Global Priorities Institute

Shelly Kagan

Clark Professor of Philosophy at Yale University

Vincent Müller

Alexander von Humboldt Professor of Ethics and Philosophy of AI at the University of Erlangen-Nuremburg

L.A. Paul

Millstone Family Professor of Philosophy and Professor of Cognitive Science at Yale University

Victoria Krakovna

AI Research Scientist at DeepMind

Jacob Steinhardt

Assistant Professor of Computer Science and AI at UC Berkeley

David Krueger

Assistant Professor of Computer Science and AI at Cambridge University

Walter Sinnott-Armstrong

Chauncey Stillman Professor of Ethics at Duke University

Lara Buchak

Professor of Philosophy at Princeton University

Johann Frick

Associate Professor of Philosophy at the University of California, Berkeley

Wendell Wallach

Hastings Center senior advisor, ethicist, and scholar at Yale’s Center for Bioethics

Rohin Shah

Research Scientist at DeepMind

2023 Fellowship News

Stay up to date on the latest news and research from the CAIS Philosophy Fellowship. Sign up for email alerts and announcements of future programs.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

October 24, 2023

Draft Published - Elliott Thornley: The Shutdown Problem: Three Theorems

October 7, 2023

Publication - Jacqueline Harding: Operationalising Representation in Natural Language Processing (British Journal for the Philosophy of Science)

August 28, 2023

Draft Published - Peter Park, Simon Goldstein, and CAIS contributors.: AI Deception: A Survey of Examples, Risks, and Potential Solutions

August 9, 2023

Journal Article - Nathaniel Sharadin: Predicting and Preferring (Inquiry)

August 2, 2023

Journal Article - Simon Goldstein & Cameron Kirk-Giannini: Language Agents Reduce the Risk of Existential Catastrophe (AI & Society).

July 21, 2023

Workshop - 1st AI Impacts Workshop. This workshop, hosted by the AI & Humanity Lab at the University of Hong Kong, will focus on the topic of benchmarking for ML and AI systems. (March 14-15, 2024).

July 19, 2023

Draft Published - Mitchell Barrington: Absolutist AI.

July 14, 2023

Op-Ed - Jacqueline Harding & Cameron Kirk-Giannini: AI's Future Worries Us. So Does it's Present (Boston Globe).

July 6, 2023

Op-Ed - Nathaniel Sharadin: Hong Kong can be a leader in mitigating the dangers of AI (Hong Kong Free Press).

July 4, 2023

Blog Post - Simon Goldstein & Cameron Kirk-Giannini: A Case for AI Wellbeing (DailyNous).

June 30, 2023

Publication - Jacqueline Harding, William D'Alessandro, Nicholas Laskowski, & Robert Long: AI Language Models Cannot Replace Human Research Participants (AI & Society).

June 20, 2023

Call for Papers - Submissions for the Philosophical Studies special edition on AI Safety (edited by Cameron Kirk-Giannini and Dan Hendrycks) are due by November 1!

June 14, 2023

Draft Published - J. Dmitri Gallow: Instrumental Convergence?

June 13, 2023

Draft Published - Frank Hong: Group Prioritarianism: Why AI Should Not Replace Humanity.

June 9, 2023

Op-ed - Nathaniel Sharadin: Growing threat of AI misuse makes the need for effective, targeted regulation all the more urgent (South China Morning Post).

June 8, 2023

Op-ed - Nathaniel Sharadin: Most AI Research Shouldn't be Publicly Released (Bulletin of Atomic Scientists).

June 6, 2023

Media - Nathaniel Sharadin on Bloomberg Radio London (from 20:00).

June 1, 2023

Media - Nathaniel Sharadin on BBC Radio (from 2:12:00).

May 31, 2023

Media - Simon Goldstein on SBS News.

May 31, 2023

Draft Published - Simon Goldstein: Shutdown-Seeking AI

May 27, 2023

Draft Published - William D'Alessandro: Is Deontological AI Safe?

May 23, 2023

Draft Published - Cameron Kirk-Giannini & Simon Goldstein: The Polarity Problem.

May 12, 2023

Draft Published - Simon Goldstein: Aggregating Utilities for Corrigible AI.

April 27, 2023

Op-ed - Simon Goldstein & Cameron Kirk-Giannini: Is it Ethical to Create Generative Agents? Is it Safe? (ABC News).

February 20, 2023

Draft Published - Elliott Thornley: There are No Coherence Theorems.

Sections

The Program

Philosophy Fellowship Motivation

1. Conceptual problem

2. Conceptual Clarification

3. Sociotechnical Orientation

1. Conceptual Problem

Advanced AI creates unique conceptual difficulties.

2. Conceptual Clarification

Academic philosophers are particularly well-positioned to address these conceptual difficulties

3. Sociotechnical Orientation

Conceptual clarity orients the broader sociotechnical landscape

Goals & Outcomes:

Research:

Field-building:

2023 Fellowship:

Simon Goldstein

Jacqueline Harding

Cameron Kirk-Giannini

Nicholas Laskowski

Nathaniel Sharadin

Dmitri Gallow

Mitchell Barrington

Harry Lloyd

Frank Hong

Bill D'Alessandro

Elliott Thornley

Robert Long

2023 Guest Speakers:

Peter Railton

Hilary Greaves

Shelly Kagan

Vincent Müller

L.A. Paul

Victoria Krakovna

Jacob Steinhardt

David Krueger

Walter Sinnott-Armstrong

Lara Buchak

Johann Frick

Wendell Wallach

Rohin Shah

2023 Fellowship News

Keep up to date with AI Safety

Thank you!