AI safety is a challenge that is beyond any single research lab. We run a variety of initiatives to support and empower the existing research community while lowering barriers to entry and further expanding the community. Our efforts include providing infrastructure and resources for the AI safety research ecosystem, initiating multi-disciplinary projects to explore the societal effects of AI from new perspectives, and creating educational resources to encourage newcomers to join.
All of our currently active projects.
To support progress and innovation in AI safety, we offer researchers free access to our compute cluster, which can run and train large-scale AI systems.
The CAIS Philosophy Fellowship is a seven-month research program that investigates the societal implications and potential risks associated with advanced AI.
Future AI systems need to be able to detect and act cautiously in morally ambiguous situations. The Moral Uncertainty Competition provides $100,000 in prizes to incentivize research towards machine learning models with the ability to detect substantial moral disagreement.
Lowering the barriers to entry in studying ML safety.
Intro to ML Safety is a comprehensive training program designed for individuals seeking additional support, community, and accountability while completing the ML safety course. Accepted participants receive access to peer discussion groups, mentorship, and a small stipend.
A $2000 scholarship for undergraduates and masters students who secure ML Safety research mentorship.
An online course which offers a comprehensive introduction to ML safety.
A monthly newsletter detailing the latest advancements in ML safety.
All of CAIS's past projects.
The ML Safety Workshop at NeurIPS 2022 brought together researchers from various fields to discuss and advance the field of ML safety.
Neural Trojans are a growing concern for the security of ML systems, but little is known about the fundamental offense-defense balance of Trojan detection. The Trojan Detection Competition at NeurIPS 2022 poses the question: How hard is it to detect hidden functionality that is trying to stay hidden?
Three best paper awards to study model robustness to threats beyond small l_p perturbations, including attacks that are perceptible and attacks with specifications not known beforehand and are unforeseen.