Jack Hessel's Homepage

Jack Hessel

contact: jmhessel@gmail.com

CV (as of Jan. 2025).

I am on github, bluesky, twitter, and google scholar.

I am a machine learning researcher at Samaya AI. Previously, I was a postdoc/research scientist at AI2, and before that, I earned a PhD in Computer Science at Cornell University. If you're looking for a bio for a talk introduction, here's one. If you're looking for me in person, I look somethi ng like this:

Selected projects:

Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest
Jack Hessel, Ana Marasović, Jena D. Hwang, Lillian Lee, Jeff Da, Rowan Zellers, Robert Mankoff, Yejin Choi
ACL 2023
🏆 Best paper award 🏆 (one of three from 3872 submissions)

Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think!
Jack Hessel and Lillian Lee
EMNLP 2020

MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers*, Ximing Lu*, Jack Hessel*, Youngjae Yu, Jae Sung Park, Jize Cao, Ali Farhadi, and Yejin Choi.
NeurIPS 2021 (oral)

All my publications (in ~reverse chronological order)

L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects
Yutaro Yamada, Khyathi Chandu, Yuchen Lin, Jack Hessel, Ilker Yildirim, Yejin Choi
NAACL Demo 2025

Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
Orion Weller, Benjamin Van Durme, Dawn Lawrie, Ashwin Paranjape, Yuhao Zhang, Jack Hessel
ICLR 2025

Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Khyathi Raghavi Chandu, Linjie Li, Anas Awadalla, Ximing Lu, Jae Sung Park, Jack Hessel, Lijuan Wang, Yejin Choi
ICLR 2025

WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild
Yuntian Deng, Wenting Zhao, Jack Hessel, Xiang Ren, Claire Cardie, Yejin Choi
EMNLP Demo Track 2024

The Art of Saying No: Contextual Noncompliance in Language Models
Faeze Brahman, Sachin Kumar, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin, Abhilasha Ravichander, Sarah Wiegreffe, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi
NeurIPS D+B 2024

Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding
Jiwan Chung, Sungjae Lee, Minseo Kim, Seungju Han, Ashkan Yousefpour, Jack Hessel, Youngjae Yu
EMNLP 2024

How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models
Jaeyoung Lee, Ximing Lu, Jack Hessel, Faeze Brahman, Youngjae Yu, Yonatan Bisk, Yejin Choi, Saadia Gabriel
Findings of EMNLP 2024

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
Joel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu
Mix your own soup! Workshop on Adaptive Foundation Models @ NeurIPS 2024

FunQA: Towards Surprising Video Comprehension
Binzhu Xie, Sicheng Zhang, Zitang Zhou, Bo Li, Yuanhan Zhang, Jack Hessel, Jingkang Yang, Ziwei Liu
ECCV 2024

OLMo: Accelerating the Science of Language Models
Dirk Groeneveld, Iz Beltagy, + 41 more folks, including me :-)
ACL 2024
Try OLMO-7B, one of the biggest+most open LLMs in the world!
🏆 Best theme paper award 🏆

Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models
Anthony Sicilia, Hyunwoo Kim, Khyathi Raghavi Chandu, Malihe Alikhani, Jack Hessel
Findings of ACL 2024

Selective "Selective Prediction": Reducing Unnecessary Abstention in Vision-Language Reasoning
Tejas Srinivasan, Jack Hessel, Tanmay Gupta, Bill Yuchen Lin, Yejin Choi, Jesse Thomason, Khyathi Raghavi Chandu
Findings of ACL 2024

UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations
Wenting Zhao, Justin T Chiu, Jena D. Hwang, Faeze Brahman, Jack Hessel, Sanjiban Choudhury, Yejin Choi, Xiang Lorraine Li, Alane Suhr
NAACL 2024
code/data coming soon!

(InThe)WildChat: 570K ChatGPT Interaction Logs In The Wild
Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, Yuntian Deng
ICLR 2024
code/models

Tailoring Self-Rationalizers with Multi-Reward Distillation
Sahana Ramnath, Brihi Joshi, Skyler Hallinan, Ximing Lu, Liunian Harold Li, Aaron Chan, Jack Hessel, Yejin Choi, Xiang Ren
ICLR 2024
code/models

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Anas Awadalla*, Irena Gao*, Josh Gardner, Jack Hessel, Yusuf Hanafy, Wanrong Zhu, Kalyani Marathe, Yonatan Bitton, Samir Gadre, Shiori Sagawa, Jenia Jitsev, Simon Kornblith, Pang Wei Koh, Gabriel Ilharco, Mitchell Wortsman, Ludwig Schmidt
code/models, blog post

SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
Hyunwoo Kim, Jack Hessel, Liwei Jiang, Peter West, Ximing Lu, Youngjae Yu, Pei Zhou, Ronan Le Bras, Malihe Alikhani, Gunhee Kim, Maarten Sap, Yejin Choi
EMNLP 2023
SODA dataset; talk to COSMO here!
🏆 Outstanding paper award 🏆 (one of thirty selections from 4909 submissions)

Text encoders are performance bottlenecks in contrastive vision-language models
Amita Kamath, Jack Hessel, Kai-Wei Chang
EMNLP 2023
code/data

What's "up" with vision-language models? Investigating their struggle to understand spatial relations.
Amita Kamath, Jack Hessel, Kai-Wei Chang
EMNLP 2023
code/data

Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms
Seungju Han, Junhyeok Kim, Jack Hessel, Liwei Jiang, Jiwan Chung, Yejin Son, Yejin Choi, Youngjae Yu
EMNLP 2023
code/data

NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation
Peter West, Ronan Le Bras, Taylor Sorensen, Bill Yuchen Lin, Liwei Jiang, Ximing Lu, Khyathi Chandu, Jack Hessel, Ashutosh Baheti, Chandra Bhagavatula, Yejin Choi
Findings of EMNLP 2023
code/data/models forthcoming!

Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Jae Sung Park, Jack Hessel, Khyathi Chandu, Paul Pu Liang, Ximing Lu, Qiuyuan Huang, Peter West, Jianfeng Gao, Ali Farhadi, Yejin Choi
NeurIPS 2023
code coming soon!

Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
Wanrong Zhu*, Jack Hessel*, Anas Awadalla, Samir Yitzhak Gadre, Jesse Dodge, Alex Fang, Youngjae Yu, Ludwig Schmidt, William Yang Wang, Yejin Choi
NeurIPS D+B 2023
mmc4 is now available! press: geekwire

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton*, Hritik Bansal*, Jack Hessel*, Rulin Shao, Wanrong Zhu, Anas Awadalla, Josh Gardner, Rohan Taori, Ludwig Schimdt
NeurIPS D+B 2023
data/leaderboard

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources
Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi
NeurIPS D+B 2023
Tülu 65B avilable now!

CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos
Seungju Han, Jack Hessel, Nouha Dziri, Yejin Choi, Youngjae Yu
ICCV 2023
code/data/models

Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images
Nitzan Bitton-Guetta*, Yonatan Bitton*, Jack Hessel, Ludwig Schmidt, Yuval Elovici, Gabriel Stanovsky, Roy Schwartz
ICCV 2023
WHOOPS Dataset available here

Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest
Jack Hessel, Ana Marasović, Jena D. Hwang, Lillian Lee, Jeff Da, Rowan Zellers, Robert Mankoff, Yejin Choi
ACL 2023
🏆 Best paper award 🏆 (one of three selections from 3872 submissions)
The New Yorker dataset/leaderboard/code at capcon.dev
Press: The Atlantic; New York Times, in an interview with Yejin; Neuroscience News; Because Language podcast; Cornell Chronicle; Psychology Today; video

Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step
Liunian Harold Li, Jack Hessel, Youngjae Yu, Xiang Ren, Kai-Wei Chang and Yejin Choi
ACL 2023

Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, JaeSung Park, Ximing Lu, Prithviraj Ammanabrolu, Rowan Zellers, Ronan Le Bras, Gunhee Kim, Yejin Choi
CVPR 2023

Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy, Prithviraj Ammanabrolu, Kianté Brantley, Jack Hessel, Rafet Sifa, Christian Bauckhage, Hannaneh Hajishirzi, and Yejin Choi.
ICLR 2023 (spotlight) ; also appeared at InterNLP @ NeurIPS 2022.
Check out the RL4LMs library!

Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu, Sean Welleck*, Jack Hessel*, Liwei Jiang, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, Yejin Choi.
NeurIPS 2022 (oral)
code, models, etc.

The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning.
Jack Hessel*, Jena D Hwang*, Jae Sung Park, Rowan Zellers, Chandra Bhagavatula, Anna Rohrbach, Kate Saenko, and Yejin Choi.
ECCV 2022 (oral)
dataset/code/leaderboard; press: venturebeat

Reframing Human-AI Collaboration for Generating Free-Text Explanations.
Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, and Yejin Choi.
NAACL 2022
code

Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer.
Yanpeng Zhao, Jack Hessel, Youngjae Yu, Ximing Lu, Rowan Zellers, and Yejin Choi.
NAACL 2022
code

Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
Peter West, Chandra Bhagavatula, Jack Hessel, Jena D. Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, and Yejin Choi
NAACL 2022
code, explainer video

MERLOT Reserve: Neural Script Knowledge through Sound, Language, and Vision
Rowan Zellers, Jiasen Lu, Ximing Lu, Youngjae Yu, Yanpeng Zhao, Mohammadreza Salehi, Aditya Kusupati, Jack Hessel, Ali Farhadi, Yejin Choi
CVPR 2022 (oral)
code, project page, Press: Venturebeat, Press: GeekWire, with interview from Rowan!

MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers*, Ximing Lu*, Jack Hessel*, Youngjae Yu, Jae Sung Park, Jize Cao, Ali Farhadi, and Yejin Choi.
NeurIPS 2021 (oral)
code, project page, Press: The Batch, Press: Venturebeat

CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi
EMNLP 2021
code, talk

How effective is BERT without word ordering? Implications for language understanding and data privacy.
Jack Hessel and Alexandra Schofield
ACL 2021
code, talk

Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think!
Jack Hessel and Lillian Lee
EMNLP 2020
code, talk

Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Jack Hessel, Zhenhai Zhu, Bo Pang, and Radu Soricut
EMNLP 2020
data, talk

Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Gregory Yauney, Jack Hessel, and David Mimno
EMNLP 2020
code

Learning from Multimodal Web Data
PhD Thesis
Cornell University 2020

Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents
Jack Hessel, Lillian Lee, and David Mimno
EMNLP 2019
code/data, poster, project page

A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions
Jack Hessel, Bo Pang, Zhenhai Zhu, and Radu Soricut
CoNLL 2019
talk slides

Something's Brewing! Early Prediction of Controversy-causing Posts from Discussion Features
Jack Hessel and Lillian Lee
NAACL 2019
data

Quantifying the Visual Concreteness of Words and Topics in Multimodal Datasets
Jack Hessel, David Mimno, and Lillian Lee
NAACL 2018
code/data

Cats and Captions vs. Creators and the Clock: Comparing Multimodal Content to Context in Predicting Relative Popularity
Jack Hessel, Lillian Lee, David Mimno
WWW 2017
code/data, slides, replication

Aligning Images and Text in a Digital Library
Jack Hessel and David Mimno
Computer Vision in Digital Humanities Workshop at DH 2017 (extended abstract)

Science, AskScience and BadScience: On the Coexistience of Highly Related Communities
Jack Hessel, Chenhao Tan, and Lillian Lee
ICWSM 2016
data, slides, replication

What do Vegans do in their Spare Time? Latent Interest Detection in Multi-Community Networks
Jack Hessel, Alexandra Schofield, Lillian Lee, David Mimno
Workshop on Networks in the Social and Information Sciences at NeurIPS 2015
project page

Image Representations and New Domains in Neural Image Captioning
Jack Hessel, Nicolas Savva, and Kimberly J. Wilber
Workshop on Vision and Language at EMNLP 2015
slides

Using Reproductive Altruism to Evolve Multicellularity in Digital Organisms
Jack Hessel and Sherri Goings
ECAL 2013

Former interns

I've been undeservedly fortunate to (co-)host to many PhD research interns. Following their excitement and insight has been a highlight of my working life. Former interns (whom I hope do not regret entrusting me+their co-hosts with one of their valuable PhD summer quarters!) include: Sarah Wiegreffe, Yanpeng Zhao, Liunian Harold Li, Wanrong Zhu, Amita Kamath, Jillian Fisher, Tejas Srinivasan, Anthony Sicilia, Yutaro Yamada, and Orion Weller.

Work Experience

I was a postdoc and resesarch scientist on the Mosaic team at AI2 from October 2020 through October 2023.
I spent the summer of 2019 working with Google's natural language understanding team.
I was a visiting faculty member at Carleton College in Spring, 2019; I taught two classes: Natural Language Processing and Discrete Math.
I spent the summer of 2018 working with Google's natural language understanding team.
I spent the summer of 2017 working with Facebook's core data science team.
I spent the summer of 2016 working with the Twitter Cortex team in NYC.

Invited Talks

My CV is more up-to-date, but I've been fortunate to speak at (roughly alphabetically): Adobe Research, Carnegie Mellon University, Carleton College, Cornell, University of North Carolina (Chapel Hill), Univeristy of Pittsburgh, Rutgers University, Seoul National University, SRI International, University of Washington, and more!

Service/Guest Lectures/Other Activites

My CV is more up to date, but I have reviewed/ACed/etc. for many NLP/CV/ML venues since 2016 including ACL, EMNLP, NAACL, AACL, EACL, AAAI, CoNLL, ACL Rolling Review, ICML, NeurIPS, ICLR, JAIR, ICWSM and more!

Other fun projects

I wrote a TreeLSTM in tensorflow 2; this is a neural network whose topology changes based on each input example.
I wrote a factorization machine layer in pytorch; for speed reasons, the forward and backward passes are written in cython.
I implemented Monroe et al.'s "Fightin' Words" algorithm for robustly comparing word frequencies in two corpora. This implementation has been used in several publications, e.g., this and this
As part of an NSF REU, I contributed to the implementation of a really (really) fast SVM solver in Kilian Weinberger's lab at Washington University, St. Louis. [now at Cornell!] see Parallel Support Vector Machines in Practice by Tyree, S., Gardner, J. R., Weinberger, K. Q., Agrawal, K., & Tran, J. (2014).
"A Comparative Analysis of Popular Phylogenetic Reconstruction Algorithms." Undergraduate thesis project/best paper award at MICS 2014: joint work with Evan Albright, Nao Hiranuma, Cody Wang, and Sherri Goings.

More?

I grew up in beautiful Portola Valley, California. I earned a B.A. from Carleton College in 2014, studying computer science and mathematics/statistics. During my time in Northfield, I played ice hockey, and hosted a radio show. I even returned to Carleton briefly in 2019, this time, as a visiting faculty member. I'm a die hard San Jose Sharks fan, avid consumer (and very occasionally a producer) of electronic music, and, an amateur lockpick. During graduate school at Cornell, I was a member of Stewart Little Coop, a community of 15 people, I played ice hockey in the Ithaca Hockey Association (and, during summer internships in CA, in the San Jose Adult Hockey League).