Jack Hessel's Homepage
Research Scientist @ AI2
CV (as of 1/2023).
I am on github, twitter, google scholar, and semantic scholar.
I am a Research Scientist at AI2. Previously, I earned a PhD in Computer Science at Cornell University.
These days, I'm most interested in improving human-AI collaboration. This includes:
Concretely: I work at the union of natural language processing, computer vision, and machine learning. At AI2, I am on the Mosaic team, which focuses on building machines capable of commonsense reasoning. If you're looking for me, I look something like this (facial hair subject to change):
- developing methods to align model behavior with human intent;
- probing the limits of scaling to identify where models can most benefit from human intervention;
- and expanding models with new modalities (e.g., vision+language) for a more complete view of the world.
Some recent preprints
CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos
Seungju Han, Jack Hessel, Nouha Dziri, Yejin Choi, Youngjae Yu
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images
Nitzan Bitton-Guetta, Yonatan Bitton, Jack Hessel, Ludwig Schmidt, Yuval Elovici, Gabriel Stanovsky, Roy Schwartz
WHOOPS Dataset available here
Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest
Jack Hessel, Ana Marasović, Jena D. Hwang, Lillian Lee, Jeff Da, Rowan Zellers, Robert Mankoff, Yejin Choi
The New Yorker Dataset is now available at capcon.dev
Publications (in reverse chronological order)
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, JaeSung Park, Ximing Lu, Prithviraj Ammanabrolu, Rowan Zellers, Ronan Le Bras, Gunhee Kim, Yejin Choi
Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy, Prithviraj Ammanabrolu, Kianté Brantley, Jack Hessel, Rafet Sifa, Christian Bauckhage, Hannaneh Hajishirzi, and Yejin Choi.
ICLR 2023 (spotlight) ; also appeared at InterNLP @ NeurIPS 2022.
Check out the RL4LMs library!
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu, Sean Welleck*, Jack Hessel*, Liwei Jiang, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, Yejin Choi.
NeurIPS 2022 (oral)
code, models, etc.
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning.
Jack Hessel*, Jena D Hwang*, Jae Sung Park, Rowan Zellers, Chandra Bhagavatula, Anna Rohrbach, Kate Saenko,
and Yejin Choi.
ECCV 2022 (oral)
dataset/code/leaderboard; press: venturebeat
Reframing Human-AI Collaboration for Generating Free-Text Explanations.
Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, and Yejin Choi.
Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer.
Yanpeng Zhao, Jack Hessel, Youngjae Yu, Ximing Lu, Rowan Zellers, and Yejin Choi.
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
Peter West, Chandra Bhagavatula, Jack Hessel, Jena D. Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, and Yejin Choi
code, explainer video
MERLOT Reserve: Neural Script Knowledge through Sound, Language, and Vision
Rowan Zellers, Jiasen Lu, Ximing Lu, Youngjae Yu, Yanpeng Zhao,
Mohammadreza Salehi, Aditya Kusupati, Jack Hessel, Ali Farhadi, Yejin Choi
CVPR 2022 (oral)
code, project page, Press: Venturebeat, Press: GeekWire, with interview from Rowan!
MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers*, Ximing Lu*, Jack Hessel*, Youngjae Yu, Jae Sung Park, Jize Cao, Ali Farhadi, and Yejin Choi.
NeurIPS 2021 (oral)
code, project page, Press: The Batch, Press: Venturebeat
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi
How effective is BERT without word ordering? Implications for language understanding and data privacy.
Jack Hessel and Alexandra Schofield
Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think!
Jack Hessel and Lillian Lee
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Jack Hessel, Zhenhai Zhu, Bo Pang, and Radu Soricut
Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Gregory Yauney, Jack Hessel, and David Mimno
Learning from Multimodal Web Data
Cornell University 2020
Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents
Jack Hessel, Lillian Lee, and David Mimno
A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions
Jack Hessel, Bo Pang, Zhenhai Zhu, and Radu Soricut
Something's Brewing! Early Prediction of Controversy-causing Posts from Discussion Features
Jack Hessel and Lillian Lee
Quantifying the Visual Concreteness of Words and Topics in Multimodal Datasets
Jack Hessel, David Mimno, and Lillian Lee
- Cats and Captions vs. Creators and the Clock: Comparing Multimodal Content to Context in Predicting Relative Popularity
Jack Hessel, Lillian Lee, David Mimno
- Aligning Images and Text in a Digital Library
Jack Hessel and David Mimno
Computer Vision in Digital Humanities Workshop at DH 2017 (extended abstract)
Science, AskScience and BadScience: On the Coexistience of Highly Related Communities
Jack Hessel, Chenhao Tan, and Lillian Lee
- What do Vegans do in their Spare Time? Latent Interest Detection in Multi-Community Networks
Jack Hessel, Alexandra Schofield, Lillian Lee, David Mimno
Workshop on Networks in the Social and Information Sciences at NeurIPS 2015
Image Representations and New Domains in Neural Image Captioning
Jack Hessel, Nicolas Savva, and Kimberly J. Wilber
Workshop on Vision and Language at EMNLP 2015
- Using Reproductive Altruism to Evolve Multicellularity in Digital Organisms
Jack Hessel and Sherri Goings
My CV is more up-to-date, but I've been fortunate to speak at (roughly alphabetically): Adobe Research, Carleton College, Cornell, University of North Carolina (Chapel Hill), Univeristy of Pittsburgh, Rutgers University, Seoul National University, SRI International, University of Washington, and more!
Service/Guest Lectures/Other Activites
My CV is more up to date, but I have reviewed/ACed/etc. for many NLP/CV/ML venues since 2016 including ACL, EMNLP, NAACL, AACL, EACL, AAAI, CoNLL, ACL Rolling Review, ICML, NeurIPS, ICLR, JAIR, ICWSM and more!
Other fun projects
- I wrote a TreeLSTM in tensorflow 2; this is a neural network whose topology changes based on each input example.
- I wrote a factorization machine layer in pytorch; for speed reasons, the forward and backward passes are written in cython.
- I implemented Monroe et al.'s "Fightin' Words" algorithm for robustly comparing word frequencies in two corpora. This implementation has been used in several publications, e.g., this and this
- As part of an NSF REU,
I contributed to the implementation of a really (really) fast SVM solver in Kilian Weinberger's lab at Washington University, St. Louis. [now at Cornell!] see Parallel Support Vector Machines in Practice by Tyree, S., Gardner, J. R., Weinberger, K. Q., Agrawal, K., & Tran, J. (2014).
- "A Comparative Analysis of Popular Phylogenetic Reconstruction Algorithms." Undergraduate thesis project/best paper award at MICS 2014: joint work with Evan Albright, Nao Hiranuma, Cody Wang, and Sherri Goings.
I grew up in beautiful Portola Valley, California. I earned a B.A. from Carleton College in 2014, studying computer science and mathematics/statistics. During my time in Northfield, I played ice hockey, and hosted a radio show. I even returned to Carleton briefly in 2019, this time, as a visiting faculty member. I'm a die hard San Jose Sharks fan, avid consumer (and very occasionally a producer) of electronic music, and, an amateur lockpick. During graduate school at Cornell, I was a member of Stewart Little Coop, a community of 15 people, I played ice hockey in the Ithaca Hockey Association (and, during summer internships in CA, in the San Jose Adult Hockey League).