vincentsunnchen /blog vincentschen

vincent sunn chen

vincentsc [at] cs [dot] stanford [dot] edu

Building the data-first platform for AI application development at Snorkel AIjoin us!

Here's an incomplete list of my reading, writing, and photography.


Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices

Vincent S Chen, Zhenzhen Weng, Alexander Ratner, Christopher Ré

ML models can achieve high quality performance on coarse-grained metrics (e.g., F1-score, overall accuracy), but they may underperform on critical data subsets, or slices. We introduce Slice-based Learning, a new programming model in which practitioners use slicing functions to specify critical data subsets for which the model should pay attention.

Scene Graph Prediction with Limited Labels

Vincent S Chen, Paroma Varma, Ranjay Krishna, Michael Bernstein, Christopher Ré, Li Fei-Fei

Scene graphs have emerged as useful in a number of computer vision tasks, including visual question answering — however, most scene graph datasets are sparse due to annotator error. This work attempts to overcome limitations of human annotators using a semi-supervised method, taking advantage of both limited labels and unlabeled data, to generate training datasets for scene graphs.

Powerful Abstractions for Programming Your Training Data

Sen Wu, Vincent S. Chen, Braden Hancock, Alex Ratner, Chris Ré, and other members of Hazy Lab

We leverage key abstractions in Snorkel to achieve state-of-the-art scores on the SuperGLUE benchmark: (1) weak supervision (2) data augmentation (3) data slicing.

Massive Multi-Task Learning: Bringing More Supervision to Bear

Braden Hancock, Clara McCreery, Ines Chami, Vincent S. Chen, Sen Wu, Jared Dunnmon, Paroma Varma, Max Lam, and Chris Ré

We incorporate a number of supervision sources, from traditional supervision, transfer learning, multi-task learning, weak supervision, and ensembling, to achieve state-of-the-art scores on the GLUE benchmark.

Weakly supervised classification of rare aortic valve malformations using unlabeled cardiac MRI sequences

Jason A Fries, Paroma Varma, Vincent S Chen, Ke Xiao, Heliodoro Tejeda, Priyanka Saha, Jared Dunnmon, Henry Chubb, Shiraz Maskatia, Madalina Fiterau, Scott Delp, Euan Ashley, Christopher Ré, James Priest

Bicuspid aortic valve (BAV) is the most common congenital malformation of the heart — obtaining training data is a tremendous practical roadblock to building ML models for detecting this malformation. We collaborate with cardiologists from Stanford Medicine to write labeling functions over geometric features of heart MRIs to produce probabilistic training labels.

Automated Training Set Generation for Aortic Valve Classification

Vincent Chen, Paroma Varma, Madalina Fiterau, James Priest and Christopher Ré.

Using weak-supervision, we learn probabilistic training labels for aortic valve MRIs.


CS231N: Convolutional Neural Networks for Visual Recognition

Teaching Assistant, Spring 2018

Hosted office hours, advised student projects, and led discussion sections on backpropogation and weak supervision.