Eric Tang ·

Eric Tang

About Me

I'm currently a software engineer at Anyscale working on LLM post-training. Previously, I was an MS student studying Computer Science at Stanford, advised by Juan Carlos Niebles at the Stanford Vision and Learning Lab, and before that I graduated from UC Berkeley with a BS in EECS, where I worked with Dan Hendrycks and was advised by Dawn Song and Jacob Steinhardt. I've also interned at Meta, where I worked on scaling ads ranking models, at TikTok, where I worked on video understanding for creator tools and recommendation, and at Google DeepMind, where I worked on research for understanding properties of LLM embeddings for regression.

This site hosts my course and personal projects that I thought were cool, notes about what I've been reading and learning about, and some of the teaching resources for I've compiled over the years!

Notes / Projects / Teaching

Publications

	Understanding LLM Embeddings for Regression Eric Tang, Bangding Yang, Xingyou Song, arXiv preprint [paper] We provide one of the first comprehensive investigations into embedding-based regression and demonstrate that LLM embeddings as features can be better for high-dimensional regression tasks than using traditional feature engineering.
	Streaming Detection of Queried Event Start Cristobal Eyzaguirre, Eric Tang, Shyamal Buch, Adrien Gaidon, Jiajun Wu, Juan Carlos Niebles NeurIPS (Datasets and Benchmarks Track), 2024 [paper][website] We propose a novel task for multimodal video understanding, where the goal is to identify the beginning of a complex event as described by a natural language query, with high accuracy and low latency. We introduce a new benchmark based on the Ego4D dataset, as well as new task-specific metrics to study streaming multimodal detection of diverse events in an egocentric video setting.
	How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios Mantas Mazeika, Eric Tang, Andy Zou, Steven Basart, Jun Shern Chen, Dawn Song, David Forsyth, Jacob Steinhardt, Dan Hendrycks NeurIPS (Datasets and Benchmarks Track), 2022 (Oral) [paper][code] Introduces the VCE and V2V datasets for understanding emotional response and viewer wellbeing for video data.
	Measuring Mathematical Problem Solving with the MATH Dataset Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, Jacob Steinhardt NeurIPS (Datasets and Benchmarks Track), 2021 [paper][code] Introduces the MATH dataset for measuring mathmatical reasoning in large language models, and the AMPS dataset for pretraining.
	Imaging Reconfigurable Molecular Concentration on a Graphene Field-Effect Transistor Franklin Liou, Hsin-Zon Tsai, Andrew S Aikawa, Kyler C Natividad, Eric Tang, Ethan Ha, Alexander Riss, Kenji Watanabe, Takashi Taniguchi, Johannes Lischner, Alex Zettl, Michael F Crommie Nano Letters, 2021 [paper] Uses scanning tunneling microscopy to demonstrate that molecules deposited onto graphene field-effect transistors (FETs) exhibit reversible, electrically tunable surface concentration.