Experience

2023 - Present
The EleutherAI Institute

Research Scientist

  • Conduct and assist in LLM mechanistic interpretability & other empirical research projects, including:
    • Circuits Over Time: in-progress research on circuit universality and evolution in different model settings
    • Polygraph: generated dataset for investigating instrumental deception in fine-tuned and RLHF models
    • Current work with Neel Nanda on sentiment, internal summarization and world-modeling
  • Train LLMs on GPU cluster for various projects as needed, and maintain and improve the GPT-NeoX library
2022 - 2023
NCSU Institute of Transportation Research & Education

Data Scientist

My role at the NCSU ITRE was to build machine learning models for education facility planning, using a variety of housing, transportation, population and enrollment data.
2020 - 2021
Taroko.io

Senior Data Analyst

In my role as Senior Data Analyst, I planned and built out our data warehouse in BigQuery, integrating our PostgreSQL databases, Heap Analytics and many more data sources. I also built and deployed statistical & ML solutions for churn and revenue prediction, key conversion path analysis, etc. Finally, I also created our company data visualization system in Data Studio, built on self-written library of hundreds of SQL queries.
2016 - 2019
Taroko.io

Business & Search Engine Marketing Analyst

In this role, I analyzed customer data & produced SQL queries and reports to produce various metrics & ad-hoc analyses, and also developed bidding algorithms that rescued failing products, decreasing CPA by 21% and increasing volume.
2014 - 2015
KPIT Extended PLM

Software Engineer

At KPIT (originally I-Cubed, Inc.), I worked to develop and deploy enterprise software (PTC Windchill) customizations for product lifecycle management. I also planned & conducted detailed QA tests for our software products.
2013 - 2014
I-Cubed

Marketer

Led and coordinated new website design project. Also initiated and led company blog program to produce daily lead-generating blog posts and showcase company expertise.

Education

Summer 2023
SERI MATS

MATS Scholar

Worked under Neel Nanda's mentorship on research into sentiment representation and internal summarization behavior in LLMs.
2019 - 2021
University of Illinois at Urbana-Champaign

Master of Computer Science

Followed the Data Science track for the UIUC Master of Computer Science program. Notable classes include Natural Language Processing, Deep Learning for Healthcare, Distributed Systems, Scientific Visualization, and Practical Statistical Learning.
2005 - 2012
North Carolina State University

B.S. in Science, Technology and Society

Studied both computer science topics (software development, algorithms) and also ethical and social implications of accelerating technological development. Also studied Japanese and Mandarin Chinese.

High-Level Skills

Core Machine Learning

  • Deep learning
  • Classic machine learning algorithms
  • Data cleaning & exploration
  • Statistics
  • NLP
  • Computer vision
  • Reinforcement learning

Software Engineering

  • Version control
  • ML testing practices
  • Documentation
  • OOP

Deployment & MLOps

  • CI/CD
  • Containerization
  • Machine learning project planning
  • Data management
  • Cloud deployment

Frameworks & Packages

Data Science Fundamentals

  • NumPy
  • Pandas
  • Scikit-Learn
  • Matplotlib
  • RStudio

Deep Learning

  • PyTorch & PyTorch Lightning
  • HuggingFace Transformers

Engineering Tools

  • Git
  • Docker
  • AWS Lambda
  • Google Cloud
  • Weights & Biases
  • Label Studio

Languages

Proficient In:

  • Python
  • SQL

Familiar With:

  • R
  • C++
  • PHP