Software Engineer @ Anyscale
M.S. EECS @ UC Berkeley
San Francisco, CA
I'm Scott, a Bay Area native, tea connoisseur, and turtle enthusiast. Currently, I'm a software engineer at Anyscale, contributing to open-source Ray. In the past, I've also had the pleasure of working at other amazing companies like Lyft, Rubrik and Brilliant. I fill my free hours by sipping tea, playing video games, climbing rocks, and trying new restaurants.
I'm passionate about designing thoughtful solutions to complex problems.
My current academic and industry interests include:
• artificial intelligence (machine learning, LLMs, computer vision)
• large-scale data engineering (TB-scale data engineering, data for AI/ML)
• intersecting technology and education
Python & Libraries: Ray, PyTorch, TensorFlow, vLLM, HuggingFace, LangChain
Other Languages & Frameworks: Apache Spark, Arrow, Airflow; SQL, Go, Java, R, JavaScript
Domain Knowledge: Computer Vision, Ad Ops (Google/Meta)
Software Engineer (Ray Data)
• Contributor to the open-source Ray project (industry standard Python library for distributed computing) as a core developer for Ray Data.
• Developed core features for Ray Data General Availability,
including execution plan optimizer, data ingestion for distributed training, and observability.
• Eng lead for new LLM workloads at Anyscale, such as text embeddings generation and LLM batch inference.
• Published several blog posts and presented a talk at Ray Summit to publicize and share our work.
Graduate Researcher
• Research areas: Computer Vision (Explainability, Few-Shot), Medical Imaging (EKG)
• Key work: Neural-Backed Decision Trees (ICLR 2021)
Software Engineer (Growth Platforms)
• Redesigned two major components in existing infrastructure for automated driver acquisition, efficiently scaling up marketing spend from COVID-shutdown to $5MM+/month marketing spend across three paid media channels.
• Drove multi-quarter, mission-critical initiatives directly impacting key team OKRs, partnering with numerous other engineers and scientists in a highly cross-functional environment; leveraged and augmented team’s core database of 30+ tables in actively planning the team’s short-term strategy as well as long-term team roadmapping.
Head Teaching Assistant
• Managed a team of 50 TAs, 60 tutors, and 150 lab assistants in orchestrating a 1300 student intro data science course.
• Led key infrastructure projects to support scaling across data science department, including streamlining assignment development, autograding system, cheating detection, and large-scale course logistics.
A Slackbot packed with features to assist teaching staff members, including roster lookup, Piazza paging, and groupshouts.
An augmented course catalog used by over 30,000 UC Berkeley students that provides data on courses, enrollment trends, grade distributions, and more.
Improving explainability for deep learning image classification using a decision tree-based structure.
A general method for altering algorithms for edge detection in order to produce edge mappings that focus on one or few individual objects in an image.