Hey, I'm Scott.

I'm a software engineer gone data scientist.


I'm Scott Lee, a Bay Area native, tea connoisseur, and turtle enthusiast. I'm currently doubling in Computer Science and Statistics at UC Berkeley. My academic and industrial interests include artificial intelligence, machine learning, and automation.

I'm also deeply interested in teaching and education. I currently serve as a head TA for UC Berkeley's Data 100. One day, I hope to make accessible, high-quality education a practical reality through technology.

You can find my resume below, or if you prefer, here in a more concise PDF form. The best way to reach me is by email (scott.lee.3898@berkeley.edu).


University of California, Berkeley 2016 - 2019
B.A. Computer Science, B.A. Statistics | 3.9 / 4
• Recent coursework: Algorithms, Data Structures, Machine Learning, Artificial Intelligence, Database Systems, Computer Architecture, Discrete Math, Probability & Random Processes, Statistical Theory, Linear Modeling


Programming Python (+ Pandas, Scikitlearn, TensorFlow, Spark), SQL, R, Java, Go, C

Web Development HTML, CSS/LESS, Javascript, jQuery, Bootstrap


Rubrik, Inc.

SWE Intern | May 2018 - Aug 2018

• Designed and implemented a robust database system with efficient random reads and sequential scanning, used to store email metadata in the cloud.
• Also added universal compatibility with various cloud providers (S3, Azure, Google Cloud) and further augmented it with Zstd compression and file deduplication.


Data Science / Education Intern | May 2017 - Aug 2017

• Conducted data analysis using SQL and Python on 1+ million emails to evaluate email algorithm effectiveness.
• Designed and produced curricula for "Math for Computer Science" and "Intro to Data Science" courses.


UC Berkeley Data 100

Head Teaching Assistant | Jan 2018 - Present

• Prepared materials, taught lab section, held offce hours, and developed assignments.
• Facilitated class logistics, including managing staff, organizing exams, and overseeing enrollment for a class of 1400 students.
[Teaching Assistant | Aug 2017 - Dec 2017]

UC Berkeley RISELab

Research Assistant | February 2018 - Present

• Applied convolutional neural networks to detect and classify arrhythmias from electrocardiogram readings.
• Also built a hierarchical classifier that parses and analyzes data from doctors' notes.



Data Scientist/Product Manager | Jan 2018 - Present

• BerkeleyTime is an augmented course catalog that provides data on courses, enrollment trends, grade distributions, and more. I serve as the Product Manager and Data Scientist for BerkeleyTime, and was previously Lead Engineer.
• We just wrapped up a course scheduler, and are currently working on a data-centered initiative with student course data (such as course recommendations, automatic course plan generation, and intelligence course classiffication).

Neural Net Applications | Python

• Implemented a neural net from scratch (creating a graph class, implementing matrix operations, backpropagation, etc).
• Applied NNs to solve various problems, such as digit classification, language classification, and function approximation.

Song Classifier | Python, TensorFlow

• Created a deep learning classifier that categorizes songs as either country or hip-hop based on lyrics frequencies.
• Achieved an accuracy rate of 89% and was selected as one of 10 winners of a 200-member Kaggle contest.

Jesture | HTML, CSS, JS, jQuery, Python, C, AppleScript

• Created gesture detection application with the Synaptics touchpad; implemented API for Spotify, Slack, Facebook, and more, then linked to a sleek web UI.