Data Science · Informatics · Psychology

Turning messy data
into decisions.

I'm Yougi — an Informatics & Psychology student at UMass Amherst building dashboards, models, and data pipelines, including a live NYC collisions app over 27K records.

I work at the intersection of data and human behavior — pairing technical analysis with a psychology background to ask better questions of the numbers.

Currently pursuing a B.S. in Informatics & Psychology at the University of Massachusetts Amherst (expected May 2027). My projects span the full pipeline: building datasets, writing SQL, training models, and shipping dashboards that non-technical stakeholders can actually use.

Focus
Analytics, ML, data engineering
Education
UMass Amherst — Informatics & Psychology
Graduating
May 2027
Based in
Amherst, MA
Projects shipped
3
Records analyzed
37K+
Survey participants
700+
Live apps
1
Jan 2026 Live Dashboard / Analytics

NYC Motor Vehicle Collisions Dashboard

A public Streamlit app analyzing 27,164 NYC collisions (Jan 2020 – May 2025) — 13,508 injuries and 64 fatalities across all five boroughs. Four SQL analyses (severity, borough, hour-of-day, monthly trends) sit on a validated 23-field schema, with caching and pytest/CI for fast, reliable loads.

Records
27,164
Injuries
13,508
Fatalities
64
  • Python
  • SQL
  • SQLite
  • Streamlit
  • PyDeck
Dec 2025 Machine Learning

Bank Customer Churn Prediction

An end-to-end R report flagging at-risk customers across 10,000 records and 14 features. Logistic regression vs. Random Forest under 5-fold cross-validation, handling a 20.4% churn class imbalance, with decision thresholds tied to retention strategy.

Records
10,000
Features
14
Churn rate
20.4%
  • R
  • Classification
  • Random Forest
  • Class imbalance
  • Cross-validation
Aug 2025 Data Architecture

Plug — Campus Marketplace

Selected for Purdue's Market Readiness Incubator. As project manager, I designed the analytics-ready PostgreSQL (Supabase) backend — a relational schema with SQL CRUD plus multi-field search, sorting, and pagination, backed by integrity checks.

Backend
PostgreSQL
Queries
CRUD + search
Selected
Purdue Incubator
  • PostgreSQL
  • Supabase
  • SQL
  • Data Modeling
More projects on GitHub ↗
  1. May 2025

    AI & Psychology Research Intern

    Spiritual Data · Remote

    Built a benchmarking harness comparing a proprietary model against baselines on 3+ metrics, designed evaluation splits and error slices to surface failure modes, and produced a stakeholder evaluation report aligned to business and compliance needs.

  2. Jun 2022

    Research Intern — Impact of Social Media

    UC Berkeley · Lawrence Hall of Science · Berkeley, CA

    Cleaned and prepared survey data from 700+ participants into analysis-ready datasets, then ran quantitative analysis (EDA, regression) on how social media use relates to public attitudes.

University of Massachusetts Amherst

Amherst, MA · Expected May 2027

B.S. Informatics & Psychology — Double Major

A double major bridging data and human behavior — informatics for the technical foundation (programming, databases, analysis) and psychology for the research methods and statistics behind asking sharper questions of the numbers.

Languages

  • Python
  • SQL
  • R
  • Java

Data

  • Pandas
  • NumPy
  • ETL
  • SQLite
  • PostgreSQL

Analytics / BI

  • Streamlit
  • Tableau
  • Power BI
  • Excel
  • KPI design

ML / Stats

  • scikit-learn
  • Classification
  • Regression
  • Model evaluation
  • Statistics

Tools

  • Git / GitHub
  • pytest
  • CI/CD
  • Caching
  • LangChain

Let's build something with data.

Open to data science and analytics roles, internships, and collaborations.