Harish Balaji Boominathan | Portfolio

Professional Experience

Google Summer of Code – Open Source Contributor

University of California, OSPO (Jun 2025 – Sep 2025)

Designed privacy metrics quantifying re-identification risk across structured datasets, including a normalized risk scoring system.
Integrated image dataset support with pre-trained vision models (CLIP, ViT, DINOv2).
Prototyped a RAG-based chatbot for interactive data readiness explanations (92% satisfaction, 1,000+ users).
Enhanced evaluation pipeline for multi-modality and visual dashboards of quality, fairness, and privacy metrics.

Graduate Research Assistant – VIDA Lab

New York University (Jan 2025 – Present)

Developed multi-sensor fusion tools for 12+ nodes across Brooklyn intersections, aggregating 2+ hours of multimodal data.
Deployed HRNet-based pose estimation on traffic video feeds (1,000+ pedestrians, safety risk analysis).
Built ML pipelines to classify 5,000+ objects and compute pedestrian speed with 95%+ accuracy.

Featured Projects

El Silencio Acoustic Explorer – Edge AI for Biodiversity Monitoring

Multi-branch pipeline: RawAudioCNN, EfficientNetB3 (LoRA), ResNet50, PANNs-based embeddings.
Achieved 385% throughput increase (13,100 FPS), model size 197.5MB, latency 23ms/sample (RTX6000).
Edge deployment on Raspberry Pi 5, real-time metrics with Prometheus/Grafana (<75ms latency, 8+ users).
Integrated MLflow for experiment tracking and edge reliability monitoring.

RouteWise – Intelligent Route Optimization Engine

Five algorithmic solvers: Greedy, Brute Force, Dynamic Programming, Constraint Programming (OR-Tools), hybrid genetic-annealing.
Reduced planning time by 85% with smart algorithm selection and weighted preference scoring.
Minimized API usage by 80% via caching and batch requests to Google Distance Matrix APIs.
Route generation with Google Maps Static API, supporting multi-day, preference-weighted itineraries.

DailyPod – Automated News-to-Audio Intelligence Platform

Aggregates multilingual news from NewsAPI, deduplicates with custom NLP, ranks by relevance/recency.
Summarizes articles using GPT-3.5 with language-aware prompts.
Text-to-speech audio delivery via WhatsApp Business API.
Backend: Flask, Celery, Redis for async processing, scaling, and retry logic. Monitoring via dashboards.

XGChurn – Predictive Analytics Platform

XGBoost classifier (87% accuracy, 85% ROC-AUC) with SHAP interpretability.
Streamlit dashboard for churn probability exploration (2,000+ customers).
Feature engineering: credit score drop, transaction frequency, etc.

Education

New York University
Master of Science in Computer Engineering (Sep 2024 – May 2026)
GPA: 4.00 / 4.00

Research: Fine-tuned RoBERTa on 120K+ news articles with LoRA adapters (92%+ accuracy, 0.5M trainable params).
Core: Machine Learning, Deep Learning, ML Systems Engineering, Applied Matrix Theory

SASTRA University
Bachelor of Technology in Computer Science and Engineering (Aug 2020 – Jun 2024)

Technical Foundation: Algorithms, Operating Systems, Machine Learning, Computer Networks, Databases

Technical Expertise

Languages & Infrastructure: Python, C++, SQL, Bash, FastAPI, Git, Docker, Kubernetes, Linux

ML & Data Science: PyTorch, TensorFlow, ONNX, MLflow, Streamlit, Prometheus, Grafana

Specializations: Distributed Systems, System Design, Model Optimization, Edge Deployment, CI/CD, Privacy-Preserving ML, MLOps

Connect

Email GitHub LinkedIn

This portfolio showcases the systems, models, and tools I've architected—from privacy-focused ML infrastructure to scalable backend pipelines and edge-deployed intelligence. Explore the repositories or reach out for collaboration opportunities.