Skip to content

~/whoami → Surya Avala

Staff Machine Learning Engineer

Principal AI Systems Architect · 10 years across 4 regulated industries

"A decade of making ML systems survive production in healthcare and energy — where the plumbing matters more than the model."

Healthcare · Energy · Finance · Tech

  • Python
  • C++
  • Kubernetes
  • PyTorch
  • vLLM
  • LangGraph

Quantified Impact

full runtime →

Six measured outcomes across Healthcare, Energy & Recommender systems.

67%

reduction in feature lead times across the GCP ML platform

Montu — Healthcare ML Platform (2023–2025)

Replaced sprawling notebook-to-production pipeline with Kubeflow + GitOps, turning sprint-length deploys into same-day ships.

methodology

DORA metrics, measured pre/post Kubeflow + GitOps platform migration

93%

clinical NLP accuracy — outperforming Google Healthcare NLP by 14 points

Montu — Clinical Document Intelligence

Built custom clinical NLP pipeline outperforming Google Healthcare NLP on domain-specific medical text extraction.

methodology

Held-out clinician-labelled validation set, 5-fold cross-validation

40%

average infrastructure cost reduction across 10k+ energy sites

Amber — Energy Forecasting Platform (2021–2023)

Designed event-driven FinOps architecture for 10k+ energy sites, eliminating redundant compute and cold-path waste.

methodology

GCP billing diff over 6-month rolling window post-FinOps refactor

71.3%

clinician adoption of the prescription recommender (100k+ patients)

Montu — Two-Tower Recommender System

Hybrid two-tower recommender personalising prescriptions for 100k+ recurring patients with clinician-in-the-loop feedback.

CI [68.1% – 74.5%] · 95% CI bootstrapped over weekly cohorts

methodology

Active-user telemetry / total clinicians eligible, 90-day rolling window

70%

clinician case reviews automated by the care quality assessment pipeline

Montu — Care Quality Assessment

Automated clinician case review pipeline combining structured extraction with quality scoring, freeing clinical hours.

methodology

Automated-review count / total reviews, monthly rolling

0.87+

F1 PII redaction score on clinical log sanitisation

Montu — Privacy-by-Design Logging

Privacy-by-Design log sanitisation with tuned NER model, ensuring clinical inputs never leak PII to downstream systems.

methodology

Held-out clinician-annotated PII corpus, micro-averaged F1

Open to Work

Targeting Principal / Staff IC roles in regulated ML infrastructure.

Volunteering for Social Good · long-form writing on notes.

last verified

By the Numbers

decade
10 yrs
industries shipped in
5
shipped roles
7
measured outcomes
6
upstream PRs merged
5

computed at build · not a marketing claim

Featured

scaling-succotash

Production agentic search engine — GraphRAG, Celery DLQ, Circuit Breakers, Kubernetes. Featured on the homepage as the flagship agentic system.

  • GraphRAG
  • LangGraph
  • Kubernetes
  • Celery
  • Circuit Breakers
Open link
Latest

traffic_counter

O(1) stdlib optimisation — zero GC thrashing, pure CPython heap/deque. Demonstrates first-principles algorithmic thinking against real workloads.

  • Python
  • Systems
  • O(1)
  • CPython
Open link

Upstream Contributions

  • tensorflow/tfx #3813
  • kubeflow/pipelines #4702
  • kubeflow/pipelines #4706
  • dask/dask #5828
  • iterative/katacoda-scenarios #docs

5 merged PRs across MLOps + data infra

Open link

Technical Leadership

RFC authorship · ML Guild mentoring · cross-functional Build vs Buy negotiation.

"Hired, onboarded and mentored teams of Data Scientists & ML Engineers across multiple orgs."

full leadership detail per role →

Open link

Beyond Code

Using ML to drive equitable outcomes — climate action, public-policy data, and technology for underserved communities.

"AI is the new electricity."

AI for Social Good · Climate · Rational Thinking