Running large language model inference workloads on Kubernetes at scale—the infrastructure problems that emerge past the prototype stage.
Topic index
Browse the archive by the systems it keeps returning to.
Each topic collects the entries that shaped it, with the newest notes first.
Use the topic index to jump, then scan each section like a curated sub-archive.
Topic
ai
1 entries
Topic
airflow
1 entries
A migration story from Airflow on VMs to cloud-native data pipelines using Argo Workflows and what we'd do differently.
Topic
argocd
1 entries
Implementing canary deployments and automated rollbacks using Argo Rollouts with Prometheus analysis templates.
Topic
canary
1 entries
Implementing canary deployments and automated rollbacks using Argo Rollouts with Prometheus analysis templates.
Topic
data-engineering
1 entries
A migration story from Airflow on VMs to cloud-native data pipelines using Argo Workflows and what we'd do differently.
Topic
devex
1 entries
What separates a good internal developer platform from an expensive Kubernetes wrapper—and how to build the former.
Topic
flux
1 entries
How we standardized cluster configuration across a multi-cloud estate using Flux v2 and a monorepo approach.
Topic
GCP
1 entries
When someone first asks to run an LLM on Kubernetes, the instinct is reasonable: it's a containerised workload, it exposes an HTTP API, it needs to scale. Deploy it like anything else — a `Deployment`, a `Service`, maybe an `HPA` on CPU. That instinct gets you surprisingly far. Until it doesn't
Topic
gitops
1 entries
How we standardized cluster configuration across a multi-cloud estate using Flux v2 and a monorepo approach.
Topic
GKE
1 entries
When someone first asks to run an LLM on Kubernetes, the instinct is reasonable: it's a containerised workload, it exposes an HTTP API, it needs to scale. Deploy it like anything else — a `Deployment`, a `Service`, maybe an `HPA` on CPU. That instinct gets you surprisingly far. Until it doesn't
Topic
go
1 entries
Lessons from building and running three custom operators in production, including the parts the tutorials skip.
Topic
k8s
1 entries
A migration story from Airflow on VMs to cloud-native data pipelines using Argo Workflows and what we'd do differently.
Topic
kubernetes
4 entries
What separates a good internal developer platform from an expensive Kubernetes wrapper—and how to build the former.
Running large language model inference workloads on Kubernetes at scale—the infrastructure problems that emerge past the prototype stage.
Lessons from building and running three custom operators in production, including the parts the tutorials skip.
How we standardized cluster configuration across a multi-cloud estate using Flux v2 and a monorepo approach.
Topic
LLM Inference
1 entries
When someone first asks to run an LLM on Kubernetes, the instinct is reasonable: it's a containerised workload, it exposes an HTTP API, it needs to scale. Deploy it like anything else — a `Deployment`, a `Service`, maybe an `HPA` on CPU. That instinct gets you surprisingly far. Until it doesn't
Topic
llmops
1 entries
Running large language model inference workloads on Kubernetes at scale—the infrastructure problems that emerge past the prototype stage.
Topic
operators
1 entries
Lessons from building and running three custom operators in production, including the parts the tutorials skip.
Topic
platform-engineering
1 entries
What separates a good internal developer platform from an expensive Kubernetes wrapper—and how to build the former.
Topic
progressive-delivery
1 entries
Implementing canary deployments and automated rollbacks using Argo Rollouts with Prometheus analysis templates.