LighthouseOps

00706/10/2024

Running LLMs on GKE: what breaks before you find the Inference Gateway

When someone first asks to run an LLM on Kubernetes, the instinct is reasonable: it's a containerised workload, it exposes an HTTP API, it needs to scale. Deploy it like anything else — a `Deployment`, a `Service`, maybe an `HPA` on CPU. That instinct gets you surprisingly far. Until it doesn't

GKELLM InferenceGCP

00606/10/2024

Platform Engineering: Building the Right Abstractions

What separates a good internal developer platform from an expensive Kubernetes wrapper—and how to build the former.

platform-engineeringdevexkubernetes

00505/01/2024

Data Pipelines on Kubernetes: Lessons from Airflow to Argo

A migration story from Airflow on VMs to cloud-native data pipelines using Argo Workflows and what we'd do differently.

airflowk8sdata-engineering

00403/20/2024

Progressive Delivery with Argo Rollouts

Implementing canary deployments and automated rollbacks using Argo Rollouts with Prometheus analysis templates.

argocdcanaryprogressive-delivery

00303/05/2024

LLMOps in Production: What Nobody Tells You

Running large language model inference workloads on Kubernetes at scale—the infrastructure problems that emerge past the prototype stage.

llmopskubernetesai

00202/08/2024

Writing Production-Grade Kubernetes Operators

Lessons from building and running three custom operators in production, including the parts the tutorials skip.

kubernetesoperatorsgo

00101/15/2024

GitOps at Scale: Managing 50 Clusters with Flux

How we standardized cluster configuration across a multi-cloud estate using Flux v2 and a monorepo approach.

gitopskubernetesflux