Rubin Observatory / SLAC | Menlo Park, CA REMOTE (US) | SRE / Software Engineer | Full-time
The Vera C. Rubin Observatory in Chile is a new world-class astronomy facility designed to create a 10-year time-lapse map of the southern sky. Featuring the world’s largest digital camera it emits 20GB of raw pixels per minute, and we are developing the Prompt Processing System that will distribute an alert on every astrophysical object that has moved, changed or appeared in real-time. (https://rubinobservatory.org/news/first-alerts)
This appointment is based at SLAC, a DOE-funded national laboratory hosting the data facility and many of the 80 scientists and engineers of the world-wide Rubin Data Management team. (https://www6.slac.stanford.edu/)
Keep Rubin's Alert Stream flowing. We are hiring an SRE to own the reliability of our Prompt Processing Framework, a Kubernetes-based, event-driven system using KEDA, Redis Streams, Kafka, and Postgres to scale near-real-time processing. You’ll write software to improve resilience, operate and evolve core infrastructure services, and build the monitoring, alerting, and on-call practices that keep the system robust and reliable during nightly observing operations.
Stack: Python, Kubernetes, Helm, ArgoCD, Kafka, Redis, KEDA, InfluxDB, PostgreSQL, Cassandra.
Details and apply here: ls.st/sre-ad