Senior Site Reliability Engineer – Cloud Automation (Oracle Health | Remote US)
Make real-world impact at scale. Join Oracle Health to build a modern, automated healthcare platform that millions rely on. You’ll design, automate, and operate secure, highly available cloud services—driving reliability, speed, and efficiency across our platform.
What you’ll do
• Own service reliability end-to-end: architecture, production operations, and on-call excellence
• Build automation and self-healing systems using IaC (e.g., Terraform) and CI/CD
• Design, implement, and evolve observability (metrics, tracing, logging) and SLO/error budgets
• Lead capacity planning, performance tuning, and cost/sustainability initiatives
• Develop tooling and services to improve scalability, availability, and developer productivity
• Partner with cross-functional teams to deliver features safely (canary/blue‑green, progressive delivery)
• Drive incident response, root-cause analysis, and prevention through automation
• Prototype and standardize platform services and best practices across teams