Independently triage, diagnose, and resolve production incidents and performance issues in Azure cloud-native workloads with urgency and precision, minimizing downtime for healthcare systems.
Own end-to-end problem management — from root cause analysis (RCA) to permanent fixes, preventive measures, and knowledge sharing to reduce recurrence.
Design, implement, and optimize cloud-native architectures on Azure using PaaS/serverless/container services (e.g., AKS, Azure Container Apps, Azure Functions, Azure API Management, Cosmos DB, Azure SQL) while adhering to healthcare best practices.
Ensure full compliance with healthcare regulations (HIPAA, HITECH, HITRUST CSF where required) by leveraging Azure-native security and governance tools: Microsoft Defender for Cloud, Azure Policy, Private Endpoints, Managed Identities, Key Vault, encryption at rest/in-transit, audit logging, and signed BAAs.
Automate infrastructure, deployments, and operations using IaC (Bicep preferred, Terraform/ARM) and GitOps/CI/CD pipelines (Azure DevOps, GitHub Actions).
Monitor, alert, and optimize Azure environments with Azure Monitor, Application Insights, Log Analytics; implement proactive health checks and chaos/resilience testing.
Lead or support refactoring/migration efforts to cloud-native patterns, improving scalability, cost-efficiency, and security for healthcare workloads (e.g., patient portals, EHR integrations, telehealth).
Collaborate remotely with client teams (dev, ops, security, compliance) while providing clear, concise updates on status, risks, and recommendations — no hand-holding required.
Proactively identify improvement opportunities (cost, performance, security) and drive implementation with minimal oversight.
Maintain detailed documentation, runbooks, and post-incident reviews to support knowledge transfer and audit readiness.
Experience
7+ years in cloud engineering, with 4+ years hands-on in Azure cloud-native environments (AKS, serverless, PaaS-heavy).
3+ years supporting production healthcare or highly regulated environments (HIPAA-eligible workloads, PHI handling).
Proven track record of independently triaging and resolving complex, time-sensitive issues in distributed systems.
Technical Skills
Expert in Azure services for cloud-native: AKS (Kubernetes), Azure Functions, Container Apps, Event Grid/Service Bus, Cosmos DB, Azure SQL, API Management.
Strong Kubernetes experience (AKS deployments, Helm, ingress, monitoring).
IaC mastery: Bicep (strongly preferred), Terraform, ARM templates.
CI/CD & automation: Azure DevOps pipelines, GitHub Actions, scripting (PowerShell, Bash, Python).
Security & compliance: Deep knowledge of HIPAA/HITECH controls in Azure (Defender for Cloud, Policy, Sentinel, encryption, RBAC, Private Link).
Observability: Azure Monitor suite, Application Insights, Log Analytics; bonus: Prometheus/Grafana.
Networking: VNet peering, NSGs, Private Endpoints, hybrid connectivity (if applicable).
Nice-to-Have
Experience with healthcare integrations (HL7/FHIR, EHR systems like Epic/Cerner).
Familiarity with Azure Government (GCC High) or sovereign clouds if relevant.
AI/ML exposure (Azure OpenAI, if HIPAA-eligible).
Certifications: AZ-305 (Azure Solutions Architect), AZ-400 (DevOps), AZ-500 (Security), CKA/CKAD, or healthcare-specific (e.g., CHIME, HITRUST).
Soft Skills & Mindset
Highly self-managed — thrives with autonomy, takes initiative, and delivers results without constant supervision.
Exceptional troubleshooting under pressure in 24/7 environments.
Strong communication: Clear incident reporting, stakeholder updates, and documentation.
Proactive, detail-oriented, with a focus on reliability and compliance in regulated settings.