← All Jobs
Posted Apr 15, 2026

Data & AI Operations Specialist

Apply Now
The Data & Operations AI Specialist serves as the Level 3 technical lead for Artificial Intelligence and Data Platform estate. You will be responsible for the architecture, engineering, and advanced troubleshooting of AI infrastructure, data pipelines, and MLOps lifecycles across a multi-cloud environment (Azure and OCI). Responsibilities: AI Infrastructure & Platform Engineering - Design & Architecture: Maintain the monitoring architecture for AI/ML platforms and configure advanced dashboards in Grafana and Azure Monitor. - Environment Governance: Manage Azure Machine Learning (AML) workspace configurations, compute targets, and Databricks cluster lifecycles (including runtime versions and platform patching). - Resource Optimization: Oversee GPU resource allocation, reserved capacity, and cost-performance optimization to align with FinOps goals. - Security Integration: Ensure all AI services utilize private endpoints, VNET integration, and RBAC controls to protect sensitive citizen data. Data Pipeline & ETL Management - Pipeline Engineering: Own the design, optimization, and remediation of Azure Data Factory (ADF) and Synapse pipelines. - Advanced Troubleshooting: Resolve complex bottlenecks related to authentication failures, data format changes, and ETL performance. - SOP Leadership: Author step-by-step Standard Operating Procedures (SOPs) for the L1 NOC team to handle routine monitoring and first-line triage. MLOps & Model Lifecycle - Automation: Implement CI/CD pipelines for model training, testing, and deployment to AML endpoints. - Model Reliability: Configure data drift detection thresholds and automated retraining triggers. - Recovery Operations: Develop self-healing scripts and automated recovery runbooks for critical AI workflows. Governance & Compliance - Audit Management: Implement and maintain audit logging for all AI decisions and model outputs, ensuring logs flow to the SIEM/vSOC. - Regulatory Alignment: Conduct quarterly AI governance reviews to ensure compliance with NESA standards and data privacy guidelines. Requirements - AI/ML Platforms: Deep expertise in Azure Machine Learning and Databricks. - Data Integration: Proficiency in Azure Data Factory and Synapse. - Infrastructure-as-Code (IaC): Experience with Terraform or ARM Templates for reproducible deployments. - Observability: Ability to use Dynatrace, Grafana, and Azure Monitor for deep-tier diagnostics. - Containerization: Knowledge of AKS, Istio Service Mesh, and KEDA. - ITIL Mastery: Strong understanding of ITIL-aligned Incident, Change, and Problem management. - Security Mindset: Familiarity with NESA standards and UAE data residency requirements. - Technical Writing: Ability to draft complex SOPs and Root Cause Analysis (RCA) documents within 48 hours of an incident. - Certifications: Microsoft Azure Data Scientist Associate or Azure AI Engineer Associate is highly preferred.
Interested in this role?Apply on iHire