Data Center Operations Manager
This role leads day-to-day and strategic operations of mission-critical data center facilities supporting AI infrastructure, managing teams of technicians and overseeing mechanical, electrical, and cooling systems across one or multiple sites. It distinguishes itself from hands-on technician roles through its focus on regional or multi-site leadership, vendor management, preventive maintenance programs, and executive-level performance accountability for uptime and reliability. These managers sit within larger infrastructure operations teams at hyperscale AI cloud providers, partnering closely with hardware engineering, construction, and capacity planning functions to ensure facilities reliably support dense GPU and compute deployments at scale.
Skills
What companies are looking for in this role.
Managing day-to-day operations of data center hardware and infrastructure across multiple geographically distributed sites
Overseeing electrical and mechanical systems including uninterruptible power supplies, power distribution units, and cooling infrastructure
Designing and implementing key performance indicator frameworks to track operational metrics such as uptime, mean time to repair, and power utilization
Managing vendor relationships and third-party operations contracts, including service level agreements and performance monitoring
Leading incident response, root cause analysis, and corrective action planning for infrastructure failures
Leading and developing operations teams across multiple sites while establishing standardized procedures and operational discipline
Ensuring regulatory compliance, safety standards, and environmental controls across mission-critical facilities
Managing hardware lifecycle including component replacement, diagnostics, and return merchandise authorization processes
Monitoring and optimizing power efficiency metrics and energy consumption across hyperscale infrastructure
Coordinating capacity delivery and infrastructure scaling initiatives aligned with business growth requirements
Designing unified monitoring and alerting systems that integrate signals from multiple infrastructure management platforms
Automating operational workflows and manual processes to improve efficiency and reduce human error
Implementing intelligent alert logic that correlates infrastructure signals to reduce alert noise and improve incident triage
Managing logistics operations including inventory control, asset tracking, and material flow in data center environments
Orchestrating seamless transitions of hardware from laboratory testing phases into large-scale production environments
Directing fiber network teams and managing high-capacity connectivity infrastructure for supercomputing clusters
Communicating operational status, challenges, and mitigation strategies to executive leadership and stakeholders
Building and maintaining strong relationships with internal teams, external vendors, and service partners
Coordinating cross-functional efforts across operations, engineering, construction, security, and facilities teams
Establishing and enforcing standardized operational procedures, playbooks, and visual management systems
Managing escalation matrices and incident governance to ensure appropriate prioritization and resolution paths
Driving continuous improvement initiatives through process optimization and performance trend analysis
Influencing stakeholders and driving decisions with limited direct authority in complex matrix environments
Managing change control processes and maintenance governance to ensure safe, structured infrastructure modifications
Technology
The tools and technologies that define this role.
Open Jobs
45 open Data Center Operations Manager jobs across 9 companies.
Other Business Operations roles
Secures, procures, and manages power supply for AI compute infrastructure. Covers energy procurement and Power Purchase Agreements (PPAs), commercial energy development, power transaction management, load interconnection with utilities and grid operators, energy market analysis and forecasting, nuclear/renewable commercial development, and on-site/behind-the-meter generation strategy. Distinct from data center operations (which runs facilities day-to-day) and from sales (which sells software to energy companies) — the defining mandate is securing reliable, cost-effective, and increasingly clean power supply at the scale required by modern AI compute.
Oversees construction projects for data centers, offices, and other facilities from planning through completion.
Provides high-level administrative support to senior leadership, managing schedules, communications, and travel.
Drives cross-functional programs and projects from initiation through delivery.
Manages sourcing, vendor relationships, procurement processes, and supply chain logistics.