Infrastructure Engineer
Infrastructure Engineers at AI companies operate the physical and systems-level infrastructure the business depends on—servers, storage arrays, networking equipment, and the Unix/Linux environments hosted on them. The day-to-day is hands-on: diagnosing hardware and firmware faults, managing warranty replacements through vendors, performing root-cause analysis on systemic issues, and maintaining the operational health of data-center and corporate-IT hardware. Cloud and infrastructure-as-code work appears in many of these jobs, but the centre of gravity is closer to traditional systems administration and data-center operations than to cloud platform engineering. These engineers typically sit within IT, infrastructure operations, or data-center teams, partnering with networking, security, and application teams to keep infrastructure running as the business scales.
Skills
What companies are looking for in this role.
Diagnosing and resolving complex hardware and firmware issues in server and datacenter environments
Administering and troubleshooting large-scale Linux operating systems and command-line interfaces
Troubleshooting complex multi-component system issues across hardware, software, and networking
Monitoring and analyzing system health, performance metrics, and equipment status
Designing and implementing infrastructure automation and scripting solutions
Implementing observability and monitoring solutions for complex distributed systems
Operating and maintaining high-performance computing clusters and distributed systems
Implementing continuous integration and continuous deployment pipelines
Managing virtual machine and containerized workload orchestration platforms
Designing configuration-as-code environments and infrastructure automation frameworks
Designing and optimizing enterprise storage systems for performance and reliability
Conducting performance benchmarking and capacity planning for infrastructure resources
Integrating multiple systems and platforms through APIs and middleware solutions
Managing asset lifecycle and infrastructure inventory tracking systems
Implementing security hardening and zero-trust principles across infrastructure
Managing GPU cluster operations and troubleshooting GPU-specific hardware issues
Optimizing infrastructure for AI and machine learning workload performance
Configuring and operating high-speed interconnect fabrics for AI workloads
Creating and maintaining technical documentation and operational runbooks
Responding to and managing critical infrastructure incidents under pressure
Mentoring and escalating technical issues to junior technicians and support teams
Leading cross-functional collaboration with infrastructure, platform, and development teams
Managing vendor relationships and processing warranty and replacement requests
Technology
The tools and technologies that define this role.
Open Jobs
31 open Infrastructure Engineer jobs across 13 companies.
Other Infrastructure & IT roles
Provides end-user technical support including hardware, software, and account troubleshooting.
Designs, deploys, and maintains enterprise IT systems including identity management, SaaS platforms, device management, and business applications. The IT-facing systems engineer managing corporate technology.
Designs, implements, and maintains network infrastructure including LAN, WAN, backbone, and edge networks.
IT professionals who remotely manage servers, operating systems, hypervisors, and software within data center environments. Focuses on systems administration, monitoring, patching, and troubleshooting at the OS and application layer — NOT physical hardware installation.
Implements and manages security infrastructure including IAM, endpoint security, SIEM, and security tooling. Operates within IT or infrastructure teams to protect the corporate environment.