Infrastructure & Platform Engineer
Engineers in this role architect and operate the systems that power AI research and product development at scale. They design distributed infrastructure for training, serving, and orchestrating AI workloads across GPU clusters, build internal platforms that accelerate developer velocity, and optimize the critical path from code to production. This role bridges deep systems engineering expertise—in areas like Kubernetes, build systems, data pipelines, and performance tuning—with the unique demands of AI workloads, combining hands-on infrastructure work with close collaboration with researchers and product teams to eliminate bottlenecks that slow down innovation.
Skills
What companies are looking for in this role.
Designing and building internal developer platforms and tooling to improve software delivery workflows
Architecting and maintaining continuous integration and continuous deployment infrastructure at scale
Building and operating cloud infrastructure including networking, compute, and storage systems
Designing distributed systems for high availability, failover, and multi-region deployment
Implementing infrastructure as code and automation frameworks for configuration management
Managing and optimizing large-scale production systems and infrastructure
Building self-service platforms and abstractions that reduce operational friction
Optimizing infrastructure performance and cost efficiency at scale
Building observability and monitoring platforms for large-scale systems
Conducting root cause analysis and driving long-term infrastructure improvements
Designing and maintaining data platforms including batch and streaming pipelines
Designing security, isolation, and compliance layers for multi-tenant systems
Designing progressive delivery strategies including canary deployments and automated rollbacks
Architecting machine learning model deployment and serving infrastructure
Implementing network automation and orchestration systems for enterprise infrastructure
Building event-driven and self-healing automation workflows
Implementing intent-based configuration systems and declarative infrastructure patterns
Collaborating with cross-functional teams to identify and resolve infrastructure bottlenecks
Mentoring and leading engineering teams on infrastructure best practices
Establishing operational standards, SLAs, and incident response procedures
Technology
The tools and technologies that define this role.
Open Jobs
612 open Infrastructure & Platform Engineer jobs across 102 companies.
Other Engineering roles
General-purpose software engineering roles focused on building and maintaining software systems. Covers generalist SWE positions that don't clearly fall into frontend, backend, fullstack, or other specialized tracks.
Engineers focused on server-side systems, APIs, services, and data processing pipelines. Includes roles explicitly labeled as backend or server-side development.
Engineers specializing in user-facing interfaces, web applications, and client-side development. Includes UI/UX engineering and web development roles.
Engineers working across the entire application stack, handling both frontend and backend responsibilities.
Engineers embedded with customers or deployed on-site to solve domain-specific technical problems. Combines engineering skills with direct client interaction.