Applied Methods
~The MetaEngineeringInfrastructure & Platform Engineer

Infrastructure & Platform Engineer

Engineers in this role architect and operate the systems that power AI research and product development at scale. They design distributed infrastructure for training, serving, and orchestrating AI workloads across GPU clusters, build internal platforms that accelerate developer velocity, and optimize the critical path from code to production. This role bridges deep systems engineering expertise—in areas like Kubernetes, build systems, data pipelines, and performance tuning—with the unique demands of AI workloads, combining hands-on infrastructure work with close collaboration with researchers and product teams to eliminate bottlenecks that slow down innovation.

$ titles --canonical
Senior Software Engineer, InfrastructureSoftware Engineer, PlatformSoftware Engineer, AI Platform
Open Jobs612
Companies Hiring102
$02

Skills

What companies are looking for in this role.

$ skills --core

Designing and building internal developer platforms and tooling to improve software delivery workflows

95%

Architecting and maintaining continuous integration and continuous deployment infrastructure at scale

92%

Building and operating cloud infrastructure including networking, compute, and storage systems

88%

Designing distributed systems for high availability, failover, and multi-region deployment

85%

Implementing infrastructure as code and automation frameworks for configuration management

84%

Managing and optimizing large-scale production systems and infrastructure

82%

Building self-service platforms and abstractions that reduce operational friction

80%

Optimizing infrastructure performance and cost efficiency at scale

76%

Building observability and monitoring platforms for large-scale systems

76%

Conducting root cause analysis and driving long-term infrastructure improvements

74%

Designing and maintaining data platforms including batch and streaming pipelines

70%

Designing security, isolation, and compliance layers for multi-tenant systems

68%
$ skills --emerging

Designing progressive delivery strategies including canary deployments and automated rollbacks

78%

Architecting machine learning model deployment and serving infrastructure

72%

Implementing network automation and orchestration systems for enterprise infrastructure

68%

Building event-driven and self-healing automation workflows

65%

Implementing intent-based configuration systems and declarative infrastructure patterns

62%
$ skills --soft

Collaborating with cross-functional teams to identify and resolve infrastructure bottlenecks

79%

Mentoring and leading engineering teams on infrastructure best practices

75%

Establishing operational standards, SLAs, and incident response procedures

72%
$03

Technology

The tools and technologies that define this role.

$ tech --language
Pythonvery high
Gohigh
SQLhigh
C++moderate
$ tech --framework
Sparkhigh
CUDAmoderate
Flinkmoderate
gRPCmoderate
Jinja2moderate
OpenTelemetrymoderate
Gradiolow
Nornirlow
$ tech --platform
AWSvery high
Kubernetesvery high
Dockerhigh
EKShigh
Kafkahigh
PostgreSQLhigh
Azuremoderate
Cassandramoderate
ClickHousemoderate
Databricksmoderate
DynamoDBmoderate
Elasticsearchmoderate
GCPmoderate
Google Cloud Pub/Submoderate
Istiomoderate
Kinesismoderate
MySQLmoderate
Redismoderate
Snowflakemoderate
Hugging Face Spaceslow
$ tech --tool
Airflowhigh
Ansiblehigh
GitHub Actionshigh
Helmhigh
Nginxhigh
Terraformhigh
Argo Workflowsmoderate
ArgoCDmoderate
CloudFormationmoderate
Dagstermoderate
Datadogmoderate
dbtmoderate
Grafanamoderate
HAProxymoderate
Kustomizemoderate
Nautobotmoderate
NetBoxmoderate
Prometheusmoderate
Pulumimoderate
Aristalow
Consullow
Jaegerlow
Juniperlow
New Reliclow
NVIDIA Mellanoxlow
Terraform Cloudlow
Vaultlow
$ tech --concept
Distributed systemsvery high
GPUvery high
Observabilityvery high
ACIDhigh
API Gatewayhigh
Autoscalinghigh
Batch processinghigh
Blue-green deploymenthigh
Cachinghigh
Canary deploymenthigh
Capacity planninghigh
Cost optimizationhigh
Debugginghigh
Disaster recoveryhigh
Documentationhigh
ETLhigh
GitOpshigh
High availabilityhigh
Incident managementhigh
LLM inferencehigh
Load balancinghigh
Load testinghigh
Model servinghigh
NoSQLhigh
On-call rotationhigh
Performance profilinghigh
Replicationhigh
REST APIhigh
Root cause analysishigh
Runbookhigh
Shardinghigh
SLAhigh
Stream processinghigh
VPChigh
Backpressuremoderate
Chaos engineeringmoderate
Circuit breakermoderate
Compliancemoderate
Data governancemoderate
Data lineagemoderate
Delta Lakemoderate
Edge computingmoderate
GraphQLmoderate
Icebergmoderate
Identity managementmoderate
KMSmoderate
Machine learning operationsmoderate
Multi-tenancymoderate
NFSmoderate
OCImoderate
Proxymoderate
Rate limitingmoderate
RBACmoderate
RDMAmoderate
Retry logicmoderate
RFCmoderate
Routing protocolmoderate
Sandboxmoderate
SDNmoderate
Security scanningmoderate
Service meshmoderate
SMBmoderate
Time series databasemoderate
Vector databasemoderate
WebSocketmoderate
ARMlow
BGPlow
BYOClow
BYOKlow
gNMIlow
GPUDirect Storagelow
Graph databaselow
MPIlow
NCCLlow
NETCONFlow
NVMe over Fabriclow
OpenConfiglow
Roofline analysislow
Simulationlow
TPUlow
UCXlow
YANGlow
$04

Open Jobs

612 open Infrastructure & Platform Engineer jobs across 102 companies.

Snorkel AI1d
Senior Software Engineer - Core Services
Redwood City, CA (Hybrid); San Francisco, CA (Hybrid)·Engineering
Crusoe3d
Data Center Design Engineer
Tel Aviv - IL·Engineering
True Anomaly3d
Platform Engineer, AI (Levels I, II, III)
Denver, CO or Long Beach, CA·Engineering
True Anomaly3d
Senior Platform Engineer, AI
Denver, CO or Long Beach, CA·Engineering
True Anomaly3d
Staff Platform Engineer, AI
Denver, CO or Long Beach, CA·Engineering
Helsing3d
Software Engineer, Platform Engineering
Washington, DC·Engineering
Anthropic4d
Staff+ Software Engineer, Claude App Infrastructure
San Francisco, CA | New York City, NY | Seattle, WA·Engineering
n8n4d
Sr Cloud Engineer | Infrastructure & Networking | Europe remote
Berlin Office·Engineering
MongoDB4d
Senior Staff Engineer
Dublin·Engineering
Abnormal Security4d
Senior Software Engineer - Platform Engineering (Fed Ops)
Remote - USA·Engineering
True Anomaly5d
Staff Mission Cloud Engineer
Denver, CO or Long Beach, CA·Engineering
CoreWeave5d
GPU Performance Engineer
Sunnyvale, CA / Bellevue, WA·Engineering
Aaru5d
Software Engineer, Infrastructure
NYC·Engineering
Aaru1w
Software Engineer, Platform
NYC·Engineering
Waymo1w
Senior Systems Engineer, Depot Automation
Mountain View, CA, USA; San Francisco, CA, USA·Engineering
Anthropic1w
Staff+ Software Engineer, Inference Runtime
Remote-Friendly (Travel-Required) | San Francisco, CA | Seattle, WA | New York City, NY·Engineering
Nscale1w
Observability Platform Engineer
UK·Engineering
Anthropic1w
Staff Software Engineer, Inference
London, UK·Engineering
Anthropic1w
Senior Software Engineer, Inference
London, UK·Engineering
Rillet1w
Senior Software Engineer, Platform Engineering
New York City·Engineering