Skip to content
Back to jobs

Posted 1 day, 2 hours ago

Fluidstack

Principal Operations Engineer Hardware

Roles

Compensation

USD 150000 - 250000

150000 - 250000

yearly
depending on experience, skills, qualifications, and location
May include equity in the form of stock options.
  • Competitive total compensation package (salary + equity)
  • Retirement or pension plan, in line with local norms
  • Health, dental, and vision insurance
  • Generous PTO policy, in line with local norms

Tech stack

Linux

Location

Remote

Work setup

full-time
Principal
Remote. Willingness to travel extensively across the fleet (50–75%).
unclear
50–75% travel across the fleet.

Role details

  • Operate as the most senior technical authority for the operational hardware fleet across the hyperscale AI data center portfolio
  • Serve as the technical arm of senior operations leadership in the field, leading site assessments and operational audits
  • Drive technical readiness of teams ahead of site activation
  • Review hardware platforms and integration designs from an operational lens
  • Feed operational learnings back into hardware engineering, deployment, and supply chain organizations
  • Act as a force multiplier across site hardware leads, deployment teams, and reliability engineers
  • Hold OEMs, ODMs, service vendors, and deployment partners accountable and enforce standards without burning relationships
  • Author, approve, and execute high-risk MOPs and change records in live production environments
  • Lead root cause analysis on significant hardware events and drive corrective actions to closure
  • Produce operational health assessments, RCAs, procedure reviews, and design review feedback
  • Operate as the senior technical voice across operations, hardware engineering, network, facilities, supply chain, and customer-facing teams
  • Travel extensively across the fleet (50–75%)
  • 10+ years of hands-on experience operating mission-critical hardware infrastructure
  • At least 5 years as the senior technical voice on a site, campus, or fleet
  • Deep working command of GPU systems, server platforms, storage infrastructure, firmware lifecycle management, and hardware diagnostics
  • Demonstrated ability to author, approve, and execute high-risk MOPs and change records in live production environments
  • Track record of leading root cause analysis on significant hardware events and driving corrective actions to closure
  • Track record of holding OEMs, ODMs, service vendors, and deployment partners accountable
  • Strong written communication
  • Comfort operating as the senior technical voice across operations and cross-functional teams
  • Willingness to travel extensively across the fleet (50–75%)

Application

You will receive a confirmation email once your application has successfully been accepted. If there is an error with your submission and you did not receive a confirmation email, please email careers@fluidstack.io with your resume/CV, the role you've applied for, and the date you submitted your application. Please mention the word LUXURIOUSLY and tag RODguMTk4Ljk5LjE0Mw== when applying to show you read the job post completely (#RODguMTk4Ljk5LjE0Mw==).

unclear
not required
unclear
unclear

Company context

Make humanity more free by delivering frontier compute infrastructure for aligned AI.

Frontier compute infrastructure and hyperscale AI data center operations (design, build, and operate data centers; deliver large-scale compute faster).
unclear

Contact

careers@fluidstack.io

Description

About Fluidstack: We exist to make humanity more free. Powerful AI will be the biggest lever for human choice we've ever built, and whoever deploys frontier compute infrastructure fastest will decide whether AI expands human freedom or shrinks it. We're focused on delivering 10 to 100s of GWs of compute faster than anyone else by acquiring power, designing and building data centers, and operating them with teams spanning hardware and software. About the Role: Seeking a Principal Operations Engineer, Hardware to serve as the most senior technical authority for the operational hardware fleet across a hyperscale AI data center portfolio. Ensures deployed GPU systems, servers, and supporting hardware are operated, maintained, and continuously improved to the workload’s standard. Operates as the technical arm of senior operations leadership—leading site assessments and operational audits, driving technical readiness ahead of site activation, reviewing hardware platforms and integration designs from an operational lens, and feeding operational learnings back into hardware engineering, deployment, and supply chain organizations as the company shifts toward a productized, repeatable build model. Acts as a force multiplier across site hardware leads, deployment teams, and reliability engineers, connecting hardware operations, hardware engineering, network, facilities, and customer-facing teams. The ideal candidate has spent a career operating hardware at scale in hyperscale data centers, large HPC environments, or comparable 24/7 infrastructure, and can diagnose hardware issues, lead fleet-wide root cause investigations, and push back on vendors on flawed processes. Formal engineering credentials valued but not required. Responsibilities include: 10+ years hands-on experience operating mission-critical hardware infrastructure with at least 5 years as the senior technical voice on a site/campus/fleet; data center operations experience strongly preferred; deep working command of GPU systems, server platforms, storage infrastructure, firmware lifecycle management, and hardware diagnostics; author/approve/execute high-risk MOPs and change records; lead root cause analysis on significant hardware events; hold OEMs/ODMs/service vendors/deployment partners accountable; strong written communication; comfort operating as senior technical voice across operations and cross-functional teams; willingness to travel extensively across the fleet (50–75%).

Similar jobs

  • Loading similar jobs...