Skip to content
Skip to content
AI Engineer Jobs
Meta

Hardware Systems Engineer, NPI AI

Meta

Location
Onsite (Menlo Park, California)
Compensation
$144k - $204k/yr
Employment
Full-time
Level
Senior Level
Posted 3 days ago

About the Role

Meta is seeking a Hardware Systems Engineer to support the new product introduction (NPI) of next-generation AI and high-performance computing infrastructure for large-scale data center deployments. You will work at the intersection of AI silicon, server systems, and data center operations, partnering with cross-functional teams to validate and scale cutting-edge AI hardware systems.

Skills

Hardware Systems Engineering NPI AI Infrastructure System Validation Silicon Validation Firmware Validation Root Cause Analysis PCIe NVLink DRAM HBM Linux JTAG GDB Trace32 ASIC Bring-up

Perks

  • Equity

Full job details

Meta is seeking a Hardware Systems Engineer to support the new product introduction (NPI) of next-generation AI and high-performance computing infrastructure for large-scale data center deployments. In this role, you will work at the intersection of AI silicon, server systems, and data center operations, partnering with hardware design, firmware, software, networking, and capacity engineering teams to validate and scale cutting-edge AI hardware systems from early bring-up through production readiness.

Responsibilities

  • Lead end-to-end system validation strategies for AI and HPC hardware platforms, including AI accelerators, GPU clusters, and high-bandwidth memory subsystems in data center environments
  • Drive hands-on bring-up, characterization, and validation of AI server systems and associated components such as PCIe, NVLink, DRAM, and high-speed networking fabrics
  • Develop and maintain test specifications, validation procedures, and debug guides tailored to AI infrastructure NPI programs
  • Investigate and root-cause complex system failures spanning silicon, firmware, software, and hardware layers in collaboration with cross-functional engineering teams
  • Triage and track hardware and firmware defects through resolution while maintaining forward progress on NPI program milestones
  • Identify gaps in test coverage and drive improvements to test methodologies, tooling, and automation frameworks across the NPI lifecycle
  • Partner with AI platform and capacity engineering teams to define acceptance criteria and deployment readiness standards for new AI hardware systems
  • Guide data collection, analysis, and reporting efforts to surface systemic hardware quality trends and inform go/no-go decisions for production deployment
  • Communicate validation status, risk assessments, and technical findings to internal engineering teams and external hardware vendors
  • Collaborate with firmware and software teams to define hardware-software interface requirements for telemetry, diagnostics, and remote management of AI infrastructure


Minimum Qualifications

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 6+ years of experience in hardware systems engineering, silicon validation, firmware validation, or system-level bring-up for AI servers, GPUs, TPUs, or AI accelerator platforms
  • Experience in one or more of the following domains: ASIC bring-up and characterization, board-level debug, firmware validation, or large-scale system validation in data center environments
  • Experience developing test specifications, validation procedures, and debug methodologies for complex hardware systems
  • Experience leading root-cause analysis and troubleshooting of system-level failures across hardware, firmware, and software stacks
  • Experience with high-speed interconnects or memory subsystems such as PCIe, NVLink, DDR5, or HBM in the context of AI or HPC system validation


Preferred Qualifications

  • 3+ years of experience with debugging tools for SoCs including JTAG, GDB, or Trace32, and familiarity with common bus protocols such as I2C, SPI, USB, and PCIe
  • 3+ years of experience defining hardware-software interface requirements for telemetry, diagnostics, and out-of-band management in AI infrastructure deployments
  • Experience integrating lab instrumentation and automation frameworks to support large-scale NPI validation workflows
  • Proficiency in Linux environments and server system management tools used in data center operations


$144,000/year to $204,000/year + bonus + equity + benefits