A
Senior Software Development Engineer, EC2 Trainium AI Infra
Amazon
Onsite (Seattle, Washington)
Senior Level
$168k - $227k/yr
Posted Today
Benefits
- Health insurance
- Medical insurance
- Dental insurance
- Vision insurance
- Prescription insurance
- Life insurance
- 401(k) matching
- Paid time off
- Parental leave
Perks
- Sign-on payments
- Restricted stock units
Skills
System Architecture
Microservices
Software Development
Technical Strategy
Infrastructure Provisioning
AI/ML Infrastructure
Design Patterns
Reliability Engineering
Scaling Systems
Technical Leadership
Mentoring
Full Software Development Life Cycle
Code Reviews
Source Control Management
Build Processes
Testing
About the Role
The Software Development Engineer will lead the team in technical strategy, design, build, and operation of infrastructure services including provisioning and availability of AWS Trainium-based AI servers. This role requires expertise in architecting large-scale systems, building micro services, and cross-functional collaboration with several other teams such as capacity management, hardware engineering, and datacenter teams to manage AI/ML infrastructure.
Key job responsibilities
- Design and develop innovative technologies that power the infrastructure supporting AI workloads on Ultraservers
- Lead technical projects establishing EC2 as the pioneer in cloud computing for AI/ML workloads across diverse applications including LLMs, multimodal systems, and emerging model architectures.
- Collaborate with various teams to influence architecture of provisioning systems and improve to operate at scale and efficiently.
- Build customer relationships by investigating complex performance challenges, developing solutions, and publishing actionable best practices through multiple channels.
About the team
The EC2 UltraServer Provisioning team is a high-performing engineering organization responsible for delivering AWS Trainium-based UltraServers infrastructure at scale. We manage end-to-end provisioning workflows from host ingestion through testing, repair, and recovery. Basic Qualifications: - 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience as a mentor, tech lead or leading an engineering team Preferred Qualifications: - 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
USA, WA, Seattle - 168,100.00 - 227,400.00 USD annually
Key job responsibilities
- Design and develop innovative technologies that power the infrastructure supporting AI workloads on Ultraservers
- Lead technical projects establishing EC2 as the pioneer in cloud computing for AI/ML workloads across diverse applications including LLMs, multimodal systems, and emerging model architectures.
- Collaborate with various teams to influence architecture of provisioning systems and improve to operate at scale and efficiently.
- Build customer relationships by investigating complex performance challenges, developing solutions, and publishing actionable best practices through multiple channels.
About the team
The EC2 UltraServer Provisioning team is a high-performing engineering organization responsible for delivering AWS Trainium-based UltraServers infrastructure at scale. We manage end-to-end provisioning workflows from host ingestion through testing, repair, and recovery. Basic Qualifications: - 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience as a mentor, tech lead or leading an engineering team Preferred Qualifications: - 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, WA, Seattle - 168,100.00 - 227,400.00 USD annually
Similar Jobs
S
AI Infrastructure Engineer - Training Platform
Scale AI
Onsite (Seattle, Washington)
$216k - $270k/yr
N
Senior Solutions Architect, AI Infrastructure
NVIDIA
Onsite (Austin, Texas)
$184k - $356k/yr
A
Systems Development Engineer (AWS Generative AI & ML Servers), AWS HW Engineering
Amazon
Onsite (Cupertino, California)
$129k - $201k/yr
A
Software Development Engineer, EC2 Trainium AI Infra
Amazon
Onsite (Seattle, Washington)
$143k - $194k/yr
A
Software Development Engineer, Data Analytics Integration AI and Platform Excellence
Amazon
Onsite (New York, New York)
$158k - $213k/yr
A
Software Development Engineer II, AWS Healthcare AI
Amazon
Onsite (Seattle, Washington)
$143k - $194k/yr
G
Senior Software Engineer, AI/ML, Google Cloud AI
Google
Onsite (Sunnyvale, CA)
$174k - $252k/yr
A
Sr. Software Dev Engineer, Healthcare AI
Amazon
Onsite (Mountain View, California)
$193k - $261k/yr
N
Principal AI and ML Infra Software Engineer, GPU Clusters
NVIDIA
Onsite (Redmond, Washington)
$272k - $431k/yr
Z
AI Infrastructure Engineer
Zoom
Hybrid (Seattle (WA), Washington)
$151k - $332k/yr