A
Software Development Engineer, EC2 Trainium AI Infra
Amazon
Onsite (Seattle, Washington)
Mid Level
$143k - $194k/yr
Posted Today
Benefits
- Health Insurance
- Medical Insurance
- Dental Insurance
- Vision Insurance
- Prescription Insurance
- Life Insurance
- 401(k) Matching
- Paid Time Off
- Parental Leave
Perks
- Sign-on Payments
- Restricted Stock Units
Skills
System Design
Microservices
AWS Native Services
Object Oriented Design
Distributed Systems
C#
C++
Java
Perl
Cloud Provisioning
System Architecture
Code Review
Observability
Scaling Infrastructure
Multi-threaded Applications
Software Development Life Cycle
About the Role
EC2 Infrastructure Services organization is responsible for making EC2 instances available to our customers at all times. We are a key part of what makes EC2 elastic. AI infrastructure has taken a key place in EC2 and we are building systems, services, and automation to operate this at scale.
The Software Development Engineer will design, build, and maintain cloud-based provisioning and recovery systems for AWS Trainium-based AI UltraServers. This role requires expertise in AWS services, system architecture, and cross-functional collaboration with Capacity Management, Hardware Engineering, and Datacenter Operations to manage AI/ML infrastructure.
Key job responsibilities
Key job responsibilities
- The Software Development Engineer is responsible for building and maintaining scalable micro services.
- They are adept at system design that solves the business problem efficiently.
- Work in environments where the technology strategy is defined but the solution design is not
- Build cloud-based solutions using AWS native services for scaling infrastructure frameworks
- Create observable systems with appropriate metrics and alarming
- Collaborate with customers and stakeholders to convert business needs into technical designs
- Participate in code reviews and technical assessments
About the team
The EC2 UltraServer Provisioning team is a high-performing engineering organization responsible for delivering AWS Trainium-based UltraServers infrastructure at scale. We manage end-to-end provisioning workflows from host ingestion through testing, repair, and recovery.
Basic Qualifications: - 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- 1+ years of software development engineer or related occupational experience
- 1+ years of designing and developing large-scale, multi-tiered, multi-threaded, embedded or distributed software applications, tools, systems, and services using: C#, C++, Java, or Perl experience
- 1+ years of Object Oriented Design experience
- Bachelor's degree or foreign equivalent in Computer Science, Engineering, Mathematics, or a related field
- Experience programming with at least one software programming language Preferred Qualifications: - 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
USA, WA, Seattle - 143,700.00 - 194,400.00 USD annually
The Software Development Engineer will design, build, and maintain cloud-based provisioning and recovery systems for AWS Trainium-based AI UltraServers. This role requires expertise in AWS services, system architecture, and cross-functional collaboration with Capacity Management, Hardware Engineering, and Datacenter Operations to manage AI/ML infrastructure.
Key job responsibilities
Key job responsibilities
- The Software Development Engineer is responsible for building and maintaining scalable micro services.
- They are adept at system design that solves the business problem efficiently.
- Work in environments where the technology strategy is defined but the solution design is not
- Build cloud-based solutions using AWS native services for scaling infrastructure frameworks
- Create observable systems with appropriate metrics and alarming
- Collaborate with customers and stakeholders to convert business needs into technical designs
- Participate in code reviews and technical assessments
About the team
The EC2 UltraServer Provisioning team is a high-performing engineering organization responsible for delivering AWS Trainium-based UltraServers infrastructure at scale. We manage end-to-end provisioning workflows from host ingestion through testing, repair, and recovery.
Basic Qualifications: - 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- 1+ years of software development engineer or related occupational experience
- 1+ years of designing and developing large-scale, multi-tiered, multi-threaded, embedded or distributed software applications, tools, systems, and services using: C#, C++, Java, or Perl experience
- 1+ years of Object Oriented Design experience
- Bachelor's degree or foreign equivalent in Computer Science, Engineering, Mathematics, or a related field
- Experience programming with at least one software programming language Preferred Qualifications: - 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, WA, Seattle - 143,700.00 - 194,400.00 USD annually
Similar Jobs
S
AI Infrastructure Engineer - Training Platform
Scale AI
Onsite (Seattle, Washington)
$216k - $270k/yr
A
Software Development Engineer — SPS, Agentic AI Experiences , Selling Partner Development Services
Amazon
Onsite (Seattle, Washington)
$143k - $194k/yr
A
Systems Development Engineer (AWS Generative AI & ML Servers), AWS HW Engineering
Amazon
Onsite (Cupertino, California)
$129k - $201k/yr
A
Software Development Engineer, Data Analytics Integration AI and Platform Excellence
Amazon
Onsite (New York, New York)
$158k - $213k/yr
A
Software Development Engineer II, AWS Healthcare AI
Amazon
Onsite (Seattle, Washington)
$143k - $194k/yr
A
Sr. Software Dev Engineer, Healthcare AI
Amazon
Onsite (Mountain View, California)
$193k - $261k/yr
A
Senior Software Development Engineer, EC2 Trainium AI Infra
Amazon
Onsite (Seattle, Washington)
$168k - $227k/yr
R
Staff Software Engineer, Agentic Data Plane
Redpanda Data
Hybrid (San Francisco, California)
$220k - $260k/yr
A
Software Development Engineer, Alexa AI
Amazon
Onsite (Bellevue, Washington)
$143k - $194k/yr
G
System Debug Engineer, Cloud AI Infrastructure
Google
Onsite (Kirkland, WA)
$163k - $237k/yr