A
Data Engineer - Battery Data Platform & AI
Apple
- Location
- Onsite (Cupertino, California)
- Employment
- Full-time
- Level
- Senior Level
Posted 5 days ago
About the Role
Join Apple's Battery Engineering team to reinvent how engineers interact with data. You will build the data systems and AI interface that power one of the largest battery datasets, impacting the entire organization.
Skills
Python
SQL
ETL/ELT
Data Modeling
Schema Design
Snowflake
Airflow
AWS
LLM Development
MCP Server Development
Agentic Search
Context Engineering
Embeddings
Tokenization
Data Pipelines
Database Development
Full job details
What if the way an entire engineering organization worked with its data could be reinvented? On Apple's Battery Engineering team, you'll build the data systems and AI interface that battery engineers across the company rely on reliable pipelines feeding one of the cleanest and largest battery datasets anywhere, and a natural language interface that's changing how engineers work with that data. You'll be building the platform the whole battery organization runs on. It's a rare chance to sharpen your data engineering craft and immerse yourself in applied AI at once.
We're looking for a data engineer to build that platform across two tightly connected fronts. First, you'll expand the Battery Data Warehouse (BDW) a mature, exceptionally clean dataset that spans the entire battery product development lifecycle: raw materials and characterization, fabrication, performance testing, simulation and modeling, qualification, manufacturing, and field telemetry. You'll build reliable pipelines that bring this data - structured, semi-structured, and unstructured - out of disparate systems owned by teams around the world. A big part of the job is technical; an equally big part is human: earning the trust of source-system owners, opening up new integration opportunities, and establishing and enforcing the SLAs that keep BDW dependable. Second, you'll build out BARD, the natural language interface to BDW. Done well, BARD will fundamentally change how battery engineers interact with their data, not just replacing dashboards and SQL with conversation, but pairing it with on-demand, in-line charting for real-time analysis and new ways to explore data. Think of it as giving every engineer their own personal data scientist. You'll engineer the full agentic stack: our custom MCP server, agentic search, domain knowledge, tool design, evals, and the end-to-end user experience. The role combines data engineering and AI engineering work. This role calls for someone who's both highly self-directed and an exceptional collaborator. You'll take real ownership and drive projects forward, while staying closely aligned with the team and our broader direction.
BS in Computer Science, Engineering, or a related field Experience with Python, SQL, and at least one other high-level programming language Experience building production data pipelines (ETL/ELT)
MS in Computer Science, Engineering, or a related field with 10+ years of relevant industry experience Strong database fundamentals: data modeling, schema design, indexing, normalization, ACID, and OLTP vs. OLAP Hands-on database development (DML, DDL, materialized views, stored procedures); Snowflake (streams, tasks, dynamic tables) a plus Hands-on experience with orchestration (e.g., Airflow), batch/stream processing, and cloud platforms (e.g., AWS) Deep curiosity about AI and hands-on experience applying it — at work or in personal projects. You keep up with the latest tools, use AI daily (including for coding), and have strong intuition for context engineering, tokenization, embeddings, and evals, as well as a clear sense of where AI excels and where it doesn't (e.g., generating new code vs. maintaining complex existing code) Experience with LLM and MCP server development Strong communication and relationship-building skills, with the ability to align stakeholders and drive integrations across organizational boundaries Familiarity with batteries or other deep-tech / hardware engineering domains
Description
We're looking for a data engineer to build that platform across two tightly connected fronts. First, you'll expand the Battery Data Warehouse (BDW) a mature, exceptionally clean dataset that spans the entire battery product development lifecycle: raw materials and characterization, fabrication, performance testing, simulation and modeling, qualification, manufacturing, and field telemetry. You'll build reliable pipelines that bring this data - structured, semi-structured, and unstructured - out of disparate systems owned by teams around the world. A big part of the job is technical; an equally big part is human: earning the trust of source-system owners, opening up new integration opportunities, and establishing and enforcing the SLAs that keep BDW dependable. Second, you'll build out BARD, the natural language interface to BDW. Done well, BARD will fundamentally change how battery engineers interact with their data, not just replacing dashboards and SQL with conversation, but pairing it with on-demand, in-line charting for real-time analysis and new ways to explore data. Think of it as giving every engineer their own personal data scientist. You'll engineer the full agentic stack: our custom MCP server, agentic search, domain knowledge, tool design, evals, and the end-to-end user experience. The role combines data engineering and AI engineering work. This role calls for someone who's both highly self-directed and an exceptional collaborator. You'll take real ownership and drive projects forward, while staying closely aligned with the team and our broader direction.
Minimum Qualifications
BS in Computer Science, Engineering, or a related field Experience with Python, SQL, and at least one other high-level programming language Experience building production data pipelines (ETL/ELT)
Preferred Qualifications
MS in Computer Science, Engineering, or a related field with 10+ years of relevant industry experience Strong database fundamentals: data modeling, schema design, indexing, normalization, ACID, and OLTP vs. OLAP Hands-on database development (DML, DDL, materialized views, stored procedures); Snowflake (streams, tasks, dynamic tables) a plus Hands-on experience with orchestration (e.g., Airflow), batch/stream processing, and cloud platforms (e.g., AWS) Deep curiosity about AI and hands-on experience applying it — at work or in personal projects. You keep up with the latest tools, use AI daily (including for coding), and have strong intuition for context engineering, tokenization, embeddings, and evals, as well as a clear sense of where AI excels and where it doesn't (e.g., generating new code vs. maintaining complex existing code) Experience with LLM and MCP server development Strong communication and relationship-building skills, with the ability to align stakeholders and drive integrations across organizational boundaries Familiarity with batteries or other deep-tech / hardware engineering domains