Amazon

Data Engineer II, QuBIT

Bellevue, WA, USA

Found: Today

Description

What if AI could query your data warehouse and actually understand what the numbers mean, not just return rows? That's the infrastructure we're building, and we need a data engineer to help us scale it. We've built a semantic layer that sits between raw operational data and AI agents, encoding metric definitions, business logic, entity relationships, data lineage, and query routing into structured knowledge that large language models can consume and reason about. The foundation is in place. Now we need someone to deepen the data models, expand entity coverage, enrich the ontology with causal relationships, and build the pipeline infrastructure that keeps it all fresh and accurate at scale. In this role, you'll design and maintain the data infrastructure that powers AI-driven analytics for workforce Learning across Amazon's fulfillment network. That means building SQL pipelines in Redshift that process millions of daily records from nine upstream platforms, defining entity schemas with join keys, primary keys, and PII classifications, writing metric definitions with traceable formulas grounded in actual ETL logic, and modeling granularity levels that tell AI agents whether to query at the associate, site, or network level. You'll own the full stack from raw ingestion through transformation to semantic enrichment. You'll also work directly with business stakeholders to translate their domain expertise into structured metadata. When a Regional Learning Manager explains that "training compliance resets weekly on Sunday" or "this site type structurally can't meet that threshold," you'll encode that context into the semantic layer so AI agents handle it correctly without human intervention. Over time, you'll push this toward a world model: not just what metrics exist, but how they relate causally, what drives them, and what happens when they change. We're looking for someone who thinks about data infrastructure as more than pipelines and tables. You'll work with knowledge graphs, entity relationship modeling, YAML-based ontologies, vector embeddings for retrieval, and the prompt engineering that ties it all together. If you want to build the data systems that make AI genuinely useful for business decision-making, at Amazon scale, this is the role.

Basic Qualifications

- 3+ years of data engineering experience- Bachelor's degree or above in Computer Science, Computer Engineering, Data Science, Electrical Engineering, or majors relating to these fields, or 3+ years of professional software development experience- Experience with one or more object-oriented programming languages (e.g., Java, C/C++, Python)- Experience in data warehouse technical architectures, data modeling, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures and hands-on SQL coding- Experience with Redshift, Oracle, NoSQL etc.

Preferred Qualifications

- Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions- Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)- Knowledge of software engineering best practices across the development life cycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations- Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets- 1+ years of programming with at least one software programming language experience.

View Details and Apply

Data Engineer II, QuBIT

Description

Basic Qualifications

Preferred Qualifications

Get jobs like this in your inbox daily

Hey, I'm Stan