Google

Engineering Manager, ML Performance

place Sunnyvale, CA, USA ; Kirkland, WA, USA

Found: Today

Engineering Manager, ML Performance

About the job

Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have not only the technical expertise to take on and provide technical leadership to major projects, but also manage a team of Engineers. You not only optimize your own code but make sure Engineers are able to optimize theirs. As a Software Engineering Manager you manage your project goals, contribute to product strategy and help develop your team. Teams work all across the company, in areas such as information retrieval, artificial intelligence, natural language processing, distributed computing, large-scale system design, networking, security, data compression, user interface design; the list goes on and is growing every day. Operating with scale and speed, our exceptional software engineers are just getting started -- and as a manager, you guide the way.With technical and leadership expertise, you manage engineers across multiple teams and locations, a large product budget and oversee the deployment of large-scale projects across multiple sites internationally.

Google’s Core Machine Learning (ML) organization is looking for an Engineering Manager to join our pioneering TPU Performance team! Our team is responsible for maximizing the speed and efficiency of Google’s custom AI chips (TPUs) for training and running massive AI/ML models.

While we have a rich 10-year history of optimizing Google’s own internal AI models, our team is entering an exciting new phase. As Google expands its focus to become a major hardware provider for the broader tech industry, we are optimization partners for both Google's internal teams and major external AI companies and foundation model builders.

The AI and Infrastructure team is redefining what’s possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.

We're the driving force behind Google's groundbreaking innovations, empowering the development of our cutting-edge AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.

Individual pay is determined by factors including job-related skills, experience, and relevant education or training. US: $207000 - $301000 (USD) + 20% bonus target + equity + benefitsLearn more about benefits at Google.

Health, dental, vision, life, disability insurance
Retirement Benefits: 401(k) with company match
Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment
Sick Time: 40 hours/year (increased to 69 hours/year for Seattle) including 5 discretionary sick days per instance
Maternity Leave (Short-Term Disability + Baby Bonding): 28-30 weeks
Baby Bonding Leave: 18 weeks
Holidays: 13 paid days per year

Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Sunnyvale, CA, USA; Kirkland, WA, USA.

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
8 years of experience in software development.
5 years of experience leading ML design and optimizing ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
3 years of experience in a technical leadership role.
2 years of experience in a people management or team leadership role.
Experience with ML performance analysis, benchmarking, and computer architecture.

Preferred qualifications:

Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
Experience in ML accelerators (GPUs, TPUs) and low-level kernel programming/tuning using tools like CUDA, Triton, or Pallas.
Experience with compiler optimization (MLIR, OpenXLA) and integrating frameworks/serving libraries (PyTorch, JAX, vLLM) to maximize hardware efficiency.
Ability to adapt ML models to specific hardware strengths and use performance benchmarking to guide both optimization and future hardware design.

Responsibilities

Lead a team of software engineers focused on identifying and maintaining ML training and serving benchmarks that are representative to Google production and the broader ML industry.
Achieve performance for customer launches, and in case of third-party/open-source software (OSS) models, for engaged benchmark submissions (ML Commons, InferenceX, etc.).
Use benchmarks to identify performance opportunities and drive both near-term SOTA (e.g., custom kernels) and out-of the box performance (compiler/runtime optimizations, agentic tooling, auto-sharding) directly and in collaboration with partner teams.
Participate in algorithmic innovations exploiting new TPU hardware features and model-preserving optimizations (speculative decoding, sparsity, quantization, LoRA, etc.).
Participate in co-designing models that are TPU-friendly to showcase model quality at performance advanced to OSS models typically designed on GPUs.

View Details and Apply