Microsoft

Principal Software Engineering Manager - Substrate efficiency

United States, Washington, Redmond

Found: Today

M365 Copilot inference is a high-impact engineering team advancing applied AI and large-scale machine learning across Microsoft. We design and operate the platform powering Microsoft 365 Copilot experiences, delivering intelligent capabilities to millions of users.

Our team owns one of the world’s largest AI inference platforms, operating at massive GPU scale across global datacenters. We build the core LLM API and routing services that enable low-latency, highly available AI experiences, and continuously push the boundaries of performance, scalability, and efficiency. We are hiring a Principal Software Engineering Manager to lead a strategic initiative focused on maximizing throughput per GPU across the Copilot inference stack. This role is to drive inference engine efficiency by optimizing model execution and runtime performance, improving throughput per GPU, reducing cost per query, and unlocking capacity without additional hardware investment.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities
  • Build and lead a high-performing engineering team focused on inference runtime efficiency and model execution performance.
  • Define and drive strategy to improve throughput per GPU through runtime optimizations.
  • Increase engineering agility, enabling faster experimentation, iteration, and rollout of performance improvements.
  • Partner across M365 Core, AI Core, Azure, and Microsoft Research to co-design and productionize advanced inference optimizations.
  • Establish metrics, telemetry, and experimentation frameworks to measure efficiency gains and guide investment decisions.
  • Own live-site performance, reliability, and operational excellence for inference engines at scale.
  • Drive alignment across partner teams on engine interfaces, performance goals, and optimization priorities.
Qualifications

Required Qualifications:

  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

Software Engineering M6 - The typical base pay range for this role across the U.S. is USD $165,600 - $296,400 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $220,800 - $331,200 per year.

Get jobs like this in your inbox daily

Fresh FAANG jobs, every day, filtered for your role and location.

Apple Google Amazon Meta OpenAI Microsoft Nvidia Stripe TikTok Netflix Uber Airbnb Booking Spotify Canva Pinterest
or use email

Similar Big Tech Jobs - Posted in the Past 24h

🍎 Apple

Senior Engineering Manager

Seattle
🔍 Google

Engineering Manager, ML Performance

place Sunnyvale, CA, USA ; Kirkland, WA, USA
🔍 Google

Engineering Manager, Network Management

place Sunnyvale, CA, USA
Stanislav Prigodich

Hey, I'm Stan

Software Developer & Creator of Top Jobs Today

I'm a software developer, and over time I realized I cared mostly about roles at big tech companies - not just whatever happened to show up on LinkedIn or generic job boards. But those sources weren't enough - some roles were delayed, or never posted at all.

So I built this website to solve that. It scrapes fresh job postings directly from official company sites, figures out what kind of roles they really are, and sends them as email alerts - simple, fast, and focused.

Hope it makes your search easier too. Wishing you the best of luck - and I'm really glad you're here!