Nvidia

Senior Machine Learning Engineer, Quantized Inference

2 Locations

Found: Today

Location:

Redmond, WA or Santa Clara, CA

Compensation:

$152,000 - $287,500/year

Responsibilities:

  • Prototype quantization and sparsity recipes for LLM workloads.
  • Design and execute post-training quantization experiments.
  • Run evaluations of quantized LLM workloads at scale.
  • Develop data analysis tools for debugging.
  • Participate in code reviews and contribute to open-source libraries.

Requirements:

  • Proficient in Python and PyTorch.
  • Experience with quantization and model compression techniques.
  • 3+ years in an applied ML role.
  • MS/PhD in relevant fields.

Ways to stand out:

  • Published work in quantization or training.
  • Experience with SFT or RLHF pipelines.
  • Familiarity with inference frameworks.

Get jobs like this in your inbox daily

Fresh FAANG jobs, every day, filtered for your role and location.

Apple Google Amazon Meta OpenAI Microsoft Nvidia Stripe TikTok Netflix Uber Airbnb Booking Spotify Canva Pinterest
or use email

Similar Big Tech Jobs - Posted in the Past 24h

🔍 Google

AI Engineer, Professional Services, Google Cloud

place Austin, TX, USA ; Atlanta, GA, USA ; +2 more
🧠 OpenAI

Research Engineer / Machine Learning Engineer - Applied Voice

San Francisco
Stanislav Prigodich

Hey, I'm Stan

Software Developer & Creator of Top Jobs Today

I'm a software developer, and over time I realized I cared mostly about roles at big tech companies - not just whatever happened to show up on LinkedIn or generic job boards. But those sources weren't enough - some roles were delayed, or never posted at all.

So I built this website to solve that. It scrapes fresh job postings directly from official company sites, figures out what kind of roles they really are, and sends them as email alerts - simple, fast, and focused.

Hope it makes your search easier too. Wishing you the best of luck - and I'm really glad you're here!