Nvidia
Senior Machine Learning Engineer, Quantized Inference
Found: Today
Location:
Redmond, WA or Santa Clara, CA
Compensation:
$152,000 - $287,500/year
Responsibilities:
- Prototype quantization and sparsity recipes for LLM workloads.
- Design and execute post-training quantization experiments.
- Run evaluations of quantized LLM workloads at scale.
- Develop data analysis tools for debugging.
- Participate in code reviews and contribute to open-source libraries.
Requirements:
- Proficient in Python and PyTorch.
- Experience with quantization and model compression techniques.
- 3+ years in an applied ML role.
- MS/PhD in relevant fields.
Ways to stand out:
- Published work in quantization or training.
- Experience with SFT or RLHF pipelines.
- Familiarity with inference frameworks.