Nvidia
Senior DL Algorithms Engineer - Inference Performance
Found: October 16, 2025
This role is based in Toronto, Canada.
Compensation:
116,250 CAD - 247,000 CAD depending on level.
Responsibilities:
- Implement language and multimodal model inference as part of NVIDIA Inference Microservices (NIMs).
- Contribute new features, fix bugs, and deliver production code to TRT-LLM, NVIDIA’s open-source inference serving library.
- Profile and analyze bottlenecks across the full inference stack to enhance performance.
- Benchmark state-of-the-art offerings in various DL models inference and perform competitive analysis.
- Collaborate with SW/HW co-design teams for next-gen AI-powered services.
Requirements:
- PhD in CS, EE, or equivalent experience.
- 3+ years of experience in deep learning and neural networks.
- Experience with performance profiling and optimization for GPU-based applications.
- Proficient in C++, PyTorch, or equivalent frameworks.
- Deep understanding of computer architecture and GPU fundamentals.
Preferred Qualifications:
- Experience with processor and system-level performance optimization.
- Understanding of modern LLM architectures.
- Strong fundamentals in algorithms.
- GPU programming experience (CUDA or OpenCL) is a plus.