Nvidia
Solutions Architect, Inference Deployments
Found: Today
This role is based in Santa Clara, CA.
Compensation:
$152,000 - $241,500/year
Responsibilities:
- Build inference pipelines with tools like NVIDIA Dynamo.
- Collaborate with DevOps to orchestrate disaggregated inference using Kubernetes.
- Provide mentorship and technical leadership to customers and internal teams.
Requirements:
- 5+ years in Solutions Architecture with experience in deploying distributed systems.
- Experience with NVIDIA Dynamo, Triton Inference Server, or TensorRT-LLM.
- BS in CS/Engineering or equivalent experience.
Tech stack:
NVIDIA Dynamo, Kubernetes, TensorRT-LLM, GPU orchestration.