Nvidia
AI and Systems Software Intern, At Scale AI - Fall 2026
Found: Today
This internship is based in Santa Clara, CA.
Compensation:
$20 - $71/hour
Responsibilities:
- Investigate failures in large-scale compute clusters, analyzing logs and telemetry.
- Assist in tracking reliability metrics to drive infrastructure improvements.
- Collaborate with a mentor to learn hardware validation and debugging methodologies.
Requirements:
- Pursuing a degree in Computer Science, Computer Engineering, Electrical Engineering, or related field.
- Proficiency in Python and Bash/Shell scripting.
- Debugging skills in complex, distributed systems.
Tech stack:
Python, Bash/Shell, high-performance computing environments, cluster managers (e.g., Slurm, Kubernetes).