Nvidia
Principal Software Engineer, AIOps and Observability
Found: December 20, 2025
This role is based in Santa Clara, CA.
Compensation:
$248k - $391k/year
Responsibilities:
- Lead the design, development, and deployment of AIOps & Observability platforms.
- Drive the technical vision and roadmap for observability initiatives.
- Collaborate with teams to understand observability needs.
- Establish observability standards and implement new technologies.
- Work with data scientists to implement machine learning models for anomaly detection.
- Develop scalable, reliable, and distributed systems.
- Automate remediation of common issues.
Requirements:
- Bachelor’s degree in computer science or related field.
- 15+ years in product development and full stack engineering.
- 5+ years in developing observability platforms.
- Experience with observability tools like Prometheus, Grafana, etc.
- Hands-on knowledge of AIOps tools.
- Experience with Kubernetes, Docker, and microservices.
- Proficient in programming languages such as Go, Python, Java, C#.