Nvidia

Manager, Site Reliability Engineer - DGX Cloud

India, Remote Remote

Found: January 8, 2026

This role is based in India, with remote work options available.

What you'll do:

  • Recruit and mentor a team of Site Reliability Engineers, fostering collaboration and technical excellence.
  • Establish SRE practices, including SLOs, SLIs, and incident management processes.
  • Collaborate with engineering teams to design and deploy scalable cloud services.
  • Drive automation across service lifecycle to eliminate toil.
  • Implement monitoring and alerting solutions for system health.
  • Oversee incident response and lead post-mortems to improve processes.

What we need to see:

  • Bachelor's or Master's degree in Computer Science or related field.
  • 10+ years in Site Reliability Engineering or DevOps, with 5 years in a leadership role.
  • Experience with cloud environments (AWS, GCP, Azure) and Kubernetes.
  • Strong understanding of SRE principles and infrastructure automation tools.
  • Excellent communication and problem-solving skills.

Get jobs like this in your inbox daily

Fresh FAANG jobs, every day, filtered for your role and location.

Apple Google Amazon Meta OpenAI Microsoft Nvidia Stripe TikTok Netflix Uber Airbnb Booking Spotify Canva Pinterest
or use email
Stanislav Prigodich

Hey, I'm Stan

Software Developer & Creator of Top Jobs Today

I'm a software developer, and over time I realized I cared mostly about roles at big tech companies - not just whatever happened to show up on LinkedIn or generic job boards. But those sources weren't enough - some roles were delayed, or never posted at all.

So I built this website to solve that. It scrapes fresh job postings directly from official company sites, figures out what kind of roles they really are, and sends them as email alerts - simple, fast, and focused.

Hope it makes your search easier too. Wishing you the best of luck - and I'm really glad you're here!