Microsoft
Site Reliability Engineer II
Found: Today
This role is based in Hyderabad, India.
Responsibilities:
- Work with all aspects of a high throughput and multi-tenant service.
- Collaborate effectively within the team and with partner teams across Microsoft.
- Be part of the on-call rotation for maintaining service health.
- Design, implement, and refine chosen solutions in close partnership with Product Management and partner teams.
- Champion operational excellence via established metrics, process governance, and policy controls for regular assessment and improvement.
Qualifications:
Master's Degree in Computer Science or related field with 2+ years of experience in software engineering, or Bachelor's Degree with 4+ years of experience.
Core Responsibilities:
- System Reliability & Uptime – Ensuring high availability of services.
- Incident Management – Detecting, responding to, and mitigating system failures.
- Performance Monitoring – Tracking system health and resolving bottlenecks.
- Automation & Tooling – Reducing manual work through scripts and automation.
- Capacity Planning – Scaling infrastructure efficiently to handle demand.
- Postmortems & Continuous Improvement – Analyzing failures to prevent recurrence.