Google
Software Engineering Manager, Site Reliability Engineering, AI Security
Found: Today
About the job
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to users' needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance.
Minimum qualifications:
- Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
- 8 years of experience with software development in one or more programming languages.
- 3 years of experience managing people or teams.
- 3 years of experience leading projects.
- 3 years of experience designing, analyzing, and troubleshooting distributed systems.
Preferred qualifications:
- Master's degree in Computer Science or Engineering.
- Experience in designing and implementing security controls for agentic workloads and AI-driven workflows.
- Strong foundation in security principles and risk management, with the ability to identify systemic risks in emerging AI/ML ecosystems.
Responsibilities
- Lead, manage, and grow a talented team of SREs dedicated to secure-by-default principle.
- Define and execute the technical goal, strategy, and roadmap for addressing top security risks across Cloud AI products and infrastructure.
- Lead by example, mentor the team and establish credibility through quality technical execution.
- Manage and take part in on-call rotations across continents, using a follow-the-sun model.
- Design, write and deliver software to improve the availability, scalability, latency and efficiency of Google's services.