Canva

Senior Research Scientist - Reinforcement Learning, MoEs

London, , United Kingdom

Found: February 26, 2026

View Details and Apply

This role is based in London, United Kingdom.

Responsibilities:

Develop agent systems for real tasks in design, vision, and language.
Scale post-training and RL across distributed systems using PyTorch.
Contribute to the research agenda for RL/agentic systems aligned with Canva’s product goals.
Build reward models and learning loops.
Develop simulation tasks to identify failure modes.
Collaborate with product and design teams to implement research findings.
Mentor teammates and share findings with the community.

Requirements:

Experience with reinforcement learning and mixture of expert models.
Strong proficiency in Python and PyTorch.
Hands-on experience with policy optimization and reward modeling.
Experience with large-scale training and cloud multimodal tooling.

View Details and Apply

Get jobs like this in your inbox daily

Fresh FAANG jobs, every day, filtered for your role and location.

or use email

Hey, I'm Stan

Software Developer & Creator of Top Jobs Today

I'm a software developer, and over time I realized I cared mostly about roles at big tech companies - not just whatever happened to show up on LinkedIn or generic job boards. But those sources weren't enough - some roles were delayed, or never posted at all.

So I built this website to solve that. It scrapes fresh job postings directly from official company sites, figures out what kind of roles they really are, and sends them as email alerts - simple, fast, and focused.

Hope it makes your search easier too. Wishing you the best of luck - and I'm really glad you're here!