Canva
Senior Research Scientist - Reinforcement Learning, MoEs
Found: Today
This role is based in Vienna, Austria.
About the role:
You’ll drive research directions and play a leading role in hands‑on work across the agent stack—from reward design and policy optimization to planning, memory, and tool orchestration.
Responsibilities:
- Develop agent systems for real tasks in design, vision, and language.
- Scale post-training and RL across distributed systems (PyTorch).
- Contribute to the research agenda for RL/agentic systems aligned with Canva’s product goals.
- Build reward models and learning loops.
- Develop simulation and sandbox tasks that surface failure modes.
- Help align on rigorous evaluation for agents.
- Collaborate with product, design, safety, and platform teams.
- Mentor teammates and present findings internally.
Requirements:
- Depth in implementing and post-training MoEs/LLMs/VLMs/Diffusion models.
- Strong experience with experimental design and reproducibility.
- Fluency in Python and PyTorch.
- Hands-on experience with policy optimization and reward modeling.
- Experience with large-scale training and cloud multimodal tooling.