Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Staff DevOps Engineer | Research Infrastructure Operations image - Rise Careers
Job details

Staff DevOps Engineer | Research Infrastructure Operations

Meet DeepL

DeepL is a global communications platform powered by Language AI. Since 2017, we’ve been on a mission to break down language barriers. Our human-sounding translations and intelligent writing suggestions are designed with enterprise security in mind. Today, they enable over 100,000 businesses to transform communications, reach new markets, and improve productivity. And, empower millions of individuals worldwide to make sense of the world and express their ideas.

Our goal is to become the global leader in Language AI, building products that drive better communication, foster connections, and make a real-life impact. To achieve this, we need talented individuals like you to join our exciting journey. If you're ready to work with a dynamic team and build your career in the fast-moving AI space, DeepL is your next destination.

What sets us apart

What sets us apart is our blend of modern technology, competitive benefits, and an open, welcoming work culture that enables our people to thrive. When we share what it's like to work at DeepL, the reactions are overwhelmingly positive. This may be because of our products that have helped countless people worldwide or our shared mission to improve communication for individuals and businesses, bringing cultures closer together. What we know for sure is this: being part of DeepL means joining a team dedicated to innovation and employee well-being. Discover what our teams have to say about life at DeepL on LinkedIn, Instagram and our Blog.

Meet the team behind this journey

Within the Infrastructure Operations and Security (IOPS) department, our data center unit manages all infrastructure systems across our remote sites. As a key member of the Research Infrastructure Operations (RIO) team, you will architect, design and operate our High-Performance Computing (HPC) infrastructure, making a fundamental contribution to our AI development.

You will work hands-on with our various Nvidia clusters, comprising thousands of GPUs. Given the scale and complexity of our workloads, it's not just about maintaining our systems, it's about elevating them. You will use your expertise in tooling and automation to improve the efficiency, reliability and performance of our infrastructure, taking our operations to the next level.

In this role, you will also coordinate with on-site staff and work closely with various teams within our organization. Joining our team means becoming part of a skilled group of engineers ready to support and kick-start your journey with us.

Your responsibilities

  • Design, plan, setup, administer, maintain and troubleshoot our GPU infrastructure

  • Benchmark and optimize the performance of our GPU infrastructure systems

  • Team up with researchers and developers to troubleshoot and fine-tune applications for HPC environments

  • Work on various projects and help keep our sites in a consistent, up to date and optimized state, on all aspects from firmware to architectural deployment plans

  • Support the team in case of unexpected issues, coordinate escalation to specialized teams when needed

  • Make your job easier by automating as much as possible using our advanced toolchain

  • Develop and implement custom monitoring checks to gain insights and respond to technical issues

  • Work with different hardware vendors in a top-notch, high-performance environment

About you

  • Extensive experience in management and troubleshooting of GPU compute clusters at scale

  • Proficiency in containerization and container orchestration technologies such as Docker and K8s

  • Software engineering expertise and fluency in at least one programming language, preferably in Go.

  • Expertise in patch and OS management at scale

  • Experienced in Linux performance benchmarking, tuning, and troubleshooting

  • Familiarity with distributed storage solutions like Lustre and Ceph

  • Knowledgeable in networking technologies and protocols, including Ethernet and ideally Infiniband

  • Proactive and solution-oriented mindset

  • Excellent problem-solving skills

  • Initiative-driven and able to take ownership

What we offer

  • Diverse and internationally distributed team: joining our team means becoming part of a large, global community with people of more than 90 nationalities. We're more than just colleagues; we're a group of professionals with a shared mission to connect diverse cultures. Our global presence is growing–we've doubled in size nearly every year, with our employees based in the UK, Germany, the Netherlands, Poland, the US, and Japan, and we continue to expand our network.

  • Open communication, regular feedback: as a language-focused company, we value the importance of clear, honest communication. We value smooth collaboration, direct and actionable feedback, and believe that leading with empathy and growth mindset makes us better together.

  • Hybrid work, flexible hours: we offer a hybrid work schedule, with team members coming into the office twice a week. This allows you to engage directly with your team and experience the unique energy of our workspace, while still enjoying the flexibility and comfort of working from home. With flexible working hours and trust in your productivity, we are in sync with your team’s general locations and time zones to foster effective and seamless collaboration.

  • Regular in-person team events: we bond over vibrant events that are as unique as our team, from local team and business unit gatherings, to new-joiner onboardings, to company-wide events that bring us all together–literally.

  • Monthly full-day hacking sessions: every month, we have Hack Fridays, where you can spend your time diving into a project you're passionate about and get the opportunity to work with other teams–we value your initiatives, impact, and creativity.

  • 30 days of annual leave: we value your peace of mind. With 30 days off (excluding public holidays) and access to mental health resources, we make sure you're as strong mentally as you are professionally.

  • Competitive benefits: just as our team spans the globe, so does our benefits package. We've crafted it to reflect the diversity of our team and tailored it to align with your unique location, to ensure you feel supported every step of the way.

If this role and our mission resonate with you, but you're hesitant because you don't check all the boxes, don't let that hold you back. At DeepL, it's all about the value you bring and the growth we can foster together. Go ahead, apply—let's discover your potential together. We can't wait to meet you!

We are an equal opportunity employer

You are welcome at DeepL for who you are—we appreciate authenticity here. Our product is for everyone, and so is our workplace. The more voices we have represented and amplified in our business, the more we will all succeed, contribute, and think forward! So bring us your personal experience, your perspectives, and your background. It’s in our diversity that we will find the power to break down language barriers in the world.

DeepL Glassdoor Company Review
4.5 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
DeepL DE&I Review
4.7 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
CEO of DeepL
DeepL CEO photo
Unknown name
Approve of CEO

Average salary estimate

$100000 / YEARLY (est.)
min
max
$80000K
$120000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Staff DevOps Engineer | Research Infrastructure Operations, DeepL

At DeepL, we're on a mission to revolutionize the way individuals and businesses communicate, and we're looking for a passionate Staff DevOps Engineer to join our Research Infrastructure Operations (RIO) team in Cologne. This isn’t just about maintaining existing systems; it’s about harnessing the power of our cutting-edge High-Performance Computing (HPC) infrastructure, particularly our extensive Nvidia GPU clusters, to drive innovation in AI development. You'll play a crucial role in architecting, designing, and operating infrastructure that supports linguistic transformation for our global clients. Collaborating with researchers and developers, you’ll troubleshoot applications and optimize system performance, ensuring we stay at the forefront of technology. Your role will include designing and administering GPU infrastructure, automating processes using advanced toolchains, and developing custom monitoring solutions to improve efficiency. We embrace a culture of open communication, flexible working, and regular feedback, making DeepL a place where you can truly thrive. If you’re ready to take ownership of your projects and be part of our diverse, international team dedicated to breaking down language barriers, come join us at DeepL, where your expertise will help shape the future of language AI!

Frequently Asked Questions (FAQs) for Staff DevOps Engineer | Research Infrastructure Operations Role at DeepL
What are the main responsibilities of a Staff DevOps Engineer at DeepL?

As a Staff DevOps Engineer at DeepL, your main responsibilities include designing and maintaining our GPU infrastructure, optimizing system performance, and collaborating with researchers and developers to ensure applications function smoothly in our High-Performance Computing environments. You're expected to automate tasks using advanced tools, implement monitoring checks, and troubleshoot issues efficiently as they arise.

Join Rise to see the full answer
What qualifications are required for the Staff DevOps Engineer position at DeepL?

To qualify for the Staff DevOps Engineer role at DeepL, candidates should have extensive experience in managing and troubleshooting GPU compute clusters, proficiency in containerization technologies like Docker and Kubernetes, and software engineering skills in at least one programming language, preferably Go. Familiarity with Linux performance benchmarking and distributed storage solutions is also important.

Join Rise to see the full answer
How does a Staff DevOps Engineer contribute to AI development at DeepL?

A Staff DevOps Engineer at DeepL plays a vital role in AI development by architecting and operating advanced High-Performance Computing infrastructure. By optimizing the performance of our GPU clusters and working closely with researchers, they ensure that applications are finely tuned for better performance, thus accelerating the development of our language AI products.

Join Rise to see the full answer
What kind of work culture can a Staff DevOps Engineer expect at DeepL?

At DeepL, the work culture is open, welcoming, and values clear communication. Staff DevOps Engineers can expect a hybrid work setup with flexible hours, regular team bonding events, and an emphasis on collaboration and feedback, which fosters both personal and professional growth in a diverse environment.

Join Rise to see the full answer
What career growth opportunities does DeepL offer for a Staff DevOps Engineer?

DeepL offers abundant career growth opportunities for a Staff DevOps Engineer, including engaging in innovative projects like monthly Hack Fridays, collaborating across teams, and participating in skills development workshops. As the company expands, there are numerous paths to advance your career within the organization while contributing to a mission that impacts global communication.

Join Rise to see the full answer
Common Interview Questions for Staff DevOps Engineer | Research Infrastructure Operations
Can you describe your experience with managing GPU compute clusters?

When answering this question, highlight specific projects you've worked on, the scale of the clusters you managed, and any tools or techniques you used for performance optimization or troubleshooting. Provide examples that demonstrate your problem-solving skills and your understanding of the underlying technologies.

Join Rise to see the full answer
How do you ensure the reliability and performance of High-Performance Computing systems?

Discuss the approaches you take for monitoring system performance, including any particular metrics you focus on. Explain how you use automation and custom monitoring checks to respond quickly to technical issues and maintain optimal performance.

Join Rise to see the full answer
What containerization and orchestration tools are you familiar with?

Mention the specific tools you've used, such as Docker and Kubernetes, and detail your experience in deploying applications or managing containerized environments. Discuss how these tools improved your workflow and efficiency in previous roles.

Join Rise to see the full answer
Describe a challenging technical problem you faced and how you solved it.

Choose an example that demonstrates your technical skills and critical thinking. Describe the problem, your analysis, the steps you took to resolve it, and any teamwork involved. This shows your ability to handle pressure and work collaboratively.

Join Rise to see the full answer
How would you approach automation in your work as a Staff DevOps Engineer?

Discuss specific automation tools or scripts you’ve implemented in past roles. Explain your philosophy on automation, focusing on how it can improve efficiency and reduce the chance of human error, and provide examples of what tasks you've automated successfully.

Join Rise to see the full answer
What strategies do you use for effective collaboration with teams?

Talk about your communication style and your experiences in cross-functional teams. Describe how you ensure open lines of communication, share feedback, and work collaboratively to solve complex problems, highlighting the importance of empathy.

Join Rise to see the full answer
What are your techniques for performance benchmarking in Linux systems?

Discuss the specific benchmarking tools you’ve used, such as ‘top’, ‘htop’, or specialized software for performance analytics. Explain your process for interpreting data and making optimizations based on your findings.

Join Rise to see the full answer
How do you prioritize tasks in a fast-paced environment?

Share your personal strategies for effective time management, potentially discussing tools you use for task management. Explain how you assess and prioritize tasks based on urgency and impact on projects.

Join Rise to see the full answer
Can you give an example of how you’ve handled a system failure?

Provide a detailed account of a specific incident, explaining your immediate response, the steps you took to diagnose the issue, and how you communicated with stakeholders during the recovery process. Highlight what you learned and how it improved future response efforts.

Join Rise to see the full answer
What interests you about working in the language AI industry at DeepL?

Speak to your passion for technology and innovation, mentioning how DeepL's mission resonates with you personally. Share any experiences that connect your skills with the specific needs of the industry and express your enthusiasm for making a difference in communication.

Join Rise to see the full answer
Similar Jobs
Mizuho Remote New York, NY (1271 AOA/6th Ave)
Posted 9 hours ago
Photo of the Rise User
TKDA Remote MSP - Minneapolis - Saint Paul, Minnesota (Bloomington)
Posted 8 days ago
Photo of the Rise User
Prompt Remote No location specified
Posted 13 days ago
Photo of the Rise User
10x Genomics Hybrid Pleasanton, California, USA HQ
Posted 5 days ago
Photo of the Rise User
Posted 13 days ago
Photo of the Rise User
PayPal Remote Scottsdale, Arizona, United States of America
Posted 2 days ago
Photo of the Rise User
Posted 11 days ago
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
March 21, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
C
Someone from OH, Columbus just viewed Data Entry Clerk at Comforce Resource
Photo of the Rise User
Someone from OH, Mason just viewed HR/Recruiting Assistant at Illumination
Photo of the Rise User
Someone from OH, Strongsville just viewed Used Car Buyer - Concord Toyota at Sonic Automotive
Photo of the Rise User
Someone from OH, Cincinnati just viewed Mid-level Creative (f/m/d) at Landor
Photo of the Rise User
70 people applied to Electrical Apprentice at Aerotek
P
Someone from OH, Kent just viewed Graphic Designer at ProjectGrowth
Photo of the Rise User
Someone from OH, Waverly just viewed Client Services Manager at Pepperstone
Photo of the Rise User
Someone from OH, Plain City just viewed Aesthetic Telehealth Nurse Practitioner (remote) at Moxie
Photo of the Rise User
Someone from OH, Columbus just viewed EdTech Product/Program Manager at Planner5D
S
Someone from OH, Lorain just viewed Test Engineer- Ninja at SharkNinja
Photo of the Rise User
Someone from OH, Youngstown just viewed Channel Development Representative at Arrow Electronics
Photo of the Rise User
Someone from OH, Cincinnati just viewed Buyer at Novolex
k
Someone from OH, Columbus just viewed Patient Experience Coordinator at knownwell
Photo of the Rise User
Someone from OH, Columbus just viewed Store Manager - New Store Opening at Curaleaf
Photo of the Rise User
Someone from OH, Akron just viewed Finance Intern - Summer 2025 at Spectrum
Photo of the Rise User
Someone from OH, Norwalk just viewed Hybrid Account Manager-Commercial Lines at AssuredPartners
Photo of the Rise User
Someone from OH, Loveland just viewed Animator at Apex Systems Bellevue, WA at Apex Systems
Photo of the Rise User
Someone from OH, Canton just viewed Lead Jr. Toddler Teacher at All Around Children
Photo of the Rise User
Someone from OH, Mentor just viewed Site Merchandising Manager at Lovepop
Photo of the Rise User
Someone from OH, Batavia just viewed Restaurant Busser at Outback Steakhouse
Photo of the Rise User
Someone from OH, New Albany just viewed Customer Success Manager at Quisitive