Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Senior Site Reliability Engineer image - Rise Careers
Job details

Senior Site Reliability Engineer

Gorgias is the conversational AI platform for ecommerce that drives sales and resolves support inquiries. Trusted by over 15,000 ecommerce brands, Gorgias supports growing independent shops to globally recognizable brands.

Built for Shopify and powered by advanced ecommerce integrations, Gorgias's conversational AI understands your brand, tools, policies, and customers to drive personalized, 1-to-1 conversations — from editing orders and initiating returns to making product recommendations. Gorgias, where every customer interaction feels personal, support becomes sales, and conversations shape success.

About The SRE Team
We are seeking a highly skilled and experienced Senior Site Reliability Engineer (SRE) to join our team. As an SRE at Gorgias, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems, enabling the seamless delivery of our products and services.

The SRE team at Gorgias maintains the core infrastructure and services that make up the heart of our product. We have the privilege to work with high throughput systems and TB-scale data stores serving billions of queries per day, most with sub millisecond response times.

We also design and maintain the software delivery stack, offering features such as metrics-based canary rollout strategies to all internal development teams.

We currently have a team of 4 Senior and Staff SREs operating together globally with aim to be 6 in the near term. We focus on scalable methods to provide the largest impact across the organization.

Some achievements we’re proud of:

  • Partitioned multi-TB tables in Postgres to reduce Vacuum time by 5x

  • For partitioning we studied the problem, the partitioning strategy, analyzed all queries to avoid bad surprises, utilized Debezium and Kafka to do a live copy and accomplished it with less than 20 mins maintenance window and no data loss

  • Split PostgreSQL connections proxy in multiple pools to guarantee quotas per service of our product, allowing sub-systems that heavily hit the database to be contained and not create a large incident blast radius

  • For connections proxying we had to go deeper into the BE to propose solutions, coded part of the fix in the backend, provided the path and helped teams migrate to the new methodology. In the end successfully eliminating incidents due to DB connections starvation

  • Worked with all product-engineering teams to accomplish SOC2 certification, ran a Hackerone program, refactored our whole incident management with Rootly for better visibility and resolution time, and improved our overall security posture

  • To keep the lights on the team is constantly working on upgrading our self-hosted Postgres and RabbitMQ, alongside other critical infrastructure components with minimal down time and high accuracy

What You Will Do:

  • Manage multi-TB PostgreSQL clusters in the public cloud, optimize parameters, storage settings and data structure

  • Operate RabbitMQ and Redis with tens of thousands of operations per second

  • Manage 10+ full featured GKE clusters worldwide, 10k+ Tenants

  • Adopt new stack of: Kafka, Debezium, Apache Flink

  • Facilitate rollout strategies at scale with Gitlab CI and ArgoCD

  • Roll out best practices around Kubernetes/Helm/Operators, SLIs/SLOs, Incident Management, Observability, Security, and Disaster Recovery to all Product-Engineering teams and drive adoption by them

  • Automate complex infrastructure pieces for our worldwide footprint with best practices IaC with TF, strong scripting with Python/Golang

What You Should Have:

  • Experience with cloud-native web systems at scale

  • Bachelor's degree in Computer Science or equivalent work experience.

  • 5+ years experience as a Site Reliability Engineer or similar role, with a focus on maintaining high-performance, scalable, and reliable high-throughput web systems.

  • Proficiency in using Kubernetes for container orchestration and management.

  • 5+ years experience with Cloud Providers (AWS, GCP) and a deep understanding of cloud services and architectures.

  • Proficient in scripting and programming languages such as Python, Bash, Go, or NodeJS.

  • Comfortable and confident in Linux systems and the command line.

  • Solid understanding of infrastructure as code (IaC) principles and experience with tools like Terraform.

  • Experience with continuous integration and deployment (CI/CD) pipelines.

  • Excellent problem-solving and troubleshooting skills.

  • Strong communication and collaboration skills with the ability to work effectively in a team environment.

Bonus Points If You Have:

  • Certification in Kubernetes (e.g., Certified Kubernetes Administrator - CKA).

  • Certification in a Cloud Provider platform (e.g., AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect).

  • Experience in managing and optimizing PostgreSQL databases.


Company Benefits and Perks

  • 🏖️ 5-week vacation plus 2 weeks RTT

  • 🤕 Paid sick leave

  • 🌏 6 weeks full remote/year

  • 🧸 Paid parental leave (16 weeks)

  • 🚊50% of public transportation reimbursed

  • 💻 MacBook Pro

  • 🍽️ Personal credit card to buy lunches (we use Swile)

  • 🏥 We provide private health insurance (we use Alan)

  • 💆🏻‍♀️ Get up to €700 to set up your workstation at home (working from home should feel breezy)

  • 📚 Get up to €2000 of learning material and wellness support per year! This includes €1500 for learning material (such as books, courses, and individual coaching sessions) directly linked to your job scope, as well as a €500 wellness budget. Take advantage of these resources to grow in your role and prioritize your personal development and wellness.

  • 🥰 Every quarter, we organize an online company-wide summit to discuss where we’re going and strengthen social bonds. Once per year we organize offsite team retreats and company retreats!

More cool things to know about Gorgias... 😁

Diversity, Equity, and Inclusion at Gorgias

At Gorgias, we’re dedicated to creating a diverse, inclusive, and equitable workplace where everyone is valued. We provide equal opportunities without discrimination based on race, gender, age, disability, or any characteristic protected by law.

We also recognize that individuals from diverse backgrounds—especially women and underrepresented groups—may hesitate to apply if they don’t meet every requirement. If this role excites you and you’re eager to grow, we strongly encourage you to apply, even if you don’t check every box. You might bring something unique and valuable that we didn’t even know we needed.

If you need accommodations to participate in the application or interview process, perform essential job functions, or access other employment benefits, please contact us at accommodation@gorgias.com. Let’s grow together!

What You Should Know About Senior Site Reliability Engineer, Gorgias

At Gorgias, we are searching for a talented and motivated Senior Site Reliability Engineer (SRE) to join our dynamic team in Paris. As part of our fast-growing conversational AI platform for ecommerce, you’ll be instrumental in maintaining the reliability and performance of our systems that power exceptional customer experiences for over 15,000 ecommerce brands. You'll work with sophisticated high-throughput systems, utilizing cutting-edge technologies such as Kubernetes, PostgreSQL, and RabbitMQ to manage millions of queries daily. What’s great about being an SRE at Gorgias is that you will have the opportunity to implement scalable solutions that directly impact our success. You will optimize our multi-TB PostgreSQL clusters and facilitate rollout strategies at scale while collaborating closely with our passionate Product-Engineering teams. In this role, you will also get to automate complex infrastructure components using Terraform and scripting languages like Python or Go, so your inner coder will be delighted! Gorgias values your efforts, offering competitive benefits like five weeks of vacation, a personal work setup allowance, and opportunities for professional development. Join us in creating a unique experience that makes customer interactions feel personal and impactful!

Frequently Asked Questions (FAQs) for Senior Site Reliability Engineer Role at Gorgias
What are the main responsibilities of a Senior Site Reliability Engineer at Gorgias?

As a Senior Site Reliability Engineer at Gorgias, your primary responsibilities include managing multi-TB PostgreSQL clusters, optimizing cloud-native services, and ensuring the reliability of our systems. You will also work on scaling our infrastructure using Kubernetes, automate processes with Terraform, and collaborate across Product-Engineering teams to implement best practices in observability and incident management.

Join Rise to see the full answer
What qualifications do I need to apply for the Senior Site Reliability Engineer position at Gorgias?

To apply for the Senior Site Reliability Engineer position at Gorgias, you should have a Bachelor's degree in Computer Science or equivalent work experience, along with at least 5 years of experience in a Site Reliability Engineer role or similar. Proficiency in Kubernetes, experience with cloud providers like AWS or GCP, and strong skills in scripting languages such as Python, Bash, or Go are essential.

Join Rise to see the full answer
What tools and technologies does a Senior Site Reliability Engineer at Gorgias use?

At Gorgias, as a Senior Site Reliability Engineer, you will utilize tools and technologies including Kubernetes for container orchestration, PostgreSQL, RabbitMQ, and Redis for high-throughput operations. You will also work with CI/CD tools like GitLab and ArgoCD, employ infrastructure as code principles using Terraform, and leverage monitoring tools to ensure system reliability and performance.

Join Rise to see the full answer
What is the company culture like for Senior Site Reliability Engineers at Gorgias?

The company culture at Gorgias is collaborative, inclusive, and supportive. We emphasize teamwork, ongoing learning, and professional growth. As a Senior Site Reliability Engineer, you'll partake in regular team summits and retreats that strengthen social bonds while working on exciting projects that have a real impact on the company and our customers.

Join Rise to see the full answer
What can I expect in terms of career growth as a Senior Site Reliability Engineer at Gorgias?

At Gorgias, as a Senior Site Reliability Engineer, you can expect significant opportunities for career growth. The company encourages continuous learning with a dedicated budget for professional development. Additionally, you will have the chance to mentor junior engineers and may progressively take on leadership roles within your team as Gorgias continues to grow.

Join Rise to see the full answer
Common Interview Questions for Senior Site Reliability Engineer
Can you explain your experience with managing PostgreSQL databases?

In your response, focus on specific projects where you managed or optimized PostgreSQL databases. Highlight your understanding of indexing, query optimization, and partitioning strategies, as well as any incidents you resolved or significant improvements you achieved.

Join Rise to see the full answer
How do you ensure system reliability and minimize downtime?

Discuss your approach to reliability, mentioning techniques like redundancy, failover, automated monitoring, and incident response plans. Cite examples of how you've proactively improved system uptime and managed incidents efficiently in previous roles.

Join Rise to see the full answer
What role does Kubernetes play in your SRE practices?

Highlight your experience with Kubernetes and how you use it to manage containerized applications. Talk about deployment strategies, scaling, and how Kubernetes helps maintain application health and system stability.

Join Rise to see the full answer
Describe a challenging incident you resolved.

When answering this, provide a specific incident and walk through your thought process, steps taken to diagnose the issue, and the resolution path. Emphasize teamwork and communication throughout the incident.

Join Rise to see the full answer
How do you approach automation in infrastructure management?

Discuss your experience with Infrastructure as Code (IaC) tools like Terraform. Provide examples of how you've automated deployments or infrastructure management processes to improve efficiency and reduce human error.

Join Rise to see the full answer
What is your method for monitoring system performance?

Describe the monitoring tools you have used and how you set up alerts for performance metrics. Discuss how you analyze logs and metrics data to identify trends or potential issues before they become critical.

Join Rise to see the full answer
Can you explain your experience with CI/CD pipelines?

Share your knowledge of CI/CD practices and tools you've used. Explain how you implemented these processes to streamline deployments and improve development workflow, perhaps providing statistics on deployment success rates before and after.

Join Rise to see the full answer
How do you ensure security in your systems?

Talk about practices you adhere to for secure system design, monitoring, and incident response. Mention any certifications you hold or security frameworks you implement and how they help safeguard systems against potential threats.

Join Rise to see the full answer
What is your experience with cloud platforms?

Discuss your work with cloud platforms such as AWS or GCP, focusing on services you’ve utilized and architectural decisions made. Share experiences where your understanding of cloud architecture helped address specific scaling challenges.

Join Rise to see the full answer
How do you handle failure in a production environment?

Explain your approach to failure recovery, emphasizing the importance of postmortem analysis, learning from mistakes, and how you ensure that similar incidents are less likely to occur in the future.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 12 days ago

Join Gorgias as a Corporate Counsel, contributing to legal support and governance for a leading ecommerce conversational AI platform.

Photo of the Rise User
Posted 14 days ago

Join Gorgias as a Data Analytics Manager to lead the Business Intelligence team and leverage data to drive strategic insights for ecommerce success.

Photo of the Rise User

Join Docplanner as a Senior PHP Developer and help transform healthcare into a more human-centered experience.

Photo of the Rise User
Posted 6 hours ago
Inclusive & Diverse
Empathetic
Collaboration over Competition
Growth & Learning
Transparent & Candid
Medical Insurance
Dental Insurance
Mental Health Resources
Life insurance
Disability Insurance
Child Care stipend
Employee Resource Groups
Learning & Development

Join American Express as a Backend Engineer and contribute to innovative financial data engineering solutions in a collaborative environment.

Photo of the Rise User
Posted 5 hours ago
Inclusive & Diverse
Diversity of Opinions
Work/Life Harmony
Dare to be Different
Reward & Recognition
Empathetic
Take Risks
Growth & Learning
Transparent & Candid
Mission Driven
Passion for Exploration
Feedback Forward
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Learning & Development
Paid Time-Off
Maternity Leave
Social Gatherings

Join Apple's innovative AI & Data Platform team as a Software Engineer, creating scalable applications focused on user experience.

Photo of the Rise User

Join ELEKS as a Middle/Senior DB Developer and contribute to innovative financial technology solutions for leading investment firms.

Photo of the Rise User
Posted 8 days ago

Enhance the US healthcare system as a Staff Software Engineer at MCG, working with a mission-driven team to innovate healthcare solutions.

Posted 8 days ago

Gridware seeks a Senior Software Engineer to develop customer-facing features that enhance the electrical grid's reliability and safety.

Photo of the Rise User
Posted yesterday

Join Intuition Machines as a Senior Backend Engineer and drive the evolution of cutting-edge AI/ML-powered security solutions in a fully remote environment.

Photo of the Rise User
Version 1 Remote London, Birmingham, Manchester, Newcastle upon Tyne, Edinburgh, Belfast, England, United Kingdom
Posted 8 days ago

Join Version 1 as a Senior AWS Engineer and play a pivotal role in building cloud-native applications with a dynamic team.

Founded in 2015 by Roman Lapeyre and Alex Plugaru, Gorgias is a multi-channel help desk integrated with e-commerce merchants' (BigCommerce, Shopify Plus, and Magento) back-office. It allows merchants to manage all their customer support from a sin...

77 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
February 19, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
Someone from OH, Alliance just viewed Store Representative - Mid-Shift at Serv-U-Success
Photo of the Rise User
10 people applied to Full Stack Developer at VSoft Corp
Photo of the Rise User
Someone from OH, Eastlake just viewed (REMOTE) Account Executive at Trellis
Photo of the Rise User
Someone from OH, Elyria just viewed Security Officer - Factory Patrol at Allied Universal
Photo of the Rise User
11 people applied to NodeJs developer at BlackStone eIT
Photo of the Rise User
Someone from OH, Cincinnati just viewed Staff Software Test Engineer, Platform at Clari
Photo of the Rise User
Someone from OH, Perrysburg just viewed Sourcing Leader, Minerals & Cullet at Owens Corning
Photo of the Rise User
Someone from OH, North Royalton just viewed Remote AI Voice Trainer (High-Quality Microphone Required) at Datadog
C
Someone from OH, Akron just viewed Phlebotomy Technician - Outpatient at CCF
Photo of the Rise User
23 people applied to Junior Unity Developer at Gameloft
Photo of the Rise User
Someone from OH, Solon just viewed Graphic Designer at Applause