Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Senior Site Reliability Engineer  image - Rise Careers
Job details

Senior Site Reliability Engineer

Replicant was founded on the belief that machines are ready to have useful, complex conversations that will transform the way they interact with the world, starting with customer service.

As the leader in Contact Center Automation, Replicant helps companies automate their most common customer service calls while empowering agents to focus on more complex and nuanced customer challenges. Replicant's AI platform allows consumers to engage in natural conversations across voice, messaging and other digital channels to resolve their customer support issues, without the wait, 24/7. We are now leading the way in using Large Language Models (LLMs) to transform customer service- again. 

If you're excited by AI, ChatGPT, LLMs and want to make an impact with other great technologists and strong go-to-market leaders, then look no further. We've grown our team by 3x, increased revenue by 4x, and were named a top enterprise AI company by The Information. We currently serve Fortune 500 customers, run millions of AI calls per month in production, and are increasing our footprint globally.

We're searching for a skilled Site Reliability Engineer to play a crucial role in scaling the infrastructure and systems that power Replicant. As our company expands, we need your expertise to optimize how Replicant's data is managed and delivered, enhance the connectivity of our software applications, and strike the right balance between engineering autonomy and standardization. Our core technology stack includes TypeScript/NodeJS and Python within a Kubernetes environment on GCP, along with tools like Helm, Terraform, Datadog, and Prometheus.

What You'll Do

  • Ensure the smooth operation and high availability of Replicant's production systems.

  • Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency.

  • Develop and maintain tools and automation to prevent and quickly resolve incidents.

  • Collaborate with engineering teams to improve the reliability and scalability of our applications and infrastructure.

  • Participate in on-call rotation to address production issues and ensure service uptime.

  • Contribute to infrastructure design and implementation, focusing on scalability, security, and cost-effectiveness.

  • Stay up-to-date on industry best practices and emerging technologies in SRE and DevOps.

What You'll Bring

  • Proven experience in managing and troubleshooting complex, distributed systems in a production environment.

  • Strong understanding of cloud platforms (GCP preferred) and containerization technologies (Kubernetes).

  • Proficiency in scripting languages and automation tools (e.g., Python, Bash, Terraform).

  • Experience with monitoring and observability systems (e.g., Datadog, Prometheus).

  • Excellent problem-solving skills and a proactive approach to identifying and mitigating potential issues.

  • Strong communication and collaboration skills, with the ability to work effectively in a team environment.

  • A passion for ensuring the reliability and performance of critical systems.

Bonus Points

  • Experience with CI/CD pipelines and infrastructure-as-code practices.

  • Knowledge of networking concepts and protocols.

  • Familiarity with security best practices for cloud-based systems.

  • Familiarity with telephony applications

For all full-time employees, we offer:

🏠  Remote working environment that respects time zone differences

💸  Highly competitive salaries, equity, and for US Employees, a 401(k) plan

🏥  Top of the line healthcare (medical, vision, and dental)

🏋️  Health and Wellness Perk

🖥️ Equipment Stipend

🌴  Flexible vacation policy

✈️  Amazing team trips & offsites where you can find our CEO baking bread for the team

🌺 Replicants are eligible for a 5-week sabbatical after being at the company for 4.5 years

Our Values

Replicant has three core values. It is critical that everyone who joins the team feels excited and moved by these values as every new team member makes an impact on our culture.

Blade Runners: We take ownership and pride to influence the outcomes of our goals. We are successful, and like a Blade Runner, use the tools at our disposal to reach our objectives. We value open and honest communication and proactively seek feedback along the way. We are a company driven to grow and achieve both individually and as a team.

Bread Makers: We are humble and strive toward an egalitarian culture. No task is too big or too small. We work together to achieve our goals and develop our company mission. We believe that the whole is greater than the sum of its parts in everything that we do.

Självdistans (Self-Distance): Självdistans is Swedish for self-distance. It's the ability to critically reflect on oneself and one's relations from an external perspective. With this in mind, we act with objectivity and always remember that we are not our work. There's no perfect science to growing a team or business, but we trust everyone at Replicant to point out our blind spots and humbly admit their own.

Replicant is proud to be an equal opportunity employer. We are committed to fostering an inclusive, diverse and equitable workplace that is built on trust, support and respect. We welcome all individuals and do not discriminate on the basis of gender identity and expression, race, ethnicity, disability, sexual orientation, colour, religion, creed, gender, national origin, age, marital status, pregnancy, sex, citizenship, education, languages spoken or veteran status. Accommodation is available upon request at any point during our recruitment process. If you require an accommodation, please speak to your talent acquisition partner or email us at hr@replicant.ai and we’ll work to meet your needs.

Replicant Glassdoor Company Review
4.0 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
Replicant DE&I Review
3.9 Glassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star icon
CEO of Replicant
Replicant CEO photo
Gadi Shamia
Approve of CEO

Average salary estimate

$150000 / YEARLY (est.)
min
max
$120000K
$180000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Senior Site Reliability Engineer , Replicant

Replicant is on the lookout for a talented Senior Site Reliability Engineer to join our pioneering team. As a key player in our mission to revolutionize customer service with advanced AI technology, you’ll help scale and optimize the critical infrastructure that supports our leading-edge platform. Your role will involve ensuring our production systems operate smoothly and maintaining high availability while tackling performance bottlenecks and enhancing efficiency. Collaborating closely with engineering teams, you’ll develop and maintain automation tools that prevent incidents and enable rapid resolutions. Your technical expertise in managing complex distributed systems, particularly within a Kubernetes environment on GCP, will be invaluable as we expand our global reach. With technologies like TypeScript, NodeJS, Python, Helm, Terraform, Datadog, and Prometheus at your fingertips, you’ll contribute significantly to our infrastructure's design and implementation. Additionally, you'll participate in on-call rotations to keep our service uptime stellar. As we foster a culture of open communication, humility, and personal growth, we believe your passion for maintaining system reliability will make a noteworthy impact on our dynamic work environment. If you have a proactive mindset and desire to work alongside innovative technologists at Replicant, we can’t wait to hear from you.

Frequently Asked Questions (FAQs) for Senior Site Reliability Engineer Role at Replicant
What are the main responsibilities of a Senior Site Reliability Engineer at Replicant?

As a Senior Site Reliability Engineer at Replicant, your main responsibilities will include ensuring the smooth operation of production systems, monitoring system performance for bottlenecks, implementing optimizations, and developing automation tools to prevent incidents. You will also collaborate with engineering teams to enhance reliability and scalability, participate in on-call rotations, and contribute to the infrastructure design focusing on scalability and security.

Join Rise to see the full answer
What qualifications are required for the Senior Site Reliability Engineer position at Replicant?

Candidates for the Senior Site Reliability Engineer role at Replicant should have proven experience managing complex distributed systems in production environments. A strong understanding of cloud platforms like GCP and containerization technologies such as Kubernetes is essential. Proficiency in scripting languages (like Python and Bash) and automation tools (like Terraform) is crucial to excel in this position.

Join Rise to see the full answer
What technologies does the Senior Site Reliability Engineer at Replicant work with?

At Replicant, the Senior Site Reliability Engineer will work with a technology stack that includes TypeScript, NodeJS, Python, and manage environments using Kubernetes on Google Cloud Platform (GCP). This role also involves utilizing monitoring tools such as Datadog and Prometheus, and implementing infrastructure as code with tools like Helm and Terraform.

Join Rise to see the full answer
Is being on-call part of the Senior Site Reliability Engineer role at Replicant?

Yes, the Senior Site Reliability Engineer position at Replicant includes participating in an on-call rotation. This means you will be responsible for addressing production issues as they arise, ensuring that service uptime remains as high as possible and that any incidents are resolved swiftly.

Join Rise to see the full answer
What benefits does Replicant offer to full-time employees in the Senior Site Reliability Engineer position?

Replicant offers a comprehensive benefits package for full-time employees, including remote working options, competitive salaries, equity, a top-notch healthcare plan, health and wellness perks, an equipment stipend, a flexible vacation policy, team trips, and a unique 5-week sabbatical for those who have been with the company for 4.5 years.

Join Rise to see the full answer
Common Interview Questions for Senior Site Reliability Engineer
Can you explain how you ensure high availability in production systems?

In an interview for the Senior Site Reliability Engineer position at Replicant, emphasize your experience with redundancy, load balancing, and failover strategies. Outline specific tools or practices you've used to monitor uptime and performance, such as Datadog or Prometheus, and discuss any automation you implemented to address potential issues proactively.

Join Rise to see the full answer
What is your experience with cloud platforms and containerization technologies?

When answering this question, share specific projects where you used GCP and Kubernetes to manage services. Highlight how you’ve utilized these technologies to improve scalability and reduce downtime, detailing any challenges you faced and how you overcame them.

Join Rise to see the full answer
Describe a time you identified a bottleneck in a system and how you resolved it.

Prepare a specific example from your past work where you detected a performance bottleneck. Explain your thought process in analyzing system metrics, the steps you took to resolve the issue, and the outcome resulting from your actions, demonstrating your problem-solving skills.

Join Rise to see the full answer
How do you handle on-call duties and incident response?

In response to this question, focus on your approach to stress management during on-call duties. Discuss your process for documenting incidents, performing post-mortems, and updating systems to prevent recurrence, highlighting your commitment to continuous improvement.

Join Rise to see the full answer
What tools do you use for monitoring and observability?

Share your experience with monitoring tools such as Datadog and Prometheus, mentioning how you've configured them to alert on performance metrics. You could also provide an example of how these tools helped you diagnose a critical issue quickly.

Join Rise to see the full answer
Can you give an example of a CI/CD process you’ve implemented?

Give a detailed explanation of a CI/CD pipeline you've worked on. Talk about the tools you used (like Jenkins or GitLab), the integration of automated tests, and how this practice enhanced the reliability and speed of deployments.

Join Rise to see the full answer
What steps do you take to ensure security in cloud-based systems?

Discuss your understanding of cloud security best practices, mentioning specific protocols you've implemented to protect data. Talk about experiences with audits, compliance checks, and managing user access within GCP or other cloud platforms.

Join Rise to see the full answer
How do you prioritize tasks during high-pressure situations?

In your answer, focus on your approach to prioritizing issues based on impact and urgency, utilizing tools for task management during incidents. Additionally, emphasize the importance of communication and teamwork during these situations for effective resolution.

Join Rise to see the full answer
What strategies do you use for collaboration with engineering teams?

Describe how you foster collaboration with engineering teams by conducting regular meetings, leveraging collaborative tools, and creating an open dialogue for feedback. Emphasize the importance of collective ownership of system reliability.

Join Rise to see the full answer
Why do you want to work for Replicant as a Senior Site Reliability Engineer?

Share your enthusiasm for Replicant's mission of transforming customer service through AI. Connect this with your passion for reliable systems and how you believe your skills can contribute to the company's goals, showcasing your alignment with their culture and values.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 7 hours ago
Photo of the Rise User
Posted 2 days ago
Photo of the Rise User
Inclusive & Diverse
Collaboration over Competition
Fast-Paced
Growth & Learning
Empathetic
Posted 4 days ago
Photo of the Rise User
Arista Networks Remote Vancouver, BC, Canada
Posted 2 days ago
Photo of the Rise User
Sika AG Hybrid Fernley, NV, USA
Posted 5 days ago
Posted 7 days ago
Photo of the Rise User
Posted 14 days ago
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
January 7, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!