Job details

Senior Site Reliability Engineer

We are expanding our team of motivated engineers with a proven track record of delivering a best in class DBaSS platform – ObjectRocket. You will have the opportunity to work with a strong team of engineers working on large-scale distributed enterprise systems. You will design, implement and support complex architectural design of hardware, software and networking systems.

Lead global SRE team to provide the highest-level reliability to our customers and platform. You will drive improvement through automation and best practices. his includes responding to, mitigating, investigating, and escalating incidents when they occur. You will be responsible for stepping above the day-to-day support, for synthesizing patterns of problems and business needs to the engineering teams. You will be responsible for ensuring that your services operations over time are improving to enhance our business effectiveness.

Key Responsibilities:

Ensure completeness of the technical infrastructure to support system performance
Stay up to date with emerging technologies and trends in the enterprise hardware, infrastructure and networking industry
Partner with the application engineering team to ensure the stability and performance of our technology solutions
Continuous identification of problems in the technology stack and processes and their corresponding burndown
Follow and execute Rackspace change management processes
Participate in systems/code reviews and design sessions
Contribute to and organize central store of knowledge
Take full ownership of product life cycle
Participate in on-call rotation

Qualifications:

Bachelor’s degree in Computer Science or equivalent experience
8+ years of information systems design/architecture/development
Strong experience in one or more of: Perl, Python, or Bash
Strong experience in one or more of: Ansible, Chef, or Salt
Strong experience working with Unix/Linux systems from kernel to shell and beyond, with experience working with system libraries, file systems, and client-server protocols. Networking: e.g. TCP/IP, UDP, ICMP, etc., MAC addresses, IP packets, DNS, SDN, OSI layers, and load balancing.
Experience in designing, analyzing and troubleshooting large-scale distributed systems.
Intermediate knowledge of operating systems.
Familiarity with algorithms, data structures and complexity analysis.
Intermediate experience designing complex SaaS applications for cloud reliability and scalability.
Intermediate experience with cloud infrastructure automation and CI/CD pipeline design.
Expertise in operational monitoring and management tools (Sensu, Prometheus, Grafana, etc.).
Intermediate written & verbal communication skills, both highly technical and non-technical.
Ability to work closely with non-technical stakeholders and executives.
Systematic problem-solving approach coupled with a strong sense of ownership and drive.
RHCE Preferred.
Preferred:
Experience working with Object Storage systems at Petabyte scale.
Experience using and managing one or more relational databases (e.g. MySQL).
Experience with non-relational databases (preferably Redis, Mongo)
Experience with cloud service providers (AWS, GCP, Azure, etc.)
Experience with Docker and container management systems (Swarm, Kubernetes, OpenShift, etc.)

$143,700 - $245,520 a year

The following information is required by pay transparency legislation in the following states: CA, CO, HI, NY and WA. This information applies only to individuals working in these states.

The anticipated starting pay range for Colorado is: $143,700 - $210,760.

The anticipated starting pay range for Hawaii and New York (not including NYC) is: $153,000 - $224,400.

The anticipated starting pay range for California, New York City and Washington is: $167,400 - $245,520.

Based on eligibility, compensation for the role may include variable compensation in the form of bonus, commissions, or other discretionary payments.

These discretionary payments are based on company and/or individual performance, and may change at any time.

Actual compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location.

Information on benefits offered is here.

#LI-JR1

#LI-Remote

#LI-USA

#rackspace

Average salary estimate

$194610 / YEARLY (est.)

min

max

$143700K

$245520K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Senior Site Reliability Engineer, Rackspace

Are you ready to take your career to the next level as a Senior Site Reliability Engineer at Rackspace? If you’re passionate about building and maintaining top-notch DBaSS platforms like ObjectRocket, this remote position is an exciting opportunity for you! You’ll be joining a dynamic team of talented engineers committed to delivering large-scale distributed systems that keep our customers happy and our technology solutions running smoothly. Your role will encompass designing and implementing complex architectures while overseeing global SRE initiatives to maximize reliability and performance. You’ll play a crucial part in driving best practices and automation, addressing incidents effectively and continuously improving operational efficiency. Embrace the chance to collaborate with both engineering and application teams to foster a seamless technology environment. With responsibilities that range from problem identification to leading knowledge-sharing initiatives, your expertise will directly impact our organization’s success. To thrive in this role, you should have a strong background in various programming languages like Perl, Python, or Bash, and a solid understanding of Unix/Linux systems, networking, and cloud infrastructure. At Rackspace, we prioritize innovation and growth, so staying updated on the latest industry trends is a must. If you possess problem-solving skills coupled with a sense of ownership and dedication, we want to hear from you! Join Rackspace and help us elevate our systems to new heights while enjoying the benefits of a flexible remote work environment.

Frequently Asked Questions (FAQs) for Senior Site Reliability Engineer Role at Rackspace

What does a Senior Site Reliability Engineer do at Rackspace?

At Rackspace, a Senior Site Reliability Engineer focuses on ensuring the reliability and performance of our DBaSS platform, ObjectRocket. This role involves designing and implementing complex architectural solutions while leading initiatives that enforce best practices and automation to address incidents effectively.