Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Cloud Site Reliability Engineer (SRE) image - Rise Careers
Job details

Cloud Site Reliability Engineer (SRE)

Company Overview

Promise empowers utilities and government agencies to create flexible, affordable solutions for individuals struggling with debt. Our innovative approach to payment plans and relief distribution significantly improves enrollment and recovery rates, helping individuals clear debts faster and reducing delinquencies for our partners.

We treat people facing financial difficulties with respect and dignity, providing the tools and resources they need to thrive. Our team includes experts from companies like Palantir, Google, Stripe , and esteemed government leaders.

Backed by over $50 million in funding from top investors such as 8VC, Kapor Capital, XYZ Ventures, and Howard Schultz, we've been recognized as one of Fast Company's "World's Most Innovative Companies of 2022."

Role Overview 

We’re looking for a Cloud Site Reliability Engineer (SRE) to build, operate, and optimize the infrastructure that powers our products. You’ll be responsible for ensuring high reliability, performance, and scalability of our cloud-based systems. The ideal candidate is self-sufficient, detail-oriented, and execution-driven, with a strong background in software development, site reliability engineering (SRE), and infrastructure-as-code (IaC).

You’ll collaborate closely with product and engineering teams to improve system architecture, troubleshoot issues, and automate operational processes. This role is ideal for someone who thrives in a hard-working, fast-moving environment, enjoys solving complex technical challenges, and takes personal responsibility for ensuring security outcomes are achieved and aligned to business goals.

What You’ll Do

  • Design, implement, and manage cloud infrastructure to ensure reliability, scalability, and security.

  • Automate infrastructure and operations using Terraform, scripting, and configuration management tools.

  • Develop strong relationships with engineering teams to define system reliability goals and best practices.

  • Troubleshoot and resolve complex network and system issues using observability tools, stack traces, and system logs.

  • Monitor and optimize system performance, implementing best practices for high availability and disaster recovery.

  • Formalize and liaise with the Engineering team to guide them through a security design review process

  • Ensure the security and stability of Linux-based production systems.

  • Provide essential support in aligning our technology projects with compliance requirements, navigating the complexities of state and federal regulations, while fostering an environment of innovation. 

  • Serve as a bridge between technical teams and non-technical stakeholders, translating security and compliance needs into actionable plans that support our broader business objectives.

What Will Enable You

  • 4+ years of experience in Linux system administration, managing large-scale production environments.

  • Strong debugging skills, with experience in performance tuning, observability, and system-level troubleshooting.

  • Hands-on experience with cloud platforms (AWS, Azure, or GCP).

  • Expertise in Infrastructure-as-Code (IaC) using Terraform or similar tools.

  • Proficiency in monitoring tools (e.g., Prometheus, Datadog) and health check implementation.

  • Experience with containerization (Docker, Podman, Kubernetes).

  • Scripting experience (Python, Bash, or equivalent) to automate infrastructure management.

  • Knowledge of networking and security best practices for cloud environments.

Promise is an equal opportunity employer and does not discriminate against any applicant or employee because of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, genetic information, age, or military or veteran status. Additionally, the Company complies with applicable state and local laws governing non-discrimination in employment in every jurisdiction in which it operates. Promise is committed to promoting diversity and inclusion in the workplace. We also provide reasonable accommodations to qualified individuals with disabilities, pregnant individuals, and those with sincerely held religious beliefs, in accordance with applicable laws.

Promise engages in US government contracts and restricts hiring to US persons, which includes US citizens and permanent residents (e.g., Green Card holders). Additionally, candidates must reside in the US.

Promise Glassdoor Company Review
4.1 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
Promise DE&I Review
2.0 Glassdoor star iconGlassdoor star icon Glassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of Promise
Promise CEO photo
Unknown name
Approve of CEO

Average salary estimate

$100000 / YEARLY (est.)
min
max
$80000K
$120000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Cloud Site Reliability Engineer (SRE), Promise

At Promise, we're on a mission to empower utilities and government agencies to offer flexible and affordable solutions for individuals facing financial challenges. We're currently seeking a Cloud Site Reliability Engineer (SRE) to join our dynamic team in Oakland. In this role, you'll play a crucial part in building, operating, and optimizing the cloud infrastructure that powers our innovative payment solutions. As a Cloud SRE, your primary focus will be on ensuring the reliability and performance of our systems while working collaboratively with product and engineering teams to tackle complex technical challenges. You'll get to leverage your expertise in software development and site reliability engineering, utilizing tools like Terraform and various monitoring solutions. Promise values creativity and diligence, and we are looking for someone who thrives in a fast-paced environment, enjoys automating processes, and is dedicated to maintaining the security of our cloud environments. This opportunity is perfect for detail-oriented individuals with substantial experience in Linux administration and a knack for troubleshooting systems to help us maintain our high standards and compliance requirements. If you're ready to make a meaningful impact while working alongside industry experts from Palantir, Google, and Stripe, we can't wait to meet you!

Frequently Asked Questions (FAQs) for Cloud Site Reliability Engineer (SRE) Role at Promise
What are the responsibilities of a Cloud Site Reliability Engineer at Promise?

As a Cloud Site Reliability Engineer (SRE) at Promise, you'll design, implement, and manage cloud infrastructure to ensure it meets requirements for reliability, scalability, and security. Additionally, you'll automate operations using tools like Terraform, cooperate with engineering teams to establish system reliability goals, and monitor system performance. Troubleshooting complex issues, maintaining Linux-based systems, and ensuring compliance with regulations are also key parts of your responsibilities.

Join Rise to see the full answer
What qualifications are required for the Cloud Site Reliability Engineer position at Promise?

To qualify for the Cloud Site Reliability Engineer role at Promise, candidates should have at least 4 years of experience in Linux system administration along with strong skills in debugging, performance tuning, and system-level troubleshooting. Familiarity with cloud platforms like AWS, Azure, or GCP, as well as expertise in Infrastructure-as-Code using Terraform, is essential. Proficiency in monitoring tools and scripting languages, along with knowledge of networking and security best practices, will greatly enhance your candidacy.

Join Rise to see the full answer
What tools and technologies will I use as a Cloud Site Reliability Engineer at Promise?

In the Cloud Site Reliability Engineer position at Promise, you will work with a variety of tools and technologies including Terraform for Infrastructure-as-Code, monitoring platforms like Prometheus or Datadog, and containerization technologies such as Docker and Kubernetes. You'll also utilize scripting languages such as Python and Bash to automate processes and manage our cloud environments effectively.

Join Rise to see the full answer
How does Promise support its Cloud Site Reliability Engineers in achieving their goals?

Promise is committed to fostering an environment where Cloud Site Reliability Engineers can thrive. You'll have the opportunity to collaborate closely with cross-functional teams, supporting your professional growth through exposure to diverse challenges and innovative projects. Additionally, our culture emphasizes open communication, allowing you to voice your ideas to improve system reliability and operational efficiency.

Join Rise to see the full answer
What makes Promise an attractive employer for Cloud Site Reliability Engineers?

Promise stands out as an employer for Cloud Site Reliability Engineers because of its commitment to social impact, innovation, and a collaborative work environment. With a strong team of experts from prestigious companies, ample opportunities for professional growth, and a focus on diversity and inclusion, joining Promise means being part of a mission-driven organization that appreciates and respects its employees.

Join Rise to see the full answer
Common Interview Questions for Cloud Site Reliability Engineer (SRE)
Can you explain the importance of site reliability engineering?

Certainly! Site Reliability Engineering (SRE) is critical because it bridges the gap between software development and IT operations. As a Cloud Site Reliability Engineer, your role ensures that services remain available, reliable, and performant by applying engineering principles to operations. This can directly influence user satisfaction and business success, making it an essential function.

Join Rise to see the full answer
What tools do you typically use for monitoring cloud systems?

For monitoring cloud systems, a Cloud Site Reliability Engineer might use tools like Prometheus for real-time monitoring, Datadog for performance metrics, or ELK Stack for logging and analysis. It's important to choose tools that fit your team's needs and can provide actionable insights into system performance.

Join Rise to see the full answer
How do you ensure the security of cloud infrastructure?

Ensuring the security of cloud infrastructure involves adopting best practices such as regular audits, implementing IAM roles, using encryption for data at rest and in transit, and maintaining up-to-date security patches. As a Cloud SRE, you should also conduct risk assessments and develop incident response strategies to address potential vulnerabilities.

Join Rise to see the full answer
What is Infrastructure as Code and why is it important?

Infrastructure as Code (IaC) is the practice of managing and provisioning computing infrastructure through machine-readable script rather than physical hardware configuration. It's crucial because it allows for consistency, reproducibility, and scalability in system management, reducing human error while speeding up deployment and recovery efforts.

Join Rise to see the full answer
How do you approach troubleshooting complex system issues?

When troubleshooting complex system issues, it's important to systematically gather data using observability tools and check system logs. Start by defining the problem clearly, analyzing dependencies, and applying techniques such as the Five Whys or root cause analysis to find effective solutions. Collaboration with teams can also provide valuable insights.

Join Rise to see the full answer
Can you describe your experience with cloud platforms?

I have extensive experience working with cloud platforms such as AWS, Azure, or GCP. I've managed resources, implemented services like load balancers and auto-scaling, and utilized cloud-native security features. My hands-on experience has equipped me to optimize cloud performance while ensuring high availability and minimal downtime.

Join Rise to see the full answer
What are some key performance indicators (KPIs) for site reliability?

Key performance indicators for site reliability include uptime percentage, response time, failure rate, and mean time to recovery (MTTR). Monitoring these KPIs helps ensure that systems meet reliability targets and highlight areas for improvement, ultimately enhancing user experience.

Join Rise to see the full answer
What role does automation play in site reliability?

Automation plays a significant role in site reliability by streamlining repetitive tasks, reducing human error, and enabling faster deployments. As a Cloud Site Reliability Engineer, you'll automate operational processes such as system monitoring, incident response, and infrastructure management to improve efficiency and reliability across the board.

Join Rise to see the full answer
How do you stay updated with the latest trends in cloud technology?

Staying updated with the latest trends in cloud technology involves regularly following industry blogs, podcasts, and websites, attending webinars, and participating in professional networks. Engaging in continuous learning through courses and certifications can also help you stay informed about the evolving landscape of cloud solutions.

Join Rise to see the full answer
Why should you work with cross-functional teams as a Cloud Site Reliability Engineer?

Working with cross-functional teams is essential for a Cloud Site Reliability Engineer because it fosters collaboration and ensures that all aspects of the system—from development to operations—are aligned. This teamwork leads to faster resolution of issues, improved system design, and an overall more robust infrastructure that supports business objectives.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Replay Remote No location specified
Posted 13 days ago
Photo of the Rise User
Posted yesterday
OnePay Remote No location specified
Posted 3 days ago
Photo of the Rise User
Posted 5 days ago
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition
Photo of the Rise User
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition
Photo of the Rise User
Posted 8 hours ago
Photo of the Rise User
KNOREX Remote No location specified
Posted 6 days ago
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
March 19, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
Someone from OH, Cleveland just viewed Finance Intern - Summer 2025 at Spectrum
Photo of the Rise User
Someone from OH, Cleveland just viewed QC Engineer at QODE
Photo of the Rise User
Someone from OH, Cleveland just viewed Getinge is hiring: UI/UX Developer in Streetsboro at Getinge
Photo of the Rise User
Someone from OH, Westerville just viewed Data analyst | Mid at Nord Security
Photo of the Rise User
Someone from OH, North Canton just viewed Researcher-NBC Sports at NBCUniversal
Photo of the Rise User
Someone from OH, North Canton just viewed Researcher-NBC Sports at NBCUniversal
Photo of the Rise User
Someone from OH, Lakewood just viewed Culture and Programs Analyst at City of Philadelphia
Photo of the Rise User
Someone from OH, Olmsted Falls just viewed Customer Service - Representative at Waterway Carwash
M
Someone from OH, Strongsville just viewed Technical Writer (Contract) at Mintlify
Photo of the Rise User
Someone from OH, Cincinnati just viewed Inside Sales Co-Op at VEGA Americas
S
Someone from OH, Cleveland just viewed Senior JavaScript Developer at SuperDial
Photo of the Rise User
Someone from OH, Columbus just viewed Environmental Science Intern at Kimley-Horn
Photo of the Rise User
Someone from OH, Dayton just viewed Sr Renewal Analyst 1730 at MeridianLink
Photo of the Rise User
Someone from OH, Canton just viewed Communications Manager at Shearer's Foods
Photo of the Rise User
Someone from OH, Akron just viewed BDR Lead at Pontera
Photo of the Rise User
Someone from OH, Akron just viewed SDR Manager at Darktrace
Photo of the Rise User
24 people applied to REMOTE Sr Piping Designer at Kelly