Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Site Reliability Engineer - SRE image - Rise Careers
Job details

Site Reliability Engineer - SRE

As a Site Reliability Engineer (SRE), you will maintain and improve our platform's reliability, availability, and performance, leveraging Azure as the core cloud platform and using industry leading tools. You will work closely with cross-functional teams to design, implement, and maintain resilient systems, automating wherever possible to streamline operations and minimize downtime. Your expertise will be instrumental in proactively identifying and resolving potential issues before they impact our customers, and you will contribute to the continuous improvement of our infrastructure and processes.Key Responsibilities:• Analyze reliability challenges and develop automated solutions for incident resolution.• Work with development teams to improve applications operational features for faster MTTD, MTTR, and auto-recovery.• Lead the establishment of SLIs, SLOs, Error budgets, policies, and work with respective engineers to instrument, visualize, and offer a means for peer engineers and developers to gain greater insight into operational performance (Observability)• Identify, track, and address Toil.• Conduct Post-Mortems• Identify and implement continuous improvement in various facets of production operations.• Offer advanced technical support for cross-product issues and incidents.• Leveraging SRE tooling to develop, implement, and deliver on the SRE mission.• Conduct Chaos Testing• Identify, define, and implement new tools and technologies to improve the quality and efficiency of distributed platforms.• You will drive reliability and supportability aspects of Cloud service, including change management, triage of customer escalations, remediation plans, playbooks, and automation.• Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.• Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.• Engage in and improve the whole lifecycle of services from inception and design through deployment, operation, and refinement.Qualifications:• 4+ years of experience in Reliability engineering background• 2+ recent years of experience with Azure systems• Advanced knowledge of New Relic ecosystem.• Working Knowledge of Monitoring and APM tools such as Azure App Insights, Grafana, and Selenium• Knowledge of networking and troubleshooting latency, connectivity, and performance• Experience working with IaC with Terraform and CaC with Ansible.• Familiar with one or more Databases - SQL server, Mongo DB, and PostgreSQL• Hands-on experience with SRE practices and writing, running Chaos engineering experiments.• Preferred experience with C#, .Net, and PowerShell or Python or Golang• Experience with containerization.• Experience in High Availability and distributed systems.• Proficient in Linux and Windows administration, troubleshooting, and support• Experience with Azure DevOps• Excellent Debugging skills across a variety of integrated platforms.StarCompliance Background ChecksAll positions require pre-employment screening due to employees potentially having access to highly sensitive and confidential information involving finance and compliance; candidates must be trustworthy and have a heightened sensitivity to protecting confidential financial, professional information. To be eligible for employment with StarCompliance, candidates must undergo a rigorous background investigation with checks including, but not limited to, criminal record history, consumer credit, employment history, qualifications, and education checks.Equal Opportunity Employer StatementWe prohibit discrimination and harassment of any kind based on race, sex, religion, sexual orientation, national origin, disability, genetic information, pregnancy, gender identity or expression, marital/civil union/domestic partnership status, veteran status or any other protected characteristic as outlined by country, state, or local laws.This policy applies to all employment practices within our organisation, including hiring, recruiting, promotion, termination, layoff, recall, leave of absence, compensation, benefits, training, and apprenticeship. StarCompliance makes hiring decisions based solely on qualifications, merit, and business needs at the time. For more information, please request a copy of our Equal Opportunities Policy.Original job Site Reliability Engineer - SRE posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
StarCompliance Glassdoor Company Review
4.0 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
StarCompliance DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of StarCompliance
StarCompliance CEO photo
Jennifer Sun
Approve of CEO

We are Reputation Guardians, on a mission to make compliance simple and easy.

7 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
August 9, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!