Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Site Reliability Engineer, FedRAMP Cloud Platform (REMOTE) - 31025 image - Rise Careers
Job details

Site Reliability Engineer, FedRAMP Cloud Platform (REMOTE) - 31025

Site Reliability Engineer, FedRAMP Cloud Platform - Remote Job DescriptionJoin us as we pursue our disruptive vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we’re committed to our work, customers, having fun and most significantly to each other’s success. Learn more about Splunk careers and how you can become a part of our journey!Role:Splunk's Cloud Services group is looking for a Site Reliability Engineer to help lead, design and build the next generation of our large scale cloud offering. You will be working on core services and applications that form the primitives for our current and future cloud service offerings. Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations of SRE, observability, Chaos Engineering and DevOps. This role is highly visible and impactful to the organization and will help shape Splunk's Engineering culture for years to come. Your job, in a nutshell, is to make every team around you better... including your own!This is a remote role available in all US states except AK, ND, and WY. You also have the option of an office desk in some locations if that's convenient and desirable for you!You will:· Own Splunk Cloud in FedRAMP environments.· Work across the organization to deliver quality products that delight Splunk's passionate users.· Work with teams of tight-knit engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing.Qualifications:· You have experience or an interest in working with regulated computing environments such as FISMA and/or FedRAMP and are enthusiastic about doing it better.· This is a fully remote, US-based/work-from-home position. You must be a US Citizen working on US soil to be considered.· You have worked with Kubernetes, EKS, GKE or AKS and the associated ecosystems. Kubernetes certifications or an interest in obtaining these certifications are a plus, such as those from the Cloud Native Computing Foundation; Certified Kubernetes Administrator (CKA), Certified Kubernetes Application Developer (CKAD), or Certified Kubernetes Security Specialist (CKS).· You enjoy building and running distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying systems to production.· You have a good understanding of linux systems (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)· Experience with at least one programming language, preferably golang (go) or python. Knowledge of working with and automating linux systems tasks using this language is required, including working with configuration files and system services. Knowledge of common data structures and algorithms, as well as their performance characteristics is required. · Knowledge of standard methodologies related to security, performance, and disaster recovery.· Skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.· You have assembled Open Source components into cohesive services.· You are interested in working hard to make the users of Splunk's products happier every day.Preferred skills:· Experience monitoring cloud environments with Splunk.· Experience with development and deployment in a hosted cloud environment, preferably AWS, Azure or GCP. Cloud certifications are a plus or an interest in obtaining these certifications, such as AWS Certified Solutions Architect, AWS Certified DevOps Engineer, or Google Associate Cloud Engineer (ACE).· Experience with large scale distributed cloud service development, infrastructure, traffic management and architecture.· Experience with distributed architectures/systems with optimized and scalable software that operates on a large number of nodes.All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying. For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records.Please note: This position supports United States federal, state, and local government agency customers and is subject to certain U.S. citizenship-based restrictions imposed by law, regulation, executive order, government contract, and/or related determination by the U.S. Attorney General. As such, this position is contingent upon candidates establishing proof of U.S. citizenship status. If Splunk determines that a candidate’s citizenship status will prohibit the candidate from working in this position, Splunk expressly reserves the right to either consider the candidate for a different position that is not subject to such restrictions, on whatever terms and conditions Splunk shall establish in its sole discretion, or, in the alternative, decline to move forward with the candidate’s application.Splunk is an Equal Opportunity Employer: At Splunk, we believe creating a culture of belonging isn’t just the right thing to do; it’s also the smart thing. We prioritize diversity, equity, inclusion, and belonging to ensure our employees are supported to bring their best, most authentic selves to work where they can thrive. Qualified applicants receive consideration for employment without regard to race, religion, color, national origin, ancestry, sex, gender, gender identity, gender expression, sexual orientation, marital status, age, physical or mental disability or medical condition, genetic information, veteran status, or any other consideration made unlawful by federal, state, or local laws. We consider qualified applicants with criminal histories, consistent with legal requirements.Note:Base Pay RangeSF Bay Area, Seattle Metro, and New York City Metro AreaBase Pay Range: $146,400.00 - 201,300.00 per yearCalifornia (excludes SF Bay Area), Washington (excludes Seattle Metro), Washington DC Metro, and MassachusettsBase Pay Range: $131,760.00 - 181,170.00 per yearAll other cities and states excluding California, Washington, Massachusetts, New York City Metro Area and Washington DC Metro Area.Base Pay Range: $117,120.00 - 161,040.00 per yearSplunk provides flexibility and choice in the working arrangement for most roles, including remote and/or in-office roles. We have a market-based pay structure which varies by location. Please note that the base pay range is a guideline and for candidates who receive an offer, the base pay will vary based on factors such as work location as set out above, as well as the knowledge, skills and experience of the candidate. In addition to base pay, this role is eligible for incentive compensation and may be eligible for equity or long-term cash awards.Benefits are an important part of Splunk's Total Rewards package. This role is eligible for a competitive benefits package which includes medical, dental, vision, a 401(k) plan and match, paid time off and much more! Learn more about our next-level benefits at https://splunkbenefits.com.
Splunk Glassdoor Company Review
3.9 Glassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star icon
Splunk DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of Splunk
Splunk CEO photo
Gary Steele
Approve of CEO

Average salary estimate

$159210 / YEARLY (est.)
min
max
$117120K
$201300K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Site Reliability Engineer, FedRAMP Cloud Platform (REMOTE) - 31025, Splunk

Are you ready to take your career to the next level as a Site Reliability Engineer with Splunk on our FedRAMP Cloud Platform? This fully remote role offers you the opportunity to be part of a passionate team committed to making machine data accessible and valuable for everyone. At Splunk, we’re not just about the code; we’re about the culture and community we create while doing what we love. You’ll dive deep into a hands-on role where you’ll help lead the design and development of innovative cloud services. Your influence will be felt across various teams, empowering them with modern SRE practices, observability, and Chaos Engineering. The impact you make will help shape our engineering culture for years to come, ensuring that every service built is top-notch and user-centric. Experience the excitement of working with a tight-knit group of engineers dedicated to building scalable systems that perform beautifully under pressure. If you have a background in regulated computing environments like FedRAMP, hands-on experience with Kubernetes and a programming language like Go or Python, we want you on our team! This is your chance to work from anywhere in the U.S. (except AK, ND, and WY), with an option for office space if you prefer a traditional setting. Join us in this mission to make every product delightful for Splunk’s dedicated users because your contributions will not go unnoticed. Let’s build the future together!

Frequently Asked Questions (FAQs) for Site Reliability Engineer, FedRAMP Cloud Platform (REMOTE) - 31025 Role at Splunk
What are the primary responsibilities of a Site Reliability Engineer at Splunk?

As a Site Reliability Engineer at Splunk, your primary responsibilities will include owning Splunk Cloud within FedRAMP environments and collaborating with various teams to deliver high-quality products. Your role will involve implementing modern interpretations of SRE practices, engaging with service owners, and focusing on making every team better, including your own.

Join Rise to see the full answer
What qualifications do I need to apply for the Site Reliability Engineer position at Splunk?

To apply for the Site Reliability Engineer position at Splunk, you should have experience or a keen interest in regulated computing environments such as FISMA and FedRAMP. Proficiency with Kubernetes and programming in Go or Python is essential. Additionally, a good understanding of Linux systems, networking, and experience with monitoring cloud environments will highly benefit your application.

Join Rise to see the full answer
Is the Site Reliability Engineer position at Splunk remote, and what are the location constraints?

Yes, the Site Reliability Engineer position at Splunk is fully remote, but it is available to candidates in all U.S. states except Alaska, North Dakota, and Wyoming. This flexibility is ideal for individuals looking to balance work with their personal life while contributing to an innovative team.

Join Rise to see the full answer
What kind of environment will I be working in as a Site Reliability Engineer at Splunk?

At Splunk, as a Site Reliability Engineer, you'll be working in a dynamic, collaborative environment with a tight-knit group of engineers focused on building a state-of-the-art cloud platform. You'll engage in discussions on best practices and will take part in crafting scalable, high-performance services for massive data processing.

Join Rise to see the full answer
What benefits can I expect as a Site Reliability Engineer at Splunk?

As a Site Reliability Engineer at Splunk, you can expect a competitive benefits package that includes medical, dental, vision coverage, a 401(k) plan with a company match, paid time off, and more. Splunk values its employees and provides various rewards and benefits to support your work-life balance.

Join Rise to see the full answer
Common Interview Questions for Site Reliability Engineer, FedRAMP Cloud Platform (REMOTE) - 31025
Can you explain site reliability engineering and its importance at Splunk?

Certainly! Site Reliability Engineering (SRE) is an integration of software engineering and systems engineering to build and run large-scale, distributed, and reliable systems. At Splunk, SRE is vital because it ensures that our services are available, performant, and scalable for our customers, allowing them to harness machine data effectively.

Join Rise to see the full answer
How do you approach incident management as a Site Reliability Engineer?

As a Site Reliability Engineer, I prioritize having a well-documented incident response plan in place. During incidents, I focus on quickly assessing the situation to mitigate impact, communicating effectively with affected teams, and conducting post-mortems afterward to identify lessons learned and improve future incident responses.

Join Rise to see the full answer
What experience do you have with Kubernetes and how have you utilized it for reliability?

I have extensive experience working with Kubernetes, managing deployments and scaling applications. I utilize Kubernetes features like auto-scaling, service discovery, and health checks to enhance system reliability and ensure seamless operational processes, allowing for quick recovery from outages.

Join Rise to see the full answer
How would you identify and resolve a performance bottleneck in a distributed system?

To identify performance bottlenecks in a distributed system, I would first profile the application using monitoring tools to gather metrics. Once a bottleneck is located, I would analyze the system architecture to understand the interactions and implement necessary adjustments, such as optimizing code, scaling out services, or caching strategies to improve performance.

Join Rise to see the full answer
What tools do you use to monitor cloud environment performance?

I frequently use tools like Prometheus and Grafana for monitoring cloud environments, along with Splunk for log management and visualization. These tools allow me to set up alerts for suspicious activity, track performance metrics, and gain insights into the overall health of the deployment.

Join Rise to see the full answer
What programming languages are you comfortable with, and how have you used them in your past roles?

I am proficient in Go and Python, which I have used extensively to automate tasks, manage configurations, and build microservices. Using these languages, I've developed scripts to streamline deployments and monitor system health, enhancing the operating efficiency of the teams I’ve worked with.

Join Rise to see the full answer
Can you explain how Chaos Engineering fits into your reliability strategy?

Chaos Engineering is integral to my approach to reliability as it helps identify weaknesses in a system under stress. By intentionally injecting failures, we can observe how our systems respond, allowing us to strengthen our resilience and recovery strategies, ensuring that our services can handle unexpected incidents.

Join Rise to see the full answer
How do you ensure that security considerations are integrated into your SRE practices?

I believe security is paramount in SRE practices. I ensure that security is embedded in the entire lifecycle of our services by following configuration management and least privilege principles. Additionally, I actively conduct security audits and engage with security-focused teams to safeguard our infrastructure.

Join Rise to see the full answer
Describe your experience with disaster recovery planning.

I have played a key role in disaster recovery planning by developing failover strategies and regularly testing them against real-world scenarios. This includes maintaining regular backups, creating operational runbooks, and implementing systems that ensure rapid restoration of services with minimal downtime.

Join Rise to see the full answer
What excites you about working at Splunk as a Site Reliability Engineer?

I'm excited about the opportunity to work with a talented team at Splunk, where innovation is valued, and collaboration drives success. The chance to contribute to cutting-edge cloud services and influence the future of our engineering culture aligns with my passion for reliability and performance optimization.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 7 days ago
Inclusive & Diverse
Diversity of Opinions
Collaboration over Competition
Growth & Learning
Transparent & Candid
Mission Driven
Social Impact Driven
Passion for Exploration
Dental Insurance
Health Savings Account (HSA)
Vision Insurance
Performance Bonus
Paid Holidays
Sabbatical
Medical Insurance
401K Matching
Paid Time-Off
Learning & Development
Maternity Leave
Paternity Leave
Mental Health Resources
Photo of the Rise User
Posted 13 days ago
Photo of the Rise User
Olsson Remote 601 P St suite 200, Lincoln, NE 68508, USA
Posted 9 days ago
Photo of the Rise User
Posted 6 days ago

Splunk’s purpose is to build a safer and more resilient digital world.

95 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
December 18, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!