Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior Site Reliability Engineer - Midnight image - Rise Careers
Job details

Senior Site Reliability Engineer - Midnight

Who are we?

IOHK, is a technology company focused on Blockchain research and development. We are renowned for our scientific approach to blockchain development, emphasizing peer-reviewed research and formal methods to ensure security, scalability, and sustainability. Our projects include decentralized finance (DeFi), governance, and identity management, aiming to advance the capabilities and adoption of blockchain technology globally.

We invest in the unknown, applying our curiosity and desire for positive change to everything we do. By fueling creativity, innovation, and progress within our teams, our products and services are designed for people to be fearless, to be changemakers.

About Midnight:

IOG's Midnight Tribe is a business technology provider and core contributor to the Midnight Network, a blockchain platform for developing decentralized applications that safeguard personal and commercial data. The Midnight Network is the first blockchain to offer programmable data isolation by leveraging zero-knowledge (ZK) proofs to enable selective disclosure of what information is visible on-chain and is designed to help developers implement necessary business policies, such as meeting regulatory requirements.

What the role involves:

As a Senior SRE, you will be a key player in shaping the reliability and performance of our systems across our cloud infrastructure. You will design and implement solutions that improve our service reliability, automate routine tasks, and facilitate smooth collaboration between development and operations teams. This role demands a blend of deep technical expertise, a proactive mindset, and the ability to take vague or evolving challenges and refine them into robust, workable solutions.

  • Infrastructure & Automation:
    • Design, build, and maintain scalable and highly available systems, primarily on AWS, using best practices.
    • Manage and optimize Kubernetes clusters for high availability and performance, extending them when it makes sense to expand functionality.
    • Leverage GitOps principles to automate deployments and manage container orchestration.
    • Implement and manage CI/CD pipelines ensuring seamless, high-quality deployments, finding and removing bottlenecks, improving performance and working alongside teams to refine feedback loops and automate toil away.
    • Develop automation tools and scripts to improve operational efficiency.

  • Monitoring & Incident Response:
    • Implement robust monitoring solutions with Prometheus and related tooling to ensure system health and performance.
    • Participate in on-call rotations and lead incident response efforts, turning challenges into learning opportunities.
    • Collaborate with dev teams to define and implement SLOs/SLIs

  • Problem Solving & Communication:
    • Take vague or loosely defined problems, work closely with cross-functional teams, and distill them into clear, actionable plans.
    • Communicate technical solutions and incident retrospectives effectively across both technical and non-technical stakeholders.

  • Innovation & Continuous Improvement:
    • Evaluate and adopt new technologies, with a special advantage for candidates with blockchain experience, to keep our systems at the cutting edge.
    • Document processes and best practices, ensuring that knowledge is shared across the team and continuously improved.
    • Strive to strike a balance between effective delivery of goals and a measurable high standard of these goals. Always apply a layer of polish and due diligence when delivering.

Who you are:

  • 7+ years of experience in SRE, DevOps, or a related role.
  • Understanding of SRE best practices, architectures, and methods.
  • Good knowledge on resiliency patterns and cloud security.
  • Strong programming proficiency in Python, Golang, or Javascript.
  • Rust experience is advantageous
  • Demonstrated experience with AWS and modern cloud architectures.
  • Proficiency in Helm, Terraform, and CI/CD tools like Github Actions and ArgoCD
  • Hands-on experience with Kubernetes/EKS and GitOps methodologies.
  • Proven track record with monitoring tools such as Prometheus, OpenTelemetry, as well as familiarity with the LGTM stack, or other comparable tools
  • Blockchain experience is advantageous, offering a unique perspective on distributed systems and security.
  • Exceptional problem-solving skills with a knack for translating vague requirements into clear, strategic plans.
  • Ability to engage in technical discussions and be part of the decision making process
  • Strong problem-solving skills and capability to work on complex systems
  • Experience in working within an Agile environment
  • Experience in working with a distributed team
  • Strong communication and collaboration abilities to work seamlessly across different teams.
  • A proactive and innovative mindset, with a passion for continuous improvement and operational excellence.

Are you an IOGer?

Do you find yourself questioning the status quo? Do you tinker with ideas and long to turn those ideas into solutions? Are you able to spark thoughtful debates, bringing out the inquisitiveness in others? Does the promise of continuously growing excite you? Then get ready to reimagine everything you thought wasn’t possible because that’s what it means to be an IOGer - we don’t set limits, we break them.

  • Remote work
  • Laptop reimbursement
  • New starter package to buy hardware essentials (headphones, monitor, etc)
  • Learning & Development opportunities
  • Competitive PTO 

At IOG, we are committed to fostering a diverse and inclusive workplace where all individuals are valued and empowered to succeed. We welcome people of all backgrounds and ensure that employment decisions are based solely on merit, qualifications, and potential. Everyone is given equal opportunities regardless of race, color, religion, national origin, gender, gender identity, sexual orientation, age, marital status, veteran status, disability, or any other characteristic protected by law.

Io Global Glassdoor Company Review
4.0 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
Io Global DE&I Review
4.0 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
CEO of Io Global
Io Global CEO photo
Unknown name
Approve of CEO
What You Should Know About Senior Site Reliability Engineer - Midnight, Io Global

Join us at IOHK as a Senior Site Reliability Engineer and become a driving force in our Midnight Tribe, where we revolutionize blockchain technology! At IOHK, we pride ourselves on our scientific approach, emphasizing rigorous research to foster groundbreaking developments like decentralized finance and identity management. In this role, you'll be crucial in enhancing the performance and reliability of our cutting-edge systems that power the Midnight Network. Your expertise will guide the design and implementation of innovative solutions across our cloud infrastructure, ensuring smooth collaboration between development and operations teams. With over 7 years of experience in SRE or DevOps roles, you'll leverage your deep understanding of cloud security, resiliency patterns, and modern architectures, particularly on AWS, to improve our services continually. Bring your problem-solving skills and creativity as you automate routine tasks, optimize Kubernetes clusters, and implement robust monitoring systems. If you're excited about the intersection of blockchain and Site Reliability Engineering and have the knack for turning vague challenges into actionable plans, then this opportunity might just be the perfect fit for you. Join us, where an inquisitive mind meets an innovative environment, and let's break limits together!

Frequently Asked Questions (FAQs) for Senior Site Reliability Engineer - Midnight Role at Io Global
What are the daily responsibilities of a Senior Site Reliability Engineer at IOHK?

As a Senior Site Reliability Engineer at IOHK, your daily responsibilities will include designing and implementing scalable cloud infrastructure solutions, managing Kubernetes clusters, and automating deployment processes using GitOps methods. Additionally, you will work collaboratively with development teams to monitor systems' performance and address incidents, ensuring all applications run reliably and efficiently.

Join Rise to see the full answer
What skills are necessary for the Senior Site Reliability Engineer position at IOHK?

To excel as a Senior Site Reliability Engineer at IOHK, candidates should possess strong programming skills in languages like Python, Golang, or JavaScript, and experience with cloud platforms like AWS. A solid understanding of SRE best practices, Kubernetes, CI/CD tools, and monitoring systems like Prometheus is essential. Previous experience in blockchain technology is a bonus but not required.

Join Rise to see the full answer
What does the career progression look like for a Senior Site Reliability Engineer at IOHK?

At IOHK, a Senior Site Reliability Engineer has ample opportunities for career progression. You can evolve into leadership roles, such as a Principal Engineer or SRE Manager, where you can impact broader strategic decisions. Additionally, you'll have the chance to lead innovative projects and mentor junior engineers, further enhancing your professional growth within the dynamic blockchain environment.

Join Rise to see the full answer
What is the work culture like for Senior Site Reliability Engineers at IOHK?

The work culture for Senior Site Reliability Engineers at IOHK is dynamic and inclusive, focusing on collaboration and innovation. Team members are encouraged to question the status quo and are supported in bringing their creative ideas to life, fostering an environment where continuous improvement and operational excellence are paramount.

Join Rise to see the full answer
How does IOHK support the ongoing learning and development of Senior Site Reliability Engineers?

IOHK is committed to the continuous learning and development of its Senior Site Reliability Engineers by providing various opportunities such as workshops, conferences, and access to educational resources. New starters also receive a reimbursement package for essential hardware, ensuring you have the tools you need to succeed in your role.

Join Rise to see the full answer
Common Interview Questions for Senior Site Reliability Engineer - Midnight
Can you describe your experience with cloud infrastructure as a Senior Site Reliability Engineer?

When discussing your experience with cloud infrastructure, focus on specific projects where you've designed or managed cloud solutions, particularly on platforms like AWS. Highlight how you've implemented these systems for reliability and performance and any automation tools you've used to manage deployments effectively.

Join Rise to see the full answer
How do you ensure system reliability and performance?

To ensure system reliability and performance, I focus on implementing robust monitoring systems and defining SLOs/SLIs with cross-team collaboration. I would discuss utilizing tools like Prometheus for monitoring and emphasize continuous testing and automated deployment processes to minimize downtime and quickly address issues.

Join Rise to see the full answer
What tools do you prefer for CI/CD and why?

I typically prefer tools like GitHub Actions and ArgoCD for CI/CD because they offer seamless integration with version control and provide a clear workflow for deployments. I would elaborate on my experiences with these tools and how they've helped streamline development processes and improve collaboration within teams.

Join Rise to see the full answer
How do you handle on-call incidents as a Senior Site Reliability Engineer?

Handling on-call incidents requires a structured approach; I prioritize the issue, quickly gather necessary information, and coordinate with relevant teams for resolution. After addressing the incident, I always conduct a retrospective to identify improvements that can prevent future occurrences.

Join Rise to see the full answer
Can you talk about your experience with Kubernetes?

Certainly! I've worked extensively with Kubernetes for container orchestration, managing deployments, scaling applications, and ensuring high availability. I'd share specific examples of how I've optimized clusters and implemented GitOps principles to streamline operations.

Join Rise to see the full answer
What strategies do you use for automating operational tasks?

I focus on identifying repetitive tasks and leveraging languages like Python or scripting to automate them. Using tools like Terraform for infrastructure as code or Helm for managing Kubernetes deployments can drastically reduce manual work and improve efficiency.

Join Rise to see the full answer
How do you approach cross-team collaboration to address SRE challenges?

I believe regular communication and shared goals are crucial for successful cross-team collaboration. I advocate for joint meetings to discuss challenges, facilitate knowledge sharing, and create an environment where all team members feel empowered to contribute their insights.

Join Rise to see the full answer
What are some best practices for implementing monitoring solutions?

Some best practices for implementing monitoring solutions include defining clear metrics that align with business objectives, ensuring proper alert configurations to avoid alert fatigue, and creating dashboards for easy visualization of system health. Continuous review of monitoring strategies is essential for adapting to changing environments.

Join Rise to see the full answer
Describe a complex problem you solved in your previous SRE role.

I encountered a situation where our application faced intermittent downtime due to resource limits. I would explain how I analyzed the system, identified bottlenecks, and suggested architectural enhancements that improved reliability and performance.

Join Rise to see the full answer
What role do you think documentation plays in Site Reliability Engineering?

Documentation is vital in Site Reliability Engineering as it facilitates knowledge sharing and helps onboard new team members efficiently. It ensures that best practices are available for everyone, ultimately leading to consistency in operations and quicker troubleshooting at all levels.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User

Join IOHK as a Software Engineer and contribute to cutting-edge blockchain technology focused on smart contracts.

Photo of the Rise User
Posted 5 days ago

Lead the Site Reliability Engineering team at IOG, driving reliability and scalability for the innovative Midnight Network blockchain platform.

Photo of the Rise User
Boeing Hybrid US, Saint Louis County, MO; Missouri, Hazelwood, MO
Posted 13 days ago

Join Boeing's innovative team as an Experienced Mechanical Design Engineer, where you'll design cutting-edge training systems for military aircraft.

Posted 7 days ago

Join Northrop Grumman as a Principal/Sr Principal Electrical Engineer and contribute to revolutionary systems that shape the future.

Photo of the Rise User
Datadog Hybrid Sarasota, FL
Posted 13 days ago
Customer-Centric
Rapid Growth
Diversity of Opinions
Reward & Recognition
Friends Outside of Work
Inclusive & Diverse
Empathetic
Feedback Forward
Work/Life Harmony
Casual Dress Code
Startup Mindset
Collaboration over Competition
Fast-Paced
Growth & Learning
Open Door Policy
Rise from Within
Maternity Leave
Paternity Leave
Flex-Friendly
Family Coverage (Insurance)
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Paid Holidays
Paid Sick Days
Paid Time-Off

Join our team as an AutoCAD Draftsman, where your design skills will contribute to innovative projects in a collaborative environment.

Photo of the Rise User

Lead innovative automotive R&D efforts as an Advanced Development Technical Leader at Valeo.

Photo of the Rise User

Join WHOOP as a Senior Mechanical Engineer to drive the design of next-generation wearable accessories and improve user experience.

Photo of the Rise User
American Express Remote Phoenix, Arizona, United States
Posted 3 days ago
Inclusive & Diverse
Empathetic
Collaboration over Competition
Growth & Learning
Transparent & Candid
Medical Insurance
Dental Insurance
Mental Health Resources
Life insurance
Disability Insurance
Child Care stipend
Employee Resource Groups
Learning & Development

Join American Express as an Engineer, where you'll leverage Apptio tools to drive impactful technology solutions.

Photo of the Rise User
Apple Hybrid San Diego, California, United States
Posted 2 days ago
Inclusive & Diverse
Diversity of Opinions
Work/Life Harmony
Dare to be Different
Reward & Recognition
Empathetic
Take Risks
Growth & Learning
Transparent & Candid
Mission Driven
Passion for Exploration
Feedback Forward
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Learning & Development
Paid Time-Off
Maternity Leave
Social Gatherings

Apple is seeking a Cellular 4G/5G Firmware Data & Automation Engineer to push the boundaries of wireless technology and improve product experiences.

Halo Braid Remote No location specified
Posted 6 days ago

Join HaloBraid as a Senior Mechanical Engineer to lead the development of an innovative AI-powered hair braiding robot.

Photo of the Rise User
Posted 10 months ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Customer-Centric
Fast-Paced
Growth & Learning
Medical Insurance
Dental Insurance
401K Matching
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Flex-Friendly
Photo of the Rise User
Posted 2 months ago

Join ABC Legal Services as a Data Entry Specialist where you can work remotely and support our team in the legal document filing process.

MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
March 14, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!