We’re a tight-knit, efficient team with a bias for action and a strong sense of ownership. Our teams have autonomy, low ego, and are trusted to drive projects end to end. We care deeply about building infrastructure for web3 use cases and collaborate across disciplines to make that happen. If you’re passionate about infrastructure that has a real impact on our users, enjoy solving hard problems, and thrive in a fast-paced environment, you’ll feel right at home.
The Engineering Operations team, including Site Reliability, works closely with engineering teams across Edge & Node to ensure the services we operate are reliable, performant, predictable, and secure. We are on a mission to take our service delivery to the next level.
Lead by example as a hands-on technical contributor, participating in on-call rotations, incident response, and the day-to-day work of the SRE team
Partner with engineering and product leadership to shape roadmaps, define team priorities, and plan work that improves reliability, performance, and scalability across the stack
Team with and support other SREs, leveraging your leadership and soft skills to foster a culture of continuous learning, blameless retrospectives, and technical excellence
Own the incident lifecycle, including root cause analysis and follow-up remediation, and work to make our systems increasingly self-healing
Drive SRE team strategy, advocating for industry best practices, standardization, and secure and optimized infrastructure
Architect and improve core infrastructure services, with an eye toward high availability, fault tolerance, performance, and end-to-end observability
Work across teams to challenge assumptions, fundamentally overhaul our systems, and improve documentation
Collaborate with external partners and vendors as needed to ensure the health of critical services
Proven experience as a senior or lead SRE or devops engineer, ideally having led large-scale reliability initiatives or infrastructure transformation projects
Strong project or technical leadership skills, with a track record of guiding teammates and setting technical direction while still remaining hands-on
Deep knowledge of the SRE/devops domain, including incident response, security awareness, maintaining SLAs and uptime guarantees, observability, supporting internal development teams, project and capacity planning, and/or system architecture
Experience with both cloud and on-prem core infrastructure, ideally with Google Cloud Platform (GCP), bare metal infra, and kubernetes (or similar orchestration tools)
Fluency in infrastructure as code, Terraform, automation tooling, CI/CD pipelines, and system monitoring solutions such as Grafana
Excellent interpersonal, leadership, and communication skills, with the ability to align stakeholders and motivate and unblock team members
Experience in web3, crypto, or blockchain is a plus (but not required)
_____
About The Graph
The Graph is the indexing and query layer of the decentralized internet. As the first open data marketplace to introduce and standardize subgraphs, The Graph is a flagship solution for accessing blockchain data across web3.
Since launching in 2018, tens of thousands of developers have built subgraphs to power dapps across 90+ blockchains. As demand for web3 data grows, The Graph is evolving to support a broader range of data services and query languages, expanding what’s possible with decentralized infrastructure—now and in the future.
Discover more about how The Graph is shaping the future of decentralized physical infrastructure networks (DePIN) by following The Graph on X, LinkedIn, Instagram, Facebook, Reddit, and Medium. Join the community on The Graph’s Telegram, and join technical discussions on The Graph’s Discord.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
As a Tech Lead for Site Reliability Engineering (SRE) at Edge & Node, you will be joining an innovative team that is passionately focused on building The Graph, a decentralized protocol designed to optimize access to the world's vast knowledge and information. We're not just developers; we're curators of a decentralized future, and your expertise will be crucial in enhancing our infrastructure to support web3 applications. In this hands-on role, you'll lead by example in incident response and on-call rotations, while also driving reliability efforts and technical excellence across our organization. Your leadership will help shape project roadmaps alongside engineering and product leadership, allowing us to deliver reliable and performant services. Not only will you support other SREs with your technical insights, but you will also champion a culture of learning and continuous improvement—where blameless retrospectives help us grow stronger. Architects of high availability, fault tolerance, and observability, our SREs strive to make our systems self-healing. We’re looking for someone who is not only knowledgeable in the SRE domains, such as incident response and system architecture but is also adept at project leadership, and is skilled with tools like Terraform and Kubernetes. If you’re excited about making a real impact and fostering collaboration across teams, you’ll be a perfect fit at Edge & Node. Explore the world of decentralized infrastructure with us and make your mark in shaping the future!
Crown is hiring a Senior Programmer Analyst to lead technical enhancements for critical business applications in a dynamic packaging environment.
Become a key player at GoodLeap by driving Salesforce solutions that revolutionize customer engagement and operational efficiency.
Lead the charge in optimizing electronic medical records at CommonSpirit Health at Home to improve patient care and operational efficiency.
Join SupportYourApp as an IT Compliance & Audit Specialist, where your expertise in information security will help safeguard our clients' sensitive data.
Join F.H. Furr as a Systems Support Administrator to enhance operational efficiency through effective system management and user training.
Join ManTech as a Principal Cyber Network Engineer to optimize network security and drive operational efficiency in our Chantilly, VA team.
Elevate your career as an ERP Developer with TerrAscend, a leader in the cannabis industry, focusing on innovative ERP solutions.
A leading client in Austin is on the hunt for a seasoned Oracle Database Administrator to enhance their critical application databases.
Subscribe to Rise newsletter