As the Engineering Manager for Observability, you will build and lead a team responsible for the full observability stack, ensuring visibility, monitoring, logging, and alerting are all operating seamlessly at scale. Your team will help product teams observe, monitor, and troubleshoot their services, ensuring that our technology scales to meet the needs of our users without sacrificing performance or increasing operational costs.
You’ll work closely with product teams to ensure that observability practices are deeply integrated into development workflows. Additionally, you'll evaluate our existing observability stack, propose improvements, and lead the implementation of cost-effective and scalable solutions. Your technical depth, strong cross-functional collaboration skills, and leadership experience will guide us as we scale our infrastructure in the cloud.
Lead and grow a team of observability engineers, fostering a culture of collaboration and innovation.
Lead a team in building the observability stack, including monitoring, logging, and tracing, ensuring scalability and cost-efficiency.
Work closely with product and infrastructure teams to integrate observability tools into their development workflows.
Scale the observability infrastructure to meet the demands of fast-growing products while managing operational costs.
Ensure system reliability by identifying and addressing performance bottlenecks.
Set the strategic direction for observability tools, processes, and infrastructure, with a focus on scalability and delightful UX.
Stay updated with the latest trends in observability and cloud-native technologies, continuously seeking out improvements.
Build and maintain strong cross-functional relationships, ensuring that all product teams have visibility into their systems and services.
Have experience building and operating an observability stack from scratch, ideally in a cloud-based environment.
Are comfortable working in a fast-moving startup environment and can adapt to the pace of rapid growth.
Have technical expertise in observability tools and technologies (e.g., DataDog, Prometheus, Grafana, ELK stack).
Have a deep understanding of cloud platforms (e.g., AWS, GCP, Azure) and their role in observability.
Understand the challenges of building scalable observability backends and appreciate the importance of creating a user-friendly interface.
Have a strong track record of building and maintaining scalable systems in a cloud-based environment.
Are skilled in collaborating with cross-functional teams and have experience working with various product teams as customers.
Have a humble, coachable attitude and are eager to learn and grow as a leader.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.
OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement
For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.
We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.
OpenAI Global Applicant Privacy Policy
At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.
OpenAI is a US based, private research laboratory that aims to develop and direct AI. It is one of the leading Artifical Intellgence organizations and has developed several large AI language models including ChatGPT.
531 jobsSubscribe to Rise newsletter