About the Team
The fleet team runs the GPU fleet that serves the models backing ChatGPT and API while also supporting training workloads for our next generation models. We manage one of the largest cutting edge GPU fleets in the world, exposing it as a singular platform for other OpenAI teams to seamlessly run production Applied AI and training workloads.
We seek to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is more important to us than unfettered growth.
About the Role
As a Technical Program Manager for the GPU Fleet, your role is to help make our future compute plans become a reality by coordinating with engineers to bring up, maintain, and serve capacity to all of OpenAIs training and inference workloads. You will be responsible for managing & coordinating the overall body of work across many parallel programs/projects, ensuring cohesive communication and consistent alignment across all teams in platform, to all cross functional teams, and up to leadership.
This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.
In this role, you will:
Guide the roadmap for automation for a fleet that can grow an order of magnitude in size or more.
Ensure that incoming clusters are tracked and delivered on-time while providing a stable supply signal for the OpenAI fleet.
Work with Data, Scheduling, and Hardware teams to drive business metrics across multi-organizations to influence strategic initiatives.
Consistently partner with GPU users across research and applied-product infrastructure to drive high utilization and optimization opportunities.
Work with strategic partners (product engineering, inference, security, research, and finance) on product launches, big project rollouts, and build tooling to ensure that all demand is actualized into scheduled compute.
Collaborate with XFN Partners that will allow us to build long-term, self-service tooling allowing OpenAI to seamlessly manage a growing compute fleet.
You might thrive in this role if you:
Possess a degree in a hard science, or have a demonstrated track record of engineering expertise.
Have 5+ years of experience in program management for major projects including capital projects or hyperscaler infrastructure deployment
Ability to dive into ambiguous technical problem spaces that may involve GPU Cluster and Node Lifecycle and AI/ML Platform Infrastructure.
Demonstrated ability to serve as the go-to person solely responsible for driving and delivering complex projects.
Comfortable in managing cross-functional and cross-company teams; experience driving information and decision hygiene
Have an extensive track record of successfully delivering high-profile, technical projects against tight deadlines.
Are technically adept and have effectively partnered with engineering or fundamental research teams of the highest caliber.
Interfacing and leading external vendors including: engineering firms, equipment suppliers, and/or construction firms
Expertise in designing and implementing simple, scalable processes that solve complex problems.
Experience managing complicated dependencies such as logistics and or supply chains
Are relentlessly resourceful and thrive in ambiguous, fast-paced environments.
Are interested in and thoughtful about the impacts of AGI.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.
OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement
For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.
We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.
OpenAI Global Applicant Privacy Policy
At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Are you an innovative Technical Program Manager looking to make an impact? OpenAI is on the lookout for someone just like you to join our Fleet Management Systems team in sunny San Francisco! As the Technical Program Manager for our GPU Fleet, you’ll work hand-in-hand with talented engineers to oversee one of the largest cutting-edge GPU infrastructures in the world. Your mission? Make our future compute plans a reality by managing numerous parallel projects that connect our research and production teams. Your role will be pivotal as you guide the roadmap for automation, ensuring we harness the power of AI responsibly and effectively. You'll collaborate with cross-functional teams to optimize our GPU and AI/ML platform infrastructures while delivering on-time and high-profile projects. We believe in continuous learning from deployment, and you'll be at the forefront of making that happen. If you possess a degree in a hard science or have an extensive engineering background, along with over 5 years of program management experience, this role is perfect for you. Join us to drive innovation in AI technology while ensuring it benefits humanity. Together, let’s shape the future of technology!
OpenAI is a US based, private research laboratory that aims to develop and direct AI. It is one of the leading Artifical Intellgence organizations and has developed several large AI language models including ChatGPT.
856 jobsSubscribe to Rise newsletter