About the Team
OpenAI’s Inference team ensures that our most advanced models run efficiently, reliably, and at scale. We build and optimize the systems that power our production APIs, internal research tools, and experimental model deployments. As model architectures and hardware evolve, we’re expanding support for a broader set of compute platforms - for example AMD GPUs - to increase performance, flexibility, and resiliency across our infrastructure.
We are forming a team to generalize our inference stack - including kernels, communication libraries, and serving infrastructure - to alternative hardware architectures like AMD.
About the Role
We’re hiring engineers to scale and optimize OpenAI’s inference infrastructure across emerging GPU platforms. You’ll work across the stack - from low-level kernel performance to high-level distributed execution - and collaborate closely with research, infra, and performance teams to ensure our largest models run smoothly on new hardware.
This is a high-impact opportunity to shape OpenAI’s multi-platform inference capabilities from the ground up.
In this role, you will:
Design and optimize high-performance GPU kernels for AMD accelerators using HIP, Triton, or other performance-focused frameworks.
Build and tune collective communication libraries (e.g., RCCL) used to parallelize model execution across many GPUs.
Integrate internal model-serving infrastructure (e.g., vLLM, Triton) into AMD-backed systems.
Debug and optimize distributed inference workloads across memory, network, and compute layers.
Validate correctness, performance, and scalability of model execution on large AMD GPU clusters.
You can thrive in this role if you:
Have experience writing or porting GPU kernels using HIP, CUDA, or Triton, and care deeply about low-level performance.
Are familiar with communication libraries like NCCL/RCCL and understand their role in high-throughput model serving.
Have worked on distributed inference systems and are comfortable scaling models across fleets of accelerators.
Enjoy solving end-to-end performance challenges across hardware, system libraries, and orchestration layers.
Are excited to be part of a small, fast-moving team building new infrastructure from first principles.
Nice to Have:
Contributions to open-source libraries like RCCL, Triton, or vLLM.
Experience with GPU performance tools (Nsight, rocprof, perf) and memory/comms profiling.
Prior experience deploying inference on AMD or other non-NVIDIA GPU environments.
Knowledge of model/tensor parallelism, mixed precision, and serving 10B+ parameter models.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.
OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement
For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.
We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.
OpenAI Global Applicant Privacy Policy
At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Experienced Customer Success Manager needed to drive AI adoption and customer success within OpenAI's government sector clients in Washington, DC.
Contribute to building advanced AI infrastructure as a Design Execution Manager at OpenAI’s Stargate team, overseeing strategic design and construction of mission-critical data centers.
Contribute expert software engineering skills at Anduril Industries to build cutting-edge defense technology solutions.
An innovative SaaS company is seeking a Senior Machine Learning Engineer to develop and deploy mission-critical ML models in a hybrid role based in King of Prussia, PA.
Experienced Full Stack Developer needed at Brandes Associates Inc. to architect and maintain critical DoD systems while mentoring junior staff.
A Principal Software Engineer role at JPMorgan Chase focused on leading data product development and strategic allocation transformation within a major financial institution.
Experienced SAP ABAP Developer needed to lead development and optimization initiatives in ECC, S/4HANA, and RAP applications at MSRcosmos.
Innovate and enhance cybersecurity testing as a Senior Software Engineer in Test at Palo Alto Networks, driving quality in cloud-delivered security services.
Senior Full Stack Engineer role at DigitalOcean to develop scalable security products and infrastructure in a remote, dynamic setting.
Innovative Graduate Engineer wanted at Anomali to develop and enhance groundbreaking cybersecurity software solutions in a hybrid role based in Redwood City, CA.
TetraScience is seeking an experienced Senior AI Infrastructure Engineer to design and maintain scalable AI/ML cloud infrastructure and enable advanced AI capabilities.
Domino seeks a Senior/Staff Software Engineer to advance their Compute team and drive scalable architecture for AI-driven data science solutions.
Lead the development of large-scale distributed systems as a Principal Backend Java Engineer at Rackspace Technology, a leader in multicloud solutions.
Senior MS Dynamics Developer / Team Lead needed to lead technical development and team collaboration on government IT transformation projects in a hybrid Washington D.C. setting.
Software Engineer needed to lead healthcare system integrations at Adonis, a cutting-edge AI orchestration startup based in New York City.
OpenAI is a US based, private research laboratory that aims to develop and direct AI. It is one of the leading Artifical Intellgence organizations and has developed several large AI language models including ChatGPT.
777 jobsSubscribe to Rise newsletter