We’re looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research.
Responsibilities:
Provide infrastructure support to our ML research and product
Build tooling to diagnose cluster issues and hardware failures
Monitor deployments, manage experiments, and generally support our research
Maximize GPU allocation and utilization for both serving and training
Requirements:
4+ years of experience supporting the infrastructure within an ML environment
Experience in developing tools used to diagnose ML infrastructure problems and failures
Experience with cloud platforms (e.g., Compute Engine, Kubernetes, Cloud Storage)
Experience working with GPUs
Nice to have
Experience with large GPU clusters and high-performance computing/networking
Experience with supporting large language model training
Experience with ML frameworks like Pytorch/TensorFlow/JAX
Experience with GPU kernel development
Founded in 2021, Character is a leading AI company offering personalized experiences through customizable AI 'Characters.' As one of the most widely used AI platforms worldwide, Character enables users to interact with AI tailored to their unique needs and preferences.
In just two years, we achieved unicorn status and were named Google Play's AI App of the Year – a testament to our groundbreaking technology and vision.
Ready to shape the future of Consumer AI? 🚀
At Character, we value diversity and welcome applicants from all backgrounds. As an equal opportunity employer, we firmly uphold a non-discrimination policy based on race, religion, national origin, gender, sexual orientation, age, veteran status, or disability. Your unique perspectives are vital to our success.
Are you a seasoned Software Engineer looking to make a significant impact in the world of Machine Learning Infrastructure? Join the innovative team at Character.AI! We’re on the lookout for skilled ML Infrastructure engineers who have a knack for designing, building, and maintaining the backbone of our ML research—essentially the engine that powers our cutting-edge applications. Your role involves a variety of tasks such as providing unwavering infrastructure support, diagnosing cluster issues, and managing experiments to ensure our research runs smoothly. You'll be deeply integrated into our operations, maximizing GPU allocation to supercharge both serving and training processes. If you have over 4 years of hands-on experience in an ML environment and a proficiency for developing diagnostic tools, this is your chance to shine. At Character.AI, we pride ourselves on creating an inclusive atmosphere and have achieved remarkable milestones, like being named Google Play's AI App of the Year. If you’re passionate about pushing the boundaries of AI and thrive in a fast-paced setting, we’d love to have you on board. Come be a part of our journey in shaping the future of Consumer AI and experience firsthand the exhilaration of technical challenges and remarkable growth opportunities!
Character.ai is a neural language model chatbot service provider based in California that leverages sophisticated language models to facilitate conversations with users. Our mobile app had over 1.7 million downloads within its first week in 2023.
20 jobsSubscribe to Rise newsletter