The Cloud Infrastructure team at Kumo is responsible for managing and scaling our Kubernetes-based, cloud-native AI platform across multiple cloud providers. They set service level objectives, optimize resource allocation, enforce security compliance, and drive cost efficiency for the Multi-Cloud Platform.
As a key team member, you will architect and operate a highly scalable, resilient Kubernetes infrastructure to support massive Big Data and AI workloads. You’ll design and implement advanced cluster management strategies, fleet capacity scaling, optimize workload scheduling, and enhance observability at scale. Your expertise in Kubernetes internals, networking, and performance tuning will be critical in ensuring high availability and seamless scaling.
Joining early, you'll play a pivotal role in shaping platform reliability, automating infrastructure, and enabling ML engineers with efficient commit-to-production automation, Continuous Provisioning, CI/CD, ML Ops, and deployment orchestration and workflows. You'll collaborate with ML scientists, product engineers, and leadership to influence scaling strategies, develop self-service tooling, and drive multi-cloud resilience. Engineers at Kumo take ownership of core system design, building infrastructure that powers the next generation of AI applications.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Are you ready to make a significant impact on the future of technology? At Kumo, we are on the lookout for an experienced Software Engineer specializing in Cloud Engineering with a focus on Kubernetes. Based in beautiful Mountain View, CA, you'll join our dedicated Cloud Infrastructure team, where you'll play a crucial role in managing and scaling our advanced Kubernetes-based, cloud-native AI platform. In this role, you will have the opportunity to architect, operate, and optimize large-scale Kubernetes clusters that handle massive Big Data and AI workloads across multiple cloud providers like AWS, GCP, and Azure. Your responsibilities will include developing advanced cluster management strategies, enhancing observability, and ensuring high availability and resilience in our infrastructure. You'll collaborate closely with ML engineers, product engineers, and leadership to craft innovative solutions for workload scheduling and resource allocation. The environment here is dynamic and engaging, as you’ll be influencing strategies, automating infrastructure, and optimizing performance. If you are passionate about driving the next generation of AI applications and enjoy a culture that values ownership and collaboration, Kumo is the perfect place for you to grow your career and do what you love.
Subscribe to Rise newsletter