Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Software Engineer - Cloud Engineering, Kubernetes image - Rise Careers
Job details

Software Engineer - Cloud Engineering, Kubernetes

The Cloud Infrastructure team at Kumo is responsible for managing and scaling our Kubernetes-based, cloud-native AI platform across multiple cloud providers. They set service level objectives, optimize resource allocation, enforce security compliance, and drive cost efficiency for the Multi-Cloud Platform.


As a key team member, you will architect and operate a highly scalable, resilient Kubernetes infrastructure to support massive Big Data and AI workloads. You’ll design and implement advanced cluster management strategies, fleet capacity scaling, optimize workload scheduling, and enhance observability at scale. Your expertise in Kubernetes internals, networking, and performance tuning will be critical in ensuring high availability and seamless scaling.


Joining early, you'll play a pivotal role in shaping platform reliability, automating infrastructure, and enabling ML engineers with efficient commit-to-production automation, Continuous Provisioning, CI/CD, ML Ops, and deployment orchestration and workflows. You'll collaborate with ML scientists, product engineers, and leadership to influence scaling strategies, develop self-service tooling, and drive multi-cloud resilience. Engineers at Kumo take ownership of core system design, building infrastructure that powers the next generation of AI applications.


Key Responsibilities
  • Design, build, and scale Kubernetes-based infrastructure to support Kumo’s multi-cloud AI platform, ensuring high availability, resilience, and performance.
  • Architect and optimize large-scale Kubernetes clusters, improving scheduling, networking (CNI), and workload orchestration for production environments.
  • Develop and extend Kubernetes controllers and operators to automate cluster management, lifecycle operations, and scaling strategies.
  • Enhance observability, diagnostics, and monitoring by building tools for real-time cluster health tracking, alerting, and performance tuning.
  • Lead efforts to automate fleet management, optimizing node pools, autoscaling, and multi-cluster deployments across AWS, GCP, and Azure.
  • Define and implement Kubernetes security policies, RBAC models, and best practices to ensure compliance and platform integrity.
  • Collaborate with ML engineers and platform teams to optimize Kubernetes for machine learning workloads, ensuring seamless resource allocation for AI/ML models.
  • Drive commit-to-production automation, cloud connectivity, and deployment orchestration, ensuring seamless application rollouts, zero-downtime upgrades, and global infrastructure reliability.


Required Skills and Experience
  • Kubernetes Mastery: 5-7+ years of experience managing large-scale Kubernetes clusters (EKS, GKE, AKS, or OpenSource) in production. Deep expertise in Kubernetes internals, including controllers, operators, scheduling, networking (CNI), and security policies.
  • Cloud-Native Infrastructure: 5-7+ years of experience building cloud-native Kubernetes-based infrastructure across AWS, Azure, and GCP.
  • Platform Engineering: 5-7+ years of experience building Kubernetes service meshes (Istio/Envoy, Traefik), networking policies (Calico/Tigera), and distributed ingress/egress control.
  • Fleet Management & Scaling: Proven experience in optimizing, scaling, and maintaining Kubernetes clusters across multi-cloud environments, ensuring high availability and performance.
  • Software Development: 5-7+ years of experience writing production-grade controllers and operators in Python, Go, or Rust to extend Kubernetes functionality.
  • Infrastructure-as-Code & Automation: Hands-on experience with Terraform, CloudFormation, Ansible, BASH and Make scripting to automate Kubernetes cluster provisioning and management.
  • Distributed Systems & SaaS: Expertise in building and operating large-scale distributed systems for cloud-native B2B SaaS applications running on Kubernetes.
  • Cloud Application Deployment: Deep expertise in building of container orchestration, workload scheduling, and runtime optimizations using Kubernetes, Argo or Flux.
  • Education: BS/MS in Computer Science or a related field (PhD preferred)


Nice to Have
  • Proficiency with cloud platforms such as AWS, GCP, or Azure.
  • Familiarity with chaos engineering tools and practices for testing system resilience.
  • Strong understanding of security best practices and compliance standards (GDPR, SOC2, ISO27001, vulnerability assessments, GRC, risk management).
  • Contributions to open-source projects, particularly in the Kubernetes or cloud-native ecosystem.
  • Expertise in Docker, Kubernetes, Jenkins, Flux, Argo, and Terraform in a Linux environment.
  • Hands-on experience with monitoring and observability tools such as Prometheus and Grafana.
  • Ability to develop customer-facing web frontends or public APIs/SDKs for platform services.


Benefits
  • Competitive salary and equity options.
  • Comprehensive medical and dental insurance.
  • An inclusive, diverse work environment where all employees are valued and supported.


We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

KUMO Glassdoor Company Review
2.6 Glassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star iconGlassdoor star icon
KUMO DE&I Review
2.1 Glassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star iconGlassdoor star icon
CEO of KUMO
KUMO CEO photo
Unknown name
Approve of CEO

Average salary estimate

$150000 / YEARLY (est.)
min
max
$120000K
$180000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Software Engineer - Cloud Engineering, Kubernetes, KUMO

Are you ready to make a significant impact on the future of technology? At Kumo, we are on the lookout for an experienced Software Engineer specializing in Cloud Engineering with a focus on Kubernetes. Based in beautiful Mountain View, CA, you'll join our dedicated Cloud Infrastructure team, where you'll play a crucial role in managing and scaling our advanced Kubernetes-based, cloud-native AI platform. In this role, you will have the opportunity to architect, operate, and optimize large-scale Kubernetes clusters that handle massive Big Data and AI workloads across multiple cloud providers like AWS, GCP, and Azure. Your responsibilities will include developing advanced cluster management strategies, enhancing observability, and ensuring high availability and resilience in our infrastructure. You'll collaborate closely with ML engineers, product engineers, and leadership to craft innovative solutions for workload scheduling and resource allocation. The environment here is dynamic and engaging, as you’ll be influencing strategies, automating infrastructure, and optimizing performance. If you are passionate about driving the next generation of AI applications and enjoy a culture that values ownership and collaboration, Kumo is the perfect place for you to grow your career and do what you love.

Frequently Asked Questions (FAQs) for Software Engineer - Cloud Engineering, Kubernetes Role at KUMO
What are the responsibilities of a Software Engineer - Cloud Engineering at Kumo?

As a Software Engineer - Cloud Engineering at Kumo, your primary responsibilities include designing and building Kubernetes-based infrastructures that support our multi-cloud AI platform, managing large-scale clusters, optimizing resource allocation, and enhancing system resilience and performance. You'll also automate cluster management, develop tools for monitoring and diagnostics, and collaborate closely with ML engineers to optimize workloads.

Join Rise to see the full answer
What qualifications do I need to apply for the Software Engineer - Cloud Engineering position at Kumo?

To qualify for the Software Engineer - Cloud Engineering role at Kumo, ideally, you should have 5-7 years of experience in managing large-scale Kubernetes clusters in production. Deep expertise in Kubernetes internals, cloud-native infrastructure across AWS, Azure, and GCP, and software development skills in Python, Go, or Rust are also required. A strong understanding of security practices and experience with infrastructure-as-code tools like Terraform or Ansible is a plus.

Join Rise to see the full answer
What programming skills are necessary for the Software Engineer - Cloud Engineering role at Kumo?

The Software Engineer - Cloud Engineering role at Kumo requires you to have a solid foundation in programming, particularly in Python, Go, or Rust. You will be expected to write production-grade controllers and operators to extend Kubernetes functionality, which means proficiency in coding is crucial for success in this position.

Join Rise to see the full answer
How does Kumo ensure a supportive work environment for Software Engineers?

At Kumo, we are committed to fostering an inclusive and diverse work environment where all Software Engineers are valued and supported. We actively promote collaboration across teams, and our culture emphasizes ownership, innovation, and the growth of each individual. Our benefits package and team dynamics reflect our dedication to employee satisfaction and engagement.

Join Rise to see the full answer
Can you describe the team culture for the Software Engineer - Cloud Engineering at Kumo?

The team culture at Kumo for the Software Engineer - Cloud Engineering position is built on collaboration, innovation, and support. You will work with passionate professionals who are committed to driving advancements in AI technology. We believe in shared learning, open communication, and encouraging diverse perspectives, making our work environment both stimulating and respectful.

Join Rise to see the full answer
Common Interview Questions for Software Engineer - Cloud Engineering, Kubernetes
What experience do you have with Kubernetes and how have you used it in previous projects?

In answering this question, focus on specific projects where you managed Kubernetes clusters, detailing your role, the scale of the clusters, and the challenges faced. Highlight your proficiency in Kubernetes internals, controllers, and operators, while providing concrete examples of how you optimized performance and availability.

Join Rise to see the full answer
Can you explain a challenging cloud-native project you’ve worked on and your contribution?

Share a project that involved complex cloud infrastructure, describing the objectives, technologies used (like AWS or GCP), and your specific contributions. Emphasize your problem-solving skills and how you collaborated with others to achieve project goals.

Join Rise to see the full answer
How do you approach optimizing resource allocation in a multi-cloud environment?

Discuss your strategies for ensuring efficient resource allocation, such as autoscaling practices, managing workloads across different cloud providers, and relevant tools you have utilized. Mention how you assess performance metrics to inform your optimization strategies.

Join Rise to see the full answer
What methodologies have you implemented for CI/CD in your previous roles?

Share the CI/CD practices you've implemented, focusing on automation tools you’ve used (like Jenkins or GitLab) and how these practices improved deployment processes. Detail any challenges you faced and how you successfully addressed them.

Join Rise to see the full answer
How do you ensure Kubernetes security policies are effective?

Discuss your experience with defining and implementing security policies in Kubernetes environments. Provide examples of how you've integrated RBAC models, conducted vulnerability assessments, and adhered to compliance standards to maintain platform security.

Join Rise to see the full answer
What techniques do you use for monitoring and observability in Kubernetes?

Explain the tools you have used for monitoring Kubernetes, such as Prometheus or Grafana, and how you set up effective observability. Describe how you track real-time cluster health and the strategies you use for diagnostics.

Join Rise to see the full answer
Can you give an example of how you've automated infrastructure management?

Provide specific examples of automation practices you've implemented for infrastructure management using tools like Terraform or Ansible. Highlight the outcomes and how automation improved efficiency, reduced errors, or enhanced performance.

Join Rise to see the full answer
What is your experience with cloud networking, and how do you optimize it?

Discuss your knowledge of cloud networking, particularly in a Kubernetes context, and the strategies you’ve employed to optimize networking performance. Mention specifics like CNI configurations and any challenges you overcame.

Join Rise to see the full answer
Describe a situation where you had to collaborate with multiple teams to achieve a project goal.

Share an example of a collaborative project that required input from various teams. Highlight your role in facilitating communication, managing differences, and how you brought everyone together for successful outcomes.

Join Rise to see the full answer
How do you keep your skills updated in a fast-evolving tech landscape?

Discuss your approach to continuous learning, such as participating in online courses, attending tech meetups, engaging with the open-source community, or following relevant publications. Mention specific areas you're currently focusing on to stay ahead in cloud engineering.

Join Rise to see the full answer
Similar Jobs
Posted 9 days ago
Photo of the Rise User
Posted 3 hours ago
Photo of the Rise User
Statisfy Remote No location specified
Posted yesterday
Barclays Hybrid 115 South Jefferson Rd Campus, Whippany
Posted 4 hours ago
Photo of the Rise User
Second Nature Remote No location specified
Posted 2 days ago
Photo of the Rise User
SQLI Remote 2 Rue Thierry le Luron, 92300 Levallois-Perret, France
Posted 11 days ago
Photo of the Rise User
Posted 11 days ago
Posted 2 days ago
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
March 22, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
Someone from OH, Mentor just viewed Site Merchandising Manager at Lovepop
Photo of the Rise User
Someone from OH, Batavia just viewed Restaurant Busser at Outback Steakhouse
Photo of the Rise User
26 people applied to Senior PLSQL Developer at ProArch
Photo of the Rise User
Someone from OH, New Albany just viewed Customer Success Manager at Quisitive
Photo of the Rise User
Someone from OH, Columbus just viewed UGC Creator - USA, Female 40-50 - Contract to hire at Upwork
Photo of the Rise User
137 people applied to Scrum Master-Remote at DICE
Photo of the Rise User
10 people applied to Frontend Engineer I at Outliant
Photo of the Rise User
17 people applied to Front-End Developer at Whizz
Photo of the Rise User
Someone from OH, Strongsville just viewed Automotive Buyer at Sonic Automotive
Photo of the Rise User
Someone from OH, Strongsville just viewed Experienced Automotive Buyer at Sonic Automotive
Photo of the Rise User
Someone from OH, Columbus just viewed Business Systems Analyst, Apps & Automations at Deel
Photo of the Rise User
Someone from OH, Findlay just viewed Marketing Analyst at ITW
R
Someone from OH, Cleveland just viewed Marketing Lead at Redi.Health
Photo of the Rise User
Someone from OH, Cleveland just viewed Associate Conversion Data Analyst at Bloomerang
Photo of the Rise User
Someone from OH, Cleveland just viewed Material Buyer/Planner at Aston Carter
F
Someone from OH, Cleveland just viewed Senior Materials Planner at Fortune Brands
Photo of the Rise User
Someone from OH, Cleveland just viewed Junior Data Analyst at Arkana Laboratories
Photo of the Rise User
Someone from OH, Cleveland just viewed BI Analyst, Junior at Emi Labs
Photo of the Rise User
Someone from OH, Bellbrook just viewed Accounting Co-Op (Part-Time) at Avery Dennison
Photo of the Rise User
Someone from OH, Cincinnati just viewed Senior Compliance officer (AML) at Visa
Photo of the Rise User
Someone from OH, Cleveland just viewed Amazon Expediting Fleet Specialist at MSX International
R
Someone from OH, Cincinnati just viewed Sales development representative at Remote Recruitment
Photo of the Rise User
Someone from OH, Cincinnati just viewed Laboratory Technologist I - 2nd Shift at Eurofins