Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Distributed Software Engineer image - Rise Careers
Job details

Distributed Software Engineer

Cerebras Systems is a pioneer in large-scale AI Supercomputers focusing on building a breakthrough architecture for the AI industry. They are looking for a Senior Distributed Software Engineer with extensive experience in software architecture, system design, and distributed systems.

Skills

  • Software architecture
  • Distributed system development
  • Kubernetes
  • GoLang
  • Python
  • Debugging in distributed environments

Responsibilities

  • Automate bare-metal configuration of networking, OS, and application software in large clusters
  • Implement workflows for cluster upgrades, downgrades, and security patching
  • Develop orchestration and scheduler systems for resource allocation and job submission
  • Support for both on-premise and cloud deployment and operations
  • Monitor and handle failures across cluster resources

Education

  • Bachelor's degree in Computer Science or related field

Benefits

  • Job stability with startup vitality
  • Flexible working culture
  • Opportunity to work on cutting-edge AI technology
  • Support for continuous learning and growth
To read the complete job description, please click on the ‘Apply’ button

Average salary estimate

$150000 / YEARLY (est.)
min
max
$120000K
$180000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Distributed Software Engineer, Cerebras Systems

Cerebras Systems is on the lookout for a skilled Distributed Software Engineer to join our innovative team in either Sunnyvale, CA or Toronto, Canada. At Cerebras, we pride ourselves on developing the world's largest AI chip, a game-changer in the industry that outperforms traditional GPUs by a staggering margin. As a Distributed Software Engineer, you will play a key role in the Cluster engineering team, focusing on automating the configuration of our powerful Wafer-Scale Engine systems and ensuring they run smoothly in large scale environments. Your responsibilities will include creating efficient workflows for upgrades and security patches, developing an orchestration system for resource management, and implementing robust monitoring tools to detect failures within clusters. With opportunities to work on groundbreaking systems that impact machine learning applications across various sectors, you'll quickly see how your efforts translate into real-world advancements. We believe in empowering our team members with the freedom to innovate and contribute to one of the fastest growing fields in technology. Are you excited about transforming AI applications? Do you want to leave your mark in a vibrant work culture that values individual thoughts while providing job stability? If the answer is yes, we invite you to apply today and join us at Cerebras Systems—where your work helps push the boundaries of AI technology.

Frequently Asked Questions (FAQs) for Distributed Software Engineer Role at Cerebras Systems
What are the responsibilities of a Distributed Software Engineer at Cerebras Systems?

As a Distributed Software Engineer at Cerebras Systems, your responsibilities will primarily revolve around automating bare-metal configuration of networking, operating systems, and application software for large clusters of our Wafer-Scale Engine systems. You'll also be involved in developing workflows for cluster upgrades and security patching, managing resource allocation and job submission within a multi-user environment, and enhancing monitoring systems to ensure high availability and performance. Your role will be crucial in maintaining the operational efficiency and reliability of our powerful AI compute clusters.

Join Rise to see the full answer
What qualifications are needed for the Distributed Software Engineer role at Cerebras Systems?

To qualify for the Distributed Software Engineer position at Cerebras Systems, candidates should possess a robust background in software architecture and development for at least 6 years. A strong understanding of distributed cluster environments and expertise in Kubernetes (K8s), Prometheus, and Grafana is essential. Proficiency in programming languages like GoLang and Python, alongside strong debugging skills and experience in writing tests for new features, are critical for success in this role.

Join Rise to see the full answer
How does the work environment at Cerebras Systems support new Distributed Software Engineers?

Cerebras Systems fosters a supportive and inclusive work environment that encourages continuous learning and individual growth. As a new Distributed Software Engineer, you will have the opportunity to collaborate with experienced professionals while contributing to groundbreaking AI technologies. The non-corporate culture respects individual beliefs, allowing for personal expression and innovation, making it an ideal place for those eager to evolve in their career.

Join Rise to see the full answer
What technologies should a Distributed Software Engineer be familiar with at Cerebras Systems?

Candidates for the Distributed Software Engineer role at Cerebras Systems should have a deep understanding of distributed systems architecture and be familiar with technologies and tools such as Kubernetes, Prometheus, and Grafana. Proficiency in programming languages like GoLang, Python, and bash is vital, as these are used in developing software solutions tailored for our AI compute clusters. Additionally, experience with debugging and testing in distributed environments will greatly enhance your capability to succeed in this position.

Join Rise to see the full answer
What makes the role of Distributed Software Engineer at Cerebras unique?

The role of Distributed Software Engineer at Cerebras Systems is unique due to our pioneering work with wafer-scale architecture that has transformed the AI landscape. Unlike conventional roles focused solely on software, this position involves creating innovative solutions for one of the largest AI supercomputers in the world. You’ll work at the intersection of hardware and software, all while contributing to an environment that values creativity and technical excellence.

Join Rise to see the full answer
Common Interview Questions for Distributed Software Engineer
Can you describe your experience with distributed systems?

Discuss your past projects involving distributed systems, focusing on your specific roles and contributions. Highlight challenges faced and how you optimized the system’s performance and reliability. Use examples to show your understanding of scaling, monitoring, and debugging distributed applications.

Join Rise to see the full answer
What experience do you have with Kubernetes, and how have you used it in previous roles?

Share specific instances where you implemented Kubernetes in your projects. Discuss how you managed container orchestration, handled deployments, and ensured high availability. Providing metrics on how Kubernetes improved system performance will strengthen your response.

Join Rise to see the full answer
How do you approach automating workflows in clusters?

Explain your methodology for automating workflows, particularly steps you’ve taken in past positions. Discuss tools and scripting languages you used, and how automation improved efficiency and reduced downtime. Emphasize the importance of documentation and testing in your automation strategies.

Join Rise to see the full answer
Can you explain the significance of monitoring in distributed systems?

Discuss the importance of monitoring tools like Prometheus and Grafana, and how you have utilized them in previous distributed system projects. Highlight specific metrics you monitored and how they helped in proactive problem resolution and performance tuning.

Join Rise to see the full answer
How do you ensure the security of clusters in your software engineering practices?

Talk about security best practices you’ve implemented in past roles, including regular security patches, vulnerability assessments, and compliance with security standards. Share examples of specific security measures tailored for clusters that you’ve executed successfully.

Join Rise to see the full answer
What strategies do you use for debugging distributed systems?

Highlight your approach to debugging, emphasizing systematic analysis and using diagnostic tools. If applicable, share a challenging debugging scenario you've faced, what steps you took, and the outcome of your efforts in resolving the issue.

Join Rise to see the full answer
Describe how you handle job scheduling and resource allocation in a multi-user environment.

Discuss the tools and methodologies you have utilized for job scheduling and resource allocation. Explain how you prioritize tasks, manage dependencies, and ensure efficient resource use while maintaining performance across multiple users.

Join Rise to see the full answer
What role do you believe automation plays in software engineering, specifically for clusters?

Share your perspective on automation's vital role in increasing efficiency, accuracy, and scalability in managing clusters. Provide examples where automation has improved your workflow, reduced errors, and enhanced overall system reliability.

Join Rise to see the full answer
How do you keep up with the latest trends and technologies in distributed systems?

Explain your approach to professional development, such as attending conferences, participating in webinars, or following key industry blogs and publications. Mention specific technologies you’re currently exploring or applying relevant trends in your work.

Join Rise to see the full answer
Can you discuss your experience in working within a collaborative team environment?

Highlight collaboration experiences in your past roles, focusing on tools for communication and project management. Discuss how you’ve navigated differing opinions, facilitated discussions, and contributed to achieving team goals in a distributed systems context.

Join Rise to see the full answer
Similar Jobs
Posted 8 days ago
Photo of the Rise User
Posted 10 days ago
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Learning & Development
Equity
Paid Holidays
Paid Time-Off
WFH Reimbursements
Child Care stipend
Maternity Leave
Paternity Leave
Photo of the Rise User
MYOB Remote No location specified
Posted 6 days ago
Photo of the Rise User
Inclusive & Diverse
Empathetic
Take Risks
Transparent & Candid
Feedback Forward
Mission Driven
Collaboration over Competition
Work/Life Harmony
Maternity Leave
Paternity Leave
Snacks
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
401K Matching
Paid Sick Days
Paid Time-Off
Paid Volunteer Time
Photo of the Rise User
Posted 4 days ago
Posted 9 days ago
Photo of the Rise User
Posted 7 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Photo of the Rise User
Avaloq Remote Ayala Ave, Makati, Metro Manila, Philippines
Posted 5 days ago
Photo of the Rise User
Posted 2 days ago
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
HQ LOCATION
No info
SALARY RANGE
$120,000/yr - $180,000/yr
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
March 18, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
Photo of the Rise User
25 people applied to Senior PLSQL Developer at ProArch
Photo of the Rise User
132 people applied to Scrum Master-Remote at DICE
T
Someone from OH, Dublin just viewed Brand Marketing Intern-Summer 2025 at Trove Brands
Photo of the Rise User
Someone from OH, Mentor just viewed Supply Planning Analyst at Avery Dennison
Photo of the Rise User
Someone from OH, Columbus just viewed Medical Expert, Fertility and Pregnancy at Carrot Fertility
Photo of the Rise User
Someone from OH, Kent just viewed Finance Year-round Intern at Sherwin-Williams
Photo of the Rise User
Someone from OH, Cincinnati just viewed Product Owner, AI at Modernizing Medicine, Inc.
Photo of the Rise User
Someone from OH, Strongsville just viewed Used Car Buyer - Concord Toyota at Sonic Automotive
Photo of the Rise User
Someone from OH, Canton just viewed UI Designer - Website & Brand at Atlan
Photo of the Rise User
21 people applied to Software Engineer Intern at Hudl
Photo of the Rise User
Someone from OH, Dayton just viewed Data Engineer - User Platform at Spotify
Photo of the Rise User
Someone from OH, Dayton just viewed Data Engineer - #1696 at MeridianLink
Photo of the Rise User
Someone from OH, Columbus just viewed Enterprise Sales Project Associate at Array
Photo of the Rise User
Someone from OH, Akron just viewed Medical Receptionist at LifeStance Health
Photo of the Rise User
Someone from OH, Thornville just viewed Finance Rotation Analyst at Huntington National Bank
Photo of the Rise User
Someone from OH, Columbus just viewed Cashier - Sawmill Road Market District at Giant Eagle
Photo of the Rise User
Someone from OH, Cincinnati just viewed Data Scientist at Apex Systems
Photo of the Rise User
Someone from OH, Mansfield just viewed POS Install Tech at TEKsystems
Photo of the Rise User
Someone from OH, Dublin just viewed Sr. Manager UX Design Research at Visa
Photo of the Rise User
Someone from OH, Columbus just viewed Case Manager at Release Recovery
Photo of the Rise User
Someone from OH, Cincinnati just viewed Recruiting Coordinator (Contractor) at Anduril Industries
Photo of the Rise User
Someone from OH, Dublin just viewed Field Support Technicians - (Phoenix) at Nordstrom