Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Site Reliability Engineer image - Rise Careers
Job details

Site Reliability Engineer

Our mission

Genmo makes it easy for anyone to create movies, as if it were magic. Using our web application, any user can create cinematic video using a simple text prompt.

We imagine a world where high-quality cinematic video content is as plentiful as water. Our mission is to empower the next billion video creators to tell their stories.

As a Site Reliability Engineer (SRE) at Genmo, you will be responsible for designing, implementing, and maintaining the infrastructure that powers our large generative AI models. You will work on infrastructure automation, distributed systems design, and manage high-performance computing (HPC) and GPU clusters. The ideal candidate will have a strong background in infrastructure automation, distributed systems, and experience with GPU and HPC environments.

Responsibilities:

  • Design, implement, and maintain scalable infrastructure to support our generative AI models.

  • Develop and maintain infrastructure automation tools using technologies like Docker, Kubernetes, and Terraform.

  • Ensure the reliability, availability, and performance of our systems through proactive monitoring and incident response.

  • Collaborate with software engineers and researchers to design and implement distributed systems.

  • Manage and optimize GPU and HPC clusters for efficient AI model training and inference.

  • Develop and maintain CI/CD pipelines to streamline development and deployment processes.

  • Implement and maintain security best practices across the infrastructure.

Qualifications:

  • 5+ years of experience in site reliability engineering or a similar role.

  • Experience working in a 24 x 7 enterprise environment

  • Hands-on experience with infrastructure as code and automation tools (Ansible, Chef, Puppet, Terraform)

  • Strong experience with infrastructure automation tools such as Docker, Kubernetes, and Terraform.

  • Expertise in designing and maintaining distributed systems.

  • Proficiency in scripting and programming languages, particularly Python and C++.

  • Strong understanding of networking, security, and system performance.

  • Excellent problem-solving skills and the ability to work in a fast-paced environment.

Bonus points:

  • Experience with cloud providers like AWS, GCP, or Azure.

  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).

  • Familiarity with CI/CD tools and practices (e.g., Jenkins, GitLab CI/CD).

  • Experience working with AI and machine learning models.

  • Strong passion for artificial intelligence and the drive to learn new technologies.

Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish.

Genmo Glassdoor Company Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
Genmo DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of Genmo
Genmo CEO photo
Unknown name
Approve of CEO
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
July 4, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!