If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
The Wikimedia Foundation is on the lookout for a talented Staff Site Reliability Engineer (SRE) focused on Machine Learning Infrastructure to join our dynamic remote team. You will collaborate with amazing colleagues across various time zones, from Eastern Americas to Europe and Africa, reporting directly to our Director of Machine Learning, Chris Albon. In this role, you’ll lead the design, development, maintenance, and scaling of the crucial infrastructure that powers our Machine Learning Engineers and Researchers’ efforts in training, deploying, and monitoring machine learning models. Your day-to-day will encompass sketching out robust ML infrastructure, enhancing the reliability and scalability of our systems, and working hand-in-hand with cross-functional teams to streamline operational processes. We’re seeking someone who can proactively monitor system performance and security while sharing their insights through collaboration and documentation. Plus, mentoring fellow team members is a big part of fostering our culture. With over 7 years of experience under your belt, particularly in SRE, DevOps, or infrastructure engineering, you’ll bring significant expertise in managing production-grade machine learning systems. If you thrive in an open-source environment and value teamwork within diverse, remote teams, then this opportunity at the Wikimedia Foundation is tailor-made for you. Join us in making knowledge freely accessible, as we believe that together, we can contribute to a world where everyone benefits from shared knowledge!
The Wikimedia Foundation is the nonprofit organization that operates Wikipedia and the other Wikimedia free knowledge projects, among the world's most popular websites. Established in 2003, Wikimedia is headquartered in San Francisco, California, ...
6 jobsSubscribe to Rise newsletter