Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Manager of Infrastructure Operations image - Rise Careers
Job details

Manager of Infrastructure Operations

Voltage Park is seeking a highly skilled and proactive Manager of Infrastructure Operations to lead our 24/7 Infrastructure Operations team responsible for the stability, scalability, and performance of compute, storage, and platform infrastructure. This role plays a key part in delivering always-on, high-performance environments that support AI/ML training, inference, and HPC workloads at scale. The ideal candidate combines technical depth with strong leadership skills and a passion for operational excellence. 

This position offers full remote flexibility, although candidates must be based in the continental US and available to work during PST hours. Unfortunately, we are unable to provide sponsorship for this role.

Responsibilities:

  • Establish and uphold the standard practices for our expanding InfraOps team.

  • Lead and mentor a 24/7 infrastructure Operations team responsible for monitoring, maintaining, and supporting our infrastructure.

  • Develop and maintain operational runbooks, escalation procedures, and documentation for critical systems.

  • Collaborate with Infrastructure Engineering, Network operations, and Datacenter Operations and Customer Success teams to support infrastructure rollouts, upgrades, and scaling efforts.

  • Oversee observability systems (monitoring, logging, alerting) and drive continuous improvements in automation and root-cause analysis.

  • Drive adoption of “Infrastructure as Code” and automated workflows to reduce manual intervention.

  • Implement and enforce best practices for system availability, performance tuning, capacity planning, and lifecycle management.

  • Be available for on-call support during urgent system incidents.

  • Ensure compliance with security, regulatory, and organizational standards across all environments.

Qualifications:

  • Proficiency in Puppet, Terraform, and Ansible.

  • Strong scripting skills in Bash, Python, or Go.

  • Extensive experience in setting up, deploying, and managing Kubernetes clusters.

  • Proven track record of architecting, building, and delivering complex systems from inception.

  • Ability to strike a balance between pragmatic development and ideal architectures.

  • Skilled at navigating trade-offs between design, risk, cost, and outcomes.

  • Deep understanding of network protocols, network programming, Unix variants, monitoring, and security systems.

  • Excellent written and verbal communication skills.

Leadership Requirements:

  • Demonstrated ability to inspire and lead a team towards common goals, fostering a positive and collaborative work environment.

  • Proven track record of effectively delegating tasks, providing constructive feedback, and developing team members' skills.

  • Strong decision-making skills, capable of guiding the team through complex technical challenges and strategic initiatives.

  • Ability to communicate a clear vision and align team efforts with broader company objectives.

  • Experience in conflict resolution and team building, promoting diversity, equity, and inclusion within the team and the organization.

Culture:

  • Enjoy collaborating with a growing motivated team focused on execution.

  • Comfortable operating with a high degree of autonomy and able to independently prioritize tasks aligning with company objectives.

  • Possess a breadth of knowledge in your domain while also embracing the opportunity to take on diverse responsibilities.

  • Value the importance of clear communication and documentation in driving success.

Team Charter:

The 24/7 Infrastructure Operations Team ensures the stability, scalability, and performance of Voltage Park’s compute, storage, and platform systems across data centers, cloud, and edge. Supporting AI and HPC GPU environments, the team delivers proactive monitoring, automation toolsets, and continuous optimization to maintain high availability and operational excellence at all times to ensure the best possible customer experience.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Average salary estimate

$125000 / YEARLY (est.)
min
max
$100000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Manager of Infrastructure Operations, Voltage Park

At Voltage Park, we’re searching for a proactive and skilled Manager of Infrastructure Operations to join our dynamic team in San Francisco. In this role, you'll lead a dedicated 24/7 Infrastructure Operations team responsible for ensuring the stability, scalability, and performance of our compute, storage, and platform infrastructure. You’ll play a pivotal role in maintaining high-performance environments tailored for AI/ML training, inference, and HPC workloads. If you're passionate about operational excellence and have the technical depth to match your leadership skills, this position is perfect for you! Enjoy the flexibility of remote work while collaborating with our growing team, though please note that candidates need to be based in the continental US and available during PST hours. As you establish standardized practices and mentor your team, you’ll dive deep into automation, observability systems, and lifecycle management. From developing operational runbooks to driving best practices in system performance, your contributions will significantly enhance our infrastructure's reliability and efficiency. While we focus on individual growth here, teamwork and collaboration are just as important, so expect plenty of interaction with Engineering and Customer Success teams. Voltage Park values a strong, diverse workplace and ensures fair consideration for all applicants. If you’re ready to drive exceptional results in a fast-paced environment, we want to hear from you!

Frequently Asked Questions (FAQs) for Manager of Infrastructure Operations Role at Voltage Park
What are the responsibilities of the Manager of Infrastructure Operations at Voltage Park?

The Manager of Infrastructure Operations at Voltage Park is responsible for leading a 24/7 Infrastructure Operations team that ensures the stability, performance, and scalability of infrastructure systems. This includes developing operational procedures, collaborating with various teams, overseeing observability systems, and implementing best practices for system availability and performance tuning.

Join Rise to see the full answer
What qualifications are needed for the Manager of Infrastructure Operations position at Voltage Park?

Candidates should possess proficiency in tools like Puppet, Terraform, and Ansible, along with strong scripting skills in languages such as Bash, Python, or Go. Experience with Kubernetes and a background in architecting complex systems are crucial. Additionally, excellent communication and leadership skills are essential for fostering a positive work environment.

Join Rise to see the full answer
What does the 24/7 Infrastructure Operations team at Voltage Park focus on?

The 24/7 Infrastructure Operations team at Voltage Park is dedicated to ensuring high availability and operational excellence for compute and storage systems across various environments including cloud and edge. They implement proactive monitoring, automation tools, and constant optimization to support AI and HPC workloads.

Join Rise to see the full answer
Is remote work an option for the Manager of Infrastructure Operations role at Voltage Park?

Yes, Voltage Park offers full remote flexibility for the Manager of Infrastructure Operations position. However, candidates must be based in the continental US and available to work during PST hours.

Join Rise to see the full answer
What is the leadership style expected of the Manager of Infrastructure Operations at Voltage Park?

The Manager of Infrastructure Operations is expected to inspire and lead their team towards common goals while promoting a collaborative work environment. Effective delegation, constructive feedback, and skill development of team members are vital components of the leadership style encouraged at Voltage Park.

Join Rise to see the full answer
Common Interview Questions for Manager of Infrastructure Operations
How do you approach leadership within an Infrastructure Operations team?

Describe your leadership philosophy emphasizing collaboration and your strategy for mentoring team members. Highlight your success in guiding teams through challenges and your ability to foster a cohesive work environment.

Join Rise to see the full answer
What experience do you have with Infrastructure as Code?

Discuss your proficiency with tools like Terraform or Ansible and how you've used them to automate deployments, manage infrastructure versions, or improve operational efficiencies in previous roles.

Join Rise to see the full answer
Can you explain your experience managing Kubernetes clusters?

Share specific examples of projects or systems you've deployed using Kubernetes. Highlight any challenges faced and how you overcame them, showcasing your technical knowledge and problem-solving skills.

Join Rise to see the full answer
What strategies do you implement for incident response during system outages?

Detail your processes for incident management, including how you prioritize issues, communicate with stakeholders, and ensure timely resolution of incidents while minimizing downtime.

Join Rise to see the full answer
How do you ensure compliance with security and regulatory standards in your operations?

Discuss your experience with specific regulations relevant to the industry, how you implement security best practices within your infrastructure, and the tools you use to ensure compliance is maintained.

Join Rise to see the full answer
How do you balance innovation with operational stability?

Explain your approach to finding the right balance between introducing new technologies or processes and ensuring the current systems remain stable and performant, providing any examples of successful implementations.

Join Rise to see the full answer
Describe how you utilize observability tools to monitor infrastructure.

Elaborate on the various observability tools you've utilized, their impact on system performance monitoring, and how you've leveraged them to drive continuous improvements in your previous roles.

Join Rise to see the full answer
What role does documentation play in your management strategy?

Discuss the importance of maintaining thorough documentation for operational procedures and systems. Provide insights into how you've implemented effective documentation practices to ensure knowledge sharing and onboarding of new team members.

Join Rise to see the full answer
How do you prioritize tasks in a 24/7 operations environment?

Explain your methods for task prioritization, mentioning any tools or systems you use to manage workloads efficiently and how you ensure alignment with broader organizational objectives.

Join Rise to see the full answer
What practices do you encourage in your team to foster a culture of inclusion and collaboration?

Highlight the initiatives or strategies you've implemented to promote diversity, equity, and inclusion within your teams, as well as how these practices have positively impacted team dynamics and productivity.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Voltage Park Remote No location specified
Posted 7 days ago

Become a key player in shaping the culture and growth of Voltage Park as their Director of People in a fully remote role.

Photo of the Rise User
Posted 6 days ago

As a Principal Detection Engineer at Jobgether, you will tackle complex cybersecurity challenges and lead initiatives to improve threat detection for our clients.

Photo of the Rise User
Posted 23 hours ago

Become an integral part of Merative by ensuring the security, reliability, and scalability of our Azure cloud infrastructure as an Azure Cloud Operations and SRE Engineer.

Photo of the Rise User

Join Noblis as a Management Information Systems Specialist and leverage your skills in information systems to enhance operational efficiency.

Mesa County Public Library District Hybrid Grand Junction, Colorado, United States
Posted 9 days ago

Join Mesa County Public Library as the Head of Public Services to manage a dynamic team dedicated to serving the community.

Photo of the Rise User
Visa Remote Austin
Posted 14 days ago
Photo of the Rise User
Posted 5 days ago

Elevate your career as a Power Platform Developer with Jobgether, leveraging cutting-edge technology to drive business solutions.

voltage park is building a new class of cloud infrastructure from the ground up. join us, we're hiring!

28 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
April 9, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!