The Principal Service Operations Engineer at Flooid maintains, improves, visualizes, troubleshoots, and delivers technical cloud environments for our highest profile customers. As a Principal, your focus will be on the process and technology of delivering our SaaS offering.
You will collaborate with teams from across the business to ensure that customer systems are performant, reliable, and resilient during periods of going live, normal operations, and during periods of change. This is an individual contribution role, but due to your technical knowledge, you will be expected to guide and develop the team’s technical skills and drive best practice. The skill set sought is a combination of site reliability engineering plus operation expertise.
Role and Responsibilities
This position has great flexibility and the opportunity to work on a global scale. The primary duties of this role include:
- Act as technical SME on cloud environments and related operations with a focus on aspects such as (but not limited to) BCP/DR, capacity, resilience, security, and troubleshooting
- Lead complex technical activities such as incidents, deployments, problems, DR failover, and similar changes with both a hands-on approach and a longer term, strategic mindset
- Take full ownership of key customers for all technical and operational issues and improvements
- Work closely with stakeholder teams to ensure a joined-up approach across the Operations Engineering teams and Project Delivery team for R&D strategy around our SaaS cloud offering
- Work “hand-in-glove” with delivery to ensure that the balance between agility, reliability, and efficiency is correct and that the customer and operational considerations are always evaluated
- Achieve KPIs and SLAs that span multiple teams, customers, and tech stacks
- Represent the technical and operational interests with multiple stakeholders, including significant global customers, and ensure the right decisions are made to promote stability and availability
- Assist in managing capacity across teams and infrastructure to ensure that Flooid can continue to grow and deliver services to an increasing number of customers
- Ensure the technical teams have the right mix of skills for both current and future scenarios and work to develop individuals as required
- Guide, develop, and mentor the teams and individuals to build best practices and processes
- Identify and implement continual improvements and drive these activities to their conclusion
- Approve the technical aspects/content and approach of change for multiple customers, including change approvals, CAB membership, and scheduling of changes
- Act as a key contributor for post incident root cause analysis with a focus on identifying the underlying causes of failure and removing them
- When necessary, support incidents and planned change in an on call or out of hours capacity
Skills and Experience
A successful Principal Service Operations Engineer at Flooid will be a pro-active self-starter who collaborates effectively with team members to find creative solutions to problems. The ideal candidate will have the following skills and experience:
- In depth technical expertise and 5+ years of experience within a similar role
- Experience of working as part of mixed technical teams, spread across regions, time zones, support rotas, and follow the sun approaches
- Because the Flooid technology stack is both broad and deep, evidence of skills and expertise in as many of the following areas as possible:
- Public cloud with a preference for Google Cloud Platform (GCP), but must have experience in at least one hyperscale cloud provider
- Cloud architecture, platform, and tooling, with expertise in operating virtual machines and containers
- Kubernetes (ideally with GKE experience) hosting applications built natively for containers plus those that require an application server
- Designing and operating SaaS in the cloud, including high availability, redundancy, and capacity
- Database engines/platforms such as Big Query, DB2, PostgreSQL, Mongo
- Event driven architecture like ActiveMQ Kafka
- Operating a Java run time in virtual machines within GCE
- Security (certificates, firewall, OWASP, VPN, WAFs)
- Load balancing (large scale platforms)
- Monitoring tooling with focus on the TICK stack (Telegraf, InfluxDB, Chronograf & Kapactior) along with Grafana
- ITIL as part of incident, change, and problem management
- Working with the Linux operating system (mainly Ubuntu and SUSE) and Windows (outside of cloud)
- Proven ability to work in collaboration with our cloud partner (GCP) and our Google Cloud Partner.
- Ability understand and distil complex information into high level views and reports for a variety of audiences with excellent customer facing communication skills
- Excellent stakeholder management and ability to work with a variety of people and teams at different levels.
- Dynamic and proactive approach to resolving issues
- Working as part of an out of hours rota and managing teams and rotas to ensure the right level of cover is always provided
The following skills and experience would be especially beneficial for the role:
- Industry expertise in operating SaaS solutions for enterprise scale customers, knowledge of retail being a bonus
- Public cloud Certification (e.g. Professional Cloud Architect) or the ability to be certified
Our Company
Flooid is an innovative software technology company offering cutting edge retail and hospitality solutions to major global brands, from point of sale, mobile, online, social, and beyond, as well as solutions for Cloud and Managed Services, ensuring our retail partners have everything they need to make the sale. Our customers’ needs are at the heart of what we do, and that focus has resulted in great historical success and an exciting strategy for where we are headed in the future.
Location: United States – Remote or In-Office/Hybrid (primary office located in Cincinnati, Ohio). Because this role supports our global business, candidates must be based in the Eastern or Central Time Zones.
Hours: Full-Time – At Flooid, we promote a flexible work environment that allows you to balance your work responsibilities with other priorities, like picking up your children, caring for an aging parent, or attending important family events.
Benefits: Benefits start on day one – medical, dental, vision, life, and disability coverage available; competitive salary; flexible PTO policy that allows for uncapped PTO; fully paid FMLA leave comparable to company-paid short-term disability coverage; 12 weeks of fully paid parental leave; 401(k) plan with company match
Accommodations: Work is performed in an office setting with frequent interruptions. This position requires the ability to sit or stand at a workstation for extended periods of time. The ability to communicate effectively in person, by phone, and on electronic devices is necessary to perform this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.
Equal Employment Opportunity: Flooid is an Equal Employment Opportunity (EEO) Employer and complies with Title VII of the Civil Rights Act of 1964 and all other applicable federal, state, and local laws and regulations pertaining to EEO as well as subsequent guidelines established by the EEO Commission.