Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Site Reliability Engineer (SRE) - A25072 image - Rise Careers
Job details

Site Reliability Engineer (SRE) - A25072

Activate Interactive Pte Ltd (“Activate”) is a leading technology consultancy headquartered in Singapore with a presence in Malaysia and Indonesia. Our clients are empowered with quality, cost-effective, and impactful end-to-end application development, like mobile and web applications, and cloud technology that remove technology roadblocks and increase their business efficiency.

We believe in positively impacting the lives of people around us and the environment we live in through the use of technology. Hence, we are committed to providing a conducive environment for all employees to realise their full potential, who in turn have the opportunity to continuously drive innovation.

We are searching for our next team members to join our growing team.

If you love the idea of being part of a growing company with exciting prospects in mobile and web technologies that create positive impact on people’s lives, then we would love to hear from you.

Co-Development Business Unit is looking for Site Reliability Engineer (SRE)

Internal Code: A25072

This is a one-year contract role

What will you do?

We are seeking a skilled and passionate Engineer to join our team to build and operate a Whole-of-Government (WoG) runtime platform.

You will be responsible for designing and operating GitLab, AWS and Kubernetes-based infrastructure and solutions that power our platform, to ensure the stability, scalability, and performance of our runtime platform.

  • Toil Reduction & Automation

Identify repetitive tasks and develop automation via CI/CD pipelines, ensuring integration with cross-functional teams to reduce manual intervention and improve operational efficiency.

  • Observability & System Health

Implement comprehensive observability solutions (logs, metrics, traces, alerts) around the four Golden Signals (latency, traffic, errors, saturation), and build automation for proactive system health assessments and self-remediation.

  • Production Support & Incident Management

Participate in on-call rotations, promptly respond to incidents to minimize MTTR, and conduct thorough post-incident reviews to implement preventive measures and improve system resilience.

  • Security & Compliance

Design and implement solutions that are secure and compliant by collaborating with dedicated security teams, conducting regular audits, and integrating advanced vulnerability scanning tools.

  • Maintenance, Optimisation & Performance

Identify and resolve performance bottlenecks and operational issues, define and track KPIs (e.g., MTTR, system uptime, cost efficiency), and drive ongoing optimisation efforts.

  • Strategic Customer Engagement

Act as a technical advisor for tenants, guiding them on containerization, and best practices for cloud-native deployments, and participating in strategic initiatives to enhance platform scalability and performance.

  • Knowledge Sharing & Documentation

Develop and maintain detailed playbooks, runbooks, and documentation to facilitate team-wide knowledge sharing, streamline incident response, and ensure that critical processes are well understood across the team.

  • Continuous Learning & Innovation

Stay current with the latest AWS, Kubernetes, and industry developments, and proactively recommend improvements and innovative solutions to maintain a competitive and reliable platform.

What are we looking for?

  • Bachelor's degree or Diploma in Computer Science, Engineering, or a related field (or equivalent experience).
  • Proven experience as a Site Reliability Engineer or similar role, with a strong background in containerization, orchestration, and cloud-native technologies.
  • Proven ability to troubleshoot and resolve complex technical issues in containerized applications.
  • Demonstrated experience with incident management, including post-incident reviews and continuous improvement.
  • Strong documentation skills and experience in knowledge sharing across teams.
  • Deep understanding of AWS, Kubernetes (including AWS EKS), and operational best practices, with familiarity in multi-cloud or hybrid environments.
  • Solid grasp of networking, security, and storage in both AWS and Kubernetes contexts.
  • Experience integrating Kubernetes with AWS cloud technologies (e.g., Secrets Manager, Load Balancers) and using infrastructure-as-code (Terraform or similar).
  • Hands-on experience with containerization tools (Kubernetes, Kustomize, Helm) and automation scripting (Go, Python, Bash, or equivalent).
  • Ability to write and maintain automated tests or conduct thorough manual testing for automation scripts, ensuring the reliability and effectiveness of automated solutions.
  • Familiarity with CI/CD tools (GitLab CI/CD, ArgoCD) and version control systems (Git).
  • Experience with observability/monitoring tools (Prometheus, Grafana, ELK Stack) and defining SLOs and Error Budgets.
  • Certifications such as Certified Kubernetes Administrator (CKA) or Certified Kubernetes
  • Application Developer (CKAD) are a plus.
  • Experience with developing Kubernetes operators using Go, service mesh technologies, and Chaos Engineering is a plus.

Please note that a coding test will be included in the second round of interviews for selected candidates.

What do we offer in return?

  • Fun working environment
  • Employee Wellness Program

Does it sound like something you are interested in exploring further? Please be in touch with our team for an initial chat at careers@activate.sg

Activate Interactive Singapore is an equal opportunity employer. Employment decisions will be based on merit, qualifications and abilities. Activate Interactive Pte Ltd does not discriminate in employment opportunities or practices on the basis of race, colour, religion, gender, sexuality, national origin, age, disability, marital status or any other characteristics protected by law.

Protecting your privacy and the security of your data are longstanding top priorities for Activate Interactive Pte Ltd.

Your personal data will be processed for the purposes of managing Activate Interactive Pte Ltd’s recruitment related activities, which include setting up and conducting interviews and tests for applicants, evaluating and assessing the results, and as is otherwise needed in the recruitment and hiring processes.

Please consult our Privacy Notice (https://www.activate.sg/privacy-policy) to know more about how we collect, use, and transfer the personal data of our candidates. Here you can find how you can request for access, correction and/or withdrawal of your Personal Data.

Average salary estimate

$82500 / YEARLY (est.)
min
max
$70000K
$95000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Site Reliability Engineer (SRE) - A25072, Activate Interactive Pte Ltd

Join Activate Interactive Pte Ltd as a Site Reliability Engineer (SRE) and become a vital part of our dynamic team! Based out of Singapore, we are a forward-thinking technology consultancy that's reshaping how businesses leverage technology for success. In this exciting one-year contract role, you will play a pivotal role in building and maintaining a Whole-of-Government (WoG) runtime platform. Your expertise will help ensure that our GitLab, AWS, and Kubernetes-based infrastructures run smoothly and efficiently. You’ll dive into automating repetitive tasks using CI/CD pipelines, improving operational efficiency alongside cross-functional teams, and implementing robust observability solutions around the four Golden Signals. Your strong incident management experience will shine as you respond to on-call rotations and conduct post-incident reviews to boost system resilience. Collaborate with dedicated security teams to maintain compliance and strengthen our platform's security posture. We believe in continuous learning, so we encourage you to stay current with AWS and Kubernetes innovations, while also sharing your knowledge through playbooks and documentations. If you're ready to embrace challenges and drive innovation in a supportive environment, Activate is excited to hear from you!

Frequently Asked Questions (FAQs) for Site Reliability Engineer (SRE) - A25072 Role at Activate Interactive Pte Ltd
What are the main responsibilities of a Site Reliability Engineer at Activate Interactive?

As a Site Reliability Engineer (SRE) at Activate Interactive, your main responsibilities include building and operating a Whole-of-Government (WoG) runtime platform, designing GitLab and AWS infrastructures, and managing Kubernetes-based solutions. You'll lead initiatives for toil reduction through automation, implement observability solutions, handle production support, and ensure security compliance. Regular strategic customer engagement and documentation of technical processes are also key aspects of this role.

Join Rise to see the full answer
What qualifications are required for the Site Reliability Engineer position at Activate Interactive?

To qualify for the Site Reliability Engineer (SRE) position at Activate Interactive, candidates should have a Bachelor’s degree or diploma in Computer Science, Engineering, or a closely related field. Additionally, proven experience in a similar SRE role, strong skills in cloud-native technologies, and a deep understanding of AWS and Kubernetes are essential. Familiarity with CI/CD tools and experience with automation scripting is highly desirable.

Join Rise to see the full answer
What skills are essential for a Site Reliability Engineer at Activate Interactive?

Essential skills for the Site Reliability Engineer (SRE) role at Activate Interactive include expertise in containerization and orchestration technologies, strong troubleshooting capabilities, and experience with incident management processes. Proficiency in automation scripting languages like Go or Python, understanding networking and security in AWS and Kubernetes environments, as well as familiarity with observability and monitoring tools will significantly benefit candidates in this role.

Join Rise to see the full answer
What can a candidate expect during the interview process for the Site Reliability Engineer role at Activate Interactive?

Candidates interviewing for the Site Reliability Engineer (SRE) position at Activate Interactive can expect a comprehensive assessment that includes a coding test in the second round. The process will also cover technical skills related to Kubernetes, AWS, and incident management experience. Prepare for discussions on real-world scenarios, so be ready to showcase your problem-solving abilities and share examples of your experience managing production environments.

Join Rise to see the full answer
What type of work environment does Activate Interactive offer its Site Reliability Engineers?

Activate Interactive offers a fun and nurturing work environment for its Site Reliability Engineers, emphasizing employee wellness and professional development. Encouraging continuous learning and innovation, Activate fosters a culture where creativity thrives, and team collaboration is highly valued. This supportive atmosphere empowers employees to grow and make significant contributions to client success.

Join Rise to see the full answer
Common Interview Questions for Site Reliability Engineer (SRE) - A25072
Can you explain your experience with Kubernetes and its role in maintaining system reliability?

When answering this question, share specific examples highlighting your previous experience with Kubernetes in production settings. Discuss how you've utilized Kubernetes for container orchestration, resolved scaling issues, or automated deployment using CI/CD pipelines. The goal is to demonstrate your understanding of Kubernetes architecture and its importance in system reliability.

Join Rise to see the full answer
How do you approach incident management and post-incident reviews?

In your response, emphasize the significance of being systematic and thorough during incident management. Discuss your processes for identifying root causes, documenting incidents, and implementing post-incident review meetings. Highlight any tools or practices you’ve used to keep stakeholders informed and improve incident resolution times.

Join Rise to see the full answer
What strategies would you use to reduce toil in a DevOps environment?

To effectively answer this, discuss various automation techniques you’ve implemented, like building CI/CD pipelines to eliminate repetitive tasks. Mention identifying bottlenecks and sharing feedback with cross-functional teams to ensure a streamlined workflow. Provide examples of how these changes improved operational efficiency.

Join Rise to see the full answer
What observability tools have you used, and how do they help maintain system health?

Discuss specific observability tools you've used, such as Prometheus or Grafana, and explain how they help monitor system health using metrics and logs. Provide an example of how implement observability practices assisted in resolving a performance issue, illustrating your capability to interpret monitoring data effectively.

Join Rise to see the full answer
How do you ensure the security and compliance of the cloud infrastructure you manage?

Describe your approach to security, including conducting regular audits and integrating vulnerability scanning tools. Mention collaboration with security teams and adherence to regulations. Providing examples where you've identified and mitigated security risks helps showcase your proactive security mindset.

Join Rise to see the full answer
What key performance indicators (KPIs) do you think are essential for an SRE role?

Highlight KPIs that are crucial for assessing the performance and reliability of services, such as Mean Time to Recovery (MTTR), uptime percentages, and cost efficiency. Explain how you would use these metrics to drive further refinements and the rationale behind each choice.

Join Rise to see the full answer
Can you share a challenging technical problem you encountered and how you resolved it?

Be sure to illustrate a specific situation where you troubleshot a significant issue. Focus on your thought process, the steps you took to diagnose the problem, and what the outcome was. This question tests your problem-solving skills—make it engaging by explaining the impact of your solution.

Join Rise to see the full answer
How do you maintain up-to-date knowledge of industry trends and new technologies?

Talk about your approach to continuous learning—attending webinars, conferences, or online courses. Demonstrating your initiative to stay abreast of new developments in AWS, Kubernetes, and related technologies shows commitment to your professional growth.

Join Rise to see the full answer
How would you assist a customer in transitioning to a cloud-native environment?

Outline your approach for engaging with customers, assessing their needs, and planning the transition. Discuss your methods for providing guidance on containerization techniques and best practices. Sharing examples of previous successful migrations can enhance your answer.

Join Rise to see the full answer
What experience do you have with automation testing for your scripts?

In answering this question, emphasize the importance of writing maintainable scripts. Discuss tools you’ve used and practices for testing automation. Providing examples of how your testing ensured reliability will strengthen your response further.

Join Rise to see the full answer
Similar Jobs
Posted 5 days ago

Join Activate Interactive as a DevOps Engineer and contribute to innovative application development in the cloud sector.

Join Activate Interactive as a Software Quality Engineer and contribute to the quality of impactful SaaS solutions.

Photo of the Rise User
Coupang Hybrid Seattle, Washington, United States
Posted 12 days ago

Coupang is looking for a Senior Software Engineer to enhance their Rocket Growth Fulfillment systems in a dynamic and innovative environment.

Photo of the Rise User

Become a key player at Nexthink as a Senior Java Engineer, driving innovation in digital employee experience management software.

Photo of the Rise User
Posted 11 days ago

Join Claritas Rx as a Director of Software Engineering to drive impactful software solutions in a pioneering healthcare technology environment.

Photo of the Rise User

Aurora is looking for a strategic and innovative Director of Software Engineering for their Data Platforms to lead complex technical initiatives enhancing autonomous vehicle data capabilities.

Photo of the Rise User
Koombea Remote No location specified
Posted 6 days ago

As a Senior Ruby On Rails Developer at Koombea, you will lead the development of innovative web applications while working with some of the brightest minds in software development.

Photo of the Rise User

Elevate your career as a Senior Software Engineer focusing on identity and access management with Jobgether's dynamic Security team in San Francisco.

Photo of the Rise User
Posted 12 days ago

Visa Technology & Operations LLC seeks a Senior Software Engineer in Atlanta to enhance software and drive automation initiatives.

MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
HQ LOCATION
No info
EMPLOYMENT TYPE
Contract, remote
DATE POSTED
April 9, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
LATEST ACTIVITY
S
Someone from OH, Cincinnati just viewed Director, Logistics & Operations at Starface World
Photo of the Rise User
Someone from OH, Westlake just viewed Senior Data Engineer - (Remote) at Jobgether
K
Someone from OH, Lima just viewed Talent Operations Associate at Kinaxis Inc.
Photo of the Rise User
Someone from OH, Delaware just viewed Quality Engineer, Call Intelligence (Contract) at Replicant
Photo of the Rise User
Someone from OH, Lima just viewed Trainee Recruitment Consultant at Gi Group Holding
Photo of the Rise User
Someone from OH, Lima just viewed Associate Talent Development Partner at Niche
Photo of the Rise User
Someone from OH, Lima just viewed Talent Acquisition Coordinator at Clio
Photo of the Rise User
Someone from OH, Lima just viewed Remote Tax Professional at H&R Block
Photo of the Rise User
Someone from OH, Lima just viewed Senior SMB Client Onboarding Partner at H&R Block
Photo of the Rise User
Someone from OH, Oxford just viewed Third Party Risk Senior Manager at TAL
Photo of the Rise User
Someone from OH, Oxford just viewed Third Party Cyber Risk Assesor at Control Risks
Photo of the Rise User
Someone from OH, West Chester just viewed Data Analyst​/Associate, Data Analyst, Senior, or Lead at Ameren
Photo of the Rise User
Someone from OH, Cincinnati just viewed Quality Inspector - Mechanical - Level 1 at SQA Services
Photo of the Rise User
11 people applied to Game Developer (Unity) at LiquidX
Photo of the Rise User
Someone from OH, Beachwood just viewed Mechanical Engineer (Entry Level) at CyberCoders
Photo of the Rise User
36 people applied to Software Engineer Intern at Hudl
Photo of the Rise User
Someone from OH, Cleveland just viewed Associate Manager, CPG Ads & Promotions - S&O at DoorDash USA
Photo of the Rise User
Someone from OH, Cleveland just viewed Manager, Trade Marketing at Red Bull
Photo of the Rise User
7 people applied to Flutter Developer at Adree
Photo of the Rise User
Someone from OH, Cincinnati just viewed Freelance Audio Editor at Side
Photo of the Rise User
Someone from OH, Painesville just viewed Summer Intern at Gooch & Housego