Sign up for our
weekly
newsletter
of fresh jobs
Senior SRE Engineer
Remote
Job Description:
We are currently seeking a highly skilled SRE Sr Engineer with solid experience to help lead transformational initiatives within IT operations, encompassing development as well. As a crucial figure in this role, you will participate/help design and implement cutting-edge SRE solutions, driving the transformation of IT operations organizations to adopt an engineering-centric approach.Responsibilities:• Should be very well equipped with all SRE parameters and key metrics and transformation steps• Knowledge on traditional support to SRE transformation is a great advantage• Worked in large scaled production with ITIL & SRE process, good understanding on ticket management• Strong understanding on Agile/Waterfall/Scrum/Kanban and leading SRE deliverables• Evangelize SRE evolution within IT operations and promoting a culture of engineering excellence and best practices.• Collaborate with development teams on resiliency to ensure that services and applications are designed with operational reliability in mind.• Implement monitoring systems to assess the performance of applications and infrastructure, and proactively identifying areas for optimization.• Understanding incident and problem management process, post-mortems, and driving improvements to prevent future incidents.Qualifications:• Around 8-10 years of SRE hands on experience with cloud technologies, development, SRE toolsets and automation• Strong hands-on experience with any Cloud Technology (AWS): Control Tower, Project Setup, Creating Accounts, RDS, SSO• Solid understanding and hands on experience with Docker/Kubernetes• Should have good experience with Linux Commands, GitLab CICD Setup and Terraform (state management, etc)• Monitoring & alerting setup experience with Splunk, Prometheus, Grafana, Kibana, ELK etc.• Hands on APM Tool/s experience, preferably Datadog or AppDynamics or Dynatrace• Good understanding of Observability Framework leveraging programmatic SLI/SLO blueprints to standardize the collection of golden signals.• Should have automation (data refresh, releases, DB snapshots) experience using Ansible or any other scripting languages• Experience with following languages (Groovy-DSL, Java, Python, Yaml and microservices architecture)• Good understanding and hands on experience with MQ, Kafka• Experience with Databases (Oracle, MySQL)Good to have:• Any of the relevant professional certifications – Certified Site Reliability Engineer (CSRE), Certified Kubernetes Administrator (CKA), AWS Certified DevOps Engineer Professional, , Google Cloud Professional; DevOps EngineerKey Skills:AWS/GCP/Azure, terraform, vault, datadog/dynatrace/appdynamics, gitlab/gitlab, python, Bash, postgres, SQL, Linux, Windows, Docker & Kubernetes (EKS), Java Springboot knowledge, networking, security, microservices architecture, kafka/rabbitmq, SRE best practices, observability. (Incident Management is must)