We are seeking an experienced Observability SME with deep expertise in observability architectures and leading monitoring platforms. This role will be responsible for designing, implementing, and optimizing end-to-end observability solutions for applications, infrastructure, and networks. The ideal candidate will have extensive hands-on experience with platforms such as ELK (Elasticsearch, Logstash, Kibana), Dynatrace, BMC TrueSight, and SolarWinds, ensuring seamless monitoring, alerting, and analytics to enhance IT operations and service reliability.
Key Responsibilities:
· Observability Strategy & Architecture: Design and implement comprehensive observability solutions to monitor applications, infrastructure, and network performance.
· Monitoring Tool Implementation & Optimization: Deploy and fine-tune monitoring solutions using ELK, Dynatrace, BMC TrueSight, and SolarWinds.
· Log Management & Analysis: Establish centralized logging, log parsing, and correlation for improved event detection and troubleshooting.
· Metrics & Performance Monitoring: Define KPIs, dashboards, and alerts for proactive IT service monitoring.
· Incident Management & Root Cause Analysis: Collaborate with IT operations, DevOps, and SRE teams to diagnose and resolve performance issues.
· Automation & Integration: Integrate monitoring tools with ITSM platforms, AIOps solutions, and automation frameworks for enhanced efficiency.
· Capacity Planning & Optimization: Analyze historical trends and real-time data to optimize resource allocation and performance.
· Stakeholder Collaboration: Work closely with developers, network engineers, system administrators, and business units to ensure observability best practices are followed.
· Continuous Improvement: Stay updated on emerging observability technologies and recommend improvements to existing processes and tools
Qualifications:
· Expertise in Observability & Monitoring Platforms: 8+ Years Hands-on experience with ELK Stack, Dynatrace, BMC TrueSight, SolarWinds, and similar platforms.
· Strong Knowledge of Infrastructure & Application Monitoring: Experience monitoring cloud, on-premise, and hybrid environments.
· Experience with Log & Event Correlation: Ability to configure and analyze logs for anomaly detection and security insights.
· Automation & Scripting: Proficiency in scripting languages such as Python, PowerShell, or Bash for automation.
· Cloud & DevOps Understanding: Experience with cloud platforms (AWS, Azure, GCP) and CI/CD pipelines.
· ITIL & Incident Management Exposure: Understanding of ITIL processes and IT service management (ITSM) practices.
· Networking & Security Awareness: Knowledge of network monitoring, SNMP, and security monitoring practices.
· Excellent Communication & Documentation Skills: Ability to present findings, create technical documentation, and train teams on observability best practices.
Preferred Qualifications:
· Certifications in Dynatrace, ELK, BMC TrueSight, or SolarWinds.
· Experience with AIOps, Machine Learning for Anomaly Detection, or AI-driven Observability.
· Background in Site Reliability Engineering (SRE) or DevOps.
· Familiarity with Infrastructure as Code (IaC) tools such as Terraform, Ansible.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Are you ready to take your expertise to the next level? Join us as an Expert Observability Engineer! In this exciting role with our team, you'll design, implement, and optimize end-to-end observability solutions that will revolutionize how applications, infrastructure, and networks are monitored. Your hands-on experience with robust platforms like ELK, Dynatrace, BMC TrueSight, and SolarWinds will be crucial in creating seamless monitoring and alerting systems that enhance our IT operations and service reliability. You'll be working closely with developers, system administrators, and network engineers, collaborating through incident management and root cause analysis to ensure everything runs smoothly. As a key player in our strategy and architecture efforts, you'll also be responsible for capacity planning, automation, and continuous improvement of our existing processes. If you’re passionate about observability technologies and enjoy tackling challenges head-on, this is the perfect opportunity for you to thrive in a dynamic environment. We can't wait to see how you can contribute to our team's success and keep our systems running at their best!
Join our team as a People Operation Supervisor to oversee HR processes and ensure the best practices in employee management across the MENA region.
We are looking for a strategic DS Senior Campaign Manager to drive data-centric marketing initiatives at SWATX.
Join Lumotive as a Senior Lidar Characterization Engineer and help drive the evolution of advanced sensing systems through your expertise in lidar technology.
Join Lightdash as a Senior Customer Engineer and drive innovative solutions while working closely with valuable customers.
Join Wolters Kluwer as a Lead DevOps Engineer and lead the design and implementation of cutting-edge infrastructure solutions.
Join Wabtec Corporation as a Manufacturing Engineer and be part of a team revolutionizing transportation solutions for the future.
We are seeking a Senior DevOps Engineer to enhance automation and maintain the health and performance of our gaming platform services.
Join KPN as a DevOps Engineer, where you'll play a key role in transitioning to new technology and developing mobile Core network solutions.
As a Controls Engineer at our company, you will play a critical role in enhancing our plant engineering efforts and supporting automation projects in a dynamic environment.
Join AECOM as a Senior Process Engineer and contribute to innovative infrastructure projects while enjoying flexible work options.
Subscribe to Rise newsletter