Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Observability Architect  image - Rise Careers
Job details

Observability Architect

Company Description

Alter Solutions Portugal is an IT Consultancy Company, promoter of Digital Transformation, part of the Alter Solutions Group, created in 2006, in Paris.

In 2022, Alter Solutions joined the act digital group, constituting a global community of talent in Technology, with presence in twelve countries: Germany, Belgium, Brazil, Canada, United States of America, Morocco, Spain, France, Luxembourg, Poland, Portugal and Serbia. Also in 2022, we were certified as a Great Place to Work©.

In Portugal, we partner with over 120 clients and a team of over 500 people, working in projects for industries as diverse as banking, insurance, transportation, aviation, energy, and telecom.

Headquarters of the Nearshore IT center, Alter Solutions Portugal has a dedicated team of around 30 specialized professionals, integrated into projects with several internationally renowned clients.

Job Description

Observability Architect (ou arquiteto de observabilidade) é o profissional responsável por definir, projetar e implementar a estratégia de observabilidade de sistemas em uma organização. O foco é garantir que aplicações, infraestruturas e serviços sejam monitoráveis de forma eficiente, possibilitando deteção proativa de problemas, troubleshooting rápido e melhoria contínua da performance.

Principais responsabilidades:

  1. Desenhar a arquitetura de observabilidade:
    • Definir como logs, métricas e traces serão coletados, armazenados e visualizados.
    • Escolher ferramentas adequadas (como Prometheus, Grafana, OpenTelemetry, Elastic Stack, Datadog, New Relic, etc.).
    • Integrar soluções com pipelines CI/CD e infraestruturas em nuvem (AWS, Azure, GCP).
  2. Implementar padrões de instrumentação:
    • Garantir que aplicações estejam corretamente instrumentadas para gerar métricas, logs e traces relevantes.
    • Trabalhar com engenheiros de software e SREs para definir boas práticas de observabilidade no código.
  3. Definir KPIs e SLIs/SLOs:
    • Trabalhar com times de produto, DevOps e negócios para mapear indicadores que reflitam a saúde do sistema (ex: latência, disponibilidade, erros, throughput).
  4. Automatizar e escalar a observabilidade:
    • Criar automações para onboarding de novos serviços na stack de observabilidade.
    • Desenvolver dashboards e alertas eficientes que evitem ruído (alert fatigue).
  5. Fomentar cultura de observabilidade:
    • Educar os times sobre a importância da observabilidade para confiabilidade e performance.
    • Liderar iniciativas de melhoria contínua da visibilidade dos sistemas.

Skills e conhecimentos comuns:

  • Ferramentas: Prometheus, Grafana, Loki, Jaeger, OpenTelemetry, Elastic Stack, Datadog, New Relic, Splunk, etc.
  • Conceitos: Telemetria, tracing distribuído, métricas, logs estruturados, SRE, DevOps, SLIs/SLOs/SLAs.
  • Linguagens e Infra: Experiência com containers (Docker, Kubernetes), pipelines CI/CD, APIs, eventualmente programação (Go, Python, Java).
  • Cloud e automação: AWS CloudWatch, Azure Monitor, Terraform, Ansible, etc.

 

A oportunidade é hibrida no Porto.

Alter Solutions Glassdoor Company Review
3.7 Glassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star icon
Alter Solutions DE&I Review
3.7 Glassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star icon
CEO of Alter Solutions
Alter Solutions CEO photo
Louis Vachette
Approve of CEO

Average salary estimate

$70000 / YEARLY (est.)
min
max
$60000K
$80000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Observability Architect , Alter Solutions

If you're looking to make a significant impact in your career as an Observability Architect with Alter Solutions Portugal, this is your chance! Based in the vibrant city of Porto, you’ll be stepping into a role that is central to defining, designing, and implementing observability strategies across various systems. Your mission? To ensure that applications, infrastructures, and services are efficiently monitored, enabling proactive problem detection and swift troubleshooting. At Alter Solutions, you’ll get to collaborate with top-notch professionals and work with a vast range of tools and technologies like Prometheus, Grafana, and more. You'll craft the architecture for collecting logs, metrics, and traces while integrating solutions into cloud infrastructures like AWS and Azure. Plus, you’ll play a crucial part in setting up automation that streamlines the onboarding of new services in the observability stack. With the chance to foster a culture where observability is prioritized for reliability and performance throughout the organization, your influence on continuous improvement will be invaluable. If you have experience with tools such as OpenTelemetry and an understanding of fundamental concepts like SLIs and SLOs, this opportunity could be perfect for you! So, if you’re eager to lead initiatives and help teams grasp the importance of observability, Alter Solutions Portugal offers not just a position but an exciting career journey in a Great Place to Work© environment.

Frequently Asked Questions (FAQs) for Observability Architect Role at Alter Solutions
What does an Observability Architect do at Alter Solutions Portugal?

At Alter Solutions Portugal, the Observability Architect is tasked with defining and implementing a comprehensive observability strategy for the organization. This role involves designing how logs, metrics, and traces are collected, stored, and visualized, alongside selecting appropriate tools like Prometheus and Grafana. Additionally, the architect collaborates with software engineers to ensure applications are properly instrumented for optimal observability.

Join Rise to see the full answer
What skills are required for the Observability Architect position at Alter Solutions Portugal?

To excel in the Observability Architect role at Alter Solutions Portugal, candidates should possess strong knowledge of observability tools such as Datadog and Elasticsearch, along with a deep understanding of telemetry, metrics, and tracing concepts. Familiarity with CI/CD pipelines and cloud services like AWS or Azure, as well as programming experience in languages like Go or Python, would be a significant advantage.

Join Rise to see the full answer
How can an Observability Architect contribute to continuous improvement at Alter Solutions Portugal?

An Observability Architect at Alter Solutions Portugal can drive continuous improvement by designing scalable observability solutions that enhance visibility and monitoring of systems. This includes developing effective dashboards, defining KPIs, and fostering a culture of observability across teams, educating them about its impact on system reliability and performance.

Join Rise to see the full answer
What tools will the Observability Architect use at Alter Solutions Portugal?

In the role of Observability Architect at Alter Solutions Portugal, you will work with an array of powerful tools including Prometheus, Grafana, Elastic Stack, and Datadog. These tools help in monitoring applications effectively and collecting relevant metrics that reflect system health, essential for driving reliability.

Join Rise to see the full answer
Is remote work an option for the Observability Architect role at Alter Solutions Portugal?

Yes, the Observability Architect position at Alter Solutions Portugal is hybrid, which means you have the flexibility to work both on-site in Porto and remotely. This allows for a better work-life balance while still being an integral part of the team.

Join Rise to see the full answer
Common Interview Questions for Observability Architect
Can you explain the importance of observability in modern software architecture?

Observability is crucial in modern software architecture as it allows engineers to gain insights into system performance and health. It ensures that teams can detect issues proactively, leading to faster troubleshooting and improved user experiences. When preparing to answer this question, highlight the link between observability and system reliability or performance metrics.

Join Rise to see the full answer
What strategies would you employ to ensure effective observability across diverse systems?

To ensure effective observability, I would implement standardized practices for logging, monitoring, and tracing. This includes defining clear metrics, leveraging observability tools like Grafana and Prometheus, and integrating these solutions into CI/CD pipelines. Focus your answer on collaborative efforts with engineering teams to promote shared observability goals.

Join Rise to see the full answer
How do you define and implement SLIs and SLOs in your observability strategy?

Defining SLIs (Service Level Indicators) and SLOs (Service Level Objectives) starts with identifying key metrics that reflect system performance, such as latency and error rate. In my strategy, I ensure these indicators are aligned with business goals, working closely with product and business teams for consensus. Discuss the importance of continuous monitoring against these metrics in your response.

Join Rise to see the full answer
What tools have you previously used for observability and how did they help your projects?

In my previous roles, I have effectively utilized tools like Elastic Stack and Datadog, which provided comprehensive visibility into application performance through advanced metrics and logging capabilities. Describe a specific project where these tools enhanced your ability to troubleshoot and improve system uptime.

Join Rise to see the full answer
How would you train teams on best practices in observability?

Training teams on observability best practices involves interactive sessions and hands-on workshops that highlight the importance of instrumenting applications correctly. I would provide resources and real-life examples, along with ongoing support to foster a culture of observability that emphasizes collaboration and knowledge sharing to improve overall system reliability.

Join Rise to see the full answer
What role does automation play in observability as you see it?

Automation is critical in observability, as it helps streamline the onboarding of new services and reduces potential human error in monitoring setups. I would emphasize using tools like Terraform to automate infrastructure as code for observability setups, ensuring consistency and efficiency. Highlight specific automation strategies you’ve used in past projects.

Join Rise to see the full answer
Can you describe a time when you helped improve observability in a previous project?

In my previous role, I identified gaps in our observability practices that led to a 20% increase in incident response times. By implementing additional monitoring tools and refining our logging practices, I significantly improved our ability to detect issues early and respond more effectively. Use this opportunity to provide metrics or outcomes to demonstrate your impact.

Join Rise to see the full answer
What metrics do you consider critical for monitoring system health?

Critical metrics for monitoring system health typically include latency, error rates, and system throughput. I would focus on establishing a balanced view of these metrics to provide insights into both performance and reliability. Emphasize the importance of selecting metrics based on business requirements in your response.

Join Rise to see the full answer
How do you approach integrating observability solutions with existing systems?

Integrating observability solutions with existing systems requires a thoughtful approach, often involving assessing current monitoring gaps and choosing tools that align with technology stacks already in use. I prioritize gradual integration, continuously testing components, and ensuring minimal disruption. Offer examples of how you’ve successfully managed integrations in the past.

Join Rise to see the full answer
What is your process for troubleshooting performance issues in highly distributed environments?

My process for troubleshooting in distributed environments begins with pinpointing which service or component is underperforming. I utilize tools like Jaeger for tracing request paths and analyzing latencies across services. Detail how you combine observability data with team collaboration during troubleshooting efforts.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 11 days ago

Become a pivotal Engineering Manager at ALTER SOLUTIONS, where you will lead the alignment of technical and business teams in a hybrid work environment.

Photo of the Rise User
Posted 10 days ago

Join a dynamic team as a software developer with 5 years of experience, working remotely while occasionally collaborating in Montreal.

Photo of the Rise User
Pano AI Hybrid San Francisco, California
Posted 6 days ago

Join Pano AI as an Engineering Manager to lead innovative engineering solutions in wildfire detection technology.

Photo of the Rise User

Join IKEA as a Mechanical Design Engineer and contribute to designing sustainable and customer-centric home textiles and rugs.

Photo of the Rise User
Posted 9 days ago

Join Diamond Foundry Inc. as an Equipment Ops Technician to work with cutting-edge technology in a rapidly growing company.

Photo of the Rise User
Posted 4 days ago

Join Blacksmith as an Infrastructure Engineer, where you'll help optimize CI workloads for a variety of fast-growing startups.

L3Harris Technologies Hybrid US, Hunt County, TX; Texas, Greenville, TX
Posted 8 days ago

Join L3Harris Technologies as a Scientist, Project Engineering - ODA Task Lead to contribute to mission-critical technology solutions in the defense sector.

Photo of the Rise User
Vast Hybrid Long Beach, California, United States
Posted 8 days ago

Join Vast, a pioneering company in space exploration, as a Staff Manufacturing Engineer focusing on innovative thermal control systems for commercial space stations.

Posted 5 days ago

Become part of TOPPAN as an intern where you'll contribute to enhancing manufacturing processes for quality and efficiency.

Photo of the Rise User
Google Hybrid Kirkland, Washington, United States
Posted 7 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Take Risks
Collaboration over Competition
Growth & Learning
Transparent & Candid
Customer-Centric
Social Impact Driven
Rapid Growth
Passion for Exploration
Dare to be Different
Reward & Recognition
Friends Outside of Work
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Bias Training
Employee Resource Groups
401K Matching
Paternity Leave
Maternity Leave
Some Meals Provided
Social Gatherings

Step into the future of automotive innovation as a Partner Technology Manager at Google, leading the integration of Android Auto with top automotive companies.

The Alter Solutions Group is an IT Consultancy group, promoter of Digital Transformation, created in 2006, in Paris. In 2022, Alter Solutions joined the act digital group, constituting a global community of talent in Technology, with presence in...

124 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 16, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!