Sign up for our
weekly
newsletter
of fresh jobs
We partner with the world’s most valuable brands to build digital solutions that transform businesses. As a digital native, we bring a 29-year track record of accelerating business impact through complete and scalable digital solutions. With a global presence of 6,000+ professionals in strategy, research, data science, design, and engineering, we unlock top-line growth, improve customer experience and drive operational efficiency.We are looking for aSite Reliability Engineer (SRE)with IoT, Big Data, Streaming Services, and mainly Kafka, and AWS Expertise.This role is vital for ensuring the reliability, scalability, and performance of our systems, particularly in environments involving IoT, Big Data, streaming services, and Apache Kafka, all built on top of AWS services. You will be responsible for observability, monitoring, and maintaining data quality across our platforms.Responsibilities:Design, implement, and maintain observability and monitoring solutions for complex IoT, Big Data, and streaming environments on AWS.Develop and maintain dashboards, alerts, and metrics to ensure the health and performance of our systems.Implement and enforce data quality standards and practices.Collaborate with development and operations teams to improve system reliability and performance.Automate repetitive tasks and improve system efficiency through scripting and tooling.Document processes, architectures, and workflows.Ensure security best practices are followed across all environments.Provide technical support and troubleshooting for production issues.Conduct root cause analysis and implement preventive measures to avoid future incidents.Requirements:Minimum of 5 years of experience in a Site Reliability Engineering (SRE) or related role.Advanced/fluent English communication skills (reading, writing, and speaking).Proven experience with IoT platforms, Big Data technologies, and streaming services.Strong expertise in monitoring Apache Kafka clusters.Proven experience with AWS services including MSK.Proficiency with observability and monitoring tools such as Prometheus, Grafana, ELK stack, or similar.Experience with infrastructure-as-code tools such as Terraform or CloudFormation.Strong scripting skills in languages such as Python, Bash, or PowerShell.Solid understanding of CI/CD pipelines and tools.Experience with data quality management practices and tools.Nice to Have:AWS certifications (e.g., AWS Certified Solutions Architect, AWS Certified DevOps Engineer).Familiarity with containerization technologies like Docker and Kubernetes.Knowledge of data engineering principles.Experience with security best practices and compliance standards.Experience working with international clients.Knowledge of FinOps as applied to cloud infrastructure.Bachelor’s degree in Computer Science, Information Technology, or a related field.CI&T is an equal-opportunity employer. We celebrate and appreciate the diversity of our CI&Ters’ identities and lived experiences. We are committed to building, promoting, and retaining a diverse, inclusive, and equitable company and culture focused on creating a better tomorrow.At CI&T, we recognize that innovation and transformation only happen in diverse, inclusive, and safe work environments. Our teams are most impactful when people from all backgrounds and experiences collaborate to share, create, and hear ideas.#J-18808-LjbffrOriginal job /Job-16814/ Site Reliability Engineer /SRE/ with Kafka/ Brazil posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.