Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Linux Site Reliability Engineer (REF3649Z) image - Rise Careers
Job details

Linux Site Reliability Engineer (REF3649Z)

Company Description

The largest ICT employer in Hungary, Deutsche Telekom IT Solutions (formerly IT-Services Hungary, ITSH) is a subsidiary of the Deutsche Telekom Group. Established in 2006, the company provides a wide portfolio of IT and telecommunications services with more than 5000 employees. ITSH was awarded with the Best in Educational Cooperation prize by HIPA in 2019, acknowledged as one of the most attractive workplaces by PwC Hungary’s independent survey in 2021 and rewarded with the title of the Most Ethical Multinational Company in 2019. The company continuously develops its four sites in Budapest, Debrecen, Pécs and Szeged and is looking for skilled IT professionals to join its team.

Job Description

We are seeking a skilled and motivated Linux Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a strong background in Linux system administration, automation, and cloud infrastructure, with a passion for building reliable and scalable systems. You will collaborate with development and operations teams to ensure our services are highly available, performant, and fault-tolerant.

  • Onboarding of New Customers: Ensure smooth deployment and operational readiness, document processes and provide initial support during the transition.
  • System Administration: Manage, monitor, and optimize Linux servers in production and development environments. Identify and resolve bottlenecks in application and system performance.
  • Automation: Develop and maintain infrastructure automation using tools like Ansible, Terraform, or similar. Creation and Maintenance of Hardening and Washing Script (Ansible).
  • Performance Optimization: Diagnose and resolve performance bottlenecks at the OS, application, and network levels. Analyze system demands and plan for scaling.
  • Incident Management: Lead efforts to quickly resolve production incidents, conduct post-mortems, and implement solutions to prevent future occurrences.
  • Scalability: Work on infrastructure scalability and reliability for high-traffic services.
  • Collaboration: Partner with development teams to create CI/CD pipelines and integrate reliability practices into the development lifecycle.
    Coordinate changes with Operation Teams.
  • Security: Ensure system security through best practices in access control, patch management, and system hardening.

Qualifications

  • Operating Systems: Extensive experience with Linux distributions like RHEL, CentOS, or Ubuntu
  • Scripting: Proficiency in scripting languages like Bash, Python, or Ruby for automation 
  • Cloud Expertise: Familiarity with cloud platforms like AWS, Azure or GCP and containerization technologies like Docker or Kubernetes
  • Infrastructure as Code (IaC): Hands-on experience with tools such as Terraform, Ansible, or Chef
  • Networking: Solid understanding of networking protocols, DNS, load balancers, and firewalls
  • Version Control: Experience with Git or similar version control systems
  • Web Servers & Middleware: Good skills in configuring and managing Apache, Tomcat, JBoss and NGINX for production environments
  • Problem-Solving: Strong troubleshooting and debugging skills
  • Communication: Strong communication and teamwork abilities for cross-functional work. At least intermediate English language knowledge
  • Mindset: A mindset for optimizing and enhancing systems iteratively

Nice to have/preferred skills and experience

  • Exposure to high-availability architectures and disaster recovery strategies 
  • Certifications: RHCE, AWS Certified SysOps Administrator, or equivalent
  • Knowledge of monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or Datadog
  • Experience with Websphere
  • German language knowledge

Additional Information

* Please be informed that our remote working possibility is only available within Hungary due to European taxation regulation.

What You Should Know About Linux Site Reliability Engineer (REF3649Z), Deutsche Telekom IT Solutions

Are you a Linux enthusiast looking for a new challenge? Join Deutsche Telekom IT Solutions as a Linux Site Reliability Engineer (SRE) and embark on an exciting journey in Hungary! This is the perfect opportunity for someone who thrives on creating reliable and scalable systems. In our vibrant teams across Budapest, Debrecen, Pécs, and Szeged, you'll collaborate closely with development and operations experts to ensure our services are available and performing at their best. With your skills in managing and optimizing Linux servers, and your knack for automation using tools like Ansible and Terraform, you will help us onboard new customers smoothly and ensure operational readiness. Your days will be filled with diagnosing performance bottlenecks, leading incident management efforts, and implementing solutions that improve our systems. At Deutsche Telekom IT Solutions, we highly value security, so your experience with best practices in system hardening will be vital. We’re not just offering a job; we’re inviting you to be part of a culture that celebrates innovative solutions and continuous improvement. If you have a solid background in scripting with languages like Bash or Python, familiarity with cloud platforms, and a passion for collaboration, this could be the role for you. Let's build the future of IT together!

Frequently Asked Questions (FAQs) for Linux Site Reliability Engineer (REF3649Z) Role at Deutsche Telekom IT Solutions
What are the primary responsibilities of a Linux Site Reliability Engineer at Deutsche Telekom IT Solutions?

As a Linux Site Reliability Engineer at Deutsche Telekom IT Solutions, your primary responsibilities will include managing and optimizing Linux servers, ensuring smooth deployment for new customers, developing infrastructure automation, and leading incident management efforts to resolve production issues efficiently. You will also collaborate with development teams to integrate reliability practices throughout the development lifecycle.

Join Rise to see the full answer
What qualifications are needed for the Linux Site Reliability Engineer position at Deutsche Telekom IT Solutions?

To qualify for the Linux Site Reliability Engineer position at Deutsche Telekom IT Solutions, you should possess extensive experience with Linux distributions (like RHEL, CentOS, or Ubuntu), scripting proficiency in Bash or Python, familiarity with cloud platforms such as AWS or Azure, and experience with infrastructure as code tools like Terraform or Ansible. Strong communication skills and at least intermediate English are also essential.

Join Rise to see the full answer
What tools and technologies should a Linux Site Reliability Engineer be familiar with at Deutsche Telekom IT Solutions?

A Linux Site Reliability Engineer at Deutsche Telekom IT Solutions should be familiar with various tools and technologies such as Ansible, Terraform, Docker, and Kubernetes. Knowledge of networking protocols, web servers like Apache or NGINX, and monitoring tools such as Prometheus or Grafana is also important.

Join Rise to see the full answer
What kind of work environment can a Linux Site Reliability Engineer expect at Deutsche Telekom IT Solutions?

At Deutsche Telekom IT Solutions, Linux Site Reliability Engineers can expect a collaborative and innovative work environment with multiple opportunities for professional growth. Working across various locations like Budapest, Debrecen, Pécs, and Szeged, you will be part of diverse teams passionate about technology and continuously improving IT services.

Join Rise to see the full answer
Can you work remotely as a Linux Site Reliability Engineer at Deutsche Telekom IT Solutions?

While Deutsche Telekom IT Solutions offers flexible work arrangements, remote working opportunities for the Linux Site Reliability Engineer role are available only within Hungary due to European taxation regulations. This enables you to enjoy a balanced work-life experience while contributing to a leading ICT employer.

Join Rise to see the full answer
Common Interview Questions for Linux Site Reliability Engineer (REF3649Z)
How do you approach incident management as a Linux Site Reliability Engineer?

When managing incidents, I prioritize quick resolution by conducting thorough investigations to identify root causes. Following an incident, I lead post-mortem discussions to analyze what went wrong and implement preventative measures, ensuring system reliability and minimizing recurrence.

Join Rise to see the full answer
Can you describe your experience with automation tools like Ansible or Terraform?

I have significant experience using Ansible to automate system configurations and deployments. With Terraform, I create and manage infrastructure as code, which enhances reproducibility and streamlines the infrastructure management process across various projects.

Join Rise to see the full answer
What strategies do you use to optimize system performance?

To optimize system performance, I analyze resource utilization metrics and identify bottlenecks using monitoring tools. I then implement configuration adjustments and scaling solutions, such as load balancing or resource provisioning, to ensure high availability and responsiveness.

Join Rise to see the full answer
How do you ensure security in your Linux environments?

I ensure security in my Linux environments by implementing best practices such as access control, regular patch management, and system hardening. I also perform regular audits and maintain updated security documentation to safeguard the infrastructure.

Join Rise to see the full answer
What experience do you have with containerization technologies like Docker?

I have hands-on experience with Docker for containerization, enabling lightweight application deployment and orchestration. This experience includes creating Docker images, managing container lifecycles, and collaborating on orchestration platforms such as Kubernetes.

Join Rise to see the full answer
How do you handle collaboration with development teams?

I believe in maintaining open lines of communication with development teams. I engage with them early in the development process to integrate SRE practices into CI/CD pipelines, aligning systems operations with development goals and fostering a collaborative culture.

Join Rise to see the full answer
What scripting languages do you use for automation, and how have they benefited you?

I primarily use Bash and Python for automation tasks. These scripting languages have allowed me to streamline workflows, reduce manual configurations, and enhance deployment efficiency, ultimately saving time and minimizing errors.

Join Rise to see the full answer
How do you stay current with industry trends and technologies?

To stay current with industry trends, I regularly participate in online courses, attend webinars, and engage with professional communities on platforms like GitHub and Stack Overflow, which allows me to learn from peers and stay updated on new tools and best practices.

Join Rise to see the full answer
Can you share an experience where you successfully improved a process?

In my previous role, I identified a manual deployment process that was prone to errors. I proposed and led a project to automate this process using Ansible, resulting in a 50% reduction in deployment time and significantly fewer issues in production.

Join Rise to see the full answer
What is your experience with cloud platforms like AWS or Azure?

I have worked extensively with AWS, utilizing various services such as EC2 for compute, S3 for storage, and RDS for database management. This experience has enabled me to design scalable, resilient architectures that meet business needs and leverage cloud capabilities effectively.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Posted 8 days ago
Photo of the Rise User
Deutsche Telekom IT Solutions Remote Budapest, Pécs, Szeged, Debrecen, Hungary
Posted 7 days ago
Photo of the Rise User
Posted 12 days ago
Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Vision Insurance
Paid Holidays
Photo of the Rise User
Posted 12 days ago
Photo of the Rise User
Posted 13 days ago
Posted 8 days ago

Founded in 1995, Deutsche Telekom AG is a Germany-based integrated telecommunications provider, offering its customers around the world a portfolio of services in the areas of telecommunications and information technology.

10 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
December 15, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!