Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Lead Site Reliability Engineer image - Rise Careers
Job details

Lead Site Reliability Engineer

Job Description

The role of the Lead – Site Reliability Engineer is to be hands-on and provide mentorship to other team members on core SRE principles and tools. The lead SRE will participate in end to end operational aspects of Production environment. The individual concerned will be able to work on cloud systems, networks, databases and help drive incident lifecycle management. As a member of the SRE team, you will also be working closely with the Architects, DevOps, Product and development teams to ensure we get the most out of the software on AWS platform. This role requires a highly skilled technology professional with excellent communication skills, strategic mindset, strong analytical and troubleshooting skills on AWS Cloud Platform.

Other responsibilities include working with internal business partners to gather requirements, prototyping, architecting, implementing/updating solutions, building and executing test plans, performing quality reviews, managing operations, and triaging and fixing operational issues. Site Reliability Engineers must be able to adjust to constant business change; common types of changes include new requirements, evolving goals and strategies, and emerging technologies.

About the Role:

  • Be hands-on and provide mentorship to a growing SRE team on core SRE principles and tools.
  • Foster a sense of automation in issue resolution; everything possible should be automated, and only when automation can’t resolve an issue should people get involved in the resolution
  • Lead efforts for updating production with new versions/infrastructures as they are available
  • Lead capacity planning efforts in collaboration with Architects and DevOps engineers to determine changes to infrastructure that are needed to support new load and performance characteristics
  • Leads engagement with software developers, DevOps and other infrastructure engineers to integrate software development and delivery from inception to full operation, ensuring robust released software and systems.
  • Ensure highest level of uptime to meet the customer SLA by implementing system wide corrections to prevent reoccurrence of issues.
  • Mentor other SRE team members to further develop their soft and hard skills
  • Triage, troubleshoot and resolve issues using golden signals and go past golden signals
  • Go past golden signals with additional principles such as chaos engineering to detect failure points and lead Game days for testing resiliency of team when it comes to incident response and remediations and synthetic monitoring.
  • Lead SRE team members to create and maintain Recovery Procedures, RCA’s in collaboration with other engineering teams.
  • Ensure Incidents assigned to the team are being managed within agreed SLAs
  • Ensure alarms are documented in up to date Knowledge Base Articles.
  • Ensures Production infrastructure is up to date with server/security patches and certificates.
  • Continuous improvement of system and application monitoring and automation
  • Identify and automate manual workarounds and process improvements
  • Proactive monitoring of Monitor the availability, latency, scalability and efficiency of all services
  • Perform periodic on-call duty as part of the SRE team

About You:
 

  • Skilled with cloud operations/administration in Amazon AWS.
  • Tax/Accounting domain experience
  • Bachelors or Master’s in Computer Science discipline.
  • 5+ years’ experience focussed on Site Reliability Engineering or related position in AWS Cloud Platform.
  • At least 2 AWS Certifications are must. (AWS Sysops Admin and Architects certifications preferred).
  • Experience working with SQL, Windows Servers, Load balancers, Linux
  • Deep experience with AWS, Docker and Kubernetes, CloudFormation, CloudWatch, CodeDeploy, DynamoDB, Lambda, SQS, Amazon FSX, Elastic Search and networking concepts are must.
  • Program at a high level in at least one language such as: Java, C#, Javascript, Python or Ruby.
  • Integration experience with PagerDuty, ServiceNow, Datadog, CloudWatch.
  • Good understanding of Site Reliability Engineering (SRE) philosophies, technologies, platforms and tools, SLO management, incident resolution, and automation;
  • Ability to explain technical concepts in clear, non-technical language
  • Working knowledge of infrastructure components (e.g. routers, load balancers, cloud products, container systems, compute, storage, and networks)
  • Knowledge of security and compliance standards such as SOC/PCI is a plus

#LI-HS1

    What’s in it For You?

    Join us to inform the way forward with the latest AI solutions and address real-world challenges in legal, tax, compliance, and news. Backed by our commitment to continuous learning and market-leading benefits, you’ll be prepared to grow, lead, and thrive in an AI-enabled future. This includes:

    • Industry-Leading Benefits: We offer comprehensive benefit plans to include flexible vacation, two company-wide Mental Health Days off, access to the Headspace app, retirement savings, tuition reimbursement, employee incentive programs, and resources for mental, physical, and financial wellbeing.

    • Flexibility & Work-Life Balance: Flex My Way is a set of supportive workplace policies designed to help manage personal and professional responsibilities, whether caring for family, giving back to the community, or finding time to refresh and reset. This builds upon our flexible work arrangements, including work from anywhere for up to 8 weeks per year, and hybrid model, empowering employees to achieve a better work-life balance.

    • Career Development and Growth: By fostering a culture of continuous learning and skill development, we prepare our talent to tackle tomorrow’s challenges and deliver real-world solutions. Our skills-first approach ensures you have the tools and knowledge to grow, lead, and thrive in an AI-enabled future.

    • Culture: Globally recognized and award-winning reputation for inclusion, innovation, and customer-focus. Our eleven business resource groups nurture our culture of belonging across the diverse backgrounds and experiences represented across our global footprint.

    • Hybrid Work Model: We’ve adopted a flexible hybrid working environment (2-3 days a week in the office depending on the role) for our office-based roles while delivering a seamless experience that is digitally and physically connected.

    • Social Impact:  Make an impact in your community with our Social Impact Institute. We offer employees two paid volunteer days off annually and opportunities to get involved with pro-bono consulting projects and Environmental, Social, and Governance (ESG) initiatives.

    



    Do you want to be part of a team helping re-invent the way knowledge professionals work? How about a team that works every day to create a more transparent, just and inclusive future? At Thomson Reuters, we’ve been doing just that for almost 160 years. Our industry-leading products and services include highly specialized information-enabled software and tools for legal, tax, accounting and compliance professionals combined with the world’s most global news services – Reuters. We help these professionals do their jobs better, creating more time for them to focus on the things that matter most: advising, advocating, negotiating, governing and informing.

    We are powered by the talents of 26,000 employees across more than 70 countries, where everyone has a chance to contribute and grow professionally in flexible work environments that celebrate diversity and inclusion. At a time when objectivity, accuracy, fairness and transparency are under attack, we consider it our duty to pursue them. Sound exciting? Join us and help shape the industries that move society forward. 

    Accessibility 

    As a global business, we rely on diversity of culture and thought to deliver on our goals. To ensure we can do that, we seek talented, qualified employees in all our operations around the world regardless of race, color, sex/gender, including pregnancy, gender identity and expression, national origin, religion, sexual orientation, disability, age, marital status, citizen status, veteran status, or any other protected classification under applicable law. Thomson Reuters is proud to be an Equal Employment Opportunity/Affirmative Action Employer providing a drug-free workplace.

    We also make reasonable accommodations for qualified individuals with disabilities and for sincerely held religious beliefs in accordance with applicable law.

    Protect yourself from fraudulent job postings click here to know more.

    More information about Thomson Reuters can be found on https://thomsonreuters.com.

    Thomson Reuters Glassdoor Company Review
    4.1 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
    Thomson Reuters DE&I Review
    No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
    CEO of Thomson Reuters
    Thomson Reuters CEO photo
    Steve Hasker
    Approve of CEO

    Average salary estimate

    $110000 / YEARLY (est.)
    min
    max
    $100000K
    $120000K

    If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

    What You Should Know About Lead Site Reliability Engineer, Thomson Reuters

    Are you ready to step into a pivotal role as a Lead Site Reliability Engineer at Thomson Reuters in Hyderabad's Raheja Mindspace? In this hands-on position, you’ll be responsible for mentoring fellow team members while diving deep into the core principles of Site Reliability Engineering (SRE). You’ll engage in all aspects of our production environment, collaborating closely with architects, DevOps, and development teams to maximize our software capabilities on the AWS platform. A big part of your day will involve spearheading efforts to automate incident resolution and ensure our systems maintain the highest levels of uptime according to customer SLAs. Your work will require not just technical prowess, particularly with AWS, Docker, and Kubernetes, but also the ability to analyze complex scenarios and communicate effectively across different teams. If you have a knack for troubleshooting, a passion for continuous improvement, and enjoy leading teams through challenges, this role is tailor-made for you. You'll be involved in capacity planning, drafting quality reviews, automating processes and tools, and proactively managing operational issues. Our commitment to a flexible work environment and continuous learning will empower you to thrive and grow in your career while contributing meaningfully to our mission. So, if you're ready to make a real impact in a company that values innovation, diversity, and community, apply today!

    Frequently Asked Questions (FAQs) for Lead Site Reliability Engineer Role at Thomson Reuters
    What are the key responsibilities of a Lead Site Reliability Engineer at Thomson Reuters?

    As a Lead Site Reliability Engineer at Thomson Reuters, your primary responsibilities will include mentoring your SRE team on core SRE principles, automating issue resolution, leading capacity planning efforts, ensuring system reliability, and collaborating with cross-functional teams. You'll also be tasked with triaging operational issues and proactively managing the production environment to meet SLA requirements.

    Join Rise to see the full answer
    What qualifications are needed for the Lead Site Reliability Engineer position at Thomson Reuters?

    To be considered for the Lead Site Reliability Engineer role at Thomson Reuters, candidates should have at least 5 years of experience in a relevant position focused on AWS Cloud Platform. A Bachelor's or Master's degree in Computer Science or a related discipline is essential, along with at least 2 AWS certifications such as AWS SysOps Administrator and Solutions Architect. Proficiency in languages like Java, Python, or C#, along with experience in SQL and cloud operations, is also required.

    Join Rise to see the full answer
    How does the Lead Site Reliability Engineer role contribute to Thomson Reuters' mission?

    The Lead Site Reliability Engineer role significantly contributes to Thomson Reuters' mission by ensuring reliable and efficient software delivery on the AWS platform. By fostering automation, enhancing system monitoring, and mentoring a skilled team, you'll help streamline operations that empower professionals in legal, tax, and compliance sectors to make informed decisions and create time for high-value tasks.

    Join Rise to see the full answer
    What tools and technologies are important for a Lead Site Reliability Engineer at Thomson Reuters?

    A successful Lead Site Reliability Engineer at Thomson Reuters should be well-versed in technologies such as AWS, Docker, Kubernetes, and various monitoring tools including CloudWatch and Datadog. Familiarity with SQL, Linux, version control systems, and integration tools will also be critical to perform effectively in this role.

    Join Rise to see the full answer
    What is the work culture like for a Lead Site Reliability Engineer at Thomson Reuters?

    At Thomson Reuters, the work culture for a Lead Site Reliability Engineer embodies a commitment to continuous learning, innovation, and diversity. You'll be part of a collaborative environment that encourages flexibility and work-life balance, alongside initiatives like employee resource groups and career development opportunities, allowing you to thrive while making a meaningful impact.

    Join Rise to see the full answer
    Common Interview Questions for Lead Site Reliability Engineer
    Can you describe your experience with AWS and how it relates to your role as Lead Site Reliability Engineer?

    When addressing your AWS experience in an interview, aim to highlight projects you’ve worked on that demonstrate your technical knowledge and operational expertise. Discuss how you utilized AWS services like EC2, AWS Lambda, and CloudFormation to enhance system reliability and performance. Mention specific metrics or achievements to quantify your impact.

    Join Rise to see the full answer
    How do you approach incident management and ensure minimal downtime?

    In your response, emphasize your understanding of incident management frameworks such as ITIL or DevOps methodologies. Illustrate your strategic approach to incident response, including the use of monitoring tools, automation for issue detection, and establishing communication channels for quick resolution. Highlight an example from your past experience to reinforce your narrative.

    Join Rise to see the full answer
    What strategies do you apply for capacity planning in cloud environments?

    Discuss how you analyze current resource utilization and forecast future needs based on historical data and business growth. Explain any tools you use for load testing and performance metrics, and how you collaborate with architects and developers to anticipate changes in capacity and make informed infrastructure decisions.

    Join Rise to see the full answer
    How would you foster a culture of automation within your SRE team?

    Share your philosophy on automation and provide examples of successful initiatives you’ve led to reduce manual tasks. Describe how you encourage team members to identify opportunities for automation and the benefits it brings, such as increased efficiency, reduced errors, and improved job satisfaction. Mention specific tools you’ve implemented in past roles.

    Join Rise to see the full answer
    What experience do you have with chaos engineering, and how do you implement it?

    When discussing chaos engineering, explain its purpose in increasing system resilience by intentionally introducing failures. Share your experience with conducting chaos engineering experiments, what metrics you tracked, and how these initiatives contributed to system improvements and team learning outcomes.

    Join Rise to see the full answer
    Describe your experience mentoring other engineers in SRE principles.

    In your answer, detail your mentoring style, focusing on how you provide guidance and share knowledge through hands-on learning opportunities and constructive feedback. Provide illustrations of your mentoring success stories, such as team members who have advanced in their roles or improved their performance under your guidance.

    Join Rise to see the full answer
    What tools do you use to monitor system performance and availability?

    Share specific tools you've leveraged for monitoring, such as Datadog, Prometheus, or AWS CloudWatch. Discuss how you set up alerts for critical metrics and use dashboards to provide visibility into system health, ensuring a proactive approach to maintaining uptime and responsiveness.

    Join Rise to see the full answer
    How do you handle changes in the production environment?

    Discuss your approach to change management, emphasizing the importance of thorough testing, documentation, and communication of any changes made. Describe how you collaborate with development and operations teams to ensure smooth deployments and mitigate risks associated with changes.

    Join Rise to see the full answer
    How do you stay current with the latest SRE practices and technologies?

    Describe the resources you utilize to keep your skills and knowledge sharp, such as industry blogs, conferences, online courses, and community forums. Illustrate how you actively apply new practices and technologies to your work, showcasing a commitment to continuous improvement.

    Join Rise to see the full answer
    What steps do you take when investigating an incident?

    Outline your systematic approach for investigating incidents, including gathering logs, collaborating with team members, and using tools that provide insights into system behavior. Emphasize how you prioritize issues based on severity and impact, and ensure root cause analysis is thorough to prevent future incidents.

    Join Rise to see the full answer
    Similar Jobs
    Photo of the Rise User
    Thomson Reuters Remote USA-MSP-2900 Ames Crossing Road
    Posted 6 days ago

    Join Thomson Reuters as a Senior Software Engineer - Java to play a key role in developing cutting-edge legal tech solutions.

    Photo of the Rise User

    Lead the transformation of legal content solutions as a Senior Business Development Manager at Thomson Reuters.

    Photo of the Rise User
    Posted 9 days ago

    Join Visa as an Associate Cybersecurity Engineer to drive IAM innovation through advanced AI technologies.

    Photo of the Rise User
    Posted 2 days ago

    Enhance the reliability and efficiency of Visa's applications as a Senior Site Reliability Engineer in a dynamic, purpose-driven organization.

    Prudential is on the lookout for a dynamic Lead, Business Systems Analyst to enhance their Group Insurance capabilities and drive impactful technological solutions.

    Photo of the Rise User
    Posted 13 days ago
    Photo of the Rise User
    Cognizant Hybrid US, Dallas County, TX; Texas, Dallas, TX
    Posted 11 days ago

    Cognizant is looking for a ServiceNow Solutions Architect with extensive experience in ServiceNow implementations to lead technology strategy and solutions.

    Photo of the Rise User
    CyberArk Hybrid Newton, Massachusetts
    Posted 2 days ago

    CyberArk is looking for a skilled Cloud Security Architect to lead the design and implementation of advanced security measures in their SaaS offerings.

    Photo of the Rise User
    Canonical Remote Home based - Africa, Nairobi
    Posted 10 days ago
    Dental Insurance
    Performance Bonus
    Paid Holidays

    Join Canonical as a Solutions Architect to work with leading automotive enterprises and promote open source solutions.

    Thomson Reuters (NYSE / TSX: TRI) informs the way forward by bringing together the trusted content and technology that people and organizations need to make the right decisions. The company serves professionals across legal, tax, accounting, compl...

    243 jobs
    MATCH
    Calculating your matching score...
    BADGES
    Badge ChangemakerBadge Diversity ChampionBadge Family FriendlyBadge Flexible CultureBadge Work&Life Balance
    FUNDING
    SENIORITY LEVEL REQUIREMENT
    TEAM SIZE
    EMPLOYMENT TYPE
    Full-time, hybrid
    DATE POSTED
    April 4, 2025

    Subscribe to Rise newsletter

    Risa star 🔮 Hi, I'm Risa! Your AI
    Career Copilot
    Want to see a list of jobs tailored to
    you, just ask me below!
    LATEST ACTIVITY
    Photo of the Rise User
    Someone from OH, Mansfield just viewed General Manager(03276) - Littleton NH at Domino's
    Photo of the Rise User
    Someone from OH, Dayton just viewed Accounts Payable Accountant at Intellihub
    Photo of the Rise User
    Someone from OH, Dayton just viewed Accounts Payable Associate at Cover Genius
    M
    Someone from OH, Dayton just viewed Sr Analyst Accounts Payable at Medline
    O
    Someone from OH, Dayton just viewed Senior Financial Analyst, FP&A at Oura
    Photo of the Rise User
    Someone from OH, Dayton just viewed Project Planner - Renewable Energy at Zone IT Solutions
    Photo of the Rise User
    Someone from OH, Loveland just viewed Inside Sales Co-Op at VEGA Americas
    Photo of the Rise User
    11 people applied to Web Developer (Remote) at B12
    T
    Someone from OH, Cuyahoga Falls just viewed Claim Operations Specialist Entry Level at Travelers
    Photo of the Rise User
    Someone from OH, Cuyahoga Falls just viewed EDI Payer Enrollment Coordinator, Health - Remote at Experian
    Photo of the Rise User
    34 people applied to Cybersecurity Intern at Dewberry
    Photo of the Rise User
    Someone from OH, Cuyahoga Falls just viewed Data Entry Clerk (Lead Sourcer) at PatSnap
    Photo of the Rise User
    Someone from OH, Columbus just viewed Regional Vice President - Ohio Valley at Zscaler
    Photo of the Rise User
    43 people applied to Security Analyst Jr at DEUNA
    V
    Someone from OH, Columbus just viewed Remote Virtual Assistant at VirtueStaff
    Photo of the Rise User
    Someone from OH, Hamilton just viewed Customer Service Agent at Allegiant
    P
    Someone from OH, Cleveland just viewed Video Editor at ProjectGrowth
    Photo of the Rise User
    Someone from OH, Columbus just viewed Fullstack Developer at Apex Systems
    Photo of the Rise User
    Someone from OH, Dayton just viewed Remote Support Engineer at Frontier Technology Inc