Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Site Reliability Engi image - Rise Careers
Job details

Site Reliability Engi

Dice is the leading career destination for tech experts at every stage of their careers. Our client, Tektree Systems Inc., is seeking the following. Apply via Dice today!Role: Site Reliability Engineer (SRE)Client: EquifaxLocation: Alpharetta, GA (DAY1 Onsite - F2F interview)===Note: Candidates with prior experience at Equifax are preferredJob DescriptionSeeking an experienced Site Reliability Engineer who can operate independently with limited guidance and oversight. This individual will be passionate about end-user experience and will be part of a tight-knit, distributed engineering team developing and delivering a comprehensive data operations management solution for Equifax's Data Fabric Platform. SRE is a critical role in the entire SDLC from coding, scaling, and ensuring production stability that includes responding to on-call incidents.Data Fabric is a Google Cloud Platform cloud-native modern data management platform which allows Equifax to acquire and curate data, provide entity resolution, and ingest into a single environment. It is deployed globally in multiple regions, highly secured and complies with regional and internal regulatory controls with strict governance and oversight. Business units, Data Scientists and many other stakeholders use APIs to consume data managed by the Data Fabric and operate data exchanges to monetize data through B2B and B2C channels.Data operations management solution consists of:A web portal UI/UX that provides a single point of access to all data management and data reliability engineeringA suite of backend API services that services the UI and integrates with low-level Data Fabric and other third-party system APIsModern data lakehouse (data lake, data warehouse, batch and streaming ELT pipelines)The data operations roadmap envisions a set of rich management capabilities including:Serves a large community of geographically dispersed data operations stakeholdersData quality and observability management to detect, alert, and prevent data anomaliesTroubleshooting, triaging and resolving data and data pipeline issuesOLAP, batch and streaming big data processing, and BI reportingMLOpsReal-time dashboards, alerting and notifications, case management, user/group management, AuthZ, and many other foundational capabilitiesTech StackFrontend: Angular 17+, JavaScript, TypeScript, HTML, SCSS, Webpack Module Federation, Tailwinds CSS, Angular Material, Angular ElementsBackend: Java (JDK 17+), Spring Framework 6.X.X, Spring Boot 3.X.X, NestJS 10.X.X, REST and GraphQL microservices, NodeJSTools & Frameworks: Nx build management, Monorepo architecture, Jenkins CI/CD, Fortify, Sonar, GitHubCloud & Data: Google Cloud Platform (GKE, Composer + Airflow, Dataflow + Apache Beam, BigQuery, BigTable, Firestore, GCS, PubSub, Vertex AI), Terraform, Helm Charts, GitOpsOther Technologies: Websockets, SSE, event-driven architectureEnvironment:Culture: Fast-paced, creative, results-orientedTeam Structure: Agile, working in 2-week sprints using Aha and Jira for project managementExpectations: Self-starters who can work independently with limited guidance, delivering solutions that end-users value and loveGeneral Responsibilities:Contribute to Development Activities: SRE is expected to participate in SDLC activities that include design, develop, test, deploy, and operate, covering both frontend and backendCross-Functional Work: Collaborate with global teams to integrate with existing internal systems and Google Cloud Platform cloudIssue Resolution: Triage and resolve product or system issues, ensuring quality and performanceDocumentation: Write technical documentation, support guides, and run booksAgile Practices: Participate in sprint planning, retrospectives, and other agile activitiesCompliance: Ensure software meets secure development guidelines and engineering standardsSRE Accountability:General: Use coding, automation, and software engineering principles to ensure scalability, performance, and reliability efficiently and toil-freeIAC: Build infrastructure as code (IAC) patterns that meet security and engineering standards using one or more technologies (Terraform, scripting with cloud CLI, and programming with cloud SDK)CI/CD: Build CI/CD pipelines for build, test and deployment of application and cloud architecture patterns, using platform (Jenkins) and cloud-native toolchainsAutomation: Build automated tooling to deploy service requests to push a change into production. Build runbooks that are comprehensive and detailed to manage detect, remediate and restore servicesChange Management: Work closely with the dev team to ensure all DevSecOps issues are addressed timely, in compliance with Equifax security policies, and adherence to Engineering HandbookIncident management: Solve problems and triage complex distributed architecture service maps. On call for high severity application incidents and improving run books to improve MTTRRCA and postmortem: Lead root cause analysis and blameless postmortem and own the call to action to remediate recurrencesCustomer Focus: Address service disruptions and downtime ensuring end-customer needs are met, and drive processes for a flawless customer experience ensuringReliability and Availability: Ensure monitoring of SRE golden signals, SLO, SLIs, and SLAs are honored within error budgets. Work closely with devs, QE, POs, and other stakeholders providing continuous feedback on uptime, scalability, and reliability, and influence best practices with aim of providing excellent operational experiencesReliability roadmap: Own the reliability roadmap by taking a holistic view of all data operations management capabilities that includes participating in Production Readiness Review (PRR), and working with stakeholders to ensure DR plans are in placeMust-Have Skills:General experience: 5-7 years of experience in software engineering, systems administration, database administration, and networking. System administration skills, including automation and orchestration of Linux/Windows using Terraform, Chef, Ansible and/or containers (Docker, Kubernetes), and shell scriptingCloud-Native Application Development: 3+ years. Solid experience with developing and supporting cloud-native applications. Experience with cloud-based security: IAM, AuthZEnd-user Application Experience: 3+ years experience as a SRE supporting an end-user facing application, e.g web/mobile/desktop app that includes UI, APIs, and backend systemsDevelopment Experience: 2+ years of general proficiency with Java, or JavaScript/NodeJSFrontend Experience: Experience with Angular, JavaScript, TypeScript, or modern web application development frameworksArchitecture Knowledge: Understanding of modular systems, performance, scalability, securityAgile Experience: Agile development mindset and experienceService-Oriented Architecture: Knowledge of RESTful web services, JSON, AVROApplication Troubleshooting: Debugging, performance tuning, production supportDocumentation Skills: Strong written and verbal communicationGeneral SDLC: Experience with CI/CD concepts and can use tools including Jenkins/Bamboo, and release management concepts. Understanding of Google Cloud Platform services related to big data like BigQuery, Dataflow, Pub/Sub, GCS, Composer/Airflow. Or similar solutions in AWS: Redshift, SNS, SQS, S3, Kinesis and othersNice-to-Have Skills:Big Data Processing: ETL/ELT experienceScripting Languages: Groovy, PythonCloud Certification: Relevant certifications in cloud technologies
DICE Glassdoor Company Review
2.7 Glassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star iconGlassdoor star icon
DICE DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of DICE
DICE CEO photo
Phillip Hutcheon
Approve of CEO

Average salary estimate

Estimate provided by employer
$160000 / ANNUAL (est.)
min
max
$121K
$199K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

OUR MISSION At DICE, our mission is to get people out more, so we built a curated platform that connects a global community of fans to personalised, high-quality live experiences in the easiest way possible. OUR VALUES Company values are often f...

665 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
September 2, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!
Other jobs
Company
Posted last month
Company
Posted 17 days ago
Company
Posted 28 days ago
Company
DICE Remote United States
Posted last month