Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior Site Reliability Engineer - Data (REMOTE) image - Rise Careers
Job details

Senior Site Reliability Engineer - Data (REMOTE)

The Discogs Platform team is focused on several objectives: building and supporting performant, cost-effective, reliable infrastructure; developer experience tooling and mentorship; and creating "golden paths" for organization-wide standards and velocity. As a key member of the Platform team, the Senior Site Reliability Engineer - Data will be working closely with other Discogs engineering squads to develop and optimize scalable, well-planned relational database architectures, drive best practices and stability for our use of Kafka and change data capture, and contribute to the Platform team’s operations.

Location

This is a remote position. Open to candidates located in OR, WA, CA, CO, TX, IL

Compensation

Starting Base Salary Range: $130,000 - $140,000 yearly

 

Who We Are

We are dedicated to supporting a global community of music fans and collectors who share the value, culture, connection, and joy of record collecting. Fostering the exchange of knowledge, records, and curation, we help people help each other deepen their relationship with music. Leveraging the power of community, we are committed to enabling people to explore artists and their recorded works through the world's definitive music discography, stay informed with record collection and sales history data, get organized with specialized collection management tools, and stay connected to a global community of fellow record collectors and sellers. Providing this essential set of resources, tools, and access, we aim to unleash boundless opportunities for people to dig into the depths of their musical interests, build and fortify their record collections, cultivate and bridge communities, and elevate their connection to music and record collecting.

What You’ll Accomplish

Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

  • Stewarding Discogs’ data stores as a key subject matter expert
  • Leading efforts on the reliability and design patterns of our Kafka and Kafka Connect implementations
  • Establishing data contracts and clear communication standards between CDC producers and consumers
  • Working closely with engineering squads to refactor and re-architect MySQL database schema and indexing for long-term scalability, performance, and cost effectiveness
  • Mentoring engineering squads on Platform best practices for MySQL, Kafka, and other software development lifecycle areas 
  • Writing documentation and runbooks that contribute to the engineering organization’s knowledge base
  • Working in a containerized, orchestrated environment
  • Contributing to the Platform team’s disciplines of site reliability and operations, supporting both our squads and Platform’s central infrastructure
  • Participating in on-call rotation, responding to incidents, and troubleshooting data and other operations issues

What You’ll Contribute

Minimum Education and Experience

  • A Bachelor's Degree in Computer Science or similar area of focus, or equivalent relevant work experience.
  • 5+ years of experience working with Kafka and relational database management systems (RDBMS).
  • 6+ years experience in Ops, DevOps, Site Reliability, Platform or other systems roles.

Required Skills & Abilities:

  • Relational database schema design, query performance optimization, administration (MySQL, Percona Server, AWS RDS)
  • Kafka: Cluster administration (Strimzi), Kafka Connect (Debezium, JDBC)
  • CI/CD (GitHub Actions)
  • GitOps (ArgoCD)
  • Kubernetes (EKS, Kustomize, Karpenter, administration, application manifests)
  • AWS and cloud development (VPC, EKS, RDS, S3)
  • Observability (Datadog, Sentry)
  • Scripting (Shell, Python)
  • Track record of collaboration and mentorship
  • Excellent written communication and documentation skills
  • Continuous learning
  • Ownership and proactive approach to solving large problems

Preferred:

  • Infrastructure-as-code (Terraform)
  • Elasticsearch (ECK administration, scaling, performance)
  • Python (SQLAlchemy, FastAPI)
  • GraphQL (schema design, Apollo federation)
  • REST API
  • Hashicorp Vault
  • Redis
  • Memcached
  • NoSQL Database
  • Data Lake/Warehouse
  • Data Governance
  • Data Security

The Platform team covers a wide range of technical topics and we'd love to hear about your skills beyond this list!

What We Provide

    • Competitive compensation: salary, plus performance-related bonus program
    • 401(k) with employer match
    • 100% company-paid medical and dental insurance benefits for you and your dependents
    • 4 weeks paid vacation, increasing based on tenure
    • 18 weeks paid leave for birth moms
    • 8 weeks paid parental leave, including for adoption
    • Monthly wellness allowance
    • Annual professional and personal development allowance
    • Work from home office set-up and expense allowances
    • Flexible work location opportunities
    • Employer matching toward charitable contributions

What We Believe In

We're building a world idealized for record collectors, driven by community, and fueled by a shared passion for music. Through culture, information, and innovation, we strive to develop a complete ecosystem of resources to empower music lovers and entrepreneurs everywhere to engage more deeply in the joys and possibilities of record collecting. We foster a collaborative community dedicated to preserving the recording industry's past, present, and unfolding future by cataloging the world's complete, interconnected music discography. Leveraging the power of this dynamic knowledge base, we aim to innovate integrated technologies to empower music fans everywhere to embark on a boundless journey of music discovery and record collecting. We envision this to be the complete collecting journey.

Discogs is an Equal Opportunity Employer.

Applicants needing accommodation to apply should contact us at 503-597-6340

Discogs does not promote job openings through text messaging. If you receive a text message claiming to offer a position at our company, please disregard it as fraudulent. For a list of our actively open positions and to apply, please visit the official Careers page on our website: https://www.discogs.com/about/careers

If you apply for this role, you will be required to upload a resume, cover letter, and fill out a few questions regarding your application. Once submitted, our hiring team will review your application and contact you if you are selected for an interview. Whether you are successful or not, we will store your application and data in our system for a maximum period of one year from the application date in case another role becomes available that you are suitable for. If you have any questions or concerns about us storing this data and/or the period of time, please contact us at legal@discogsinc.com and we will respond to you within 30 days.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$130000K
$140000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Senior Site Reliability Engineer - Data (REMOTE), Discogs

The Discogs team is on the hunt for an enthusiastic and skilled Senior Site Reliability Engineer - Data to join our vibrant Platform team! If you're passionate about building robust and efficient infrastructure while enhancing developer experiences, this opportunity is calling your name. You'll play a pivotal role in refining and optimizing our relational database architectures, primarily focusing on MySQL and Kafka systems. Collaborating with various engineering squads, you'll ensure that our database designs are not only scalable but also efficient and cost-effective. Your expertise in Kafka and change data capture will be invaluable as you lead efforts to enhance stability and best practices across the organization. You'll also take the lead in creating clear data contracts and communication standards that promote seamless cooperation between producers and consumers. At Discogs, we cherish knowledge sharing, and that's why mentoring your fellow engineers on Platform best practices will be part of your day-to-day activities. If you're ready to immerse yourself in a containerized, orchestrated environment and contribute to our central infrastructure while participating in on-call rotations, we’d love to meet you. Plus, as a fully remote position, you have the flexibility to work from the comfort of your home, all while earning a competitive salary of $130,000 - $140,000 a year. We believe in supporting our team members both personally and professionally, offering a wealth of benefits and an environment where your passion for music and technology can thrive together.

Frequently Asked Questions (FAQs) for Senior Site Reliability Engineer - Data (REMOTE) Role at Discogs
What are the main responsibilities of the Senior Site Reliability Engineer - Data at Discogs?

As a Senior Site Reliability Engineer - Data at Discogs, you will take charge of stewarding our data stores, ensuring high reliability and scalability in our database architectures. You will also lead initiatives to optimize Kafka implementations, establish clear communication standards between CDC producers and consumers, and mentor engineering teams on MySQL and Kafka best practices, all contributing to our evolving technical landscape.

Join Rise to see the full answer
What qualifications do I need to apply for the Senior Site Reliability Engineer - Data position at Discogs?

To be considered for the Senior Site Reliability Engineer - Data role at Discogs, you should have at least a Bachelor's Degree in Computer Science or a related field, along with 5+ years of experience with Kafka and relational database management systems, and 6+ years in roles focused on Ops, DevOps, or Site Reliability. Proficiency in MySQL, Kafka, CI/CD, Kubernetes, and scripting languages is essential.

Join Rise to see the full answer
What skills are essential for the Senior Site Reliability Engineer - Data role at Discogs?

Essential skills for the Senior Site Reliability Engineer - Data at Discogs include strong abilities in relational database design and optimization, Kafka cluster administration, CI/CD practices using GitHub Actions, and proficiency with Kubernetes and AWS. A knack for effective communication and documentation is also crucial to foster collaboration across teams.

Join Rise to see the full answer
What is the work environment for a Senior Site Reliability Engineer - Data at Discogs?

The work environment for a Senior Site Reliability Engineer - Data at Discogs is fully remote, allowing you the flexibility and comfort to work from home. You will engage with a collaborative team focused on pushing the envelope in tech while being supported by our robust infrastructure and modern technologies, ensuring you have the resources you need to thrive.

Join Rise to see the full answer
What opportunities for growth are available for the Senior Site Reliability Engineer - Data at Discogs?

At Discogs, there's a strong culture of continuous learning and professional development for the Senior Site Reliability Engineer - Data. You'll have access to annual professional development allowances, opportunities to mentor and lead projects, and the chance to work on diverse technical topics that will expand your skill set and enhance your career trajectory.

Join Rise to see the full answer
Common Interview Questions for Senior Site Reliability Engineer - Data (REMOTE)
Can you describe your experience with Kafka and how it relates to site reliability engineering?

When addressing your experience with Kafka, speak about specific projects where you implemented it for data streaming and how it improved system reliability. Discuss challenges faced, your solutions, and the impact on the overall infrastructure.

Join Rise to see the full answer
What strategies do you employ to optimize relational database schemas?

You can elaborate on techniques such as indexing, normalization, data partitioning, and query optimization. Use examples from past experiences where you successfully implemented these strategies to improve performance.

Join Rise to see the full answer
How do you prioritize tasks and troubleshoot issues in a high-pressure environment?

Outline your process for prioritizing based on impact and urgency. Describe a situation involves troubleshooting complex data issues, emphasizing your systematic approach to problem-solving and communicating with stakeholders.

Join Rise to see the full answer
Can you explain the concept of observability and its importance in site reliability?

Discuss the aspects of observability like monitoring, tracing, and logging. Explain how observability helps you preemptively identify and resolve issues, ensuring high availability and performance of services.

Join Rise to see the full answer
What is your experience with CI/CD pipelines and how have they influenced your work?

Share your familiarity with CI/CD tools like GitHub Actions or ArgoCD, emphasizing how they streamline deployment processes, improve code quality, and facilitate collaboration among teams, thereby enhancing reliability.

Join Rise to see the full answer
Describe your approach to mentoring junior engineers in site reliability best practices.

Talk about your commitment to knowledge-sharing, providing examples of mentorship initiatives you've led. Highlight how you customize your approach based on each individual’s learning style and needs.

Join Rise to see the full answer
How do you assess and implement infrastructure-as-code in your projects?

You can mention tools like Terraform and the advantages of IaC in ensuring consistency, repeatability, and scalability. Discuss a specific project where you successfully implemented IaC and its outcomes.

Join Rise to see the full answer
What are your key considerations for database security and data governance?

Provide insights into best practices for securing data stores, including encryption, access controls, and audit logging. Discuss how these practices tie into compliance and overall data governance strategies.

Join Rise to see the full answer
How do you stay current with emerging technologies in site reliability engineering?

Explain your approach to professional development, including following related publications, attending webinars, or participating in forums. Mention how this ongoing learning informs your work.

Join Rise to see the full answer
Tell us about a time you improved system reliability through a specific implementation.

Use the STAR method (Situation, Task, Action, Result) to detail a project where your contribution led to tangible improvements in system reliability, discussing strategic implementation and outcomes.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User

Be a vital part of a music-focused platform as a User Experience Designer, creating engaging and intuitive designs for record collectors.

Photo of the Rise User
Boeing Hybrid USA - Colorado Springs, CO
Posted 4 days ago

Join Boeing's missile defense team as a C2BMC Systems Engineer in Colorado Springs, where you will play a crucial role in developing systems for the protection of the nation.

Photo of the Rise User
Posted 10 days ago

Join Raytheon's LCE team in Tucson as a Senior System Safety Engineer, tackling significant engineering challenges to enhance national safety.

Photo of the Rise User
SPEAK Hybrid San Francisco
Posted 11 days ago

Join Speak as a Senior SRE Engineer and lead the way in transforming language learning experiences through cutting-edge technology.

Photo of the Rise User
Posted 10 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Take Risks
Startup Mindset
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Learning & Development
Work Visa Sponsorship
401K Matching
Equity
Performance Bonus

As a Federal Compliance Engineer, you will enhance security compliance for Palantir's US Government clients while collaborating with engineering teams.

Photo of the Rise User
Posted 3 days ago

AT&T seeks a Lead System Engineer to drive software development and application solutions in a collaborative team environment.

Photo of the Rise User
SpaceX Hybrid Sunnyvale, California, United States
Posted 3 days ago
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition

Join SpaceX as a Sr. Design Verification Engineer to work on revolutionary technologies that facilitate connectivity across the globe.

Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Performance Bonus
Family Medical Leave
Paid Holidays

Join Lightship as a Mechanical Design Engineer to innovate and electrify recreational vehicle design.

Photo of the Rise User

Join SoSafe as a Staff Engineer to lead the development of innovative cloud-native security solutions while supporting a vibrant cybersecurity team.

Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Customer-Centric
Fast-Paced
Growth & Learning
Medical Insurance
Dental Insurance
401K Matching
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Flex-Friendly
Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Transparent & Candid
Growth & Learning
Fast-Paced
Collaboration over Competition
Take Risks
Friends Outside of Work
Passion for Exploration
Customer-Centric
Reward & Recognition
Feedback Forward
Rapid Growth
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Paternity Leave
Fully Distributed
Flex-Friendly
Some Meals Provided
Snacks
Social Gatherings
Pet Friendly
Company Retreats
Dental Insurance
Life insurance
Health Savings Account (HSA)
Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Transparent & Candid
Growth & Learning
Fast-Paced
Collaboration over Competition
Take Risks
Friends Outside of Work
Passion for Exploration
Customer-Centric
Reward & Recognition
Feedback Forward
Rapid Growth
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Paternity Leave
Fully Distributed
Flex-Friendly
Some Meals Provided
Snacks
Social Gatherings
Pet Friendly
Company Retreats
Dental Insurance
Life insurance
Health Savings Account (HSA)
Photo of the Rise User
Posted 2 months ago

Join ABC Legal Services as a Data Entry Specialist where you can work remotely and support our team in the legal document filing process.

Discogs is the world's foremost Database, Marketplace, and Community for music. The user-built Database boasts a catalog of more than 11 million releases and 6.1 million artists making it the most extensive physical music Database in the world. By...

9 jobs
MATCH
VIEW MATCH
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
April 20, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!