Sign up for our
weekly
newsletter
of fresh jobs
The Role
In this role, you'll be given lots of responsibility and the opportunity to have true ownership as we build out the product. This is a unique opportunity to use your engineering powers to make a direct impact in people's lives. We need a Staff Site Reliability Engineer who is enthusiastic about building reliable, scalable, and flexible systems to support our growing team, product, and... user base. You'll work with other engineers to reliably release and maintain services, and help define and meet internal and customer-facing SLA's and SLO's.
This position is not eligible to be performed in Hawaii.
What You’ll Do
• Manage and orchestrate Cloud Resource (AWS) configuration using Infrastructure As Code (Terraform) to empower engineering staff to embrace a DevOps culture of Self Service Ownership
• Develop and govern Observability (Datadog) best practices for tracking platform performance and health trends to meet customer SLAs and lead technical decisions with strong supporting evidence
• Create solutions that dynamically scale based on demand with enough flexibility to pivot for fast changing project requirements while maintaining a balance of good versus perfect
• Provide strong and consistent communication updates on technical progress or blockers to keep stakeholders informed while additionally creating appropriate documentation on technical design to spread knowledge and reduce information silos
• Participate and respond to 24/7 on-call critical alerts and follow documented incident investigation procedures to reestablish customer facing feature availability
• Maintain HIPAA, GDPR, SOC-2 compliance and general security through best practice implementation
Who You Are
• At least 8+ years of experience in software engineering with 4+ years experience in DevOps
• Cloud Provider (AWS, GCP, Azure) experience on managing resources through Infrastructure As Code (Terraform)
• Container Orchestration (ECS or K8s) experience to confidently build, test, and release containerized applications for multiple environments and regions
• Knowledge of Observability best practices across common cloud resources (EC2, ECS, RDS, DynamoDB, S3, SQS, Eventbridge) with experience on rolling out enhancements across a distributed platform with scale in mind
• Experience with shell scripting for *nix systems
• Experience with Networking for web applications
• Effective at communicating ideas through writing and diagramming
• Comfortable working with a distributed development and ops team
• Familiarity with AWS: ECS and cloud hosting, Gitlab: CI/CD, Python: Django, Flask, aiohttp, Bash, Data: PostgreSQL, Redis, Monitoring: Datadog and Sentry, IaC: Terraform, Packer
Benefits
Fundamentals:
• Medical / Dental / Vision / Disability / Life Insurance
• High Deductible Health Plan with Health Savings Account (HSA) option
• Flexible Spending Account (FSA)
• Access to coaches and therapists through Modern Health's platform
• Generous Time Off
• Company-wide Collective Pause Days
Family Support:
• Parental Leave Policy
• Family Forming Benefit through Carrot
• Family Assistance Benefit through UrbanSitter
Professional Development:
• Professional Development Stipend
Financial Wellness:
• 401k
• Financial Planning Benefit through Origin
But wait there’s more…!
• Annual Wellness Stipend to use on items that promote your overall well being
• New Hire Stipend to help cover work-from-home setup costs
• ModSquad Community: Virtual events like active ERGs, holiday themed activities, team-building events and more
• Monthly Cell Phone Reimbursement