Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Site Reliability Engineer, Global E-commerce - USDS image - Rise Careers
Job details

Site Reliability Engineer, Global E-commerce - USDS - job 1 of 2

ResponsibilitiesTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security (“USDS”) is a subsidiary of TikTok in the U.S. This new, security-first division was created to bring heightened focus and governance to our data protection policies and content assurance protocols to keep U.S. users safe. Our focus is on providing oversight and protection of the TikTok platform and U.S. user data, so millions of Americans can continue turning to TikTok to learn something new, earn a living, express themselves creatively, or be entertained. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more.Why Join UsCreation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day. To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve. Join us.About The TeamThe Global E-commerce SRE team of US Tech Services works with engineering and product teams to build and run large-scale, globally distributed, observable, fault-tolerant systems. As an SRE, you will deliver on production ownership and be responsible for observability and automation across complex, large-scale service mesh architectures.In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time.What You'll Do• Own the service level of a critical, revenue generating E-commerce platform as well as all supporting infrastructure and services. This role will focus on service reliability, highly-scalable design and release management in a cloud-native environment.• Define service level indicators and data-driven objectives to uphold and improve uptime, latency, and system health of a core TikTok production platform.• Collaborate cross team with engineering and product to ensure that key requirements (such as capacity planning and launch reviews) are performed to enable transparent service delivery to customers.• Automation geared towards infrastructure-as-code, scalability and service resiliency• Implement SRE practices around incident management, post-mortems while being part of on-call rotations.QualificationsBasic Qualifications:• Good understanding of Unix/Linux operating systems internals and networking• Experience writing code in Java, Go, Python or a similar language• Expertise in designing, analyzing, and troubleshooting large-scale distributed systems (Redis, Elasticsearch, Kafka, Druid, Hadoop, Flink or comparable solutions), relational databases, caching solutions and web service frameworks• Experience with algorithms, data structures, complexity analysis and software design• Experience developing tools and APIs to reduce manual interaction with systems and applications using a variety of coding and scripting standards• Systematic problem-solving approach, coupled with effective communication skills and a sense of drivePreferred Qualifications• Familiarity with running production grade web services at scale and understanding cloud native technologies and networking• Knowledge about a variety of strategies for ingesting, modeling, processing, and persisting data, ETL design, dimensional modeling, and cube designCandidates for this position must be legally authorized to work in the United States. This position is not eligible for visa sponsorship or support.TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https://shorturl.at/ktJP6This role requires the ability to work with and support systems designed to protect sensitive data and information. As such, this role will be subject to strict national security-related screening.Job Information【For Pay Transparency】Compensation Description (Annually)The base salary range for this position in the selected city is $137750 - $237500 annually.Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).The Company reserves the right to modify or change these benefits programs at any time, with or without notice.For Los Angeles County (unincorporated) CandidatesQualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state, and local laws including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Our company believes that criminal history may have a direct, adverse and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment:• Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues;• Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; and• Exercising sound judgment.
TikTok Glassdoor Company Review
3.4 Glassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon Glassdoor star icon
TikTok DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of TikTok
TikTok CEO photo
Shou Zi Chew
Approve of CEO

Average salary estimate

Estimate provided by employer
$167147 / ANNUAL (est.)
min
max
$146K
$188K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Site Reliability Engineer, Global E-commerce - USDS, TikTok

Are you ready to take your skills as a Site Reliability Engineer to the next level with TikTok's Global E-commerce team? Located in the vibrant city of Seattle, WA, this role offers an exciting opportunity to work at the forefront of data security and platform reliability. At TikTok, our mission is all about inspiring creativity and bringing joy to millions of users. In the U.S. Data Security (USDS) division, we're dedicated to protecting sensitive data and ensuring that our platform runs smoothly. You'll be a key player in maintaining and enhancing the reliability of our revenue-generating E-commerce platform while cultivating a collaborative work environment. As a Site Reliability Engineer, you'll take ownership of service levels, define key performance indicators, and drive automation for scalability and resiliency in our cloud-native environment. Your expertise in large-scale distributed systems and programming in languages like Java, Go, or Python will truly shine here. Plus, with a hybrid work schedule that allows for flexibility, you can find balance while contributing to a mission that strives for excellence and innovation. Connect with ambitious tech professionals, tackle challenging problems, and help shape the future of a platform that brings creativity to life. Bring your systematic problem-solving approach and effective communication skills, and join us in this exciting journey at TikTok!

Frequently Asked Questions (FAQs) for Site Reliability Engineer, Global E-commerce - USDS Role at TikTok
What are the main responsibilities of a Site Reliability Engineer at TikTok?

As a Site Reliability Engineer at TikTok, your main responsibilities will include owning the service level of a critical E-commerce platform, ensuring high reliability, and implementing automation strategies to enhance service resiliency. You will define service level indicators, collaborate with cross-functional teams, and manage incidents effectively.

Join Rise to see the full answer
What qualifications do I need to apply for the Site Reliability Engineer position at TikTok?

To qualify for the Site Reliability Engineer role at TikTok, candidates should have a strong understanding of Unix/Linux systems, experience with programming languages like Java, Go, or Python, and expertise in troubleshooting large-scale distributed systems. Familiarity with cloud-native technologies is preferred.

Join Rise to see the full answer
How does TikTok support the growth and development of its Site Reliability Engineers?

TikTok is committed to fostering a culture of learning and growth. As a Site Reliability Engineer, you'll have the chance to tackle complex challenges, collaborate with talented teams, and receive ongoing support for professional development, including access to training resources and innovative projects.

Join Rise to see the full answer
What is the work schedule like for a Site Reliability Engineer at TikTok?

The Site Reliability Engineer role at TikTok follows a hybrid work schedule, requiring employees to work in the office three days a week while allowing for flexibility on the remaining days. This model promotes collaboration while supporting work-life balance.

Join Rise to see the full answer
What career advancement opportunities are available for Site Reliability Engineers at TikTok?

TikTok values career growth and offers various pathways for advancement within the company. Site Reliability Engineers can explore roles in senior leadership, project management, or specialized technical positions as they gain experience and showcase their skills.

Join Rise to see the full answer
Common Interview Questions for Site Reliability Engineer, Global E-commerce - USDS
Can you explain a challenging problem you solved as a Site Reliability Engineer?

When addressing this question, focus on a specific incident where you identified a bottleneck or failure. Explain the steps you took to analyze the issue, outline your solution, and highlight any key metrics that improved as a result, showcasing your problem-solving abilities.

Join Rise to see the full answer
How do you prioritize tasks in a high-pressure environment?

In a high-pressure environment, it's crucial to assess priorities based on impact and urgency. Share your experience using methods like the Eisenhower Matrix to categorize tasks and explain how you communicate with your team to ensure alignment and effective workload management.

Join Rise to see the full answer
What SRE practices do you think are most important?

When discussing SRE practices, consider focusing on incident management, post-mortem analysis, and the importance of automation. Share relevant experiences where implementing these practices improved system reliability or reduced incident resolution time.

Join Rise to see the full answer
Describe your experience with cloud-native technologies.

Be prepared to provide examples of projects you've worked on involving cloud-native technologies. Emphasize any specific services or platforms you've used and how they contributed to the scalability and reliability of applications in your experience.

Join Rise to see the full answer
How do you handle on-call rotations?

Discuss your approach to being on-call, including how you prepare, document key procedures, and respond to incidents. Highlight your emphasis on effective communication with team members and your methods for maintaining a work-life balance during on-call periods.

Join Rise to see the full answer
What programming languages are you proficient in, and how have you applied them?

Share your proficiency in programming languages relevant to the role, such as Java, Go, or Python. Discuss specific projects where you wrote code to automate processes, troubleshoot issues, or improve system performance, demonstrating your technical expertise.

Join Rise to see the full answer
Can you describe your experience with incident management?

Outline your experience in handling incidents, from detection to resolution. Mention specific tools you've utilized and your approach to conducting post-mortems to identify root causes and improve processes for future incidents.

Join Rise to see the full answer
How do you ensure system health and observability?

Discuss the ways you assess system health, such as monitoring key performance indicators or using observability tools. Explain how you leverage the data collected from these practices to make informed decisions and improve system reliability.

Join Rise to see the full answer
What strategies do you use for capacity planning?

When discussing capacity planning, explain the importance of analyzing traffic patterns and system usage. Share your experience creating models to predict future growth and how those models informed scaling decisions and resource allocation.

Join Rise to see the full answer
How do you stay updated on industry trends and technologies?

Highlight your commitment to continuous learning through attending conferences, following relevant publications, participating in online communities, or taking courses. Mention how you’ve applied this knowledge to improve your work as a Site Reliability Engineer.

Join Rise to see the full answer
Similar Jobs

Our mission is to inspire creativity and bring joy.

226 jobs
MATCH
Calculating your matching score...
BADGES
Badge Flexible CultureBadge Future MakerBadge Global CitizenBadge InnovatorBadge Rapid Growth
CULTURE VALUES
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Take Risks
Casual Dress Code
Startup Mindset
Emails over Meetings
Collaboration over Competition
Fast-Paced
Growth & Learning
BENEFITS & PERKS
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Mixe-Ability Accomodations
Work Visa Sponsorship
Commuter Benefits
Employee Resource Groups
Performance Bonus
Health Savings Account (HSA)
Flexible Spending Account (FSA)
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
December 4, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!