Site Reliability Engineer - AML Global Recommendation - USDS
2 days ago Be among the first 25 applicants
Responsibilities
About the Team:
Site Reliability Engineering (SRE) of the AML (Applied Machine Learning) team combines system engineering and the art of machine learning to develop and run a massively distributed AI/ML recommendation system for the United States and worldwide.
On the SRE team, you'll have the opportunity to sharpen your expertise in coding, performance analysis, and large-scale systems operation. Join us and you'll have the chance to shape the future of AML systems and make a tangible impact on Tik Tok users.
Responsibilities:
- Design, build, and maintain highly available, scalable, and fault-tolerant systems.
- Monitor and analyze system performance, identifying and resolving issues proactively.
- Develop and maintain automated monitoring, alerting, and incident response systems.
- Collaborate closely with software engineering teams to ensure applications are designed with reliability, scalability, and performance in mind.
- Implement and maintain security best practices and ensure compliance with regulatory requirements.
- Participate in on-call rotations and respond to issues and incidents within and outside of normal hours.
- Conduct root cause analysis of incidents, hold post-mortem reviews, and implement preventative measures.
Qualifications
Minimum Qualifications:
- Expertise in analyzing and troubleshooting Linux-based distributed systems.
- Bachelor's/Master's degree in Computer Science, Computer Engineering, or equivalent experience in SRE or software engineering.
- Experience programming with at least one language (C, C++, Python, Go).
- Strong understanding of data structures and algorithms.
- Knowledge of relational database systems.
Preferred Qualifications:
- Experience designing and maintaining large-scale systems.
- Knowledge of code optimization and routine task automation.
- Proficiency in machine learning frameworks like Tensor Flow, Py Torch, MXNet, or Paddle Paddle.
About USDS
Tik Tok is the leading destination for short-form mobile video, inspiring creativity and bringing joy. U. S. Data Security (USDS) is a U. S.-based subsidiary focused on data protection policies and content assurance to keep U. S. users safe, overseeing Tik Tok platform security and data privacy.
Data Security Statement
This role involves working with systems designed to protect sensitive data and will be subject to strict security screening.
Why Join Us
Join Tik Tok to be part of a diverse, innovative team that values creativity, impact, and growth. We foster a culture of curiosity, humility, and resilience, aiming for meaningful breakthroughs and shared success.
Diversity & Inclusion
Tik Tok is committed to an inclusive environment that values diverse skills, experiences, and perspectives, reflecting the communities we serve.
Acknowledgment of Country
We acknowledge the Traditional Custodians of the land across Australia and pay respect to elders past and present, extending this respect to all Aboriginal and Torres Strait Islander peoples.
#J-18808-Ljbffr