San Jose, US-United States
Posted 1 year ago
About The Company This company pioneers short-form video creation and social engagement, boasting a vast, engaged user base. Its platform empowers users with creative tools, filters, and effects. With a diverse content ecosystem, it’s a hub of creativity and expression. The proprietary algorithm ensures personalized content feeds, enhancing user engagement and satisfaction. This company wields significant influence on digital media, making it an invaluable partner for innovative collaborations and marketing endeavors. Responsibilities for this role include 1. Building and managing the Global SRE team, including recruitment, training, system operation, and team culture building. 2. Improving cross-team, time zone, and regional cooperation mechanisms, providing SRE solutions aligned with business needs. 3. Arranging and managing the SRE team, enhancing overall efficiency and effectiveness. 4. Developing process specifications and plans for compliant access, configuration, disaster recovery, and fault handling. 5. Continuously improving core SRE capabilities in efficiency, cost, quality, and security. 6. Developing automation, data visualization, and automated monitoring processes to optimize the e-Commerce platform infrastructure. 7. Driving the design and engineering of tools and platform solutions to enhance product engineering and operation efficiencies. 8. Managing on-call processes, responding to performance and reliability issues, and establishing best practices for issue resolution. Requirements for this position include 1. Bachelor’s degree or above in Computer Science or a related technical discipline, with good English communication skills. 2. Familiarity with SRE-related processes and trends, along with experience building SRE systems in the E-commerce industry (5+ years). 3. Knowledge of cloud computing technologies, particularly Amazon Web Services and Google Cloud Platform. 4. Demonstrable experience in programming languages such as Java, C++, Go, or scripting languages like Shell and Python. 5. Expertise in operations, deployment, high availability, and quality assurance of large-scale distributed systems, with a focus on stability and performance. 6. Strong sense of responsibility, proactive team spirit, and analytical problem-solving skills. The ideal candidate for this role is agile, a quick self-learner, highly self-motivated, and takes ownership of their work. They should have experience running a 24×7 production infrastructure at scale and possess the ability to independently research and solve complex technical problems. Additionally, the candidate should be an empathetic and results-oriented leader, a good collaborator and team player, and comfortable working in a fast-paced, culturally diverse, and globally distributed team environment. |
Job Features
Job Category | DevOps & SRE |
Seniority | Manager / Senior Manager |
Base Salary | $210,000 - $358,000 |
Recruiter | joshua.chen@ocbridge.ai |