Software Engineer – Machine Learning Infrastructure 2945

San Jose, US-United States, Seattle, US-United States
Posted 2 weeks ago
About The Company

This company pioneers short-form video creation and social engagement, boasting a vast, engaged user base. Its platform empowers users with creative tools, filters, and effects. With a diverse content ecosystem, it’s a hub of creativity and expression. The proprietary algorithm ensures personalized content feeds, enhancing user engagement and satisfaction. This company wields significant influence on digital media, making it an invaluable partner for innovative collaborations and marketing endeavors.


About the Team

The mission of our AML team is to push the next-generation AI infrastructure and recommendation platform for the ads ranking, search ranking, live & ecom ranking in our company. We also drive substantial impact on core businesses of the company. Currently, we are looking for Machine Learning Engineer – Machine Learning Infrastructure to join our team to support and advance that mission.


Responsibilities

– Responsible for the design and implementation of a global-scale machine learning system for feeds, ads and search ranking models.
– Responsible for improving use-ability and flexibility of the machine learning infrastructure.
– Responsible for improving the workflow of model training and serving, data pipelines, storage system and resource management for multi-tenancy machine learning systems.
– Responsible for designing and developing key components of ML infrastructure and mentoring interns.


Minimum Qualifications

– Proficient in at least one programming language such as Go/Python in Linux environment, with excellent coding skills.
– Familiar with open source distributed scheduling/orchestration/storage frameworks, such as Kubernetes (K8S), Yarn (Flink, MapReduce), Mesos, Celery, HDFS, Redis, S3, etc., with rich practical experience in machine learning system development.
– Master the principle of distributed systems and participate in the design, development and maintenance of large-scale distributed systems.
– Possess excellent logical analysis ability, able to perform reasonable abstraction and decomposition of business logic.


Preferred Qualifications

– Experience contributing to an open sourced machine learning framework (TensorFlow/PyTorch).
– Experience in big data frameworks (e.g., Spark/Hadoop/Flink), experience in resource management and task scheduling for large scale distributed systems.
– Experience in using/designing open-source machine learning lifecycle management systems: TFX

Job Features

Job CategoryAI Engineering
SeniorityJunior / Mid IC
Base Salary$160,000 - $290,000
Recruiteryuxuan.sheng@ocbridge.ai

Apply Online