Create Email Alert

Email Alert for

ⓘ There was an unexpected error processing your request.

Please refresh the page and try again.

If the problem persists, please contact us with your issue.

Email address is already registered

You can always manage your preferences and update your interests to ensure you receive the most relevant opportunities.

Would you like to [visit your alert settings] now?

Success! You're now signed up for Job Alerts

Get ready to discover your next great opportunity.

Similar Jobs

  • Meta

    Software Engineer, Systems ML - HPC Specialist

    Bellevue, WA, United States

    Meta is seeking an AI Software Engineer to join our Research & Development teams. The ideal candidate will have industry experience working on AI Infrastructure related topics. The position will involve taking these skills and applying them to solve for some of the most crucial & exciting problems that exist on the web. Some aspects of this role as

    Job Source: Meta
  • Meta

    Software Engineer, Systems ML - HPC Specialist_

    Bellevue

    **Summary:** Meta is seeking an AI Software Engineer to join our Research & Development teams. The ideal candidate will have industry experience working on AI Infrastructure related topics. The position will involve taking these skills and applying them to solve for some of the most crucial & exciting problems that exist on the web.Some aspects of

    Job Source: Meta
  • META

    Software Engineer, Systems ML - HPC Specialist

    Bellevue, WA, United States

    Summary: Meta is seeking an AI Software Engineer to join our Research & Development teams. The ideal candidate will have industry experience working on AI Infrastructure related topics. The position will involve taking these skills and applying them to solve for some of the most crucial & exciting problems that exist on the web.Some aspects of thi

    Job Source: META
  • T-Mobile

    Engineers, Systems Architecture

    Bellevue, WA, United States

    T-Mobile USA, Inc. seeks Engineers, Systems Architecture in Bellevue, WA • Design, setup & deploy Docker, Kubernetes orchestration platforms & Docker compute nodes, Kubernetes pods & container mgmt on Unix/Linux infrastructure environ in a cloud or on-premises hosted environ. • Apply exp w/ designing, setup & deploying Oracle WebLogic Server, IPlan

    Job Source: T-Mobile
  • Meta

    Software Engineer, Systems ML - PyTorch Compiler / ML Framework / Performance_

    Bellevue

    **Summary:** The PyTorch Compiler team is dedicated to making PyTorch run faster and more resource-efficient without sacrificing its flexibility and ease of use. The team is the driving force behind PT2, a step function change in PyTorch’s history that brought compiler technologies to the core of PyTorch. PT2 technologies have gained industry-wide

    Job Source: Meta
  • Meta

    Software Engineer, Systems ML - Frameworks / Compilers / Kernels

    Bellevue, WA, United States

    In this role, you will be a member of the MTIA (Meta Training & Inference Accelerator) Software team and part of the bigger industry-leading PyTorch AI framework organization. MTIA Software Team has been developing a comprehensive AI Compiler strategy that delivers a highly flexible platform to train & serve new DL/ML model architectures, combined

    Job Source: Meta
  • Meta

    Software Engineer, Systems ML - Frameworks / Compilers / Kernels_

    Seattle

    **Summary:** In this role, you will be a member of the MTIA (Meta Training & Inference Accelerator) Software team and part of the bigger industry-leading PyTorch AI framework organization. MTIA Software Team has been developing a comprehensive AI Compiler strategy that delivers a highly flexible platform to train & serve new DL/ML model architectur

    Job Source: Meta
  • META

    Software Engineer, Systems ML - Frameworks / Compilers / Kernels

    Bellevue, WA, United States

    Summary: In this role, you will be a member of the MTIA (Meta Training & Inference Accelerator) Software team and part of the bigger industry-leading PyTorch AI framework organization. MTIA Software Team has been developing a comprehensive AI Compiler strategy that delivers a highly flexible platform to train & serve new DL/ML model architectures,

    Job Source: META

Software Engineer, ML System Architecture

Seattle, WA, United States

Responsibilities

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.

At TikTok, our people are humble, intelligent, compassionate and creative. We create to inspire - for you, for us, and for more than 1 billion users on our platform. We lead with curiosity and aim for the highest, never shying away from taking calculated risks and embracing ambiguity as it comes. Here, the opportunities are limitless for those who dare to pursue bold ideas that exist just beyond the boundary of possibility. Join us and make impact happen with a career at TikTok.

Team Intro

AML (Applied Machine Learning) Machine Learning System team focuses on the research and implementation of cutting-edge technologies in the field of Machine Learning systems, providing high-performance, highly reliable, scalable systems.

In the team, you'll have the opportunity to build the large scale heterogeneous system integrating with GPU/RDMA/Storage and keep it running stable and reliable, enrich your expertise in coding, performance improvement and problem analysis,and be involved in the decision-making process.

Responsibilities:

1. Responsible for the design and development of Machine Learning infrastructure for LLM/AIGC, etc

2. Build up a super large machine learning system integrating GPUs, RDMA networking, and high-performance storage

3. Responsible for solving technical problems such as high stability and availability of the system

4. Organize and coordinate multiple teams to complete the construction of the system, including Data center team, network team, computing team, storage team, resource team

Qualifications

Minimum Qualifications:

1. Be proficient in 1 to 2 programming languages such as C++/Go/Python/Shell in Linux environment

2. Understand the principles of distributed systems and have experience in design, development and maintenance of large-scale machine learning systems

3. Be familiar with Kubernetes architecture, and have rich experience in system-level development and tuning

4. Have an excellent logical analysis ability, able to reasonably abstract and split business logic

5. Have a strong sense of responsibility, good learning ability, communication skills and self-drive

Preferred Qualifications

1. Familiar with the ML Infrastructure of Large Model training and inference

2. Experience in one of the following fields: AI Infrastructure, HW/SW Co-Design, High Performance Computing, ML Hardware Architecture (GPU, Accelerators, Networking)

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at [email protected]

Job Information:

【For Pay Transparency】Compensation Description (annually) The base salary range for this position in the selected city is $137750 - $237500 annually.

Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.

Our company benefits are designed to convey company culture and values, to create an efficient and inspiring work environment, and to support our employees to give their best in both work and life. We offer the following benefits to eligible employees:

We cover 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents and offer a Health Savings Account(HSA) with a company match. As well as Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans. In addition to Flexible Spending Account(FSA) Options like Health Care, Limited Purpose and Dependent Care.

Our time off and leave plans are: 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated upon hire and increased by tenure) and 10 paid sick days per year as well as 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.

We also provide generous benefits like mental and emotional health benefits through our EAP and Lyra. A 401K company match, gym and cellphone service reimbursements. The Company reserves the right to modify or change these benefits programs at any time, with or without notice.

#J-18808-Ljbffr

Apply

Create Email Alert

Create Email Alert

Email Alert for Software Engineer, ML System Architecture jobs in Seattle, WA, United States

ⓘ There was an unexpected error processing your request.

Please refresh the page and try again.

If the problem persists, please contact us with your issue.

Email address is already registered

You can always manage your preferences and update your interests to ensure you receive the most relevant opportunities.

Would you like to [visit your alert settings] now?

Success! You're now signed up for Job Alerts

Get ready to discover your next great opportunity.