Create Email Alert

Email Alert for

ⓘ There was an unexpected error processing your request.

Please refresh the page and try again.

If the problem persists, please contact us with your issue.

Email address is already registered

You can always manage your preferences and update your interests to ensure you receive the most relevant opportunities.

Success! You're now signed up for Job Alerts

Get ready to discover your next great opportunity.

Similar Jobs

Waymo

Senior Software Engineer, ML Performance

Mountain View, CA, United States
- Ending Soon
Senior Software Engineer, ML Performance Mountain View, California, United States New York, New York, United States Waymo is an autonomous driving technology company with a mission to make it safe and easy for people and things to get where they're going. Since our start as the Google Self-Driving Car Project in 2009, Waymo has been fo
Job Source: Waymo
Waymo

ML Performance Engineer

Mountain View, CA, United States
- Ending Soon
ML Performance Engineer Mountain View, California, United States New York, New York, United States Waymo is an autonomous driving technology company with the mission to be the most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver-The World's Most Experienced D
Job Source: Waymo
Advanced Micro Devices , Inc.

Principal ML Performance Engineer

San Jose, CA, United States
WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpin
Job Source: Advanced Micro Devices , Inc.
Meta

Software Engineer, Systems ML - PyTorch Compiler / ML Framework / Performance

Menlo Park, CA, United States
The PyTorch Compiler team is dedicated to making PyTorch run faster and more resource-efficient without sacrificing its flexibility and ease of use. The team is the driving force behind PT2, a step function change in PyTorch's history that brought compiler technologies to the core of PyTorch. PT2 technologies have gained industry-wide recognition s
Job Source: Meta
Snorkel AI

Software Engineer - AI/ML

Redwood City, CA, United States
- Ending Soon
As a Senior AI/ML Engineer, you'll build systems to power large-scale machine learning and foundation model (e.g. large language model) workloads. You’ll work closely with other engineers, product managers, and field team members to ensure that Snorkel Flow users working with different data modalities (e.g. text, PDF, image) and different use cases
Job Source: Snorkel AI
Resemble

Software Engineer, AI/ML

Palo Alto, CA, United States
- Ending Soon
At Conveyor, our mission is to make the ‘worst job in cybersecurity’ a walk in the park. Today, answering security due diligence questionnaires and sharing security info to build trust with prospects before they will buy a vendor’s software is a critical piece of the B2B sales cycle, but it is “demoralizing” and “painful” for infosec and sales team
Job Source: Resemble
Meta

Software Engineer, Systems ML - PyTorch Compiler / Performance (PhD)_

Menlo Park
**Summary:** The PyTorch Compiler team is dedicated to making PyTorch run faster and more resource-efficient without sacrificing its flexibility and ease of use. The team is the driving force behind PT2, a step function change in PyTorch’s history that brought compiler technologies to the core of PyTorch. PT2 technologies have gained industry-wide
Job Source: Meta
NVIDIA

Performance Software Engineer

Santa Clara, CA, United States
- Ending Soon
We are now looking for a Performance Software Engineer for Deep Learning Libraries! Do you enjoy tuning parallel algorithms and analyzing their performance? If so, we want to hear from you! As a deep learning library performance software engineer, you will be developing optimized code to accelerate linear algebra and deep learning operations on NV
Job Source: NVIDIA

Software Engineer, ML Performance

Cupertino, CA, United States

About Etched

Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning.

Running millions of tokens per second for large models (e.g Llama-3-70B) means running into new performance bottlenecks. Even with hardware optimization for the operations that usually bottleneck us (attention, kernel parallelism), we encounter novel bottlenecks and must design our own solutions to solve them.

You will work closely with our hardware and software teams to identify and mitigate performance bottlenecks, enabling our chips to achieve unprecedented throughput and efficiency. Your work will involve a blend of low-level programming, performance profiling, and hands-on debugging, all aimed at maximizing the performance of our custom-built AI hardware.

You will also play a key role in developing tools and methodologies to help our customers understand the full potential of our hardware.

Representative projects:

Writing new kernels to improve throughput for LLM embedding

Improving on PagedAttention to prevent fragmentation of the KV cache in memory

Debugging hardware issues on a simulated or emulated chip

Profile transformers running on our hardware, and fix bottlenecks

Develop ways for customers to work with our chip and understand how their workloads will run on it.

You may be a good fit if you:

Have 5+ years of low-level programming experience

Have a strong understanding of data flow and execution paths within embedded systems

Pick up slack, even if it goes outside your job description

Are results-oriented, and bias towards shipping products

Understand SoC and computer system architecture, especially for CPU, interconnect, and memory subsystems

Want to learn more about machine learning research

We encourage you to apply even if you do not believe you meet every single qualification.

Strong candidates may also have experience with:

GPU kernel profiling and low-level programming

Transformer optimizations, such as FlashAttention

Ongoing research in machine learning

Palladium emulation

How we’re different:

Etched believes in the Bitter Lesson . We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in Cupertino, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Benefits:

Full medical, dental, and vision packages, with 100% of premium covered, 90% for dependents

Housing subsidy of $2,000/month for those living within walking distance of the office

Daily lunch and dinner in our office

Relocation support for those moving to Cupertino

#J-18808-Ljbffr

Name	Expiration	Description
ATTBCookie*	2 years	These cookies are used to remember a user’s choice about cookies on thebigjobsite.com. Where users have previously indicated a preference, that user’s preference will be stored in these cookies.
last-search search redirect-stage original-keyword	1 day Session 1 hour 1 hour	These cookies are used by thebigjobsite.com to pass search data between our own pages.
datadome	1 year	DataDome is a cybersecurity solution to detect bot activity
jjap	1 days	Used to track if you have seen the Job Alerts prompt. Job Alerts is a service you can subscribe to to receive information about new jobs.

What job

...and where?

Similar Jobs

Senior Software Engineer, ML Performance

ML Performance Engineer

Principal ML Performance Engineer

Software Engineer, Systems ML - PyTorch Compiler / ML Framework / Performance

Software Engineer - AI/ML

Software Engineer, AI/ML

Software Engineer, Systems ML - PyTorch Compiler / Performance (PhD)_

Performance Software Engineer

Software Engineer, ML Performance

Share this job

Create Email Alert