Create Email Alert

Email Alert for

ⓘ There was an unexpected error processing your request.

Please refresh the page and try again.

If the problem persists, please contact us with your issue.

Email address is already registered

You can always manage your preferences and update your interests to ensure you receive the most relevant opportunities.

Success! You're now signed up for Job Alerts

Get ready to discover your next great opportunity.

Similar Jobs

Anthropic

Site Reliability Engineer

San Francisco, CA, United States
We are looking for a Site Reliability Engineer who will ensure the high availability and performance of our Kubernetes clusters that power machine learning research and services. About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as
Job Source: Anthropic
Appspace

Site Reliability Engineer

San Francisco, CA, United States
- Ending Soon
At Appspace, we’re passionate about creating better work experiences for people everywhere, and we’re looking for people that feel the same way. Our global office locations and flexible work culture help you work wherever and however you’re at your best. Plus, we take the time to help you enjoy your work, build lasting connections, and grow your ro
Job Source: Appspace
Wasmer

Site Reliability Engineer

San Francisco, CA, United States
- Ending Soon
[Full Time] Site Reliability Engineer at Wasmer (United States) | BEAMSTART Jobs Site Reliability Engineer Wasmer United States Date Posted 25 Mar, 2023 Work Location San Francisco, United States Salary Offered Not Specified Job Type Full Time Experience Required 1+ years Remote Work Yes Stock Options No Vacancies 1 available Role: Senior Si
Job Source: Wasmer
Dorahacks

Site Reliability Engineer

San Francisco, CA, United States
- Ending Soon
About DoraHacks DoraHacks is a global hackathon organizer and one of the world's most active developer incentive platforms. It creates a global hacker movement in blockchain/Web3, quantum computing, space tech, and other frontier technology. DoraHacks provides a wide range of toolkits to help hackers around the world team up and fund their ideas an
Job Source: Dorahacks
Compunnel

Site Reliability Engineer

San Francisco, CA, United States
Direct client Location: San Francisco, CA (SFO bay area) Role: Site Reliability Engineer (DevOps) Contract to hire Required Skills: Experience in using Terraform to manage AWS Programmable Infrastructures Must have architected and implemented the Cloud Infrastructure Automation scripts to create and maintain various target environments like Dev, St
Job Source: Compunnel
Withorb

Site Reliability Engineer

San Francisco, CA, United States
- Ending Soon
Mission Orb is on an ambitious mission to provide every business with the infrastructure to unlock their revenue. Best-in class businesses find ways to effectively align their monetization to product usage—whether that's through seats, consumption, feature limits, or usage-based tiers. Orb brings that opportunity to every software company. We are
Job Source: Withorb
Gusto

Site Reliability Engineer

San Francisco, CA, United States
- Ending Soon
About Gusto Gusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insurance, 401(k)s, expert HR, and team management tools. Today, Gusto offices in Denver, San Francisco, and New York serve more than 300,000 businesses nationwide. Our mission is to creat
Job Source: Gusto
Resource Informatics Group

Site Reliability Engineer

San Francisco, CA, United States
Job Title: Site Reliability Engineer Work Location : San Francisco, CA (Hybrid after showing successful engagement) Duration: 18+ months Most important skills: 10 years of Oracle database administration experience on large production environment Database hands on skills especially around database and system troubleshooting and administration G
Job Source: Resource Informatics Group

Site Reliability Engineer

San Francisco, CA, United States

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase.

You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems.

Requirements

7+ years of professional SRE or related experience

Bachelor's degree in Computer Science or a related field or equivalent work experience

Expert knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes

Proficiency in programming/scripting languages

Direct experience in monitoring and observability practices

Advanced knowledge of cloud services

Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts

Responsibilities

Be on an on-call (PagerDuty) rotation to respond to incidents that impact availability

Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users

Build monitoring systems to ensure the highest quality service for our customers

Design and implement operational processes (such as deployments and upgrades)

Debug production issues across all services and levels of the stack

Identify improvements for the product architecture from the reliability, performance and availability perspectives

Plan the growth of Together AI's infrastructure

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at https://www.together.ai/privacy

#J-18808-Ljbffr

Name	Expiration	Description
ATTBCookie*	2 years	These cookies are used to remember a user’s choice about cookies on thebigjobsite.com. Where users have previously indicated a preference, that user’s preference will be stored in these cookies.
last-search search redirect-stage original-keyword	1 day Session 1 hour 1 hour	These cookies are used by thebigjobsite.com to pass search data between our own pages.
datadome	1 year	DataDome is a cybersecurity solution to detect bot activity
jjap	1 days	Used to track if you have seen the Job Alerts prompt. Job Alerts is a service you can subscribe to to receive information about new jobs.

What job

...and where?

Similar Jobs