Create Email Alert

Email Alert for

ⓘ There was an unexpected error processing your request.

Please refresh the page and try again.

If the problem persists, please contact us with your issue.

Email address is already registered

You can always manage your preferences and update your interests to ensure you receive the most relevant opportunities.

Would you like to [visit your alert settings] now?

Success! You're now signed up for Job Alerts

Get ready to discover your next great opportunity.

Similar Jobs

  • Assured

    Staff Site Reliability Engineer (SRE)

    Palo Alto, CA, United States

    Job Description Job Description Assured is on a mission to modernize insurance. Claims processing (i.e. should we pay this claim?), while often overlooked, is the foundation of the entire industry. It’s currently highly manual, involving phone calls, faxes, and gut instinct—costing tens of billions of dollars a year. We can do better. At Assured

    Job Source: Assured
  • Character.AI

    Staff Site Reliability Engineer (SRE)

    Menlo Park, CA, United States

    • Ending Soon

    About us Character’s mission is to empower everyone with AGI. Our vision is to enable people with our technology so that they can use Character.AI any moment of any day. Character.AI is one of the world’s leading personal AI platforms. Founded in 2021 by AI pioneers Noam Shazeer and Daniel De Freitas, Character.AI is a full-stack AI company w

    Job Source: Character.AI
  • Maxonic

    Site Reliability Engineer(SRE)

    Sunnyvale, CA, United States

    Maxonic maintains a close and long-term relationship with our direct client. In support of their needs, we are looking for a Site Reliability Engineer(SRE) . Job Description: Job Title: Site Reliability Engineer(SRE) Job Location: Sunnyvale, CA Hybrid (2-3 days onsite) Duration: 5 months Pay Rate: $56.05 - $59/hr on W2 Responsibilities: • Gath

    Job Source: Maxonic
  • Dice

    Site Reliability Engineer(SRE)

    Sunnyvale, CA, United States

    Dice is the leading career destination for tech experts at every stage of their careers. Our client, Della Infotech, is seeking the following. Apply via Dice today! Position: Site Reliability Engineer(SRE) Job Posting ID: ISIJP00009758 Duration: ~5 months Hybrid (2-3 days onsite) Gather and analyze metrics from operating systems as well as app

    Job Source: Dice
  • Redolent Infotech Pvt. Ltd.

    Site Reliability Engineer (SRE)

    Sunnyvale, CA, United States

    One of our direct client is urgently looking for an Azure DevOps Engineer@ Sunnyvale CA TITLE:Sr. Azure DevOps Engineer LOCATION: San Bruno, CA Duration: 6 to 12+ Description: Docker, Kubernetes and environment creation using scripts or java / ruby are the key requirements. Bachelors + 5 years or Master + 2 years 1. Resource should have mi

    Job Source: Redolent Infotech Pvt. Ltd.
  • Elastic

    Platform - Site Reliability Engineer (SRE)

    Mountain View, CA, United States

    Elastic is a free and open search company that powers enterprise search, observability, and security solutions built on one technology stack that can be deployed anywhere. From finding documents to monitoring infrastructure to hunting for threats, Elastic makes data usable in real-time and at scale. Thousands of organizations worldwide, including B

    Job Source: Elastic
  • Equifax, Inc.

    Site Reliability Engineer (SRE) - Intermediate

    San Jose, CA, United States

    • Ending Soon

    Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles. SREs in our

    Job Source: Equifax, Inc.
  • Newsbreakdigest

    Site Reliability Engineering (SRE)

    Mountain View, CA, United States

    • Ending Soon

    Responsibilities Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement. Build and manage systems, infrastructure and applications through automation Support services before they go live through activities such as system design consulting, developing software platforms and fr

    Job Source: Newsbreakdigest

Staff Site Reliability Engineer (SRE)

Menlo Park, CA, United States

About the role

The Role:

As the founding member of our DevOps/Site Reliability Engineer function here at Character, you'll have the opportunity to support our infrastructure with thousands of nodes, terabytes of data and millions of daily active users on our site. You'll be responsible for ensuring our product's reliability, scalability, and performance as we aggressively grow our user base, with a goal of growing to 3 billion users. Work closely with our development team to design and implement processes and systems that ensure the stability and availability of our service.

Specific Responsibilities:

Maintain production services and keep them operational.

Develop tools, Instrumentation and automation to monitor and optimize the performance and reliability of our service.

Develop, implement and maintain automation tools and processes to prevent and mitigate service disruptions.

Collaborate with development teams to design and implement scalable, reliable systems, CI/CD processes for deployment.

Establish and support SLAs and SLOs for our site

Provide system monitoring and incident alerts

Participate in on-call rotations to provide support for critical incidents and outages.

Develop plans for site reliability and disaster recovery

Job Requirements: 5+ years of experience in a development focused DevOps/SRE role within a technology organization that has significant scale

Deep experience with and proven success in developing software tools and automation wherever needed using Python and Golang

Expertise with SQL, Linux, CI/CD, Kubernetes, Terraform to support a site/application within a large multi node infrastructure and a growing user base.

Experience working with multiple cloud computing platforms such as GCP is also a must

Demonstrated experience to successfully and reliably troubleshoot technical issues and challenges across a range of platforms and systems

Experience with incident management and event postmortems

Desired Experience: Familiarity with GPU clusters and/or HPC environments is preferred

Experience with monitoring and logging tools such as Prometheus and Grafana

Hands-on experience scaling a consumer product from early days into hypergrowth

About Character.AI

Founded in 2021 by AI pioneers Noam Shazeer and Daniel De Freitas, Character is a leading AI company offering personalized experiences through customizable AI 'Characters.' As one of the most widely used AI platforms worldwide, Character enables users to interact with AI tailored to their unique needs and preferences.

Noam co-invented core LLM tech and was recently honored as one of TIME's 100 Most Influential in AI. Daniel created LaMDA, the breakthrough conversational AI now powering Google's Bard.

In just two years, we achieved unicorn status and were named Google Play's AI App of the Year - a testament to our groundbreaking technology and vision.

Ready to shape the future of AGI?

At Character, we value diversity and welcome applicants from all backgrounds. As an equal opportunity employer, we firmly uphold a non-discrimination policy based on race, religion, national origin, gender, sexual orientation, age, veteran status, or disability. Your unique perspectives are vital to our success.

#J-18808-Ljbffr

Apply

Create Email Alert

Create Email Alert

Email Alert for Staff Site Reliability Engineer (SRE) jobs in Menlo Park, CA, United States

ⓘ There was an unexpected error processing your request.

Please refresh the page and try again.

If the problem persists, please contact us with your issue.

Email address is already registered

You can always manage your preferences and update your interests to ensure you receive the most relevant opportunities.

Would you like to [visit your alert settings] now?

Success! You're now signed up for Job Alerts

Get ready to discover your next great opportunity.