Create Email Alert

Email Alert for

ⓘ There was an unexpected error processing your request.

Please refresh the page and try again.

If the problem persists, please contact us with your issue.

Email address is already registered

You can always manage your preferences and update your interests to ensure you receive the most relevant opportunities.

Would you like to [visit your alert settings] now?

Success! You're now signed up for Job Alerts

Get ready to discover your next great opportunity.

Similar Jobs

  • Zoox

    Site Reliability Engineer

    Foster City, CA, United States

    • Ending Soon

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant throu

    Job Source: Zoox
  • Point One Navigation, Inc.

    Site Reliability Engineer

    San Francisco, CA, United States

    • Ending Soon

    Company Overview: Join the dynamic team at Point One Navigation, a pioneer in providing cutting-edge precision positioning solutions. We're actively seeking a skilled SRE with expertise in AWS, Kubernetes, and Go to elevate our infrastructure and streamline deployment processes. Job Responsibilities: Infrastructure as Code (IaC) Implement and manag

    Job Source: Point One Navigation, Inc.
  • Zoox

    Site Reliability Engineer

    San Mateo, CA, United States

    • Ending Soon

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant throu

    Job Source: Zoox
  • AEG

    Site Reliability Engineer

    San Francisco, CA, United States

    • Ending Soon

    In order to be considered for this role, after clicking "Apply Now" above and being redirected, you must fully complete the application process on the follow-up screen. Swish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictive sports analytics data products. We believe that oddsmaking is a challe

    Job Source: AEG
  • Retool

    Site Reliability Engineer

    San Francisco, CA, United States

    ABOUT RETOOL Nearly every company in the world runs on custom software: Gartner estimates that up to 50% of all code is written for internal use. This is the operational software for refunding orders, underwriting loans, onboarding employees, analyzing transactions, and providing customer support. But most companies don't have adequate resources t

    Job Source: Retool
  • Syndio

    Site Reliability Engineer

    San Francisco, CA, United States

    Do you want to empower organizations to fairly and equitably hire, promote, retain and compensate their employees? Syndio is a Series-C technology company committed to fairness in the workplace. Fueled by investments of $83M from Bessemer Ventures, Voyager Capital and social change organization Emerson Collective, Syndio is investing in growing our

    Job Source: Syndio
  • Swish Analytics

    Site Reliability Engineer

    San Francisco, CA, United States

    • Ending Soon

    Swish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictive sports analytics data products. We believe that oddsmaking is a challenge rooted in engineering, mathematics, and sports betting expertise; not intuition. We're looking for team-oriented individuals with an authentic passion for accurate an

    Job Source: Swish Analytics
  • Patreon

    Site Reliability Engineer

    San Francisco, CA, United States

    • Ending Soon

    Patreon is the best place for creators to build exclusive content and community for their fans. We enable creators (podcasters, writers, musicians, illustrators, etc) to connect with their fans directly and make money from their creative work. Creators can sell one-off items from their own shops or offer recurring monthly memberships with exclusive

    Job Source: Patreon

Site Reliability Engineer

San Francisco, CA, United States

WHY WE'RE LOOKING FOR YOU

As our first Site Reliability Engineer, you will be instrumental in defining and shaping the processes and practices for a pivotal new business offering. You will play a crucial role in ensuring the reliability, scalability, and performance of our services while collaborating closely with our product and GTM teams. This is a unique opportunity to significantly impact the direction and success of a key initiative within our company.

Reducing friction in deploying Retool is one of the largest levers for us to grow efficiently as a business. You’ll be figuring out how to productize a scalable deployment solution that is both effective and delightful for our customers. This role requires a blend of deep technical expertise in site reliability engineering and a keen product sense to create solutions that not only perform well but also provide an exceptional developer experience.

IN THIS ROLE YOU'LL

Infrastructure Management: Design, implement, and manage scalable and resilient infrastructure using AWS, Kubernetes, and Terraform.

Process Shaping: Define and implement processes and practices that will support our new business offering, ensuring they are robust, scalable, and aligned with industry best practices.

Automation: Automate deployment and maintenance tasks to improve efficiency and scalability of this offering.

Documentation & Knowledge Sharing: Create and maintain comprehensive documentation for systems, processes, and procedures. Mentor and guide other team members on best practices.

Monitoring & Alerting: Leverage existing observability systems to build new products that ensure the health and performance of our services.

THE SKILLSET YOU'LL BRING

Technical Expertise:

Strong experience with AWS and Kubernetes.

Proficiency in managing PostgreSQL databases.

Extensive experience with infrastructure as code (IaC) using Terraform.

Operational Experience:

Previous experience in a similar SRE or DevOps role, ideally within a SaaS environment.

Strong background in monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, Datadog).

Programming Skills:

Proficiency in one or more programming languages (e.g., Python, Go, Java).

Problem-Solving Skills:

Excellent problem-solving skills and the ability to troubleshoot complex issues.

Collaboration & Communication:

Strong interpersonal and communication skills, with the ability to work effectively in a team-oriented environment.

NICE TO HAVE

Experience with CI/CD pipelines and tools (e.g., Buildkite, GitLab CI).

Knowledge of security best practices and tools.

#J-18808-Ljbffr

Apply

Create Email Alert

Create Email Alert

Email Alert for Site Reliability Engineer jobs in San Francisco, CA, United States

ⓘ There was an unexpected error processing your request.

Please refresh the page and try again.

If the problem persists, please contact us with your issue.

Email address is already registered

You can always manage your preferences and update your interests to ensure you receive the most relevant opportunities.

Would you like to [visit your alert settings] now?

Success! You're now signed up for Job Alerts

Get ready to discover your next great opportunity.