All roles

[Remote] Site Reliability Engineer, Environment Automation

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. GitLab is the intelligent orchestration platform for DevSecOps, enabling organizations to increase developer productivity and improve operational efficiency. The Site Reliability Engineer will focus on Environment Automation, ensuring the reliability, scalability, and security of GitLab environments while contributing to automation across their lifecycle.

Responsibilities

  • Contribute to automating operational tasks across many GitLab environments, from initial provisioning and configuration updates to upgrades and routine maintenance, helping reduce manual work and improve reliability at scale under the guidance of senior team members
  • Help build and refine the observability stack for multi-tenant GitLab environments so we monitor the right signals across Kubernetes, cloud services, and GitLab applications, supporting early issue detection and basic capacity tracking
  • Assist in responding to platform alerts and incidents, collaborating with Environment Automation SREs and engineering teams to troubleshoot production issues across multiple tenants and document findings
  • Support planning and implementation of infrastructure changes, capacity expansions, and new service rollouts for Dedicated and other managed GitLab environments, contributing to efforts that improve resource efficiency and environment isolation
  • Develop and maintain scripts, automation tools, and infrastructure-as-code workflows that manage parts of the GitLab environment lifecycle, enabling more repeatable, self-service operations over time
  • Apply and help implement best practices for running GitLab on Kubernetes and cloud platforms, focusing on day-to-day reliability, performance, and security while learning how to keep environments consistent
  • Participate in the on-call rotation for production GitLab environments with appropriate support, helping triage and mitigate incidents across clusters and cloud providers and contributing to post-incident reviews
  • Document operational tasks, runbooks, and lessons learned so they become clear, repeatable processes and can be candidates for future automation, improving shared knowledge and reducing manual toil across the team

Skills

  • Experience working as an SRE or in a similar role operating production infrastructure, with an interest in automating the lifecycle of many environments or tenants in parallel, even if you have not yet done so at large scale
  • Hands-on experience with backend programming languages such as Golang, with the ability to read, understand, and modify infrastructure tools
  • Hands-on experience running Kubernetes-based workloads in production, including basic understanding of deployments, rollouts, and debugging common issues like crash loops, failed health checks, and scheduling problems
  • Familiarity with infrastructure automation and configuration management tools such as Terraform and Ansible, including experience working with modules, variables, and managing state safely for multiple environments
  • Solid understanding of Git-based workflows and infrastructure-as-code practices, with the ability to contribute to reusable modules, templates, and pipelines that make automation safer and more consistent
  • Experience working in distributed systems or cloud-based production environments, ideally in SaaS or managed service settings, with comfort participating in incident response and on-call rotations under guidance from more senior team members
  • A proactive mindset focused on automation and documentation—you look for opportunities to remove manual steps, improve runbooks, and turn repetitive tasks into reliable, self-service tools
  • Comfort working asynchronously across distributed teams and a desire to contribute to GitLab's values of collaboration, transparency, and iteration

Benefits

  • Benefits to support your health, finances, and well-being
  • Flexible Paid Time Off
  • Team Member Resource Groups
  • Equity Compensation & Employee Stock Purchase Plan
  • Growth and Development Fund
  • Parental Leave

Company Overview

  • GitLab is a web-based Git repository manager that offers a variety of features for software development teams. It was founded in 2014, and is headquartered in San Francisco, California, USA, with a workforce of 1001-5000 employees. Its website is http://about.gitlab.com.
  • Apply To This Job

    Related roles

    [Remote] Cloud Engineer

    Remote · USA Full-time

    [Remote] Senior Full Stack Integration Engineer

    Remote · USA Full-time

    [Remote] VP of Design (UX, Product, research, etc.)

    Remote · USA Full-time

    [Remote] Project Manager

    Remote · USA Full-time

    [Remote] Director of Business Development

    Remote · USA Full-time

    [Remote] Financial Clearance Specialist

    Remote · USA Full-time

    [Remote] Senior Legal Counsel - Aetna, West Regional Counsel

    Remote · USA Full-time

    [Remote] Research Engineer — Post-Training & Small Language Models (SLMs), Healthcare AI

    Remote · USA Full-time

    [Remote] Senior DevOps Engineer (V)

    Remote · USA Full-time

    [Remote] Senior Full Stack Engineer (AWS, NodeJS exp.)

    Remote · USA Full-time

    Experienced Patient Care Customer Service Representative – Evening & Weekend Shifts (Remote)

    Remote · USA Full-time

    Semiconductor Software Engineer

    Remote · USA Full-time

    Immediate Hiring: Data Entry Specialist at arenaflex

    Remote · USA Full-time

    Experienced Entry-Level Remote Chat Operator – Customer Support & Live Chat Expert

    Remote · USA Full-time

    Work From Home Scheduling Support

    Remote · USA Full-time

    Experienced Outreach Specialist for arenaflex Hosts – 24/7 Customer Service, Tech Support, & Marketing Solutions – Contract to Hire

    Remote · USA Full-time

    Experienced Customer Service Representative – Work From Home Opportunity with arenaflex

    Remote · USA Full-time

    Claims Examiner - Workers' Comp

    Remote · USA Full-time

    Support Specialist - P - Evening hours 1:30-10pm Mon-Sat/Sun 11-8 pm

    Remote · USA Full-time

    Next.js Developer | Offshore

    Remote · USA Full-time