[Remote] Manager, Software Engineering - Observability
Note: The job is a remote job and is open to candidates in USA. Figma is a company dedicated to making design accessible to all, and they are seeking an Engineering Manager for their Observability team. The role involves leading a team of engineers to enhance the visibility and efficiency of Figma's systems, focusing on core observability platforms and driving initiatives for cost transparency and optimization.
Responsibilities
- Lead and grow a team of engineers responsible for the reliability, scalability, and evolution of Figma’s observability and cost engineering platforms
- Own and operate Figma’s core observability stack, including vendor platforms such as Datadog, ensuring high availability, strong data quality, and effective signal-to-noise across metrics, logs, and traces
- Define and drive the technical strategy for instrumentation standards, observability libraries, agents, and operators used to monitor internal and external facing services
- Explore and implement innovative, AI-driven approaches to anomaly detection, root cause analysis, signal correlation, and operational automation
- Establish clear frameworks for cost attribution, budgeting, forecasting, and alerting across infrastructure and observability spend, enabling teams to make informed tradeoffs
- Partner with infrastructure, product engineering, finance, and security teams to improve visibility into system health and cost efficiency at scale
- Lead initiatives to optimize observability footprint and spend, balancing depth of insight with performance and cost considerations
- Coach and mentor engineers through career development, performance feedback, and technical leadership, fostering a culture of ownership, collaboration, and high quality execution
Skills
- 4+ years of experience leading infrastructure, observability, or platform engineering teams, with a track record of delivering highly reliable production systems
- Deep hands-on experience with modern observability platforms (e.g., Datadog, OpenTelemetry) across metrics, logs, and distributed tracing
- Strong understanding of distributed systems, instrumentation best practices, SLO design, and incident response workflows
- Experience driving cost transparency and accountability initiatives, including cost attribution, budgeting, forecasting, and alerting in cloud environments
- Demonstrated ability to set technical direction, drive cross-functional alignment (Engineering, Finance, Security), and make sound architectural decisions in complex environments
- Experience designing or evolving company-wide observability standards, shared libraries, and agent/operator-based integrations
- Background in cost optimization for infrastructure or observability tooling, including vendor negotiations and usage modeling
- Experience applying AI or machine learning techniques to anomaly detection, root cause analysis, or operational automation
- Familiarity with OpenTelemetry and modern instrumentation frameworks across multiple programming languages
- Experience scaling and mentoring high-performing engineering teams through platform expansion or significant architectural change
Benefits
- Figma offers equity to employees
- Health, dental & vision
- Retirement with company contribution
- Parental leave & reproductive or family planning support
- Mental health & wellness benefits
- Generous PTO
- Company recharge days
- A learning & development stipend
- A work from home stipend
- Cell phone reimbursement
- Sales incentive pay for most sales roles
- An annual bonus plan for eligible non-sales roles
Company Overview
Company H1B Sponsorship