Senior Site Reliability Engineer, Compute

The Position

Who We Are

Twitter Site Reliability Engineering (SRE) scales Twitter to serve the public conversation around the globe. We inspire engineering confidence by systematically making services reliable and efficient, and ensuring changes are safe and fast.

Compute SRE is responsible for maintaining the availability and reliability of Twitter’s computing platforms. We believe that reliability is the most important feature; without it, other features don’t matter. We use a blend of systems engineering, software development, and architectural skill to do our work, and deeply value collaboration and empathy.

What You’ll Do

Twitter offers engineers the unique opportunity to personally make a noticeable difference at a company that makes a difference in the world. In this role, you will support Twitter’s mission by ensuring the successful operations, and continuous improvement, of our internal computing platforms. You will operate at scale, understanding engineering needs and working constantly to automate into the future. Your contributions to the team will help maintain a high operational standard through effective monitoring, SLO development, and incident response.

Your responsibilities include, but are not limited to:

  • Serve as a steward of Twitter’s production environment through providing on-call support, incident response, collaborative debugging, and continuous learning via blameless postmortems.

  • Implement systemic improvements to reliability and operational excellence of Twitter’s compute clusters.

  • Collaborate with engineers in peer teams to develop solutions that work effectively in Twitter’s ecosystem.

Qualifications

  • 5+ years of professional experience in a reliability engineering, software engineering, or systems engineering; blend of competencies in software and systems engineering.

  • Demonstrated understanding of CS fundamentals including OS, Networking, Data Structures, Algorithms, Concurrency and Distributed Systems.

  • Experience operating and scaling a high-performance compute cluster in a production environment.

  • Experience participating in an on-call rotation.

  • Ability to write code in at least one language; comfortable implementing both functionality and tests, and reviewing others’ code.

  • Experience working with Cloud environments such as GCP or AWS.

  • Clear bias for automation. You don’t build infrastructure, you write code to build infrastructure.

Desired Additional Qualifications

  • Experience in hybrid on-premises, multi-cloud environments.

  • Industry experience with large-scale distributed systems.

Company Description

Twitter is what’s happening and what people are talking about right now. For us, life's not about a job, it's about purpose. We feel real change starts with conversation. Here, your voice matters. Come as you are and together we'll do what's right (not what's easy) to serve the public conversation.

Team

Infrastructure Engineering, Software Engineering

Location

San Francisco, New York City, Boston, Atlanta, Seattle

 

Application

U.S. Equal Employment Opportunity information (Completion is voluntary)
Non U.S. Equal Employment Opportunity information (Completion is voluntary)
Privacy and data