Senior Site Reliability Engineer - Engineering Effectiveness
Who We Are
Twitter's SRE team is a world-class group of talented engineers that operate state-of-the-art platforms and technology.
We are looking for a Site Reliability Engineer to join our Consumer Product SRE team, and to support our Engineering Effectiveness products. Engineering Effectiveness helps engineers at Twitter iterate faster, ship high-quality products, and enjoy doing it. Our vision is to enable software development at the scale of Twitter, but with the ease and speed of a startup.
We do this by supporting the entire software development lifecycle (SDLC), including core developer tools and environments, build tools, code review, source version control, CI/CD pipelines, education, and documentation.
Our customers are all engineers at Twitter, whether they are building or supporting clients, services, libraries, data pipelines, machine learning models, and so on.
SREs contribute to this effort by owning the tools and initiatives for scaling the systems we support, optimizing performance, and improving the reliability and availability of our systems.
What You’ll Do
- You’ll partner with product engineering teams and other SREs to productionalize services through configuration management, monitoring, alerting, and documentation.
- You will optimize performance and solve issues across the entire stack: hardware, software, application, and network.
- You will identify and drive opportunities to improve automation for the company.
- You will represent the SRE organization in design reviews and operational readiness exercises for new and existing services.
Who You Are
- You love solving problems related to scaling production systems.
- You have a deep understanding of systems and application design, including the operational trade-offs of various designs.
- You have practical knowledge of various aspects of service design like messaging protocols & behavior, caching strategies and software design practices.
- You are adaptable, solutions oriented, and work very well in a team setting.
- You have a track record of successful practical problem solving, excellent written and social communication, and documentation skills.
- You are able to prioritize tasks, work independently, and call out exceptions effectively.
- Minimum 3+ years of running services in a large scale environment.
- Expert level understanding of Linux servers, specifically RHEL/CentOS.
- Practical, proven knowledge of shell scripting and at least one higher-level language (eg. Python, Ruby, GoLang).
- Experience with source code and binary repositories, build tools, and CI/CD (Git, Artifactory, Jenkins, etc)
- Experience running services written in Scala or Java.
- Demonstrable knowledge of TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures.
- Able to configure and fix DNS, DHCP, and LAN/WAN technologies.
- You have a degree in computer science or similar field or equivalent experience.
- Participate in on-call rotation and periodic conference calls with other specialists from other time zones including but not limited to our headquarters in San Francisco, CA USA.
Engineering Hiring Process
Once your application is received, a recruiter will reach out pending your qualifications are a match for the role.
If your background is a match, you may have 1-2 technical phone interviews or be given the chance to provide a work sample depending on the role.
If the phone interviews go well or your work sample is strong, the final step includes interviews with 5-6 people held onsite in our office.
We're the People Team @Twitter. We're hiring service, purpose-driven people who are creative and move fast. All things Twitter Careers! #LoveWhereYouWork
We're your one stop shop for anything University related. That means campus outreach, student advice/tips, & of course, our University Recruiting efforts!