Site Reliability Engineer - Core Storage Infrastructure, Seattle
As a Site Reliability Engineer (SRE) in Twitter’s Core Storage team, you will work to improve the reliability and performance of the next-generation of distributed systems. You will partner with our product engineering teams to design, build, operate and automate distributed storage systems at the heart of Twitter’s infrastructure that are used by millions of people.
• Build tooling to improve the automation of operations. This includes automatic failure detection and remediation, application deployment, OS/Kernel/JVM/Firmware deployment, capacity planning, and fleet management.
• Diagnose, and troubleshoot complex distributed systems handling millions of queries per second, petabytes of data, and develop solutions that have a significant impact at our massive scale.
• Collaborate with SWE teams to sustain and optimize the availability, reliability, and performance of production services.
• Work and collaborate with the diverse hardware, software and networking teams throughout the company to design next-generation distributed storage platforms.
• Troubleshoot issues across the entire stack - hardware, software, application and network.
• Participate in a 24x7 on-call rotation.
• 5+ years of managing services in a distributed, internet-scale *nix environment.
• Practical knowledge of at least one programming language (Python, Go, Ruby, Perl).
• Demonstrable knowledge of Linux operating system internals, TCP/IP, filesystems, disk/storage technologies.
• Familiarity with systems management tools (Puppet, Chef, Capistrano, Ansible, etc)
• Hands-on operational experience on managing JVM services.
• Ability to prioritize tasks and work independently
• Track record of practical problem solving, excellent communication, and documentation skills
• BS degree in Computer Science or Engineering, or equivalent experience.
After you apply, a recruiter may reach out to you for an introductory call.
If your background is a match for the role, you may phone interview with 1-2 people.
If you continue through the process, you will come onsite 1-2 times to interview with a total of 5-10 people.
We're the People Team at Twitter. We Tweet about who we're hiring, what we're doing, and why you should work at Twitter! #LoveWhereYouWork
We're your one stop shop for anything University related. That means campus outreach, student advice/tips, & of course, our University Recruiting efforts!