Site Reliability Engineer - Hadoop / Data Platforms
Who We Are:
SREs work on improving the availability, scalability, performance and reliability of Twitter’s production services. Come join us.
About This Job:
As a Site Reliability Engineer (SRE) in Twitter’s Hadoop team you will be working to improve the reliability and performance of Hadoop clusters and data management services. The Hadoop clusters at Twitter are among the largest in the world. We manage data used by millions of people as they connect, explore, and interact with information and one another. You will work shoulder-to-shoulder with our engineering teams to design, build and operate our clusters/services. Your focus will be on debugging, automation, availability and performance, and above all efficiency at ‘reach-every-user-on-the-planet’ scale. We have a wide range of opportunities for varying skill levels and experience.
Work in engineering team to design, build, and maintain Hadoop clusters and data services
Participate in and build tools to:
- Diagnose, and troubleshoot complex distributed systems handling 10s of petabytes of data and develop solutions that have a significant impact at our massive scale.
- Troubleshoot issues across the entire stack - hardware, software, application and network
- Test, monitor, administrate, and operate of multiple clusters across data centers, primarily in Python and Java.
Collaborate across teams such as Application services, Linux kernel, JVM and Capacity Planning, Hardware, Network, and Datacenter Operations to design next-gen storage platforms.
Take part in a 24x7 on-call rotation
Interact with the open source community
2+ years of managing services in a distributed, internet-scale *nix environment.
Familiarity with systems management tools (Puppet, Chef, Capistrano, etc)
Demonstrable knowledge of Linux operating system internals, filesystems, disk/storage technologies and storage protocols and networking stack.
Hands-on operational experience on managing JVM services.
Practical knowledge of shell scripting and at least one scripting language (Python, Ruby, Perl).
Ability to prioritize tasks and work independently
Track record of practical problem solving, excellent communication, and documentation skills
BS or MS degree in Computer Science or Engineering, or equivalent experience.
We are committed to an inclusive and diverse Twitter. Twitter is an equal opportunity employer. We do not discriminate based on race, ethnicity, color, ancestry, national origin, religion, sex, sexual orientation, gender identity, age, disability, veteran status, genetic information, marital status or any other legally protected status.
San Francisco applicants: Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
After you apply, a recruiter may reach out to you for an introductory call.
If your background is a match for the role, you may phone interview with 1-2 people.
If you continue through the process, you will come onsite 1-2 times to interview with a total of 5-10 people.
Twitter Recruiting: All the deets about who we're hiring, what we're doing and why you should come and work here! #lovewhereyouwork
We're your one stop shop for anything University related. That means campus outreach, student advice/tips, & of course, our University Recruiting efforts!