Senior Site Reliability Engineer - Data Infrastructure / Hadoop



Site Reliability Engineers work on improving the availability, scalability, performance and reliability of Twitter’s production services. Come join us.

Who We Are:

  • As a member of the organization you will be dedicated to improving the reliability of our end-to-end data infrastructure. Your work will integrate directly with Twitter's products.

  • Our core infrastructure receives hundreds of millions of tweets per day and serves tens of billions of API requests. We also serve over 2+ billion search queries per day, render millions of ad impressions, and process hundreds of terabytes of log and interaction data daily.

  • We dive deep into gnarly operational issues; from the software, systems, automation, and process perspectives. We will understand the challenges around integrating disparate infrastructures into a new facility, processes and procedures.

  • We work with open-source technologies and get involved with SRE and Hadoop community.

  • We actively participate in the vision to move away from high operational cost tasks such as break/fix, cluster migrations, new service buildouts, abuse, etc. You will contribute to services that can shrink and expand based on demand, self heal, automatically rollout, etc.

  • We will train and invest in our team members to make sure that they are successful in supporting large variety of system and products that Twitter use.

Your responsibilities include but are not limited to:

  • You will use your expertise to improve the reliability and performance of Hadoop clusters and data management services.

  • You will participate in and build tools to diagnose, and fix complex distributed systems handling 10s of petabytes of data and drive opportunities to improve automation for the company, scope and create automation for deployment, management and transparency of our services.

  • You will tackle issues across the entire stack - hardware, software, application and network.

  • You will test, monitor, administer, and operate multiple clusters across data centers, primarily in Python and Java.

  • You will take part in 24x7 on-call / support rotation.

Who You Are:

  • Minimum 3+ years of handling services in a large scale distributed systems environment, preferably Hadoop.

  • Familiarity with systems management tools (Puppet, Chef, Capistrano, etc)

  • Knowledge of Linux operating system internals, filesystems, disk/storage technologies and storage protocols and networking stack.

  • Proven knowledge of systems programming (bash and shell tools) and/or at least one scripting language (Python, Ruby, Perl, Scala).

  • Track record of practical problem solving, excellent communication, and documentation skills

  • Proven understanding of systems and application design, including the operational trade-offs of various designs.

  • Work well with and be able to influence a myriad of personalities at all levels.

  • Be adaptable and able to focus on the simplest, most efficient & reliable solutions.


  • Experience with HDFS, YARN and related hadoop technologies.

  • Ability to lead technical teams through design and implementation across an organization.

  • B.S. in computer science or similar field.


We are committed to an inclusive and diverse Twitter. Twitter is an equal opportunity employer. We do not discriminate based on race, color, ethnicity, ancestry, national origin, religion, sex, gender, gender identity, gender expression, sexual orientation, age, disability, veteran status, genetic information, marital status or any legally protected status.


Engineering Hiring Process

Step 1

Once your application is received, a recruiter will reach out pending your qualifications are a match for the role.

Step 2

If your background is a match, you may have 1-2 technical phone interviews or be given the chance to provide a work sample depending on the role.

Step 3

If the phone interviews go well or your work sample is strong, the final step includes interviews with 5-6 people held onsite in our office.


Personal Information

Required field. PDFs only; max file size is 1MB.
Required field. PDFs only; max file size is 1MB.

Twitter does not accept any unsolicited resumes from recruiting agencies and will not pay fees associated with any such resumes. Agencies, please do not send resumes to any Twitter location, employee, or email address.

Twitter cares about your privacy and protecting your data.  Please click the privacy policy link and acknowledge you have read and understood how Twitter treats your privacy and your data.  

Would you like to receive email communication from Twitter about career opportunities? You may unsubscribe at any time.
Applicant Data - You have a choice. Can we keep your personal data for both the job you are applying for and any other Twitter jobs that we feel you may be a match for? If you choose yes we will retain your personal data for a period of twelve months to consider you for other job opportunities at Twitter.
Analytics - May we use personal data from your resume and application to analyze and improve the Twitter hiring experience.
Thanks for applying!
Submission failed. Please make sure all fields are correctly formatted.