Site Reliability Engineer - Hadoop / Data Platforms

San Francisco, CA

Site Reliability Engineers work on improving the availability, scalability, performance and reliability of Twitter’s production services. Come join us.

Who We Are:

  • As a member of the organization you will be dedicated to improving the reliability of our end-to-end data infrastructure. Your work will integrate directly with Twitter's products.
  • Our core infrastructure receives hundreds of millions of tweets per day and serves tens of billions of API requests. We also serve over 2+ billion search queries per day, render millions of ad impressions, and process hundreds of terabytes of log and interaction data daily.
  • We dive deep into gnarly operational issues; from the software, systems, automation, and process perspectives. We will understand the challenges around integrating disparate infrastructures into a new facility, processes and procedures.
  • We work with open-source technologies and get involved with SRE and Hadoop community.
  • We actively participate in the vision to move away from high operational cost tasks such as break/fix, cluster migrations, new service buildouts, abuse, etc. You will contribute to services that can shrink and expand based on demand, self heal, automatically rollout, etc.
  • We will train and invest in our team members to make sure that they are successful in supporting large variety of system and products that Twitter use.

Your responsibilities include but are not limited to:

  • You will use your expertise to improve the reliability and performance of Hadoop clusters and data management services.
  • You will participate in and build tools to diagnose, and fix complex distributed systems handling 10s of petabytes of data and drive opportunities to improve automation for the company, scope and create automation for deployment, management and transparency of our services.
  • You will tackle issues across the entire stack - hardware, software, application and network.
  • You will test, monitor, administer, and operate multiple clusters across data centers, primarily in Python and Java.
  • You will take part in 24x7 on-call / support rotation.

Who You Are:

  • Minimum 3+ years of handling services in a large scale distributed systems environment, preferably Hadoop.
  • Familiarity with systems management tools (Puppet, Chef, Capistrano, etc)
  • Knowledge of Linux operating system internals, filesystems, disk/storage technologies and storage protocols and networking stack.
  • Proven knowledge of systems programming (bash and shell tools) and/or at least one scripting language (Python, Ruby, Perl, Scala).
  • Track record of practical problem solving, excellent communication, and documentation skills
  • Proven understanding of systems and application design, including the operational trade-offs of various designs.
  • Work well with and be able to influence a myriad of personalities at all levels.
  • Be adaptable and able to focus on the simplest, most efficient & reliable solutions.
  • B.S. in computer science or similar field or equivalent experience.

Desired:

  • Experience with HDFS, YARN and related hadoop technologies.
  • Ability to lead technical teams through design and implementation across an organization.

We are committed to an inclusive and diverse Twitter. Twitter is an equal opportunity employer. We do not discriminate based on race, ethnicity, color, ancestry, national origin, religion, sex, sexual orientation, gender identity, age, disability, veteran status, genetic information, marital status or any other legally protected status.

San Francisco applicants: Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Hiring Process

Step 1

After you apply, a recruiter may reach out to you for an introductory call.

Step 2

If your background is a match for the role, you may phone interview with 1-2 people.

Step 3

If you continue through the process, you will come onsite 1-2 times to interview with a total of 5-10 people.

Application

Personal Information

This field is required.
This field is required.
This field is required.
This field is required.
Required field. PDFs only; max file size is 1MB.
Required field. PDFs only; max file size is 1MB.

U.S. Equal Opportunity Employment Information  (Completion is Voluntary)

At Twitter, we have a bold aspiration to reach every person on the planet. We believe that goal is more attainable with a team that understands and represents different cultures and backgrounds and we are committed to an inclusive and diverse Twitter.

This is where you come in! Please take a few minutes to provide us with your information. You are not required to provide this information and you may decline to disclose. Your decision to provide information (or not) will not affect your employment or opportunities at Twitter.

Twitter is an equal opportunity employer. We do not discriminate based on race, color, ethnicity, ancestry, national origin, religion, sex, gender, gender identity, gender expression, sexual orientation, age, disability, veteran status, genetic information, marital status or any legally protected status.

You can view the ‘EEO is the Law’ poster here.

Twitter does not accept any unsolicited resumes from recruiting agencies and will not pay fees associated with any such resumes. Agencies, please do not send resumes to any Twitter location, employee, or email address.

Twitter, Inc. is committed to working with and providing access and reasonable accommodations to applicants with physical or mental disabilities. If you need an accommodation in order to apply for open job opportunities, please submit a description of your accommodation request to RARequest-Recruiting@twitter.com. This email is only for accommodation requests related to the application process.

Success
Thanks for applying!
Error
Submission failed. Please make sure all fields are correctly formatted.

Don't see the right fit?

Check out other opportunities at Twitter.