Thread:Kirkburn/@comment-25523075-20150630143256/@comment-9346948-20150630155455

Here's the description/requirements for one of the available positions:

Are you a leader in Data Engineering who is bored working with million row data sets? Would you love to be in charge of building a brand new data warehouse for a top 20 website? How does this sound?

We’re one of the largest websites in the world and we’d like you to take the lead on managing our data. With 1.8 billion pageviews a month and a data warehouse of over 13TB, the role of Director of Data Engineering opens you to a vast, growing, dynamic social universe. We want you to help us better understand and serve the biggest, most knowledgeable fans on the planet.

Responsibilities


 * Understand business needs and translate to data requirements


 * Enhance and maintain our current Data Warehouse and ETL infrastructure


 * Lead the design and implementation of our new Data Warehouse & ETL infrastructure


 * Perform ad-hoc SQL-based analysis against MySQL & PostgreSQL data warehouses


 * Translate analysis results into digestible tabular or graphical reports

Requirements


 * Experience maintaining and using relational databases(MySQL, PostgreSQL, Oracle, etc.)


 * Experience building Hadoop systems and migrating data into them


 * Strong knowledge of Hadoop components and how they work together (Hive, HBase, Oozie, Sqoop, Impala, etc)


 * Knowledge of Dimensional modeling and performant data loading strategies


 * Scripting experience (preferably Perl/Python and bash)


 * Grasp of basic statistics concepts


 * Self-starter with appropriate attention to detail


 * 2-5 experience managing engineers


 * 5-10 years of web-based software engineering, preferably some using PHP


 * BS Computer Science or equivalent

Bonus points if you...


 * Have experience wrangling terabyte-sized datasets


 * Know the R statistical package


 * Have implemented Tableau or MicroStrategy (we use Tableau)


 * Set up lambda architecture-based systems (eg Storm)


 * Have experience with Amazon EC2 and EMR''