Students: Apply now for Summer of Code

Tuesday, 30 April 2013

We are thrilled to have an opportunity again to participate and support the Summer of Code program, especially since we enjoyed being involved so much last year for the first time.

Unlike many Summer of Code participating organizations that focus on a single ecosystem, we have a variety of projects spanning multiple programming languages and open source communities. Here are a few of our project ideas for this year:

Finagle

Finagle is a protocol-agnostic, asynchronous RPC system for the JVM that makes it easy to build robust clients and servers in Java, Scala or any JVM-hosted language. It is extensively used within Twitter and other companies to run their backend services. This summer, we’re offering these project ideas:

  • Distributed debugging: DTrace is a very powerful and versatile tool for debugging local application. We would like to employ similar types of instrumentation on a cluster of machines that form a distributed system, tracing requests based on specific conditions like the state of the server.
  • Pure Finagle-based ZooKeeper client: ZooKeeper is the open sourced library of cluster coordination that we use at Twitter. We would like to implement a ZooKeeper client purely in Finagle.


If you’re new to Scala, we recommend you check out our Scala School and Effective Scala guides on GitHub.

Mesos

Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications (or frameworks). It is extensively used at Twitter to run all sorts of jobs and applications. We are looking for a student to help us add security and authentication support to Mesos (including integration with LDAP). We recommend signing up on the Mesos mailing list and if you want to learn more about Mesos, you might enjoy this article in Wired.

Scalding

Scalding is a Scala library that makes it easy to specify Hadoop MapReduce jobs. Scalding is built on top of Cascading, a Java library that abstracts away low-level Hadoop details. Scalding is comparable to Pig, but offers tight integration with Scala, bringing advantages of Scala to your MapReduce jobs. This summer, we’re looking for students to help with:

  • Scalding Read-eval-print-loop (REPL): Make a REPL to allow playing with scalding in local and remote mode with a REPL. The challenge here is scheduling which portions of the items can be scheduled to run, and which portions are not yet ready to run. You will build a DAG and when one is materialized, you schedule the part of the job that is dependent on that output.
  • Integrate Algebird and Spire: Spire is a scala library modeling many algebraic concepts. Algebird is a Twitter library that is very similar and has a subset of the objects in Spire. We would like to use the type-classes of Spire in Algebird. Algebird is focused on streaming/aggregation algorithms, which are a subset of Spire’s use case.


You can view all of our project ideas on our wiki.

We strongly recommend that you submit your application early and discuss your ideas with respective project mentors. The deadline is May 03 at 19:00 UTC and late applications cannot be accepted for any reason. You can always update your application and answer our questions after you submit it. If you have any questions not covered in the wiki, ask them on our Summer of Code mailing list. We look forward to reading your applications and working with you on open source projects over the summer.

Good luck!

- Chris Aniszczyk, Manager of Open Source (@cra)