Facebook – Real Time ETL (PUMA and PTail with HBase)


Facebook is working on a real-time analytics dashboard that will let users determine which content on their pages is getting the most attention from visitors. As described in an educational session on Wednesday night in Facebook’s Seattle office, the service, which tracks both impressions and actions  for plugins and newsfeeds, should be valuable to companies seeking to maximize the effectiveness of the marketing efforts on the popular social media site. However, the highlight of the session was the infrastructure underlying the forthcoming service.

The session video gives plenty of details, but here are some highlights. The analytics service tracks about 100 different metrics; is built atop HBase, with support from two Facebook-developed tools called pTail and Puma; and it aims for less than 30 seconds of lag time, a goal it has met a majority of the time during testing. It’s interesting that Facebook is becoming such a big user…

View original post 78 more words