Blogs » Developer Blog » Developer Blog » HBase at Stumbleupon

Hello everyone, my name is Ryan Rawson, a software engineer working on data storage at Stumbleupon.com. I am currently working on HBase, an open source distributed storage system, in which I am the committer on the project. HBase is part of the Hadoop ecosystem, and like the rest of Hadoop projects we take a distributed, low-power and cost-effective approach to storing data. Instead of purchasing one or two very large computers, HBase runs on a network of smaller and cheaper (both absolutely, and by dollars, per mips/ram/disk) machines. By distributing the data across multiple smaller machines, we get better performance since we can leverage additional resources merely by adding more machines. We also gain independence from any individual machine failure as a result.

In addition to actually writing code for HBase, I have presented about the technology publicly. I recently gave a talk at a San Francisco event called Nosql, and it was recorded and uploaded to the internet. I provided an overview of where HBase was for the next release (0.20) and talked a little bit about Stumbleupon’s experiences.

NOSQL – HBase from martind on Vimeo.

Recently, Stumbleupon hosted the HBase User Group meeting at our office. This event was followed up by the HBase hackathon where many committers and contributors gathered to plan the work for the next major revision of HBase. Getting together every so often allows us to agree on work plans, in which we then can go back and continue to coordinate via IRC and email lists for bug reports.

In addition to my open source project work, I am busy making HBase power various projects here at Stumbleupon. The first project to launch using HBase as it’s storage backend is our short-url service su.pr. We use HBase to power all the data and real-time analytics. There are more applications of Hadoop and cascading behind the scenes as well.

HBase has a bright future at Stumbleupon. With the release of 0.20, we are pushing forward on the next set of features, and looking to expand our use of HBase. Follow me on twitter at @ryanobjc

/StumbleUpon Team profile picture