Emre Sokullu
Sep 26

Note to Self: the difference between Hive and Cassandra

Sat 26 Sep 2009 15:38:31 | 2 comments
After long hours of research and a heated debate at our office, I've finally come to a conclusion on the difference between 2 fairly new  open source projects that are being used by Facebook too; Hive and Cassandra.

Cassandra: (=BigTable=HBase) saves computational power and time (by indexing with a column based approach) but is a resource hog when it comes to storage.
Hive: doesn't consume that much storage space bit responds slowlier.

So how they are in use at Facebook.

As far as we were able to figure out, friend recommendations are produced by Hive. Hive computations are done by cronjobs. Cassandra is in use when immediate responsiveness becomes a requirement. So for example, for the inbox system.

Know more than that? Have any input on how they are being used? Please let me know in the comments.

Comments

Emree, you've gotta do something about the sites speed man.....
If anyones going to be at the forefront of scalability it will be facebook thats for sure, especially after breaking the 500million user mark recently! I dont envy the server guy in their basement lol



or
Connect with Facebook

Powered by

Twitter

Bir Digital Turks Uyesiyim

More about me...

Destekliyoruz...

About My Company