Cok guzel bir konferansti; duzenledigi icin Ozkan Altuner'e tesekkurler:
Yanimda kartvizitlerimi getirmedigim icin iletisim bilgilerimi paylasamadigim herkesi bana email atmaya emre [at] groups-inc.com ve/veya buraya uye olmaya davet ediyorum (sorulara verdiginiz cevap sadece ben tarafindan goruntulenecek).
Yanimda kartvizitlerimi getirmedigim icin iletisim bilgilerimi paylasamadigim herkesi bana email atmaya emre [at] groups-inc.com ve/veya buraya uye olmaya davet ediyorum (sorulara verdiginiz cevap sadece ben tarafindan goruntulenecek).
After long hours of research and a heated debate at our office, I've finally
come to a conclusion on the difference between 2 fairly new open source
projects that are being used by Facebook too; Hive and Cassandra.
Cassandra: (=BigTable=HBase) saves computational power and time (by indexing with a column based approach) but is a resource hog when it comes to storage.
Hive: doesn't consume that much storage space bit responds slowlier.
So how they are in use at Facebook.
As far as we were able to figure out, friend recommendations are produced by Hive. Hive computations are done by cronjobs. Cassandra is in use when immediate responsiveness becomes a requirement. So for example, for the inbox system.
Know more than that? Have any input on how they are being used? Please let me know in the comments.
Cassandra: (=BigTable=HBase) saves computational power and time (by indexing with a column based approach) but is a resource hog when it comes to storage.
Hive: doesn't consume that much storage space bit responds slowlier.
So how they are in use at Facebook.
As far as we were able to figure out, friend recommendations are produced by Hive. Hive computations are done by cronjobs. Cassandra is in use when immediate responsiveness becomes a requirement. So for example, for the inbox system.
Know more than that? Have any input on how they are being used? Please let me know in the comments.
Check its last modification date, a la filemtime of PHP. How can you do that?
Easy:
show table status from DATABASE_NAME like
'TABLE_NAME'G
This gives you an output such as:
mysql> show table status from groups like 'analytics'G
*************************** 1. row ***************************
Name: analytics
Engine: MyISAM
Version: 10
Row_format: Dynamic
Rows: 453161
Avg_row_length: 343
Data_length: 15573804
Max_data_length: 281474976710655
Index_length: 2778112
Data_free: 0
Auto_increment: 158700
Create_time: 2009-08-20 14:26:29
Update_time: 2009-08-23 12:33:49
Check_time: 2009-08-20 14:26:31
Collation: utf8_unicode_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)
Bold section is the last modification date.
If it's NULL, that means you probably can remove it safely.
- MySQL Cluster is not the solution; consider it as a different database and don't forget that it's not very efficient, so ends up costing you much
- Use Memcached
- Your memcache machine should be close to your application/web servers, not
the database
- Sharding is hard. There are things you can do before sharding. Replicate as much as you need, so that you can split read and write operations. One machine dedicated to write only.
-
- Then the most logical thing to do for sharding is horizontal sharding; which means host different tables in different machines. But this is needed only in extreme cases, as you grow like crazy, like grou.ps
- Before that, you can maximize your write capabilities, you can get a beefy server with 128MB of ram, 4x Hexacore processors. Note that you'll need to use xtradb on such a configuration because regular mysql setup does not scale well to more than 4 cores, instabilities do occur.
- And even before that, try optimizing your code.
- You can optimize your queries by enabling logging of the
slow queries
- For a 16GB of ram, 400 is the maximum number of connections you need to have on the machine.
- If SHOW PROCESSLIST; displays too many Sleeping connections, no problem, don't worry about them.
- You don't recover from table locks with myisam, auto_increment still requires table lock
- Use maatkit tools. mk_audit recommended as a good start
- If you are replicating, you'll need mk_slave_restart at some point, but don't forget that, it's dangerous, can create inconsistencies
- use mk-query-digest to collect information about the incoming mysql
queries; then you can optimize them using the EXPLAIN command - tcpdump is a
microsecond level alternative for the very same job
- take off unused databases and tables
- for a fast updating environment, cronjob mk-query-digest and mk-duplicate-key-checker
- Consider a dual master setup; use mmm (1.x branch is recommended as of this writing) or flipper - good
- hot backup strategies: lvm_backup, mk-parallel-dump, innobackup,
- do not use set where you can use enum; set is for options that can be chosed more than once
- vmstat 5 5 to see the status of your disk
- even though you don't use myisam, the minimum key_buffer size should be 32MB
- InfoBright is good for analytics tables
- XFS is good as the underlying filesystem of innodb systems.
Other Sources:
http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/
http://provenscaling.com/blog/2008/10/09/introducing-flipper-for-managing-mysql-master-pairs/
http://provenscaling.com/software/flipper/docs/html/
http://www.howtoforge.com/mysql_master_master_replication
http://www.howtoforge.org/mysql_database_replication
Last week, I've informally started an open discussion on this blog, asking
"how to keep good people at your
company" - The answers started to come in and I got 2 comments so far; both
of them suggesting the importance of materialistic satisfaction:
But I think I found the best answer in the book that I've been reading this weekend, Outliners by Malcolm Gladwell. I'll quote 2 of his sentences:
- Hidayet Dogan, one of the best Turkish PHP programmers I've seen so far (based on his resume and projects) dreams of a Google like office environment, full of gadgets and big LCD displays.
- Eren Yagdiran, whose diverse array of interests can be found on his personal blog, says it's all about money.
But I think I found the best answer in the book that I've been reading this weekend, Outliners by Malcolm Gladwell. I'll quote 2 of his sentences:
Those three things--autonomy, complexity, and a connection between effort and reward--are, most people agree, the three qualities that work has to have if it is to be satisfying.
Hard work is a prison sentence only if it does not have meaning. Once it does, it becomes the kind of thing that makes you grab your wife around the waist and dance a jigAt GROU.PS we try to foster a corporate culture that provides those 3 things, all at the same time:
- Autonomy: Our management type can
be considered democracy under
meritrocratic feudalism. There's a loose hierarchy which gives its nodes
the freedom to show their creativity without the boundaries of bureaucracy, and
the chance to have your own team if you can prove yourself to your "lord". The
hierarchy is not fixed, it's dynamic based on your merits.
- Complexity: What we're doing here
is literally social operating system. If you think it's just a content
management system or something like that (and if you're qualified enough)
you're welcome to our office to see what kind of things and what kind of an
architecture that we're working. I don't claim it's rocket sciences, but it's
pretty close.
- Connection between Effort and
Reward: GROU.PS is a global venture with chances of being acquired or
going IPO. And besides the fixed salary, we are committed to give a generous
amount of sweat equity and/or stock options to our employees according to their
commitment and role in the team.
UPDATE: forget my negative observations - it just takes a little while
to get used to it.
Negative Ones
- As you browse through the pages, the screen flicks, and it temporarily displays randomly located black spots - which results in poor user experience.
- There could be a separate button for Kindle Store
- Buttons seem to be low quality
- Packaging was not as good as I was expecting - my anticipation was a true Apple experience, but it was far from that.
- I miss the multi-touch screen of iPhone, it could make note-taking a breeze - which is, I think, a natural component of our reading experience.
- Poor blog reading machine, because no images, no Flash components - which are crucial parts of blogs.
- Can't delete a book that you've purchased (or there's no easy way)
- Chaging pages is slow, once you lose the page, it's so difficult to find wherever you were at
Positive Ones
- You can send yourself docs via email - big WINNER!
- Amazon.com integration is much better than what I was expecting
- Text to speech sounds better than what I was expecting, it's pretty understandable.
- Bonus feature: browse the web - even though it's only in text-mode, fine!
- Very easy to get started
- Letter of Jeff Bezos :) the gadget makes you feel special right from the beginning, because your account info is there, saved, so you don't need to deal with it. Bezos' letter starts with Dear __your name__ :) nice...
All in all,even though my negative points outnumber the positive ones, I
liked Kindle. These are only my after buying impressions, I was already pretty
positive about it before purchasing and perhaps that's why my expectations were
so high.
I think it's worth price, especially when consider the fact that you save the shipping price and there's no waiting, plus Kindle books are much cheaper since they come with no physical costs. The device looks cool too. Highly recommended.
This is not a blog entry, I'm just asking. Feel free to add your thoughts in
the comments. But I should write a blog entry about this soon too.
GROU.PS gibi ortaya cikan yeni genc Turk temelli startup'lar beni
heyecanlandirmiyor degil. Ancak ben bugun baslasam bir web 2.0 girisimi
yapmazdim. Web 2.0 girisimlerinin sayisi ozellikle Yahoo'nun OddPost'u satin
almasindan sonra hizla artti; delicious ve Flickr satin alimlariyla beraber
doruk noktasina eristi. Bir ara oyle bir boom yasadik ki, YouTube'ler ve
ozellikle su reklam endustrisindeki satin alimlar, ardindan down ekonominin
tadini tekrar tatmamiz uzun zaman almadi. Ama neyse teknoloji ekonomisinde
isler cok fena gitmiyor ve satin alimlar yeniden aktiflesmeye basladi. Yine de
web 2.0 o kadar cazip bir alan olmadigini kanitladi gibi.
Ama bence bunun en buyuk sebebi Facebook Platform ve iPhone'un yazilim dunyasina getirdigi yeni boyutlar. Ozellikle de iPhone... Konum temelli servisler....
Ozellikle Facebook ve iPhone uzerindeki oyun dunyasinda buyuk firsatlar var. Genc bir girisimcinin en kolay flip yolu hic suphesiz bu ortamlarda oyun yazmak. Basta Zynga, EA ve ngmoco genc oyun sirketlerini (microstudios) almaya oyle ac ki; http://gamemakers.ngmoco.com/ tek basina yeterli bir ornek olsa gerek.
Ama flip'i birakin - birkac arkadas bir araya gelerek buralarda baska buyuk isler yapmak icin de bircok firsat mevcut.
Su an bildigim birkac iddiali Turk Facebook developer da var - ama onlari, izinlerini aldiktan sonra, baska bir yazinin malzemesi yapabilirim :)
Ama bence bunun en buyuk sebebi Facebook Platform ve iPhone'un yazilim dunyasina getirdigi yeni boyutlar. Ozellikle de iPhone... Konum temelli servisler....
Ozellikle Facebook ve iPhone uzerindeki oyun dunyasinda buyuk firsatlar var. Genc bir girisimcinin en kolay flip yolu hic suphesiz bu ortamlarda oyun yazmak. Basta Zynga, EA ve ngmoco genc oyun sirketlerini (microstudios) almaya oyle ac ki; http://gamemakers.ngmoco.com/ tek basina yeterli bir ornek olsa gerek.
Ama flip'i birakin - birkac arkadas bir araya gelerek buralarda baska buyuk isler yapmak icin de bircok firsat mevcut.
Su an bildigim birkac iddiali Turk Facebook developer da var - ama onlari, izinlerini aldiktan sonra, baska bir yazinin malzemesi yapabilirim :)


