Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi, disclaimer - I work for ScaleBase, giving a true automated transparent sharding, so I live and breath sharding for 4 years now...

The main problem is user/session concurrency. On one machine - it kills at some (near) point. A DB is doing much more for every write then reads (look at my blog here: http://database-scalability.blogspot.com/2012/05/were-in-big...). The limit is here and now, even 100 heavy writing sessions will choke the MySQL (or any SQL DB...) on any hardware.

Catch 22: Scale-out to repl slaves with R/W splitting? This can lower read load on the master DB, but read load can be better lowered by caching. The problem is writes and small supporting transactional reads, and slaves won't help. Distributing data (sharding?) is the only way to distribute write intensive load, and it also helps reads by putting them on smaller chunks, and parallelizing them is a sweet sweet bonus :)

As I see around (hundreds of medium-large sites) - there's no other way...

And one final word about the cloud: "one DB machine" is limited to a rather limited non-powerful virtualized compute and I/O space... In the cloud limits are here and now! Cloud is all about elasticity and scale-out.

Hope I helped! Doron



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: