Gee, it seems as if the super-scalable website biz has moved beyond MySQL/Memcached.
But in addition, he provides a lot of useful links, which DBMS-oriented folks such as myself might have previously overlooked. Following those trails gets one to, among other things:
A September, 2009 post outlining Digg's reasons for moving to Cassandra. The core idea is that joining two tables is expensive; it's cheaper to store the results prejoined on disk. Details are provided.
I also recall seeing something that said "We have 13X as many queries as updates, so of course we should optimize for reads," but I can't find that now. The classical OLTP answer to that would probably be "Yeah, but by the time you're two-phase-committing and integrity-checking all the part of that update, it turns out updates are still what you should optimize for." Well, what if the update is so simple that that's no longer a valid argument?
There certainly seem to be some non-obvious technical choices being made here, with options being conflated that perhaps shouldn't be. In particular, I wonder whether things are being written to cheap disk in a really fast way when it might be better to keep them in more expensive RAM or, perhaps better yet, solid-state memory. Perhaps then the functionality/performance tradeoff wouldn't be so painful.
On the other hand, the designers of the world's most scalable websites -- e-commerce sites perhaps excepted -- seem pretty unanimous in thinking it's best to bake some database/integrity management into the applications, rather than offload it all to an RDBMS. Why? Because the transactions are so simple that hand-coding all that isn't prohibitive. And of course because of their extreme performance and scalability needs.
I'm not sure on what basis one could argue that they're wrong.Todd Hoff put up a provocative post on High Scalability called "MySQL and Memcached: End of an Era?"... It seems as if the super-scalable Web site biz has moved beyond MySQL/Memcached...
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.