With Firefox 4 entering Release Candidate Mozilla Labs embarked on a huge user testing programme for the new browser. Data is constantly collected from participants and consists of surveys, direct feedback as well as automatic monitoring data. The site gives daily sentiment reports and also trends by factors such as operation systems
Mozilla Labs wrote up the details of their evaluation process and performance testing. Their final choice was Riak and this probably represents the highest profile deployment of Riak to date.
The leaking of the Afghan combat reports represents a significant milestone in crowdsourced data analysis (although the professional media services managed to provide better analysis and background).
The original data is basically like a paper form with consistent sections but variable length content. The match to document databases is immediately apparent and it is likely that the US government originally stored them in some kind of document store. It was interesting to note that no-one seemed to use a relation store to publish or analyse the reports.
The Guardian used Google App Engine which sit on BigTable, if you click through to the event log you can see the blocks of the underlying sections.
An independent CouchDb version has also been created.