Getting the most out of social data analysis

Mario HonrubiaBig Data

There are countless websites and applications which are dedicated to web-based analysis of social data through creating and engaging the crowd in a discussion. The research though shows that such tools that rely on public discussion to produce hypotheses or explanations of patterns and trends in data rarely yield high-quality results in practice. But Wesley Willett and his co-authors from Stanford VIS Group (Now part of University of Washington) think there is no need to be worried about: There is an alternative approach and it's called Crowdsourcing! Crowdsourcing in general differs from pure discussions because the owner of the data pays workers to generate explanations or analyze patterns. Imagine a scenario puts a large amount of data on web and ask people to interpret the patterns or provide insights: There will be no surprise if the owner gets many comments like: "Nice Charts!". Apart from the fact that organizer of the social data analysis effort should not ask questions like “Explain why a chart is interesting”, Willett and his colleagues think it's better to go the crowdsourcing route and  do it the following way: Always provide good examples, and include complementary information and training material on reference gathering, and chart reading. Their study shows that such simple modifications increases the quality of responses by 28% for US workers and 196% for non-US workers. Paid crowd workers can reliably generate diverse, high-quality explanations that support the analysis of specific data-sets, but we need to give the participants what they need in order to flourish!
Photo credit: jwyg / Foter / CC BY-SA