Small Data and Big Data Controversies and Alternatives: Perspectives from The Sage Handbook of Social Media Research Methods
Small Data and Big Data Controversies and Alternatives: Perspectives from The Sage Handbook of Social Media Research Methods
Workshop 2C: 14:15-17:30
17:30 Join us for dinner at New Cross House pub
Claudine Bonneau, Université du Québec à Montréal, firstname.lastname@example.org
Anatoliy Gruzd, Ryerson University, email@example.com
Martin Hand, Queen’s University, firstname.lastname@example.org
Guillaume Latzko-Toth, Université Laval, email@example.com
Mélanie Millette, Université du Québec à Montréal, firstname.lastname@example.org
Anabel Quan-Haase, The University of Western Ontario, email@example.com
Diane Rasmussen Pennington, University of Strathclyde, firstname.lastname@example.org
Luke Sloan, Cardiff University, email@example.com
Ravi Vatrapu, Copenhagen Business School, firstname.lastname@example.org
Frauke Zeller, Ryerson University, email@example.com
The panel will provide an overview of critical themes to be covered in the Sage Handbook of Social Media Research Methods to be published in 2016. The Handbook is the first book to cover not only the entire research process in social media research from question formulation to the interpretation of research findings, but also to include specific chapters and examples on how data collection and analysis takes place on specific social media platforms such as Twitter and Instagram. Our panel will focus on a critical theme that weaves through the entire handbook, namely the tensions and controversies that have emerged around two fundamental different approaches toward the study of social media: big data vs. small data. Three central themes will be explored in an interactive format that includes a life poll and feedback from the audience: (1) the contributions to scholarship that big data and small data make and the contexts in which each approach is appropriate; (2) the tension between big data analytics and small data; (3) approaches on how to combine and integrate both approaches and how they can potentially inform each other.
The main objective of the present panel is to discuss key challenges in the study of social media, specifically looking at methodological issues. At the core of the discussion will be the tension between large-scale quantitative and small-scale qualitative approaches. These approaches are often perceived to be at opposite ends of the spectrum. Our approache consists of discussing first what the contributions are that each perspective makes to our understanding of social media phenomena. We explore the following two questions: What is the contribution of large-scale quantitative approaches? What is the contribution of small-scale qualitative approaches? Members from the panel will discuss the strengths of each approach and specifically draw from their own research experiences to demonstrate how they have employed either a big data or small data approach. The aim of the discussion will be to get a better sense of the context in which each approach is appropriate and the kinds of insights it allows scholars to make. We will then explore the tension that exists between big data and small data approaches and also provide an overview of cutting-edge methodological innovations that help bridge the divide. The two questions of interest are: Where does the tension come from between large-scale quantitative and small-scale qualitative approaches? How can small data inform big data and vice versa?
Two key questions will guide the panel discussion and ask for audience members’ opinions as well:
What is the contribution of large-scale quantitative approaches? What is the contribution of small-scale qualitative approaches?
Names of panelists and their perspectives:
Claudine Bonneau and Mélanie Millette
Will discuss the contribution of small-scale qualitative approaches. Drawing on a recent case study in social media research, they will propose various “data thickening methods” to illustrate qualitative strategies whose scope crosses the boundaries between “small” and “big” datasets, and between “trace data” and data of other origins. They show that data thickening is essentially a relational process: it happens when links are created with other data that operate as metadata. In the process of thickening trace data, the thickening occurs not only on the side of digital traces, but it also happens with other qualitative data collected along the way.
Will discuss the unique contribution of small-scale qualitative approaches based on visual data and emphasizes visual culture methodologies to conceptualize visuality in social media as comprising of three broad elements, each of which has to be successfully negotiated in methodological terms. First, images themselves can take many forms, and require careful attention to the established approaches in visual studies while acknowledging the specific qualities of the digital. Second, the circulation of visual data in social media destabilizes the objects of study in ways that challenge visual analysis concerned with locating textual meanings. Third, while the visualization of social practices through social media appears to offer unprecedented access to social life, the detail of such practices often remains obscure if we focus solely on images. We need to ask how visual objects are generated and used, and how people make sense of the visual in using social media. Pulling these three dimensions apart and then together is difficult. This, I suggest, is the current predicament of visual studies of social media.
He addresses an academic research gap and real-world industry need to describe, model, analyse and explain large-scale interactions on organisations’ social media channels as individuals’ associations to ideas, values, identities etc. Towards this end, we are developing and evaluating a set-theoretical approach to big data analytics termed “Social Set Analysis” (SSA). Social Set Analysis consists of three primary research activities: (a) theorising, modelling, and collecting big social data about organisations (e.g., Danish Cancer Society’s official Facebook page); (b) combining those big social data sets with in-house organisational data sets (e.g., Customer Relationship Management systems); and finally (c) analysing the combined datasets by applying set theoretical methods and tools (crisp sets, fuzzy sets, rough sets, random sets and Bayesian sets). This talk will outline the SSA approach, report selected empirical findings, discuss implications and limitations, identify challenges and future research directions.
Diane Rasmussen Pennington
She will argue that quantitative analysis of big data carries risks with it, but that it can provide value if used appropriately. The misuse of big data presents risks to individuals’ privacy and online security, especially since the process of anonymizing consumer data does not always succeed (Williams & Rasmussen Neal, 2012; Mayer-Schöenberger & Cukier, 2013). The patterns that emerge from quantitative big data analysis do not tell researchers much, if anything, about individuals’ preferences and differences. Additionally, the data in “big data” is frequently incorrect, as can be observed anecdotally on credit reports or “people search” websites such as intelius.com. Qualitative analysis of big data presents the issue of not allowing for the employment of larger sample sizes or finding those potentially useful trends mentioned previously, although it can provide deeper insight into behaviours or preferences of fewer individuals (Rasmussen Pennington, in press). Both approaches have something important to offer big data and small data; how can they be used together in a mixed methods type of approach in order to achieve the most meaningful, accurate, representative results possible?
Where does the tension come from between large-scale quantitative and small-scale qualitative approaches?
Names of panellists and perspectives:
The algorithmic processing of very large sets of “traces” of user activities collected by digital platforms—so-called “Big Data”—exerts a strong appeal on social media researchers. In the context of a computational turn in social sciences and humanities is qualitative research based on small samples and corpuses (“Small Data”) still relevant? It is argued that the unique value of such research lies in data thickness. This is achieved through a process we call thickening. Drawing on recent case studies in social media research I have conducted, I propose and illustrate three strategies to thicken trace data: trace interview, manual data collection and flexible long-term online observation.
Provides an overview of the challenges and opportunities, but also a discussion of the term data and of the nature of social media data. The methods overview, in combination with the introduction and discussion of methods used in other disciplines and in commercial market research, aims to provide a practical and applied guideline for social media research. An applied case study at the end of this chapter describes a novel approach to the analysis of multimodal, large data sets in online communication environments, using a mixed method design.
Big Data and Political Science. Increasingly, public media is dominated by discussions of the utility of social media-sourced data for use in policymaking, surveillance, and marketing. Academic lag on the subject, however, remains endemic, making its utility in social science research seem elusive for those first broaching it. The panel discussion will explore the utility of social media data as indicative evidence of external social, economic and political relations, trends, and events.
Social media encompass a wide array of platforms ranging from popular sites such as Facebook and Sina Weibo to sites geared to niche communities such as Academia, Pinterest, and Ello. While social media share common features that afford engagement through ‘two-way’ audience interaction, the diversity in design encountered across sites makes it difficult to identify a set of core functionalities. In this discussion, I focus on user engagement in the context of social media at the level of the individual and network experience – i.e., the experiences that motivate users to engage with content created, shared, or endorsed by people in their social networks and encourage them to linger and return. Understanding social media engagement is valuable and its understanding bridges big data analytics and small-scale research.
Panelists will discuss their positions and ideas on the relevance and importance of both qualitative and quantitative methods in social media research. Some approaches, such as content analysis, can be performed either qualitatively or quantitatively, while others require an exclusion of one approach or the other (Rasmussen Pennington, in review). For example, discourse analysis is entirely qualitative, which presents potential challenges such as a relatively small dataset (Neal, 2010). Automated approaches, such as sentiment analysis, do not necessarily allow for human intervention with the data, and therefore inflections and other subtleties in the sample may not be captured adequately (Thelwall & Buckley, 2013).
The panel addresses several conference themes. The various subtopics under the Theories & Methods category, including Qualitative Approaches, Quantitative Approaches, and Theoretical Models, are represented by the panellists as evidenced by their biographies. Additionally, Big and Small Data (under the Social Media & Big Data category) will be discussed, since data sampling is an issue of interest to researchers who work in the various approaches to social media data collection and analysis.
Overview of session:
|Activity||Who is involved?||Time allocation|
|Introduction to the panel||Editors: Anabel Quan-Haase, Luke Sloan||14:15|
|Question 1 explored||Panelists discuss question 1: Martin Hand, Claudine Bonneau, Mélanie Millette, Ravi Vatrapu, Diane Rasmussen Pennington
|Question 2 explored||Panelists discuss question 2: Anatoliy Gruzd, Frauke Zeller, Guillaume Latzko-Toth, Anabel Quan-Haase
|Wrap-up||Poll results or small groups discussion by all panelists Moderators: Anabel Quan-Haase and Diane Rasmussen Pennington||16:45|
|End of workshop and dinner||17:30|
Attendees will be able to engage in the session in three forms: (1) asking panelists questions and discussing with them three central questions; (2) providing input via a live poll during the session; and (3) through a wrap-up interactive session lasting 20 minutes at the end of the panel. In the final wrap-up session, attendees will be able to reflect upon the results from the live poll as well as provide their opinions of the results.
Brief Biography of Each Presenter:
Claudine Bonneau, Université du Québec à Montréal
Claudine Bonneau is Associate Professor of Management & Technology at Université du Québec à Montréal (UQAM), where she is a member of the Laboratory on Computer-Mediated Communication (LabCMO) and teaches in graduate and undergraduate programs in Information Technology. Her current work focuses on social media uses and online collaboration practices at work. She is also interested in methodological issues related to qualitative research and online ethnography. Besides her contributions to edited books (such as the Handbook of Social Media Research Methods, Sage, 2016), her work has been published in the International Journal of Project Management (2014), tic&société (2013) and other French-language publications.
Anatoliy Gruzd, Ryerson University
Dr. Gruzd is a Canada Research Chair in Social Media Data Stewardship, Associate Professor at the Ted Rogers School of Management at Ryerson University (Canada), and Director of the Social Media Lab. He is also a co-editor of a multidisciplinary journal on Big Data and Society published by Sage. His research initiatives explore how the advent of social media and the growing availability of user-generated big data are changing the ways in which people communicate, collaborate and disseminate information and how these changes impact the social, economic and political norms and structures of modern society.
Martin Hand, Queen’s University
Martin Hand is an Associate Professor in Sociology at Queen’s University, Kingston, Canada. He is the co-editor of Big Data? Qualitative Approaches to Digital Research (2014; Emerald), author of Ubiquitous Photography (2012; Polity), Making Digital Cultures (2008; Ashgate) and co-author of The Design of Everyday Life (2007; Berg), plus articles and essays about visual culture, technology, and consumption. He is currently conducting research on technology, time and temporality in contemporary Canadian society, funded by the Social Sciences and Humanities Research Council of Canada.
Guillaume Latzko-Toth, Université Laval
Guillaume Latzko-Toth is Associate Professor in the Department of Information and Communication at Université Laval (Quebec City, Canada) and codirector of the Laboratory on Computer-Mediated Communication (LabCMO, http://www.labcmo.ca). Rooted in a Science and Technology Studies (STS) perspective, his research and publications address the role of users in the development of digital media, the transformations of publics and publicness, and methodological and ethical issues related to Internet research. Besides several contributions to edited books, his work appeared in the Journal of Community Informatics (2006), the Bulletin of Science, Technology and Society (2010), tic&société (2013) and the Canadian Journal of Communication (2014).
Mélanie Millette, Université du Québec à Montréal
Mélanie Millette is Professeure substitut at the Département de communication sociale et publique, UQAM, Canada. She is a member of the Laboratory on Computer-Mediated Communication (LabCMO, UQAM and Université Laval). Her work concerns social, political, and cultural aspects of social media uses, more specifically how citizens mobilize online platforms to achieve political participation. She won a SSHRC-Armand-Bombardier grant and a Trudeau Foundation scholarship for her thesis research which examines media visibility options offered by online channels such as Twitter for francophone minorities in Canada. She is the co-editor of a book on social media (Médias sociaux : enjeux pour la communication, PUQ, 2012) and is a contributor to many edited books (such as the Handbook of Social Media Research Methods, Sage, 2016, and Hashtag Publics: The Power and Politics of Discursive Networks, Peter Lang, 2015).
Anabel Quan-Haase, The University of Western Ontario
Anabel Quan-Haase is Associate Professor of Information and Media Studies and Sociology at Western University. Her research interests include digital scholarship, networked work, serendipity in work practices, serendipity in social media, and the design of discovery systems that promote serendipity. She is the author of “Technology and Society: Social Networks, Inequality and Power” (Oxford University Press, 2015) and co-editor of the Handbook of Social Media Research Methods (Sage, 2016). She is the past president of the Canadian Association of Information Science and current Council Member of the Communication, Information Technology, and Media Sociology section of the American Sociological Association. She has organized several conferences including the Canadian Association of Information Science Annual Meeting and has served on numerous programme committees.
Diane Rasmussen Pennington, University of Strathclyde
Dr. Diane Rasmussen Pennington is a Lecturer in Information Science in the Department of Computer and Information Sciences at the University of Strathclyde in Glasgow, Scotland, where she is a member of the iLab and the Digital Health and Wellness research groups. She is also the Social Media Manager of the Association for Information Science & Technology (ASIS&T). Dr Rasmussen Pennington has taught classes on research methods, social media, knowledge organisation, and a range of information technology topics. Her diverse research areas encompass non-text information indexing and retrieval, Emotional Information Retrieval (EmIR), user behaviours on social media, and online health information preferences. She is the editor of Indexing and Retrieval of Non-Text Information (2012) and Social Media for Academics: A Practical Guide (2012). She is currently editing a book series entitled Computing for Information Professionals.
Luke Sloan, Cardiff University
Luke Sloan is a Senior Lecturer in Quantitative Methods and Deputy Director of the Social Data Science Lab at the School of Social Sciences, Cardiff University UK. Luke has worked on a range of projects investigating the use of Twitter data for understanding social phenomena covering topics such as election prediction, tracking (mis)information propagation during food scares and ‘crime-sensing’. His research focuses on the development of demographic proxies for Twitter data to further understand who uses the platform and increase the utility of such data for the social sciences. He sits as an expert member on the Social Media Analytics Review and Information Group (SMARIG) which brings together academics and government agencies and he works closely with the Office for National Statistics and Food Standards Agency.
Ravi Vatrapu, Copenhagen Business School
Ravi Vatrapu is a professor of human computer interaction at the Department of IT Management, Copenhagen Business School; professor of applied computing at the Westerdals Oslo School of Arts Communication and Technology; and director of the Computational Social Science Laboratory (http://cssl.cbs.dk). Prof. Vatrapu’s current research focus is on big social data analytics. Based on the enactive approach to the philosophy of mind and phenomenological approach to sociology and the mathematics of classical, fuzzy and rough set theories, his current research program seeks to design, develop and evaluate a new holistic approach to computational social science, Social Set Analytics (SSA). SSA consists of novel formal models, predictive methods and visual analytics tools for big social data. Prof. Vatrapu holds a Doctor of Philosophy (PhD) degree in Communication and Information Sciences from the University of Hawaii at Manoa, a Master of Science (M.Sc) in Computer Science and Applications from Virginia Tech, and a Bachelor of Technology in Computer Science and Systems Engineering from Andhra University.
Frauke Zeller, Ryerson University
Frauke Zeller is Assistant Professor in the School of Professional Communication at Ryerson University in Toronto (ON), Canada. Her research interests include organizational communication, Human-Computer Interaction/Human-Robot Interaction, digital communication, and method development for digital research analyses. She has been awarded with a range of major research grants, among them a Marie Curie Fellowship (2011-2013), which is one of Europe’s most distinguished individual research grants. It enabled her to conduct research on big data and multimodal communication analyses tools. She is the co-creator of hitchBOT, Canada’s first hitchhiking robot, and has also been involved in a range of art works and social-scientific experiments relating to robotics and AI. She is co-editor of “Revitalising audience research: Innovations in European audience research” (Routledge 2015).
boyd, D. and Crawford, K. (2012). Critical questions for big data: provocations for a cultural, technological and scholarly phenomenon. Information, Communication & Society, 15(5), 662–679.
Duggan, M., & Smith, A. (2014). Social media update 2013: 42% of online adults use multiple social networking sites, but Facebook remains the platform of choice (Online). Pew Research Internet Project. Retrieved January 14, 2015, from http://www.pewinternet.org/
Grabau, M. & Hegelich, S. (2016). The Gas Game: Simulating Decision-Making in the European Union’s External Natural Gas Policy, accepted for publication at: Swiss Political Science Review (SPSR).
Hegelich, S., Fraune, C. & Knollmann, D. (2015). Point predictions and the punctuated equilibrium theory: A data mining approach. Policy Studies Journal (PSJ), 43(2), 228-256.
Hegelich, S. & Shahrezaye, M. (2015) The communication behavior of German MPs on Twitter: Preaching to the converted and attacking opponents. European Policy Analysis (EPA), 1(2), 155-174.
Hogan, B., & Quan-Haase, a. (2010). Persistence and Change in Social Media. Bulletin of Science, Technology & Society, 30(5), 309–315.
Mayer- Schöenberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. New York: Houghton Mifflin Harcourt.
Neal, D. M. (2010). Emotion-based tags in photographic documents: The interplay of text, image, and social influence. Canadian Journal of Information and Library Science, 34(3), 329-353.
Pew Research Center. (2015). Social networking fact sheet. Retrieved December 12, 2015 from http://www.pewinternet.org/fact-sheets/social-networking-fact-sheet/
Rasmussen Pennington, D. (in press). ‘The most passionate cover I’ve seen’: Emotional information in fan-created U2 music videos. Journal of Documentation.
Rasmussen Pennington, D. (2017). Coding of non-text documents. In A. Quan-Haase and L. Sloan (Eds.), Handbook of Social Media Research Methods.
Savage, M., & Burrows, R. (2007). The coming crisis of empirical sociology. Sociology, 41(5), 885–899. http://doi.org/10.1177/0038038507080443
Thelwall, M., & Buckley, K. (2013). Topic-based sentiment analysis for the Social Web: The role of mood and issue-related words. Journal of the American Society for Information Science and Technology, 64(8), 1608–1617.
Williams, L., & Rasmussen Neal, D. (2012). The digital aggregated self: A literature review. Paper presented at the IEEE International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Sanya, China.