Jakob Jünger

Mapping the Field of Automated Data Collection on the Web: Collection Approaches, Data Types, and Research Logic

Social scientists work intensively to collect online data; however, there is a lack of clarity when considering how to methodologically characterize different techniques (Bruns 2013). As each technique brings its own methodological issues, comparison and categorization is necessary to reach a more general understanding of basic limitations and opportunities. The basic issues accompanying automated data collection must be discussed before the field of automated methods can reach the stage of mainstream research (Parks 2014: 357).

In order to better understand the specific circumstances surrounding automated data collection, this paper attempts to map the field of automated data collection methods. In the first part, three specific ways of collecting data are discussed and backed up with examples: working with raw data, access to programming interfaces, and the exploitation of user interfaces. In the next section, some basic methodological distinctions will reveal a wide variety of data types and their respective methodological properties. For instance, while automated methods usually are considered nonreactive, automated data collection on the Web sometimes entails different types of reactivity. The challenges and opportunities that arise from these collection approaches and types of data are discussed in the third section.