A Primer to Web Scraping with R
Simon Munzert, PhD
Wednesday, April 12, 2017
12:00 - 1:30 PM CDT/ 1:00 - 2:30 PM EDT/ 10:00 - 11:30 AM PDT
ASA-SRMS members will receive AAPOR member pricing on webinars when registering for live webinars or purchasing recordings of webinars.
Purchase a recording of this webinar nowPreview webinar recording
If you are an ASA member, click here to purchase.
NOTE: If you purchased the yearly webinar subscription with your membership renewal, you are already registered for this webinar and do not need to re-register. You will automatically receive the login instructions prior to the event. Click here for the Webinar Package FAQs.
About This Course:
The internet offers a wealth of opportunities to learn about public opinion and social behavior. Data from social networks, search engines or web services open avenues for new ways of measuring human behavior and preferences in previously unknown velocity and variety. Fortunately, the open source programming language R provides advanced functionality to gather data from virtually any imaginable data source on the Web - via classical screen scraping approaches, automated browsing, or by tapping APIs. This allows researchers to stay in one programming environment in the processes of data collection, tidying, analysis, and publication. The talk gives an overview of web technologies fundamental to gather data from internet resources. Further, we will learn about state-of-the-art tools and packages for web scraping with R. Finally, we will also discuss subtleties of the web scraping workflow, such as how to ensure reproducibility and to stay friendly on the web.
- Summarize web technologies fundamental to web scraping
- Demonstrate how to gather data from webpages and APIs
- Apply valuable tips for a reproducible and friendly scraping workflow
About the Instructor:
Simon Munzert is Research and Teaching Fellow at the Chair of Comparative Political Behavior, Humboldt University of Berlin. He received his doctoral degree in Political Science from the University of Konstanz. His research interests include measuring and forecasting public opinion, political representation, and the use of new media in society. He has published extensively on these topics in various international outlets. Furthermore, he is an enthusiastic user of the statistical software R and regularly teaches quantitative methods for social science research.