Scraping the Surface: A Guide to Data Collection Using Web Scraping in Sports Medicine.
Pinkoski AM., Ward P., Kluzek S., Arundale AJH., Bullock GS.
Pinkoski, AM, Ward, P, Kluzek, S, Arundale, AJH, and Bullock, GS. Scraping the surface: a guide to data collection using web scraping in sports medicine. J Strength Cond Res 39(12): e1473-e1479, 2025-Publicly obtained injury data, which can be combined with performance data for the team and individual, have become an attractive option in sports medicine research. Publicly obtained injury data are primarily collected through reading a website's source code and converting the results into a format that can be used for further analysis, a process known as web scraping. Despite its growing popularity and public domain in which data sources exist, studies that use these methods often do not disclose their methods of extraction and are subject to replicability issues. The adoption of Open Science practices through transparency of methods and sharing of code or data sets is one such manner to ensure reproducible and meaningful results across sports medicine research and applied settings. The purpose of this article was to (a) operationalize and describe data-scraping methods, (b) describe the strengths and weaknesses of data-scraping methods in an applied sports medicine setting, and (c) provide a practical example with adjoining data and code on how to use replicable and reliable data-scraping methods in applied sport settings.