Predicting Film Box Office Performance Using Wikipedia Edit Data

Patel, Niraj (2025) Predicting Film Box Office Performance Using Wikipedia Edit Data. International Journal of Innovative Science and Research Technology, 10 (2): 25FEB802. pp. 1951-1956. ISSN 2456-2165

Abstract

This study explores the potential of Wikipedia edit data as a predictor of opening box office revenues for films
released in the US. After analyzing films from 2007 to 2011, we developed a predictive model based on Wikipedia article
edits using gradient boosting trees as the primary algorithm. Our model incorporates features such as the frequency of Wikipedia edits, the size and content of article revisions, and the revenues of similar films. The results demonstrate that Wikipedia activity can serve as a rough indicator of film popularity, though the model’s predictive accuracy is limited. We find that Wikipedia-based features, particularly edit runs and content changes, significantly contribute to the model’s
performance, achieving an R² of 0.54 for films released in 2012. This suggests that while Wikipedia data offers valuable insights into social interest, it is best used in conjunction with other predictors for more reliable revenue estimates.

Documents
68:329
[thumbnail of IJISRT25FEB802.pdf]
Preview
IJISRT25FEB802.pdf - Published Version

Download (819kB) | Preview
Information
Library
Statistics

Downloads

Downloads per month over past year

View Item