Keakuratan Regresi Linear Berganda Dalam Memprediksi View Count Pada Dataset Youtube Top 100 Songs 2025

Authors

Keywords:

Regresi Linear Berganda, Youtube, View Count, Outlier, Evaluasi Model, Machine Learning

Abstract

The popularity of content on the YouTube platform is often predicted using simple statistical approaches, such as linear regression, even though digital data characteristics are generally non-linear and influenced by extreme outliers. This condition has led to a research gap regarding the effectiveness of linear regression in modelling view counts for popular music data. This study aims to evaluate the accuracy of multiple linear regression models in predicting music video view counts on the YouTube Top 100 Songs 2025 dataset. The predictor variables used include song duration and channel follower count. The research methods include exploratory data analysis, testing the correlation between variables, and evaluating model performance using the coefficient of determination (R²), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). The analysis results show that the view count distribution is highly skewed to the right and dominated by extreme outliers, and has a very weak correlation with the predictor variables. The linear regression model produced a negative R² value of −0.0675, with an MAE of 161,251,287 and an RMSE of 443,492,327, indicating poor prediction performance. These findings conclude that multiple linear regression is ineffective for modelling YouTube video popularity, thus requiring a non-linear approach or a model that is more robust to skewed data and outliers.

Downloads

Download data is not yet available.

Downloads

Published

20-12-2025

How to Cite

Keakuratan Regresi Linear Berganda Dalam Memprediksi View Count Pada Dataset Youtube Top 100 Songs 2025. (2025). Jurnal Ilmiah Epigraf: Kajian Ilmu Sosial Multidisiplin, 1(1), 69-78. https://ejournal.inskripsi.org/index.php/epigraf/article/view/21