Yibing Huang1,2, Sisi Zheng2,*
1School of Mathematical Information, Shaoxing University, Shaoxing, Zhejiang, China.
2School of Mathematics and Statistics, Huizhou University, Huizhou, Guangdong, China.
*Corresponding author:Sisi Zheng
Abstract
In this paper, an accuracy-weighted stacking fusion model based on Borderline-SMOTE is developed to predict the repeat purchase behavior of users on the Tmall platform. GBDT feature selection algorithm was utilized to obtain the importance ranking of features, filter redundant features, and construct a more efficient feature system. By calculating the error of the base learner’s predicted results, weight coefficients were constructed to weigh the output. Subsequently, the new training set was used as input for the meta-learner. Through accuracy weighting, learners with better performance received higher weights, enabling weighted heterogeneous integration of the model and improving overall performance. The experiments verify that the accuracy weighting stacking algorithm proposed in this paper, based on Borderline-SMOTE, achieves high fitting accuracy. The AUC, F1, recall, and precision have all increased, with the AUC value reaching 0.93. The prediction effectiveness surpassed that of the XGBoost, LightGBM, CatBoost, Random Forest, and other models. This research will provide superior prediction models for e-commerce platform merchants, assisting them in capturing potential new users and retaining loyal customers, ultimately generating greater business value.
References
[1] Zhang Ning, et al. The impact of consumer perceived value on repeat purchase intention based on online reviews: By the method of text mining [J]. Data Science and Management, 2021: 22-32.
[2] Esmeli R, Bader-El-Den M, Abdullahi H. Towards early purchase intention prediction in online session based retailing systems [J]. Electronic Markets, 2021, 31: 697-715.
[3] Martínez A, Schmuck C, Pereverzyev Jr S, et al. A machine learning framework for customer purchase prediction in the non-contractual setting [J]. European Journal of Operational Research, 2020, 281(3): 588-596.
[4] Lee J, Jung O, Lee Y, et al. A comparison and interpretation of machine learning algorithm for the prediction of online purchase conversion [J]. Journal of Theoretical and Applied Electronic Commerce Research, 2021, 16(5): 1472-1491.
[5] Chaudhuri N, Gupta G, Vamsi V, et al. On the platform but will they buy? Predicting customers’ purchase behavior using deep learning [J]. Decision Support Systems, 2021, 149: 113622.
[6] Ghasemian A, Hosseinmardi H, Galstyan A, et al. Stacking models for nearly optimal link prediction in complex networks [J]. Proceedings of the National Academy of Sciences, 2020, 117(38): 23393-23400.
[7] Ma L, Sun B. Machine learning and AI in marketing—Connecting computing power to human insights [J]. International Journal of Research in Marketing, 2020, 37(3): 481-504.
[8] Kshatri S S, Singh D, Narain B, et al. An empirical analysis of machine learning algorithms for crime prediction using stacked generalization: an ensemble approach [J]. Ieee Access, 2021, 9: 67488-67500.
[9] Kuric E, Puskas A, Demcak P, et al. Effect of Low-Level Interaction Data in Repeat Purchase Prediction Task [J]. International Journal of Human-Computer Interaction, 2023: 1-19.
[10] Xu J, Wang J, Tian Y, et al. SE-stacking: Improving user purchase behavior prediction by information fusion and ensemble learning [J]. Plos one, 2020, 15(11): e0242629.
[11] Park C, Kim D, Yang M C, et al. Click-aware purchase prediction with push at the top [J]. Information Sciences, 2020, 521: 350-364.
[12] Chen S, Wang X, Zhang H, et al. Customer purchase prediction from the perspective of imbalanced data: A machine learning framework based on factorization machine [J]. Expert Systems with Applications, 2021, 173: 114756.