Bag of words interprets online reviews

Online reviews from customers of products and services are widespread and influential, it seems. A sale or booking can hinge on a constellation of five-star reviews on the likes of TripAdvisor or a potential customer may be deterred from parting with their hand-earned cash and disappear down a digital black hole if the reviews are anything but stellar.

Writing in the International Journal of Business Intelligence and Data Mining, a team from Turkey, has looked at an approach to analysing reviews on the likes of TripAdvisor using a so-called Bag-of-Words (BOW) model. The BOW model is a popular tool in data mining that takes each word or set of words as a distinct feature of a text document, in this case, a review article. These distinct entities are then given a numerical weight that allows them to be balanced against each other in the analysis and so reveal the nature of the review at a quite detailed level using an algorithm to process the weights.

The team has shown successfully how this BOW model can be used to analyse reviews of hotels. Specifically, their approach has been applied to almost three-quarters of a million TripAdvisor reviews of more than one thousand hotels in different tourist regions of Turkey. Their workflow makes short shrift of the processing when compared to how such a rich lode of reviews might be mined using conventional, manual techniques.

The team explains that the approach demonstrated that building a dimensional model dataset before performing any text mining process is an optimal way in which to make the data retrieval process much more efficient and to help in representing the data by different measures of interest.

The specifics of the study revealed what might be expected of the actual hotels the reviews of which were analysed, in that hotels in the coastal Aegean and Mediterranean regions were the focus of those seeking fun and sun whereas hotels in Istanbul and other historic centres were associated more with the cultural and educational aspects of tourism in Turkey. Interestingly, those whose reviews were in English more prominently discussed bar and à la carte restaurants whereas the reviews in Turkish typically focused on the food itself.

“We conclude that adopting and automating this proposed workflow into the hotel BI systems may prove considerably beneficial, providing hotel managers with essential insights necessary to understand and track customers and competitors,” the team writes. They add that most other research in this field has focused on the Chinese and US markets and the current work adds a novel dimension to the literature.

Bektas, J. and Elsadig, A. (2022) ‘A unified workflow strategy for analysing large-scale TripAdvisor reviews with BOW model’, Int. J. Business Intelligence and Data Mining, Vol. 21, No. 1, pp.102–117.