Using Text Regression for Predicting Usefulness of Customer Opinion

Ngo-Ye, T. L. & Sinha, A. P. (2014). The influence of reviewer engagement characteristics on online review helpfulness: A text regression model. Decision Support Systems 61(2014), pp. 47-58.

Ngo-Ye and Sinha (2014) developed a text regression model for predicting the helpfulness of 7,465 online restaurant reviews posted at Yelp.com and 584 book reviews posted at Amazon.com. According to the authors, most review opinion mining studies focused on sentiment rather than quality and ignored reviewer engagement characteristics. The authors argued that reviewer characteristics influence the perception of helpfulness and that decision makers at online organizations who rely on user-generated content to gain marketing advantage could better leverage that content if they understood what makes it helpful to others and if they could predict that helpfulness.

In a hybrid approach using the bag-of-words (BOW) model, a type of vector space model (VSM) that represents a document as a bag of words along with recency, frequency, and monetary (RFM) analysis, a tool commonly used for maximizing response rates in direct marketing, and correlation-based feature selection (CFS), a dimension reduction technique, Ngo-Ye and Sinha performed a comparison of the predictive strength of their proposed model and the ZeroR model, a method of classification that relies on the target and ignores predictive variables. The authors further experimented with and cross-compared four common index weighting schemes including, binary occurrence, term occurrence, term frequency, and term frequency/inverse document frequency. A summary of the conceptual models and predictor variables used by the authors are shown in the fol1owing table. All models used the number of useful votes as the target variable.

Model	Predictor Variable(s)
ZeroR	NA
RFM	Recency, Frequency, Monetary value
BOW	Word₁, Word₂, ... Word_N
BOW/CSF	Word₁, Word₂, ... Word_p
BOW/CSF +RFM	Word₁, Word₂, ... Word_p, Recency, Frequency, Monetary value

The authors sought to determine if the BOW/ CSF model was a better predictor of review helpfulness than that of ZeroR; if the BOW/CSF + RFM model was a better predictor than that of BOW/CSF alone; and if BOW/CSF was a better predictor than that of RFM alone. To do so, they instantiated 28 models using support vector regression (SVR), a technique used to optimize large datasets (Basak, Pal, & Patranabis, 2007). The results indicated that BOW/CSF + RFM significantly (p < .05) out-performed the other models in predicting review helpfulness for both the Yelp.com and Amazon.com datasets.

Anderson, Sweeney, Williams, Camm, and Martin (2012) stated there are numerous linear programming applications used in marketing, such as in media selection and market research. The research done by Ngo-Ye and Sinha (2014) is another example in that it demonstrates how quantitative models can help marketers make decisions about what content to display prominently to what users and at which points in time. The ability to accurately predict which online reviews users will perceive as the most helpful would give the company a competitive edge and would increase user satisfaction.

Anderson, D. R., Sweeney, D. J., Williams, T. A., Camm, J. D., & Martin, K. (2012). An introduction to management sciences: Quantitative approaches to decision making (13th ed.). U.S: South-Western.

Basak, D., Pal, S., & Patranabis, D. C. (2007). Support vector regression. Neural Information Processing – Letters and Reviews, 11(10), pp. 203-224

Kathleen Marrs, Ph.D.

Kathleen wants to live in a world filled with open books, open source, open hearts, and open minds in which diversity is embraced and creativity flourishes.

A long time CPA turned online professor, Kathleen’s life was transformed upon completion of her dissertation An Investigation of the Factors that Influence Faculty and Student Acceptance of Mobile Learning in Online Higher Education.

Her statistical analyses was called ”pioneering” by her committee chair Dr. Marlyn K. Littman and brought Kathleen full circle back to her number-crunching roots inspiring her to earn a second master’s in Business Intelligence.

Kathleen plans to continue her studies of contemporary issues related to teaching, learning, and technology and loves to help undergrad and grad students achieve their academic and professional goals. As a lifelong learner she also plans on continuing her quest to understand the problems posed by mobile and micro learning formats and find innovative ways of helping people maximize the benefits these emerging technologies afford.

[print-me]