CN117094856B - Prediction method for user evaluation behavior after embedding OTA website based on panel logic model - Google Patents

Prediction method for user evaluation behavior after embedding OTA website based on panel logic model Download PDF

Info

Publication number
CN117094856B
CN117094856B CN202311074103.8A CN202311074103A CN117094856B CN 117094856 B CN117094856 B CN 117094856B CN 202311074103 A CN202311074103 A CN 202311074103A CN 117094856 B CN117094856 B CN 117094856B
Authority
CN
China
Prior art keywords
comment
logic model
comments
model
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311074103.8A
Other languages
Chinese (zh)
Other versions
CN117094856A (en
Inventor
张紫琼
王乐
吴少辉
王雪妍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202311074103.8A priority Critical patent/CN117094856B/en
Publication of CN117094856A publication Critical patent/CN117094856A/en
Application granted granted Critical
Publication of CN117094856B publication Critical patent/CN117094856B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/12Hotels or restaurants
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • General Business, Economics & Management (AREA)
  • Algebra (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A prediction method for user evaluation behavior after embedding an OTA website based on a panel logic model belongs to the technical field of data analysis. Step one: acquiring an original data set; step two: performing data preprocessing on the original data set to obtain a cleaned data set; step three: combing documents and theories related to customer evaluation behaviors, determining a research model of influence of comment features embedded in an OTA website on subsequent comment behaviors on an original OTA website, defining variables required by the research model, and calculating the required variables by using a cleaned data set; step four: constructing a logic model to obtain coefficients and residual terms of the panel logic model; step five: robustness test; if the panel logic model passes the robustness test, the reliability of the model is demonstrated, and the coefficient and residual error items are brought back to the panel logic model to obtain a prediction model; if not, the panel logic model needs to be built and analyzed again, namely, the step four is executed again. The method is used for predicting the user evaluation behaviors.

Description

Prediction method for user evaluation behavior after embedding OTA website based on panel logic model
Technical Field
The invention belongs to the technical field of data analysis, and particularly relates to a prediction method of user evaluation behavior after embedding an OTA website based on a panel logic model.
Background
With the development of electronic commerce, online reviews have become a widely accepted and trusted source of information for most consumers. In the travel industry, consumers tend to search the web for information before making purchase decisions and post-consumer satisfaction ratings. However, the large number of available online reviews can overload the information and bring additional perceived costs to the consumer. In addition, in terms of the consumer's acquisition of information, the hotel consumer no longer relies on a single source of information to make purchase and evaluation decisions, and both internal and external information of the OTA website may affect the consumer's decisions. The academy also has richer achievements about the influence mechanism of external information on the evaluation behavior of consumers. However, there is a large growing space for research about external information influencing consumer evaluation behaviors, and the current research is mainly focused on external comment information from social media platforms and e-commerce websites, and less research focuses on how external information of embedded OTA websites can influence user evaluation behaviors, so that it is difficult for platforms and hotel managers to truly know the influence of external information of various channels on user evaluation behaviors, and it is also difficult for the platforms and hotel managers to predict the evaluation behaviors of users better. In addition, focusing on a unique platform design, i.e. embedding comment information on an external OTA website in an initial OTA website, the academic community has not yet decided on the advantages and disadvantages of this unique design, and there are few studies to discuss this problem from the point of quantitative analysis, making it difficult for a platform manager to manage this unique platform design more reasonably.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a prediction method for user evaluation behavior after embedding an OTA website based on a panel logic model.
The analyzed data are comment data obtained from the Yilong website, some key information in the characteristic of the performance comment, such as the positive comment ratio, comment quantity, embedded comment variance and the like of the internal comment, is obtained through analysis processing of a large number of comment data, a panel logic model is further utilized to explore the influence of the embedded external information on the user evaluation behavior when the external information is consistent or inconsistent with the internal information in the evaluation characteristic, key influence factors influencing the user evaluation behavior are obtained through analysis, the grade corresponding to the model prediction comment is established, the marketing strategy can be timely adjusted for a platform and a hotel manager, and a useful guidance is provided for the relation between the embedded external comment from the OTA website and the internal comment of the OTA website.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
A prediction method of user evaluation behavior after embedding an OTA website based on a panel logic model comprises the following steps:
step one: acquiring an original data set; acquiring comment data, hotel data and user data of all hotels with embedded tripadvisors user comments in a target area from an OTA website through a JAVA crawler program to obtain an original data set;
step two: preprocessing data; carrying out data preprocessing on the original data set, namely carrying out data cleaning work, deleting samples which lack data and have abnormal values, and obtaining a cleaned data set;
Step three: variable definition and calculation; combing documents and theories related to customer evaluation behaviors, determining a research model of influence of comment features embedded in an OTA website on subsequent comment behaviors on an original OTA website, defining variables required by the research model, and calculating the required variables by using a cleaned data set;
Step four: panel logic model analysis; determining an interpretation variable and an adjustment variable according to the evaluation attribute of each comment, and incorporating comment text length, picture number, comment equipment, travel type and hotel room type into analysis to construct a logic model; secondly, in order to obtain a robust result, controlling the hotel fixing effect and the time effect of comments by introducing hotel specific variables and time specific variables, constructing a panel logic model, exploring an influence relationship by taking user evaluation behaviors as dependent variables, and importing variable data obtained through calculation in the third step into STATA software for operation to obtain coefficients and residual items of the panel logic model;
Step five: robustness test; the sample range of the positive evaluation ratio of the interpretation variable is enlarged and reduced respectively, the variable is calculated again, a panel logic model is utilized for analysis to obtain a new data result, the new data result is compared with the previous data result, if the panel logic model passes the robustness test, the interpretation model has reliability, and the coefficient and residual error item obtained in the fourth step are brought back to the panel logic model to obtain a prediction model; if the robustness test is not passed, the panel logic model needs to be built and analyzed again, namely, the step four is executed again.
Further, in the third step, the variables required for the model study are defined specifically as follows: the required variables and correspondences are defined as the positive rating ratio, the number of previous reviews, the average rating of the embedded reviews, and the variance of the embedded reviews.
Further, in the fourth step, the positive evaluation ratio and the number of previous comments are used as explanatory variables, and the average rating of the embedded comments and the variance of the embedded comments are used as regulating variables.
Compared with the prior art, the invention has the beneficial effects that: according to the method for predicting the user evaluation behavior after the OTA website is embedded based on the panel logic model, firstly, comment data, hotel data and user data from the OTA website are utilized, variables required by the model are calculated according to the data, such as the positive comment ratio of previous comments, the number of previous comments, the variance of embedded comments and the number of embedded comments, characteristics of hotel layers, characteristics of personal layers and time effects are controlled, the panel logic model is constructed, influences of different characteristics of the comments on the user evaluation behavior are analyzed, influences of interaction of internal comments and external comments on the user evaluation behavior are explored, and prediction of the user evaluation behavior by the internal comments and the external comments is obtained based on the variables. In the field of comment management, the prediction method of the invention takes the influence of external comments into consideration, and explores the influence of interaction of two different comments on user evaluation behaviors. Meanwhile, the factors such as text length, picture number, comment equipment, travel type, hotel room type and the like are comprehensively taken into consideration, a panel logic model is built, the influence relation of the factors on the user evaluation behavior is explored, the platform and hotel manager can completely know the weight of the influence factors, and efforts are made to the places which can be improved. Secondly, the prediction method can be considered according to the unique characteristics of each hotel, and a fixed effect model belonging to the specific hotel can be built according to the data of the specific hotel, so that user comments can be managed better. Finally, the panel logic model in the prediction method of the invention can also analyze hotel comment data in a large amount, and carry out comprehensive, efficient, high-accuracy and easy-to-realize mining and analysis on user comments and hotel information.
Drawings
Fig. 1 is a flow chart of a method for predicting user evaluation behavior after embedding an OTA website based on a panel logic model.
Detailed Description
The first embodiment is as follows: as shown in fig. 1, this embodiment discloses a method for predicting user evaluation behavior after embedding an OTA website based on a panel logic model, which includes the following steps:
Step one: acquiring an original data set; acquiring comment data, hotel data and user data of all hotels with embedded tripadvisors user comments in a target area from an OTA website through a JAVA crawler program to obtain an original data set (for example, acquiring comment data, hotel data and user data of all hotels with embedded tripadvisors user comments in Beijing city from the OTA website through the JAVA crawler program to obtain the original data set);
step two: preprocessing data; carrying out data preprocessing on the original data set, namely carrying out data cleaning work, deleting samples which lack data and have abnormal values, and obtaining a cleaned data set;
Step three: variable definition and calculation; combing documents and theories related to customer evaluation behaviors, determining a research model of influence of comment features embedded in an OTA website on subsequent comment behaviors on an original OTA website, defining variables required by the research model, and calculating the required variables by using a cleaned data set;
Step four: panel logic model analysis; according to the evaluation attribute of each comment, determining an interpretation variable and an adjustment variable, and taking comment text length, picture number, comment equipment (including mobile phones, computers and the like), travel type and hotel room type into analysis to construct a logic model; secondly, in order to obtain a robust result, controlling the hotel fixing effect and the time effect of comments by introducing hotel specific variables and time specific variables, constructing a panel logic model, wherein the model takes user evaluation behaviors (taking evaluation attributes as examples) as dependent variable exploration influence relations, importing variable data obtained through calculation in the third step into STATA software for operation, and obtaining coefficients and residual terms of the panel logic model (the coefficients can reveal the influence of different characteristics of internal and external comments, hotel characteristics and user characteristics on the user evaluation behaviors, and the residual terms represent unexplained parts in the model, namely the unexplained parts of the model cannot be completely explained);
step five: robustness test; the sample range of the positive evaluation ratio of the interpretation variable is enlarged and reduced respectively, then the variable is calculated again (for example, the positive evaluation ratio in the first 50 comments and the last 200 comments of the comment is calculated) and is analyzed by using a panel logic model to obtain a new data result, the new data result is compared with the previous data result, if the panel logic model passes the robustness test, the interpretation model has reliability, and the coefficient and the residual error item obtained in the fourth step are brought back to the panel logic model to obtain a prediction model; if the robustness test is not passed, the panel logic model needs to be built and analyzed again, namely, the step four is executed again.
Further, in the third step, the variables required for the model study are defined specifically as follows: the required variables and correspondence are defined as the ratio of active ratings (ratio of active reviews in the top 100 reviews of the piece), the number of previous reviews (number of reviews before the piece), the average rating of the embedded reviews (average of the embedded review scores), and the variance of the embedded reviews (variance of the embedded review scores).
Further, in the fourth step, the positive evaluation ratio and the number of previous comments are used as explanatory variables, and the average rating of the embedded comments and the variance of the embedded comments are used as regulating variables.
Example 1:
The embodiment discloses a prediction method of user evaluation behaviors after embedding an OTA website based on a panel logic model, which adopts data analysis to mine comment features (evaluation amount, comment titer, evaluation variance and positive evaluation ratio) of the OTA website and an embedded external OTA website, finally analyzes different influence factors of the user evaluation behaviors through the panel logic model, aims at better predicting the user evaluation behaviors, explores the influence of various comment related factors on the user evaluation behaviors, and guides managers of the OTA website and hotels to effectively manage the user comments.
1. Study data and methods
1. Study data
In view of the increasing information overload of online comment website users, it is becoming increasingly important to know ways to reduce cognitive costs and make effective information reliable. In this embodiment, a crawler program based on JAVA retrieves comment records with embedded comments on 36,117 strips of beijing common hotels listed on an OTA website (for example, a Yilong website) during data collection, obtains comment data (time stamp, accommodation time, comment titer), hotel data (hotel level, hotel room type), personal data (comment equipment, travel type) and the like, and explores the influence of embedded external information and other different features on user evaluation behaviors when external information and internal information are consistent or inconsistent on evaluation features (such as evaluation amount, comment titer, evaluation variance and positive evaluation ratio).
2. Research method
With the development of the Internet, the convenience of life is greatly improved, and the public also increasingly relies on the Internet to make hotel reservations and scenic spot ticket purchases. The market for the hospitality industry is becoming more complex, with the most significant problem being how the hotel and manager should understand the consumer's expectations to evaluate and improve their quality of service with pertinence. Comment information posted by consumers on an OTA website is used as an important communication carrier between the hotel and the consumers, so that the comment information is feedback of the consumers on hotel consumption, is also a channel for hotel managers to know consumer appeal in time, and greatly influences purchase and evaluation decisions of potential users. Therefore, the invention provides a prediction method of user evaluation behaviors after embedding an OTA website based on a panel logic model, which analyzes the influence of various factors such as different characteristics of OTA website comments and embedded comments, characteristics of hotels and individuals and the like on the user evaluation behaviors by carrying out data mining on comment information (such as comment titers, positive evaluation ratios, comment amounts and the like) on the internal OTA website and the embedded website.
As shown in fig. 1, the method of the present invention comprises the steps of:
(1) Acquiring an original data set; acquiring comment data, hotel data and user data of all hotels with embedded tripadvisors user comments in Beijing city from a Yilong website through a JAVA crawler program to obtain an original data set;
(2) Preprocessing data; carrying out data preprocessing on the original data set, namely carrying out data cleaning work, deleting samples which lack data and have abnormal values, and obtaining a cleaned data set;
(3) Variable definition and calculation; combing documents and theories related to customer evaluation behaviors, determining a research model of influence of comment features of an embedded OTA website on subsequent comment behaviors on an original OTA website, defining variables required by the research model, and calculating the required variables by using a cleaned data set, wherein the calculation formulas of the variables are related to specific definitions, and each variable and the corresponding definition are respectively positive evaluation rate (the positive comment rate in 100 comments before the comment), previous comment number (the comment number before the comment), average rating of the embedded comment (the average value of the embedded comment score) and variance of the embedded comment (the variance of the embedded comment score);
(4) Panel logic model analysis; according to the evaluation attribute of each comment, taking the positive evaluation ratio, the number of previous comments, the average rating of the embedded comments and the variance of the embedded comments as explanatory variables and adjusting variables, and taking the comment text length, the number of pictures, comment equipment, the travel type and the hotel room type into analysis to construct a logic model; secondly, in order to obtain a robust result, controlling hotel fixing effect and time effect to be reviewed by introducing hotel specific variables and time specific variables, constructing a panel logic model, exploring an influence relationship by taking evaluation attributes as examples by a variable user evaluation behavior, importing variable data obtained through calculation in the third step into STATA software for operation, analyzing to obtain coefficients and residual terms of the fixing effect model, wherein the coefficients reveal influences of different characteristics of internal and external comments, hotel characteristics, user characteristics and the like on the user evaluation behavior, and the residual terms represent unexplained parts in the model, namely differences which cannot be completely interpreted by the model;
(5) Robustness test; the sample range of the positive evaluation ratio of the interpretation variable is enlarged and reduced respectively, then the variable is calculated again (for example, the positive evaluation ratio in the first 50 comments and the first 200 comments of the comment is calculated) and is analyzed by using the panel logic model to obtain a new data result, the new data result is compared with the previous data result, if the panel logic model passes the robustness test, the panel logic model is proved to have certain reliability, and the coefficient and the residual error item obtained in the fourth step are brought back to the panel logic model to obtain a prediction model; if the robustness test is not passed, the panel logic model needs to be built and analyzed again, namely, the step four is returned.
2. Experiment and analysis
1. Data source and preprocessing
All comment data of all hotels with embedded tripadvir user comment modules in Beijing city on Yilong websites are selected through JAVA programs, hotel related information hotel data (hotel level, hotel room type), personal data (comment equipment, travel type) and the like are obtained, samples which lack data and do not meet requirements are removed, and then a comment record with embedded comments is obtained together 36,117.
Because the types of the crawled data are complex and the number of the crawled data is large, after the original data are obtained, data preprocessing is usually needed, the reliability of the data is improved, and the specific process is as follows:
(1) Screening comments containing missing values through Excel;
(2) Deleting comments with abnormal values through manual screening;
2. variable measurement
The data obtained from the website has a large variety, including user comment data, user personal data, hotel basic data and the like, but many data cannot be directly used for performing user evaluation behavior analysis. Based on the existing data, the variables required for modeling are measured. The specific variable definition and measurement process is as follows:
(1) An interpreted variable; only "recommended" or "not recommended" evaluations of hotel reviews on the art dragon website represent either positive or negative evaluations. Hotel recommendations (SquRecom) are used to display whether the hotel is recommended in a comment.
(2) Interpreting the variable; the positive rating ratio (RecomRatio) is defined as the ratio of positive comments among the top 100 comments, reflecting the positive internal comment ratio. The number of previous evaluations (RevNum) is defined as the number of reviews before the review. The average rating of the embedded reviews is measured (TripAve) based on the pentad scores of tripasor using the average of the embedded review scores, which may represent the emotional mood of the embedded extrinsic information observed by the consumer. Further, a variance (TripV) of the embedded comment is defined as a variance of the embedded comment rating.
(3) Controlling a variable; variables associated with the comment are controlled, including comment length (RevLen) and number of pictures in the current comment (PicNum). Devices (devices) used when consumers post comments and travel types (TRAVELTYPE) of different peers are used to explain consumer heterogeneity characteristics. Hotel specific variables such as room type (RoomType) and hotel fixation effect (hotel) are controlled. While also controlling time-specific variables in the model, one month was interpreted by adding a vector of month virtual variables to account for unobserved time heterogeneity with month fixed effect (montan).
3. Panel Logit model analysis
And according to whether the hotel is recommended or not in each comment, taking the polar product ratio of the previous comments, the number of comments, the number and variance of the embedded comments as independent variables, controlling some characteristics of the individual aspects and the hotel aspects, simultaneously controlling the fixed effect and the time effect of the hotel, constructing a panel logic model to explore the influence relationship, and calculating to obtain coefficients and residual items of the panel logic model. Studies have found that the prior comment polar product ratio, the number of comments, the average titer and variance of embedded comments all have a significant impact on the user's recommended behavior. Meanwhile, the characteristics of the embedded comments also can obviously influence the relationship between the internal comments and the recommended behavior of the user.
3.1 Modeling
For binary selection of interpreted variables, a fixed effect logic regression model is used to predict back and forth. And utilizing the previously constructed variables, controlling comment length, picture number, equipment, travel type and the like according to probability distribution about user evaluation behaviors obtained by positive comment rate and comment number of internal comments and rating and variance analysis of external comments, and constructing panel logic model analysis.
In addition, in order to study the influence of interaction which can exist between the internal comment and the external comment, interaction items for explaining variables are added on the basis of the original model, and possible regulation effects are explored.
(1) A panel logic model; the predicted user without interaction terms evaluates the behavioral model.
SquRecomijt=α+β1RecomRatioijt2LgRevNumijt3TripAvejt4TripVarjt
5RevLenijt6PicNumijt7Deviceijt8TravelTypeit
9RoomTypeijt+Month FEt+Hotel FEjijt
Where i refers to the consumer, j refers to the Hotel, t refers to the time of the current review, α refers to the constant term, β 1 refers to the coefficient of the positive rating ratio, β 2 refers to the coefficient of the logarithm of the number of previous reviews, β 3 refers to the coefficient of the average rating of the embedded reviews, β 4 refers to the coefficient of the variance of the embedded reviews, β 5 refers to the coefficient of the review length, β 6 refers to the coefficient of the number of pictures of the current review, β 7 refers to the coefficient of the review device, β 8 refers to the coefficient of the travel type, β 9 refers to the coefficient of the Hotel room type, montah t refers to the Month fix effect, hotel FE j refers to the Hotel fix effect, and μ ijt refers to the error term.
(2) A panel logic model; predictive user assessment behavior models with interactive terms.
SquRecomijt=α+β1RecomRatioijt2LgRevNumijt3TripAvejt4TripVarjt
5RecomRatioijt*TripAvejt6RecomRatioijt*TripVarjt
7LgRevNumijt*TripAvejt8LgRevNumijt*TripVarjt
9RevLenijt10PicNumijt11Deviceijt12TravelTypeit
13RoomTypeijt+Month FEt+Hotel FEjijt
Where i refers to the consumer, j refers to the hotel, t refers to the time of the current review, α refers to the constant term, β 1 refers to the coefficient of the positive rating ratio, β 2 refers to the coefficient of the logarithm of the number of previous reviews, β 3 refers to the coefficient of the average rating of the embedded reviews, β 4 refers to the coefficient of the variance of the embedded reviews, β 5 refers to the coefficient of the positive rating ratio and the coefficient of the average rating interaction term of the embedded reviews, β 6 refers to the coefficient of the positive rating ratio and the coefficient of the variance interaction term of the embedded reviews, β 7 refers to the coefficient of the number of previous reviews and the coefficient of the average rating interaction term of the embedded reviews, β 8 refers to the coefficient of the comment length, β 10 refers to the coefficient of the number of pictures of the current reviews, β 11 refers to the coefficient of the comment equipment, β 12 refers to the coefficient of the type of the hotel room, β 13 refers to the coefficient of the hotel room type, and β t refers to the fixed Month error effect of the hotel, ijt.
3.2 Calculation of model coefficients and residual terms
In this embodiment, stata software was used to run the panel logic model described above, and the output results were shown in table 1 model (1). The results show that the positive rating ratio and the coefficient of the logarithm of the number of previous comments are both positive and significant, indicating that the proportion of previous recommended comments and the number of comments of the internal reservation platform have a positive impact on the consumer's subsequent positive rating. The coefficient of average rating of the embedded reviews was significantly 0.057, while the variance coefficient of the embedded reviews was-0.162. These results indicate that the higher the average price of the embedded comments, the more likely it is that a positive rating will be obtained in the subsequent comments. In contrast, the smaller the variance of an embedded comment, the more likely it is that a positive rating will be obtained in subsequent comments.
The interactive item positive rating ratio x the coefficient of the average rating of the embedded comment is 0.451, indicating that an increase in the rating of the embedded comment may result in a significant increase in the likelihood of a subsequent positive rating under the effect of the positive internal rating ratio. In contrast, the coefficient of the log of the number of previous reviews of the interactive item x the average rating of the embedded reviews is-0.046, indicating that the consumer does not need a large number of reviews to make a purchase decision when the titer of the embedded reviews is relatively high. The interactive item positive evaluation ratio x the coefficient of variance of the embedded comment is-1.270, indicating that the variance of the embedded comment has a negative regulatory effect on the positive correlation of the internal positive evaluation ratio with the subsequent positive evaluation. That is, an increase in embedded comment variance may result in a slightly increased likelihood of subsequent positive evaluations as a result of the positive internal rating ratio. Moreover, the coefficient of the logarithm of the number of previous comments of the interactive item x the variance of the embedded comment is 0.127, which indicates that the positive effect of the number of previous comments on the current rating increases with increasing variance of the embedded rating.
TABLE 1
Note that the values in brackets are z values (standard scores); asterisks indicate significant coefficients at the 10%, 5% and 1% levels, respectively.
3.3 Robustness check of model
The main demonstration findings presented in the model (1) of table 1 were based on the data of 100 previous comments before the current comment. To ensure robustness of the results, the variables were calculated with 50 and 200 previous reviews, respectively, to avoid possible bias of the results from arbitrary selection. Table 1 models (2) and (3) use the same methods and equations to re-estimate the empirical model for the re-summarized data. Overall, the evidence results are consistent with previously reported results.
According to the regression result of the panel login model, the independent variables (the previous positive comment rate, the previous comment number, the average titer and variance of the embedded comment), the interactive items (the previous positive comment rate and the average titer of the embedded comment, the previous positive comment rate and the variance of the embedded comment, the previous comment number and the average titer of the embedded comment, the previous comment number and the variance of the embedded comment), the control variables (the comment length, the picture number, the equipment used in comment and the travel type) and the coefficients of the fixed effect are brought back to the metering model to obtain the prediction equation of the user evaluation behavior after the embedded OTA website based on the panel login model.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to apply equivalent substitutions or alterations to the technical solution and the inventive concept thereof according to the technical scope of the present invention disclosed herein.

Claims (1)

1. A prediction method for user evaluation behavior after embedding an OTA website based on a panel logic model is characterized by comprising the following steps of: the method comprises the following steps:
step one: acquiring an original data set; acquiring comment data, hotel data and user data of all hotels with embedded tripadvisors user comments in a target area from an OTA website through a JAVA crawler program to obtain an original data set;
step two: preprocessing data; carrying out data preprocessing on the original data set, namely carrying out data cleaning work, deleting samples which lack data and have abnormal values, and obtaining a cleaned data set;
Step three: variable definition and calculation; combing documents and theories related to customer evaluation behaviors, defining required variables, wherein the required variables and the corresponding definitions are positive evaluation rate, the number of previous comments, average rating of embedded comments and variance of the embedded comments, and calculating the required variables by using a cleaned data set;
Step four: panel logic model analysis; determining an interpretation variable and a regulating variable according to the evaluation attribute of each comment, wherein the positive evaluation ratio and the number of previous comments are taken as the interpretation variable, and the average rating of the embedded comments and the variance of the embedded comments are taken as the regulating variable; based on the explanation variable and the adjustment variable, the comment text length, the number of pictures, comment equipment, the travel type and the hotel room type are included and analyzed, and a logic model is built by using a fixed effect logic regression model; secondly, through introducing positive evaluation ratio and average rating interactive items of the embedded comments, positive evaluation ratio and variance interactive items of the embedded comments, average rating interactive items of the number of previous comments and variance interactive items of the embedded comments, a panel logic model is constructed, the panel logic model explores influence relation by taking user evaluation behaviors as dependent variables, variable data obtained through calculation in the third step are imported into STATA software for operation, and coefficients and residual items of the panel logic model are obtained;
Step five: robustness test; the sample range of the positive evaluation ratio of the interpretation variable is enlarged and reduced respectively, the variable is calculated again, a panel logic model is utilized for analysis to obtain a new data result, the new data result is compared with the previous data result, if the panel logic model passes the robustness test, the panel logic model is proved to have reliability, and the coefficient and residual error items obtained in the fourth step are brought back to the panel logic model to obtain a prediction model; if the robustness test is not passed, the panel logic model needs to be built and analyzed again, namely, the step four is executed again.
CN202311074103.8A 2023-08-24 2023-08-24 Prediction method for user evaluation behavior after embedding OTA website based on panel logic model Active CN117094856B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311074103.8A CN117094856B (en) 2023-08-24 2023-08-24 Prediction method for user evaluation behavior after embedding OTA website based on panel logic model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311074103.8A CN117094856B (en) 2023-08-24 2023-08-24 Prediction method for user evaluation behavior after embedding OTA website based on panel logic model

Publications (2)

Publication Number Publication Date
CN117094856A CN117094856A (en) 2023-11-21
CN117094856B true CN117094856B (en) 2024-04-30

Family

ID=88774932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311074103.8A Active CN117094856B (en) 2023-08-24 2023-08-24 Prediction method for user evaluation behavior after embedding OTA website based on panel logic model

Country Status (1)

Country Link
CN (1) CN117094856B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776868A (en) * 2016-11-29 2017-05-31 浙江工业大学 A kind of restaurant score in predicting method based on multiple linear regression model
JP2020024531A (en) * 2018-08-07 2020-02-13 株式会社日立製作所 Behavior characteristic measurement system and behavior characteristic measurement method
WO2020076179A1 (en) * 2018-10-11 2020-04-16 Общество С Ограниченной Ответственностью "Глобус Медиа" Method for determining tags for hotels and device for the implementation thereof
CN111666413A (en) * 2020-06-09 2020-09-15 重庆邮电大学 Commodity comment recommendation method based on reviewer reliability regression prediction
CN114612163A (en) * 2022-03-21 2022-06-10 南京信息工程大学 Method for evaluating influence factors of cultural tourism willingness of liquor based on structural equation model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130204833A1 (en) * 2012-02-02 2013-08-08 Bo PANG Personalized recommendation of user comments

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776868A (en) * 2016-11-29 2017-05-31 浙江工业大学 A kind of restaurant score in predicting method based on multiple linear regression model
JP2020024531A (en) * 2018-08-07 2020-02-13 株式会社日立製作所 Behavior characteristic measurement system and behavior characteristic measurement method
WO2020076179A1 (en) * 2018-10-11 2020-04-16 Общество С Ограниченной Ответственностью "Глобус Медиа" Method for determining tags for hotels and device for the implementation thereof
CN111666413A (en) * 2020-06-09 2020-09-15 重庆邮电大学 Commodity comment recommendation method based on reviewer reliability regression prediction
CN114612163A (en) * 2022-03-21 2022-06-10 南京信息工程大学 Method for evaluating influence factors of cultural tourism willingness of liquor based on structural equation model

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Restaurants’ motivations to solicit fake reviews: A competition perspective;Ziqiong Zhang等;《International Journal of Hospitality Management》;20220911;全文 *
Ziqiong Zhang等.Effects of spatial distance on consumers' review effort.《Annals of Tourism Research》.2022,全文. *
在线中文评论情感分类问题研究;张紫琼;《中国优秀博士论文集》;20131215;全文 *
基于评论情感分析的用户在线评价研究――以豆瓣网电影为例;马松岳;许鑫;;图书情报工作;20160520(10);全文 *
社会化媒体的用户产品评价影响因素研究:基于文本挖掘的方法;司格;张伦;张增一;;国际新闻界;20150623(06);全文 *
跨境电商平台顾客评论有用性影响因素研究——以亚马逊中国为例;周翔;《中国优秀硕士论文》;20230215;全文 *

Also Published As

Publication number Publication date
CN117094856A (en) 2023-11-21

Similar Documents

Publication Publication Date Title
KR101871747B1 (en) Similarity tendency based user-sightseeing recommendation system and method thereof
Xiao et al. Crowd intelligence: Analyzing online product reviews for preference measurement
Elberse et al. Demand and supply dynamics for sequentially released products in international markets: The case of motion pictures
Nguyen et al. Travel intention to visit tourism destinations: A perspective of generation Z in Vietnam
Elahi et al. Towards responsible media recommendation
Chung et al. A general consumer preference model for experience products: application to internet recommendation services
Colladon et al. Studying the association of online brand importance with museum visitors: An application of the semantic brand score
Alwan et al. The effect of digital marketing on value creation and customer satisfaction
Law A fuzzy multiple criteria decision-making model for evaluating travel websites
Mich et al. Evaluating Facebook pages for small hotels: a systematic approach
Baugh et al. A matter of appearances: How does auditing expertise benefit audit committees when selecting auditors?
Presley et al. An analytic hierarchy process model for evaluating and comparing website usability
Adamopoulos et al. Heterogeneous demand effects of recommendation strategies in a mobile application: Evidence from econometric models and machine-learning instruments
Peukert et al. The editor and the algorithm: Recommendation technology in online news
Gerlich et al. Artificial intelligence as toolset for analysis of public opinion and social interaction in marketing: identification of micro and nano influencers
Pontikes Fitting in or starting new? An analysis of invention, constraint, and the emergence of new categories in the software industry
CN117094856B (en) Prediction method for user evaluation behavior after embedding OTA website based on panel logic model
Chen et al. Using data mining to provide recommendation service
Qingju et al. Algorithm study under big data environment of personalized recommendation based on user interest model
Badada et al. Economic Impact of Transport Infrastructure in Ethiopia: The Role of Foreign Direct Investment
Tang et al. Service recommendation based on dynamic user portrait: an integrated approach
Barcaroli et al. Integration of ICT survey data and Internet data from enterprises websites at the Italian National Institute of Statistics
Liu et al. Two‐tuple linguistic utility aggregation operator and its applications to group decision‐making
Li et al. Case-Based Reasoning for Personalized Recommender on User Preference through Dynamic Clustering
US20240177204A1 (en) Systems and methods for attribute characterization of usability testing participants

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant