CN113837802B - Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth - Google Patents

Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth Download PDF

Info

Publication number
CN113837802B
CN113837802B CN202111122887.8A CN202111122887A CN113837802B CN 113837802 B CN113837802 B CN 113837802B CN 202111122887 A CN202111122887 A CN 202111122887A CN 113837802 B CN113837802 B CN 113837802B
Authority
CN
China
Prior art keywords
mobile phone
price
time sequence
time
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111122887.8A
Other languages
Chinese (zh)
Other versions
CN113837802A (en
Inventor
林乐新
周超
涂家辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shanhui Technology Co ltd
Original Assignee
Shenzhen Shanhui Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Shanhui Technology Co ltd filed Critical Shenzhen Shanhui Technology Co ltd
Priority to CN202111122887.8A priority Critical patent/CN113837802B/en
Publication of CN113837802A publication Critical patent/CN113837802A/en
Application granted granted Critical
Publication of CN113837802B publication Critical patent/CN113837802B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0278Product appraisal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Business, Economics & Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Probability & Statistics with Applications (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention belongs to the technical field of price prediction of secondhand mobile phones, in particular to a price prediction method of secondhand mobile phones, which is formed by integrating a time sequence process with the depth of defect characteristics of mobile phones, and comprises the steps of extracting metadata characteristics; calculating the average price of each machine type for each day as a macroscopic time sequence; text preprocessing is carried out on the model detection report content; mapping each word into 1 300-dimensional real vector by using a word2vec model, and inputting the obtained mobile phone metadata characteristics, mobile phone price time sequence and text vector representation into a price prediction model for prediction, wherein the price prediction model comprises a time sequence process modeling module, an attribute characteristic modeling module and an attention fusion module; according to the method, the long-term trend of the global price and the sudden fluctuation of the price in the local unit time are considered in the time sequence process modeling of the price prediction of the second-hand mobile phone, so that the change in the historical sales time sequence price data can be modeled more comprehensively and specifically; CNNs with attention mechanisms are used to model sudden fluctuations in price per unit time.

Description

Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth
Technical Field
The invention belongs to the technical field of price prediction of secondhand mobile phones, and particularly relates to a secondhand mobile phone price prediction method with a time sequence process and mobile phone defect feature depth fusion.
Background
Price prediction of the second-hand mobile phone has important significance for recycling and selling manufacturers of the second-hand mobile phone, and price prediction is performed by modeling various characteristics such as type, age, used time and geographic position of the second-hand mobile phone, so that the method is a typical application scene of a regression model.
However, the price of a mobile phone is predicted to be influenced by external factors to generate short-term fluctuation which is difficult to predict, and the attribute and metadata characteristics of the mobile phone usually comprise multiple modes, so that the modeling complexity is increased.
The technology in the industry can be divided into: time series process modeling and modeling based on attribute features.
Modeling a time sequence process: the time-sequential process modeling prices based on the time evolution process of the aggregate view quantity among time slots. The price changes with time, and unexpected explosive fluctuation is generated under the influence of external factors, so that how to model the long-term rule and the short-term fluctuation is a target of a time sequence model.
1. The price fluctuation process of the second hand mobile phone is regarded as a microscopic arrival point process of the purchase behavior of the user, and popularity is predicted based on modeling of a single event microscopic point process of an enhanced possion process, a hawkes point process or a neural network. However, in large scale applications, the number of events (the number of transactions) may burst in a short time, which will lead to performance problems for microscopic sequential process modeling.
2. Prediction is performed based on the event quantity macroscopic accumulation process. The Hox Intensity Process (HIP) can describe the evolution of the macroscopic temporal process and is successfully applied to Youtube video popularity predictions. However, HIP makes specific assumptions about the functional form of the sequential process and the influence of external factors, limiting the expressive power of the model. Other conventional practices have manually extracted the rising and falling "stages" from the macroscopic timing process to capture fluctuations, and then employed a linear phase-based regression approach to popularity predictions. However, the "stage" of hand-made is not capable of handling the popular evolution process and is not universal.
Modeling based on characteristics: modeling mobile phone detection item metadata features for price prediction.
For example, the mobile phone screen, the battery characteristics, the sales channel characteristics and the like are manually extracted and then predicted by using the traditional regression model fusion characteristics. However, such techniques do not take full advantage of inspector evaluation report long text and metadata features, and ignore the evolution process of cell phone prices. Moreover, manually extracted features are difficult to design and measure, and are typically limited to specific data sets or applications.
Defects of the prior art:
1. External factors affecting cell phone prices may cover different ranges and durations, it is difficult to artificially assume the number and shape of price fluctuations, short term fluctuations are captured by specific assumptions of external influencing factors in advance or by manual extraction, and instead the predictive power of the model is limited.
2. When the user purchases the mobile phone, the user can pay attention to the detection report from the detection personnel besides the detection results of the mobile phone, and the psychological bidding of the user can be influenced by the writing condition (detailed condition, word degree condition and other differences) of the report. The prior art fails to fully utilize a long text based on a detection report to predict the price of the second hand mobile phone.
3. The time series process and the content feature modeling cannot be sufficiently integrated to exert their respective advantages. Different cell phone models may exhibit different price fluctuations. The intuitive fusion approach lacks the flexibility to handle the price evolution process.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a secondhand mobile phone price prediction method with a time sequence process and mobile phone defect feature depth fusion, which has the characteristics of convenient use, better processing effect and better flexibility.
In order to achieve the above purpose, the present invention provides the following technical solutions: a price prediction algorithm for second hand mobile phones comprises the following steps:
step one: extracting metadata features to obtain (F 1,F2,F3,…,Fn) feature vectors;
Step two: calculating the average price of each machine type daily sales as a macroscopic time sequence to obtain a (P 1,P2,P3,…,Pn) time sequence characteristic sequence;
Step three: text preprocessing is carried out on the model detection report content, word segmentation is carried out on the text, word frequency is counted, and a part of high-frequency keywords are screened out from the text to serve as a word segmentation dictionary;
Step four: assuming that the text is correspondingly divided into k words in W 1,W2,W3,…,Wk through the step III, mapping each word into 1 300-dimensional real vector by using a word2vec model, namely, mapping doc into a matrix of k x 300;
Step five: and inputting the mobile phone metadata features, the mobile phone price time sequence and the text vector representation obtained in the first step, the second step and the fourth step into a price prediction model for prediction, wherein the price prediction model comprises a time sequence process modeling module, an attribute feature modeling module and an attention fusion module.
In the first step, the metadata features include a mobile phone brand, a release time, a model report and a screen detection item.
In the third step, the preprocessing includes special symbol processing, english case conversion and unified complex and simplified characters.
As a preferred technical solution of the present invention, the third step further includes uniformly representing the keywords not in the dictionary with special symbols < unk >.
In the fifth step, the time process modeling takes the historical feedback sequence { V 1,V2,V3,…,Vt } in the second step as input, uses the recurrent neural network RNN to model the long-term trend, and uses the convolutional neural network CNN to capture the short-term fluctuation.
In the fifth step, the attribute feature modeling adopts an embedded network and a hierarchical attention network to respectively receive the metadata features and the text vectors in the first step and the fourth step as inputs, and models the metadata and the long text.
In the fifth step, the attention fusion module dynamically integrates the time process modeling and the network module for modeling the attribute characteristics.
As a preferable technical scheme of the invention, the invention further comprises a training stage for training the time sequence process modeling, attribute feature modeling and attention fusion module.
As a preferred technical scheme of the invention, the method also comprises an application stage for processing the second-hand mobile phone as online service, and specifically comprises the following steps:
step 1, extracting metadata characteristics, model names and detection report text contents of a mobile phone to be predicted, dividing a historical sales sequence of the mobile phone, and preprocessing;
Step 2, inputting the processed price history time sequence, the metadata characteristics and the text sequence into a trained model, and outputting the predicted category of the price of the mobile phone;
and step 3, determining the sales guidance price of the mobile phone according to the prediction result through rules.
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the method, the long-term trend of the global price and the sudden fluctuation of the price in the local unit time are considered in the time sequence process modeling of the price prediction of the second-hand mobile phone, so that the change in the historical sales time sequence price data can be modeled more comprehensively and specifically;
(2) The method uses LSTM to model a long-term growth trend, and has two advantages: a) The LSTM is extremely suitable for processing the sequence structure input with the dependency relationship on the time sequence, can capture the relationship between the historical moments, and learns the historical evolution mode of the price; b) The memory unit in the LSTM can memorize the time sequence dependency relationship with longer distance, and can better process the history time sequence input of a long sequence;
(3) The method adopts CNN modeling with attention mechanism to model sudden fluctuation of price in unit time, and has two advantages: a) CNNs are adept at capturing local structures with translational invariance, whereas short-term fluctuations in price "up" and "down" have just this property; b) The attention mechanism enables the model to pay more attention to the points in time that are affected by external factors;
(4) The method adopts a hierarchical attention network to model the detection report long text content of the mobile phone, which is beneficial to obtaining better text semantic representation because the document is encoded into the attention vectors of word level and sentence level in turn in consideration of the inherent hierarchical structure of the document (namely, words form sentences and sentences form the document);
(5) According to the method, the embedding technology is adopted to embed the metadata features of different types into the homologous dense space, so that the metadata features are fully fused;
(6) The invention adopts a time sequence attention fusion mechanism to automatically determine the decisive degree of the output of different modules on the final prediction result according to different moments, and has good flexibility of dynamically evolving the processing time sequence process;
(7) The lack of early historical data release by cell phone models makes early predictions more challenging, yet predictions at early stages are more valuable. The time attenuation loss function provided by the invention is beneficial to helping the model to put more effort into optimizing the prediction performance in an early stage;
(8) Through the combination of the above points, the model carries out deep fusion on the time sequence process and the attribute characteristics when predicting the price of the second hand mobile phone, and the advantages of the time sequence process and the attribute characteristics are complementary when the model predicts the price of the second hand mobile phone, so that the price of the second hand mobile phone can be flexibly predicted at any period of the new mobile phone release period.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of an algorithm of the present invention;
FIG. 2 is a schematic diagram of a body model structure according to the present invention;
FIG. 3 is a schematic diagram of a time course model structure in the present invention;
FIG. 4 is a schematic diagram of an embedded network structure according to the present invention;
FIG. 5 is a schematic diagram of a hierarchical attention network architecture in accordance with the present invention;
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1-5, the present invention provides the following technical solutions: a price prediction algorithm for second hand mobile phones comprises the following steps:
Step one: extracting metadata features including mobile phone brands, release time, model reports and screen detection items to obtain (F 1,F2,F3,…,Fn) feature vectors;
Step two: calculating the average price of each machine type daily sales as a macroscopic time sequence to obtain a (P 1,P2,P3,…,Pn) time sequence characteristic sequence;
Step three: text preprocessing is carried out on the model detection report content, wherein the text preprocessing comprises special symbol processing, english case conversion and unified complex and simplified characters, word segmentation is carried out on the text, word frequency is counted, a part of high-frequency keywords are screened out from the text to serve as a word segmentation dictionary, and keywords which are not in the dictionary are uniformly represented by special symbols < unk >;
Step four: assuming that the text is correspondingly divided into k words in W 1,W2,W3,…,Wk through the step III, mapping each word into 1 300-dimensional real vector by using a word2vec model, namely, mapping doc into a matrix of k x 300;
Step five: and inputting the mobile phone metadata features, the mobile phone price time sequence and the text vector representation obtained in the first step, the second step and the fourth step into a price prediction model for prediction, wherein the price prediction model comprises a time sequence process modeling module, an attribute feature modeling module and an attention fusion module.
Specifically, according to fig. 3, in the fifth embodiment, in the modeling of the time course, the historical feedback sequence { V 1,V2,V3,…,Vt } in the second step is taken as an input, the cyclic neural network RNN is used to model the long-term trend, and the convolutional neural network CNN is used to capture the short-term fluctuation.
Time course networks use long and short term memory networks LSTM in RNNs to capture long-term growth trends in price evolution over time, LSTM's have the advantage of modeling time in that hidden states contain all historical information, so specific assumptions do not need to be made about the functional form of the historical trends, and memory cells in LSTM are better at capturing long sequence dependencies, feed feedback vectors V for each slot into LSTM, and obtain output vectorsHistory of growth patterns in (a).
On the other hand, the fluctuation caused by the external factors causes the reading quantity curve to present ascending and descending stages which look like "mountain" and "valley", which are local structures with constant translation, as shown in fig. 3, and therefore, the present invention proposes to capture such short-term fluctuation structures with one-dimensional convolutional neural networks; furthermore, the influence of different factors continues over different time ranges, meaning that the "mountain" has different widths, so the present invention captures different fluctuation ranges using multiple convolution kernels of different sizes, then stacks the outputs of all convolution kernels vertically, since CNNs typically require a fixed size input, assuming an input window width of k, the input of each convolution layer is a cut sequence { V t-k+1,Vt-k+2,…,Vt } of length k before time t, applies the same filling operation and obtains an output sequence { C t-k+1,Ct-k+2,…,Ct } of length k, which captures the most recent historical fluctuation pattern, finally, the present invention incorporates the output sequence { C t-k+1,Ct-k+2,…,Ct } to the output vector by time dimension using a attention mechanismIn (c), the attention mechanism is that vectors c at different moments in the convolutional layer output sequence are multiplied by different attention weights a c, thereby contributing to output/>Focusing more on the point in time affected by external factors, the weight a c and the output vector/>The calculation method of (2) is as follows:
Specifically, according to fig. 4 and fig. 5, in the fifth embodiment, the attribute feature modeling uses the embedded network and the hierarchical attention network to receive the metadata feature and the text vector in the first step and the fourth step as inputs, respectively, so as to model the metadata and the long text.
The content attribute features (including test report text and metadata features) of second hand handsets largely determine their price intervals, the metadata features include one-hot coding features, such as categories, as well as numeric features, such as the past week/day of trade average price etc., so the present invention utilizes embedding techniques to embed these features into the homologous dense vectors and applies a fully connected layer to the feature combinations instead of manually selecting and combining these features, as shown in fig. 4, we embed one-hot coding features into the dense vectors by embedding matrices while multiplying the numeric features by the embedding vectors to map them to homologous dense vectors, then concatenate all metadata features, and combine all metadata features together by the fully connected layer to obtain the overall metadata representation vector h e.
Since the mobile phone detection report is usually a long text document, the invention adopts a hierarchical attention network HAN to model text content characteristics, and takes the inherent hierarchical structure of the document (i.e. words form sentences and sentences form the document) into consideration, the HAN adopts a two-stage encoder and an attention mechanism to encode the document into word-level and sentence-level attention vectors, the word-level and sentence-level encoders are Bi-GRU, in addition, the mobile phone model name is a high-level description of the mobile phone, the main impression of the mobile phone is displayed, the model name expression vector is learned in the HAN as a supplement, and the model name is encoded into a vector only with the word-level encoder and the attention, and then the detection report document vector and the model name vector are connected together as a final text characteristic h h.
Specifically, according to fig. 4 and fig. 5, in the fifth embodiment, the attention fusion module dynamically integrates the network modules of time process modeling and attribute feature modeling.
Assume thatH h and h e represent RNN, CNN, HAN and meta-feature embedded outputs respectively, and since in the initial stage after model release, time course modeling is difficult to learn the overall trend of price, so prediction should depend mainly on attribute feature modeling, and as time goes by, the observed price gradually tends to be stable, so time modeling should play a main role in prediction, and the attention fusion mechanism is just to/>H h and h e are combined with a flexible weight a, a being/>H h and h e with time t, so that the method can automatically adapt to the output of different modules and different moments, has good flexibility of dynamic evolution of a processing time sequence process, and an attention mechanism is element-by-element combination, and is used for/>H h and h e feed into the fully connected layer for feature combination and to obtain alignment vectors/>, of the elementsAnd/>Then, a two-layer neural network is used to calculate the attention weight a m in the following manner:
The time representation variable t consists of the periodic property of a given time slot t, the time slot interval and the release time, wherein the periodic property is a single thermal coding characteristic and the time interval is a numerical characteristic, and the invention applies the same strategy as embedding metadata characteristics to embed the time representation variable t into a vector, dynamically fuses all sub-networks into a vector through the attention weight a m And obtaining a price prediction probability distribution P t={pt(l1),pt(l2),…,pt(ln) after the full connection layer and the softmax output layer, and then taking the price category corresponding to the maximum probability as a final prediction result/>The specific calculation process is as follows:
Specifically, according to fig. 4 and fig. 5, the embodiment further includes a training stage for training the time sequence process modeling, attribute feature modeling and attention fusion module.
In order to ensure the diversity of training data, the price prediction of the mobile phone is regarded as a classification task, and the prices are classified into 100 categories: dividing the continuous value of the price into 100 classes according to the price equal-frequency dividing bucket, and taking the class as a label of training data; at the same time, we limit the maximum and minimum input length of the text sequence, filtering sequences with too short historical feedback time sequences and no predictive value.
In the training process, the category with the highest prediction score is selected as a prediction result of the price, an Adam optimization algorithm and a time attenuation loss function proposed by us are adopted as optimization targets, the model is trained on a training set until convergence, the early price is predicted more valuable in practical application, in addition, the observed internal relation between the price and a reasonable price interval enables the prediction to be easier in the later stage, more efforts are invested on the early stage to optimize the prediction performance for helping the model, in the training process, a weighted sum of the cross entropy loss of a single step is multiplied by a time attenuation factor D (delta t) as a final loss function, the time attenuation factor D (delta t) is a monotonically non-increasing function of a time interval delta t between t and model release time, and the specific form of D (delta t) and a loss function J is as follows:
D(Δt)=[logrt+1)]-1
Here, [ ] denotes a rounding up operator, Δ t is the number of slots between issue time and time t, so Δ t and log p t(lc are both positive integers, Y > 1 is a hyper-parameter for controlling the decay rate, we use a logarithmic function to ensure that the decay rate of D (Δ t) becomes smaller and smaller over time, and a rounding up operator is used to limit the initial decay rate of the logarithmic function.
Specifically, according to fig. 4 and fig. 5, the embodiment further includes an application stage, which is used as an online service to process the second-hand mobile phone, and specifically includes:
step 1, extracting metadata characteristics, model names and detection report text contents of a mobile phone to be predicted, dividing a historical sales sequence of the mobile phone, and preprocessing;
Step 2, inputting the processed price history time sequence, the metadata characteristics and the text sequence into a trained model, and outputting the predicted category of the price of the mobile phone;
and step 3, determining the sales guidance price of the mobile phone according to the prediction result through rules.
(1) The present approach uses LSTM to model long-term trends in price, and other structures in RNNs may be used instead, such as gated loop units (GRUs);
(2) The HAN adopts a Bi-directional gating and circulating unit (Bi-GRU) to encode the text of word level and sentence level sequentially, and other encoders can be used for replacing the text, such as unidirectional or bidirectional RNNs and LSTM, CNNs and transformers;
(3) The final output can be changed from a classification problem to a regression problem, and the price score can be directly predicted;
(4) The method can be extended to any business of regression scenes, such as news in a portal and blog recommendation in a social network.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. A secondhand mobile phone price prediction method integrating a time sequence process and mobile phone defect feature depth is characterized by comprising the following steps:
Step one: extracting metadata features including mobile phone brands, release time, model reports and screen detection items to obtain (F 1,F2,F3,…,Fn) feature vectors;
step two: calculating the average price of each machine type daily sales as a time sequence characteristic sequence to obtain a (P 1,P2,P3,…,Pn) time sequence characteristic sequence;
Step three: performing text preprocessing on the detection report content, segmenting the text, counting word frequency, screening out a part of high-frequency keywords as a word segmentation dictionary, counting word frequency, screening out a part of high-frequency keywords from the word frequency as the word segmentation dictionary, and uniformly representing keywords which are not in the dictionary by special symbols < unk >;
step four: through the corresponding division of the text in the step three, k words are counted in W 1,W2,W3,…,Wk, each word is mapped into 1 300-dimensional real vector by using a word2vec model, namely doc is mapped into a matrix of k x 300;
step five: inputting the metadata features, time sequence feature sequences and 300-dimensional real vectors obtained in the first step, the second step and the fourth step into a price prediction model for prediction, wherein the price prediction model comprises a time process modeling module, an attribute feature modeling module and an attention fusion module, the time process modeling module takes the time sequence feature sequences (P 1,P2,P3,…,Pn) in the second step as input, adopts a cyclic neural network RNN to model long-term trends, adopts the convolutional neural network CNN to capture short-term fluctuations, adopts a long-term memory network LSTM in the RNN to capture long-term growth trends of price evolution along with time, feeds feedback vectors of each time slot into the LSTM, and obtains output vectors CNN fixes the width of the input window to be k, the input of each convolution layer is a shearing sequence with the length of k before the moment t, the same filling operation is applied, the output sequence { C t-k+1,Ct-k+2,…,Ct } with the length of k is obtained, and the attention mechanism is adopted to merge the output sequence { C t-k+1,Ct-k+2,…,Ct } to the output vector/>, through the time dimensionWherein, attribute feature modeling adopts an embedding network and a hierarchical attention network to respectively receive the metadata features and 300-dimensional real vectors in the first step and the fourth step as input, the metadata and the long text are modeled, the metadata features comprise single-hot coding features and numerical type features, the single-hot coding features are embedded into the dense vectors through the embedding matrix, meanwhile, the numerical features are multiplied by the embedding vectors so as to map the numerical features to homologous dense vectors, then all the metadata features are cascaded, all the metadata features are combined together by a full connection layer to obtain an overall metadata representation vector h e, a detection report is a long text file, the text content features are modeled by adopting a hierarchical attention network HAN, the hierarchical attention network adopts a two-stage encoder and an attention mechanism to sequentially encode documents into word-level and sentence-level attention vectors, the word-level and sentence-level encoders are two-way gating circulating units Bi-GRU, model name representation vectors are simultaneously learned as supplements in the HAN, model names are encoded into vectors only with word-level encoders and attention, then all the detection report document vectors and model name final report are connected together as a feature h, the attribute modeling process is integrated by a dynamic modeling element, and the attribute modeling process is a time-integrated by a dynamic modeling element, and a time-integrated modeling process is a dynamic modeling element H h and h e feed into the fully connected layer for feature combination and to obtain alignment vectors/>, of the elements And/>The attention weights a m are then calculated using a two-layer neural network, dynamically fusing all sub-networks into/>And obtaining probability distribution P t={pt(l1),pt(l2),…,pt(ln) of price prediction after the full connection layer and the softmax output layer, and taking the price category corresponding to the maximum probability as the final prediction result/>
2. The secondhand mobile phone price prediction method based on the time sequence process and mobile phone defect feature depth fusion of claim 1, wherein the secondhand mobile phone price prediction method is characterized by comprising the following steps of: in the third step, the preprocessing comprises special symbol processing, english case conversion and unified complex and simplified characters.
3. The secondhand mobile phone price prediction method based on the time sequence process and mobile phone defect feature depth fusion of claim 1, wherein the secondhand mobile phone price prediction method is characterized by comprising the following steps of: the training stage is used for training the time sequence process modeling, attribute feature modeling and attention fusion module.
4. The secondhand mobile phone price prediction method based on the time sequence process and mobile phone defect feature depth fusion of claim 1, wherein the secondhand mobile phone price prediction method is characterized by comprising the following steps of: the method also comprises an application stage for processing the second hand mobile phone as an online service, and specifically comprises the following steps:
step 1, extracting metadata characteristics, model names and detection report text contents of a mobile phone to be predicted, dividing a historical sales sequence of the mobile phone, and preprocessing;
Step 2, inputting the processed price history time sequence, the metadata characteristics and the text sequence into a trained model, and outputting the predicted category of the price of the mobile phone;
and step 3, determining the sales guidance price of the mobile phone according to the prediction result through rules.
CN202111122887.8A 2021-09-24 2021-09-24 Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth Active CN113837802B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111122887.8A CN113837802B (en) 2021-09-24 2021-09-24 Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111122887.8A CN113837802B (en) 2021-09-24 2021-09-24 Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth

Publications (2)

Publication Number Publication Date
CN113837802A CN113837802A (en) 2021-12-24
CN113837802B true CN113837802B (en) 2024-05-28

Family

ID=78970058

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111122887.8A Active CN113837802B (en) 2021-09-24 2021-09-24 Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth

Country Status (1)

Country Link
CN (1) CN113837802B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101061094B1 (en) * 2010-09-28 2011-08-31 홍상욱 Information providing and selling system for used car on network
WO2012071543A2 (en) * 2010-11-24 2012-05-31 Decide, Inc. Price and model prediction system and method
CN109492838A (en) * 2019-01-16 2019-03-19 中国地质大学(武汉) A kind of stock index price expectation method based on deep-cycle neural network
AU2020100249A4 (en) * 2019-04-26 2020-03-26 Shanghai Academy Of Agricultural Sciences Method and device for predicting product price and computer medium
CN112016964A (en) * 2020-08-27 2020-12-01 李忠耘 Second-hand vehicle dynamic pricing method and device, electronic equipment and storage medium
US10878505B1 (en) * 2020-07-31 2020-12-29 Agblox, Inc. Curated sentiment analysis in multi-layer, machine learning-based forecasting model using customized, commodity-specific neural networks
KR102199620B1 (en) * 2020-05-20 2021-01-07 주식회사 네이처모빌리티 System for providing bigdata based price comparison service using time series analysis and price prediction
KR102234821B1 (en) * 2020-10-12 2021-04-01 주식회사 브랜드쉐어 Electronic device for performing a predection for a price of a product using big data and machine learning model and method for operating thereof
KR20210063774A (en) * 2019-11-25 2021-06-02 (주)크래프트테크놀로지스 Method, computer program and recording medium for predicting price of asset use of convolution neural network
CN113033903A (en) * 2021-03-31 2021-06-25 西安建筑科技大学 Fruit price prediction method, medium and equipment of LSTM model and seq2seq model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10185996B2 (en) * 2015-07-15 2019-01-22 Foundation Of Soongsil University Industry Cooperation Stock fluctuation prediction method and server
CN105930931A (en) * 2016-04-22 2016-09-07 国网浙江省电力公司经济技术研究院 Electric power engineering cost management method
US20210110440A1 (en) * 2019-10-15 2021-04-15 A La Carte Media, Inc. Systems and methods for enhanced evaluation of pre-owned electronic devices and provision of protection plans, repair, certifications, etc.

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101061094B1 (en) * 2010-09-28 2011-08-31 홍상욱 Information providing and selling system for used car on network
WO2012071543A2 (en) * 2010-11-24 2012-05-31 Decide, Inc. Price and model prediction system and method
CN109492838A (en) * 2019-01-16 2019-03-19 中国地质大学(武汉) A kind of stock index price expectation method based on deep-cycle neural network
AU2020100249A4 (en) * 2019-04-26 2020-03-26 Shanghai Academy Of Agricultural Sciences Method and device for predicting product price and computer medium
KR20210063774A (en) * 2019-11-25 2021-06-02 (주)크래프트테크놀로지스 Method, computer program and recording medium for predicting price of asset use of convolution neural network
KR102199620B1 (en) * 2020-05-20 2021-01-07 주식회사 네이처모빌리티 System for providing bigdata based price comparison service using time series analysis and price prediction
US10878505B1 (en) * 2020-07-31 2020-12-29 Agblox, Inc. Curated sentiment analysis in multi-layer, machine learning-based forecasting model using customized, commodity-specific neural networks
CN112016964A (en) * 2020-08-27 2020-12-01 李忠耘 Second-hand vehicle dynamic pricing method and device, electronic equipment and storage medium
KR102234821B1 (en) * 2020-10-12 2021-04-01 주식회사 브랜드쉐어 Electronic device for performing a predection for a price of a product using big data and machine learning model and method for operating thereof
CN113033903A (en) * 2021-03-31 2021-06-25 西安建筑科技大学 Fruit price prediction method, medium and equipment of LSTM model and seq2seq model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Two-channel Attention Mechanism Fusion Model of Stock Price Prediction Based on CNN-LSTM;Lin Sun.et al;ACM Transactions on Asian and Low-Resource Language Information Processing;第1-12页 *
基于HP-LSTM模型的股指价格预测方法;姚远等;计算机工程与应用;第296-304页 *

Also Published As

Publication number Publication date
CN113837802A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
CN110728541B (en) Information streaming media advertising creative recommendation method and device
WO2022116536A1 (en) Information service providing method and apparatus, electronic device, and storage medium
CN111464881B (en) Full-convolution video description generation method based on self-optimization mechanism
CN112084334B (en) Label classification method and device for corpus, computer equipment and storage medium
CN110795657A (en) Article pushing and model training method and device, storage medium and computer equipment
CN112085565A (en) Deep learning-based information recommendation method, device, equipment and storage medium
CN110223675A (en) The screening technique and system of training text data for speech recognition
CN117453921A (en) Data information label processing method of large language model
CN116304745B (en) Text topic matching method and system based on deep semantic information
CN111723295A (en) Content distribution method, device and storage medium
CN115659995B (en) Text emotion analysis method and device
CN113032552A (en) Text abstract-based policy key point extraction method and system
CN116975615A (en) Task prediction method and device based on video multi-mode information
CN117216535A (en) Training method, device, equipment and medium for recommended text generation model
CN114996486A (en) Data recommendation method and device, server and storage medium
CN115018190A (en) Overdue behavior prediction method and device, storage medium and electronic device
CN110377910A (en) A kind of processing method, device, equipment and the storage medium of table description
CN116383521B (en) Subject word mining method and device, computer equipment and storage medium
CN113837802B (en) Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth
CN111241392A (en) Method, device, equipment and readable storage medium for determining popularity of article
CN111581386A (en) Construction method, device, equipment and medium of multi-output text classification model
CN116167371A (en) Product recommendation method and device, processor and electronic equipment
CN117216617A (en) Text classification model training method, device, computer equipment and storage medium
CN113688232B (en) Method and device for classifying bid-inviting text, storage medium and terminal
CN113505207B (en) Machine reading understanding method and system for financial public opinion research report

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant