CN113837802B - Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth - Google Patents
Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth Download PDFInfo
- Publication number
- CN113837802B CN113837802B CN202111122887.8A CN202111122887A CN113837802B CN 113837802 B CN113837802 B CN 113837802B CN 202111122887 A CN202111122887 A CN 202111122887A CN 113837802 B CN113837802 B CN 113837802B
- Authority
- CN
- China
- Prior art keywords
- mobile phone
- price
- time sequence
- time
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 77
- 230000008569 process Effects 0.000 title claims abstract description 52
- 230000007547 defect Effects 0.000 title claims abstract description 10
- 239000013598 vector Substances 0.000 claims abstract description 44
- 238000001514 detection method Methods 0.000 claims abstract description 20
- 230000004927 fusion Effects 0.000 claims abstract description 19
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 14
- 230000007774 longterm Effects 0.000 claims abstract description 10
- 230000007246 mechanism Effects 0.000 claims abstract description 10
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000012010 growth Effects 0.000 claims description 3
- 125000004122 cyclic group Chemical group 0.000 claims description 2
- 230000007787 long-term memory Effects 0.000 claims description 2
- 239000013589 supplement Substances 0.000 claims description 2
- 238000012216 screening Methods 0.000 claims 2
- 238000010008 shearing Methods 0.000 claims 1
- 238000013507 mapping Methods 0.000 abstract description 5
- 230000008859 change Effects 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 7
- 230000008901 benefit Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 241000288105 Grus Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000007773 growth pattern Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0278—Product appraisal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Business, Economics & Management (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Probability & Statistics with Applications (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention belongs to the technical field of price prediction of secondhand mobile phones, in particular to a price prediction method of secondhand mobile phones, which is formed by integrating a time sequence process with the depth of defect characteristics of mobile phones, and comprises the steps of extracting metadata characteristics; calculating the average price of each machine type for each day as a macroscopic time sequence; text preprocessing is carried out on the model detection report content; mapping each word into 1 300-dimensional real vector by using a word2vec model, and inputting the obtained mobile phone metadata characteristics, mobile phone price time sequence and text vector representation into a price prediction model for prediction, wherein the price prediction model comprises a time sequence process modeling module, an attribute characteristic modeling module and an attention fusion module; according to the method, the long-term trend of the global price and the sudden fluctuation of the price in the local unit time are considered in the time sequence process modeling of the price prediction of the second-hand mobile phone, so that the change in the historical sales time sequence price data can be modeled more comprehensively and specifically; CNNs with attention mechanisms are used to model sudden fluctuations in price per unit time.
Description
Technical Field
The invention belongs to the technical field of price prediction of secondhand mobile phones, and particularly relates to a secondhand mobile phone price prediction method with a time sequence process and mobile phone defect feature depth fusion.
Background
Price prediction of the second-hand mobile phone has important significance for recycling and selling manufacturers of the second-hand mobile phone, and price prediction is performed by modeling various characteristics such as type, age, used time and geographic position of the second-hand mobile phone, so that the method is a typical application scene of a regression model.
However, the price of a mobile phone is predicted to be influenced by external factors to generate short-term fluctuation which is difficult to predict, and the attribute and metadata characteristics of the mobile phone usually comprise multiple modes, so that the modeling complexity is increased.
The technology in the industry can be divided into: time series process modeling and modeling based on attribute features.
Modeling a time sequence process: the time-sequential process modeling prices based on the time evolution process of the aggregate view quantity among time slots. The price changes with time, and unexpected explosive fluctuation is generated under the influence of external factors, so that how to model the long-term rule and the short-term fluctuation is a target of a time sequence model.
1. The price fluctuation process of the second hand mobile phone is regarded as a microscopic arrival point process of the purchase behavior of the user, and popularity is predicted based on modeling of a single event microscopic point process of an enhanced possion process, a hawkes point process or a neural network. However, in large scale applications, the number of events (the number of transactions) may burst in a short time, which will lead to performance problems for microscopic sequential process modeling.
2. Prediction is performed based on the event quantity macroscopic accumulation process. The Hox Intensity Process (HIP) can describe the evolution of the macroscopic temporal process and is successfully applied to Youtube video popularity predictions. However, HIP makes specific assumptions about the functional form of the sequential process and the influence of external factors, limiting the expressive power of the model. Other conventional practices have manually extracted the rising and falling "stages" from the macroscopic timing process to capture fluctuations, and then employed a linear phase-based regression approach to popularity predictions. However, the "stage" of hand-made is not capable of handling the popular evolution process and is not universal.
Modeling based on characteristics: modeling mobile phone detection item metadata features for price prediction.
For example, the mobile phone screen, the battery characteristics, the sales channel characteristics and the like are manually extracted and then predicted by using the traditional regression model fusion characteristics. However, such techniques do not take full advantage of inspector evaluation report long text and metadata features, and ignore the evolution process of cell phone prices. Moreover, manually extracted features are difficult to design and measure, and are typically limited to specific data sets or applications.
Defects of the prior art:
1. External factors affecting cell phone prices may cover different ranges and durations, it is difficult to artificially assume the number and shape of price fluctuations, short term fluctuations are captured by specific assumptions of external influencing factors in advance or by manual extraction, and instead the predictive power of the model is limited.
2. When the user purchases the mobile phone, the user can pay attention to the detection report from the detection personnel besides the detection results of the mobile phone, and the psychological bidding of the user can be influenced by the writing condition (detailed condition, word degree condition and other differences) of the report. The prior art fails to fully utilize a long text based on a detection report to predict the price of the second hand mobile phone.
3. The time series process and the content feature modeling cannot be sufficiently integrated to exert their respective advantages. Different cell phone models may exhibit different price fluctuations. The intuitive fusion approach lacks the flexibility to handle the price evolution process.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a secondhand mobile phone price prediction method with a time sequence process and mobile phone defect feature depth fusion, which has the characteristics of convenient use, better processing effect and better flexibility.
In order to achieve the above purpose, the present invention provides the following technical solutions: a price prediction algorithm for second hand mobile phones comprises the following steps:
step one: extracting metadata features to obtain (F 1,F2,F3,…,Fn) feature vectors;
Step two: calculating the average price of each machine type daily sales as a macroscopic time sequence to obtain a (P 1,P2,P3,…,Pn) time sequence characteristic sequence;
Step three: text preprocessing is carried out on the model detection report content, word segmentation is carried out on the text, word frequency is counted, and a part of high-frequency keywords are screened out from the text to serve as a word segmentation dictionary;
Step four: assuming that the text is correspondingly divided into k words in W 1,W2,W3,…,Wk through the step III, mapping each word into 1 300-dimensional real vector by using a word2vec model, namely, mapping doc into a matrix of k x 300;
Step five: and inputting the mobile phone metadata features, the mobile phone price time sequence and the text vector representation obtained in the first step, the second step and the fourth step into a price prediction model for prediction, wherein the price prediction model comprises a time sequence process modeling module, an attribute feature modeling module and an attention fusion module.
In the first step, the metadata features include a mobile phone brand, a release time, a model report and a screen detection item.
In the third step, the preprocessing includes special symbol processing, english case conversion and unified complex and simplified characters.
As a preferred technical solution of the present invention, the third step further includes uniformly representing the keywords not in the dictionary with special symbols < unk >.
In the fifth step, the time process modeling takes the historical feedback sequence { V 1,V2,V3,…,Vt } in the second step as input, uses the recurrent neural network RNN to model the long-term trend, and uses the convolutional neural network CNN to capture the short-term fluctuation.
In the fifth step, the attribute feature modeling adopts an embedded network and a hierarchical attention network to respectively receive the metadata features and the text vectors in the first step and the fourth step as inputs, and models the metadata and the long text.
In the fifth step, the attention fusion module dynamically integrates the time process modeling and the network module for modeling the attribute characteristics.
As a preferable technical scheme of the invention, the invention further comprises a training stage for training the time sequence process modeling, attribute feature modeling and attention fusion module.
As a preferred technical scheme of the invention, the method also comprises an application stage for processing the second-hand mobile phone as online service, and specifically comprises the following steps:
step 1, extracting metadata characteristics, model names and detection report text contents of a mobile phone to be predicted, dividing a historical sales sequence of the mobile phone, and preprocessing;
Step 2, inputting the processed price history time sequence, the metadata characteristics and the text sequence into a trained model, and outputting the predicted category of the price of the mobile phone;
and step 3, determining the sales guidance price of the mobile phone according to the prediction result through rules.
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the method, the long-term trend of the global price and the sudden fluctuation of the price in the local unit time are considered in the time sequence process modeling of the price prediction of the second-hand mobile phone, so that the change in the historical sales time sequence price data can be modeled more comprehensively and specifically;
(2) The method uses LSTM to model a long-term growth trend, and has two advantages: a) The LSTM is extremely suitable for processing the sequence structure input with the dependency relationship on the time sequence, can capture the relationship between the historical moments, and learns the historical evolution mode of the price; b) The memory unit in the LSTM can memorize the time sequence dependency relationship with longer distance, and can better process the history time sequence input of a long sequence;
(3) The method adopts CNN modeling with attention mechanism to model sudden fluctuation of price in unit time, and has two advantages: a) CNNs are adept at capturing local structures with translational invariance, whereas short-term fluctuations in price "up" and "down" have just this property; b) The attention mechanism enables the model to pay more attention to the points in time that are affected by external factors;
(4) The method adopts a hierarchical attention network to model the detection report long text content of the mobile phone, which is beneficial to obtaining better text semantic representation because the document is encoded into the attention vectors of word level and sentence level in turn in consideration of the inherent hierarchical structure of the document (namely, words form sentences and sentences form the document);
(5) According to the method, the embedding technology is adopted to embed the metadata features of different types into the homologous dense space, so that the metadata features are fully fused;
(6) The invention adopts a time sequence attention fusion mechanism to automatically determine the decisive degree of the output of different modules on the final prediction result according to different moments, and has good flexibility of dynamically evolving the processing time sequence process;
(7) The lack of early historical data release by cell phone models makes early predictions more challenging, yet predictions at early stages are more valuable. The time attenuation loss function provided by the invention is beneficial to helping the model to put more effort into optimizing the prediction performance in an early stage;
(8) Through the combination of the above points, the model carries out deep fusion on the time sequence process and the attribute characteristics when predicting the price of the second hand mobile phone, and the advantages of the time sequence process and the attribute characteristics are complementary when the model predicts the price of the second hand mobile phone, so that the price of the second hand mobile phone can be flexibly predicted at any period of the new mobile phone release period.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of an algorithm of the present invention;
FIG. 2 is a schematic diagram of a body model structure according to the present invention;
FIG. 3 is a schematic diagram of a time course model structure in the present invention;
FIG. 4 is a schematic diagram of an embedded network structure according to the present invention;
FIG. 5 is a schematic diagram of a hierarchical attention network architecture in accordance with the present invention;
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1-5, the present invention provides the following technical solutions: a price prediction algorithm for second hand mobile phones comprises the following steps:
Step one: extracting metadata features including mobile phone brands, release time, model reports and screen detection items to obtain (F 1,F2,F3,…,Fn) feature vectors;
Step two: calculating the average price of each machine type daily sales as a macroscopic time sequence to obtain a (P 1,P2,P3,…,Pn) time sequence characteristic sequence;
Step three: text preprocessing is carried out on the model detection report content, wherein the text preprocessing comprises special symbol processing, english case conversion and unified complex and simplified characters, word segmentation is carried out on the text, word frequency is counted, a part of high-frequency keywords are screened out from the text to serve as a word segmentation dictionary, and keywords which are not in the dictionary are uniformly represented by special symbols < unk >;
Step four: assuming that the text is correspondingly divided into k words in W 1,W2,W3,…,Wk through the step III, mapping each word into 1 300-dimensional real vector by using a word2vec model, namely, mapping doc into a matrix of k x 300;
Step five: and inputting the mobile phone metadata features, the mobile phone price time sequence and the text vector representation obtained in the first step, the second step and the fourth step into a price prediction model for prediction, wherein the price prediction model comprises a time sequence process modeling module, an attribute feature modeling module and an attention fusion module.
Specifically, according to fig. 3, in the fifth embodiment, in the modeling of the time course, the historical feedback sequence { V 1,V2,V3,…,Vt } in the second step is taken as an input, the cyclic neural network RNN is used to model the long-term trend, and the convolutional neural network CNN is used to capture the short-term fluctuation.
Time course networks use long and short term memory networks LSTM in RNNs to capture long-term growth trends in price evolution over time, LSTM's have the advantage of modeling time in that hidden states contain all historical information, so specific assumptions do not need to be made about the functional form of the historical trends, and memory cells in LSTM are better at capturing long sequence dependencies, feed feedback vectors V for each slot into LSTM, and obtain output vectorsHistory of growth patterns in (a).
On the other hand, the fluctuation caused by the external factors causes the reading quantity curve to present ascending and descending stages which look like "mountain" and "valley", which are local structures with constant translation, as shown in fig. 3, and therefore, the present invention proposes to capture such short-term fluctuation structures with one-dimensional convolutional neural networks; furthermore, the influence of different factors continues over different time ranges, meaning that the "mountain" has different widths, so the present invention captures different fluctuation ranges using multiple convolution kernels of different sizes, then stacks the outputs of all convolution kernels vertically, since CNNs typically require a fixed size input, assuming an input window width of k, the input of each convolution layer is a cut sequence { V t-k+1,Vt-k+2,…,Vt } of length k before time t, applies the same filling operation and obtains an output sequence { C t-k+1,Ct-k+2,…,Ct } of length k, which captures the most recent historical fluctuation pattern, finally, the present invention incorporates the output sequence { C t-k+1,Ct-k+2,…,Ct } to the output vector by time dimension using a attention mechanismIn (c), the attention mechanism is that vectors c at different moments in the convolutional layer output sequence are multiplied by different attention weights a c, thereby contributing to output/>Focusing more on the point in time affected by external factors, the weight a c and the output vector/>The calculation method of (2) is as follows:
Specifically, according to fig. 4 and fig. 5, in the fifth embodiment, the attribute feature modeling uses the embedded network and the hierarchical attention network to receive the metadata feature and the text vector in the first step and the fourth step as inputs, respectively, so as to model the metadata and the long text.
The content attribute features (including test report text and metadata features) of second hand handsets largely determine their price intervals, the metadata features include one-hot coding features, such as categories, as well as numeric features, such as the past week/day of trade average price etc., so the present invention utilizes embedding techniques to embed these features into the homologous dense vectors and applies a fully connected layer to the feature combinations instead of manually selecting and combining these features, as shown in fig. 4, we embed one-hot coding features into the dense vectors by embedding matrices while multiplying the numeric features by the embedding vectors to map them to homologous dense vectors, then concatenate all metadata features, and combine all metadata features together by the fully connected layer to obtain the overall metadata representation vector h e.
Since the mobile phone detection report is usually a long text document, the invention adopts a hierarchical attention network HAN to model text content characteristics, and takes the inherent hierarchical structure of the document (i.e. words form sentences and sentences form the document) into consideration, the HAN adopts a two-stage encoder and an attention mechanism to encode the document into word-level and sentence-level attention vectors, the word-level and sentence-level encoders are Bi-GRU, in addition, the mobile phone model name is a high-level description of the mobile phone, the main impression of the mobile phone is displayed, the model name expression vector is learned in the HAN as a supplement, and the model name is encoded into a vector only with the word-level encoder and the attention, and then the detection report document vector and the model name vector are connected together as a final text characteristic h h.
Specifically, according to fig. 4 and fig. 5, in the fifth embodiment, the attention fusion module dynamically integrates the network modules of time process modeling and attribute feature modeling.
Assume thatH h and h e represent RNN, CNN, HAN and meta-feature embedded outputs respectively, and since in the initial stage after model release, time course modeling is difficult to learn the overall trend of price, so prediction should depend mainly on attribute feature modeling, and as time goes by, the observed price gradually tends to be stable, so time modeling should play a main role in prediction, and the attention fusion mechanism is just to/>H h and h e are combined with a flexible weight a, a being/>H h and h e with time t, so that the method can automatically adapt to the output of different modules and different moments, has good flexibility of dynamic evolution of a processing time sequence process, and an attention mechanism is element-by-element combination, and is used for/>H h and h e feed into the fully connected layer for feature combination and to obtain alignment vectors/>, of the elementsAnd/>Then, a two-layer neural network is used to calculate the attention weight a m in the following manner:
The time representation variable t consists of the periodic property of a given time slot t, the time slot interval and the release time, wherein the periodic property is a single thermal coding characteristic and the time interval is a numerical characteristic, and the invention applies the same strategy as embedding metadata characteristics to embed the time representation variable t into a vector, dynamically fuses all sub-networks into a vector through the attention weight a m And obtaining a price prediction probability distribution P t={pt(l1),pt(l2),…,pt(ln) after the full connection layer and the softmax output layer, and then taking the price category corresponding to the maximum probability as a final prediction result/>The specific calculation process is as follows:
Specifically, according to fig. 4 and fig. 5, the embodiment further includes a training stage for training the time sequence process modeling, attribute feature modeling and attention fusion module.
In order to ensure the diversity of training data, the price prediction of the mobile phone is regarded as a classification task, and the prices are classified into 100 categories: dividing the continuous value of the price into 100 classes according to the price equal-frequency dividing bucket, and taking the class as a label of training data; at the same time, we limit the maximum and minimum input length of the text sequence, filtering sequences with too short historical feedback time sequences and no predictive value.
In the training process, the category with the highest prediction score is selected as a prediction result of the price, an Adam optimization algorithm and a time attenuation loss function proposed by us are adopted as optimization targets, the model is trained on a training set until convergence, the early price is predicted more valuable in practical application, in addition, the observed internal relation between the price and a reasonable price interval enables the prediction to be easier in the later stage, more efforts are invested on the early stage to optimize the prediction performance for helping the model, in the training process, a weighted sum of the cross entropy loss of a single step is multiplied by a time attenuation factor D (delta t) as a final loss function, the time attenuation factor D (delta t) is a monotonically non-increasing function of a time interval delta t between t and model release time, and the specific form of D (delta t) and a loss function J is as follows:
D(Δt)=[logr(Δt+1)]-1
Here, [ ] denotes a rounding up operator, Δ t is the number of slots between issue time and time t, so Δ t and log p t(lc are both positive integers, Y > 1 is a hyper-parameter for controlling the decay rate, we use a logarithmic function to ensure that the decay rate of D (Δ t) becomes smaller and smaller over time, and a rounding up operator is used to limit the initial decay rate of the logarithmic function.
Specifically, according to fig. 4 and fig. 5, the embodiment further includes an application stage, which is used as an online service to process the second-hand mobile phone, and specifically includes:
step 1, extracting metadata characteristics, model names and detection report text contents of a mobile phone to be predicted, dividing a historical sales sequence of the mobile phone, and preprocessing;
Step 2, inputting the processed price history time sequence, the metadata characteristics and the text sequence into a trained model, and outputting the predicted category of the price of the mobile phone;
and step 3, determining the sales guidance price of the mobile phone according to the prediction result through rules.
(1) The present approach uses LSTM to model long-term trends in price, and other structures in RNNs may be used instead, such as gated loop units (GRUs);
(2) The HAN adopts a Bi-directional gating and circulating unit (Bi-GRU) to encode the text of word level and sentence level sequentially, and other encoders can be used for replacing the text, such as unidirectional or bidirectional RNNs and LSTM, CNNs and transformers;
(3) The final output can be changed from a classification problem to a regression problem, and the price score can be directly predicted;
(4) The method can be extended to any business of regression scenes, such as news in a portal and blog recommendation in a social network.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (4)
1. A secondhand mobile phone price prediction method integrating a time sequence process and mobile phone defect feature depth is characterized by comprising the following steps:
Step one: extracting metadata features including mobile phone brands, release time, model reports and screen detection items to obtain (F 1,F2,F3,…,Fn) feature vectors;
step two: calculating the average price of each machine type daily sales as a time sequence characteristic sequence to obtain a (P 1,P2,P3,…,Pn) time sequence characteristic sequence;
Step three: performing text preprocessing on the detection report content, segmenting the text, counting word frequency, screening out a part of high-frequency keywords as a word segmentation dictionary, counting word frequency, screening out a part of high-frequency keywords from the word frequency as the word segmentation dictionary, and uniformly representing keywords which are not in the dictionary by special symbols < unk >;
step four: through the corresponding division of the text in the step three, k words are counted in W 1,W2,W3,…,Wk, each word is mapped into 1 300-dimensional real vector by using a word2vec model, namely doc is mapped into a matrix of k x 300;
step five: inputting the metadata features, time sequence feature sequences and 300-dimensional real vectors obtained in the first step, the second step and the fourth step into a price prediction model for prediction, wherein the price prediction model comprises a time process modeling module, an attribute feature modeling module and an attention fusion module, the time process modeling module takes the time sequence feature sequences (P 1,P2,P3,…,Pn) in the second step as input, adopts a cyclic neural network RNN to model long-term trends, adopts the convolutional neural network CNN to capture short-term fluctuations, adopts a long-term memory network LSTM in the RNN to capture long-term growth trends of price evolution along with time, feeds feedback vectors of each time slot into the LSTM, and obtains output vectors CNN fixes the width of the input window to be k, the input of each convolution layer is a shearing sequence with the length of k before the moment t, the same filling operation is applied, the output sequence { C t-k+1,Ct-k+2,…,Ct } with the length of k is obtained, and the attention mechanism is adopted to merge the output sequence { C t-k+1,Ct-k+2,…,Ct } to the output vector/>, through the time dimensionWherein, attribute feature modeling adopts an embedding network and a hierarchical attention network to respectively receive the metadata features and 300-dimensional real vectors in the first step and the fourth step as input, the metadata and the long text are modeled, the metadata features comprise single-hot coding features and numerical type features, the single-hot coding features are embedded into the dense vectors through the embedding matrix, meanwhile, the numerical features are multiplied by the embedding vectors so as to map the numerical features to homologous dense vectors, then all the metadata features are cascaded, all the metadata features are combined together by a full connection layer to obtain an overall metadata representation vector h e, a detection report is a long text file, the text content features are modeled by adopting a hierarchical attention network HAN, the hierarchical attention network adopts a two-stage encoder and an attention mechanism to sequentially encode documents into word-level and sentence-level attention vectors, the word-level and sentence-level encoders are two-way gating circulating units Bi-GRU, model name representation vectors are simultaneously learned as supplements in the HAN, model names are encoded into vectors only with word-level encoders and attention, then all the detection report document vectors and model name final report are connected together as a feature h, the attribute modeling process is integrated by a dynamic modeling element, and the attribute modeling process is a time-integrated by a dynamic modeling element, and a time-integrated modeling process is a dynamic modeling element H h and h e feed into the fully connected layer for feature combination and to obtain alignment vectors/>, of the elements And/>The attention weights a m are then calculated using a two-layer neural network, dynamically fusing all sub-networks into/>And obtaining probability distribution P t={pt(l1),pt(l2),…,pt(ln) of price prediction after the full connection layer and the softmax output layer, and taking the price category corresponding to the maximum probability as the final prediction result/>
2. The secondhand mobile phone price prediction method based on the time sequence process and mobile phone defect feature depth fusion of claim 1, wherein the secondhand mobile phone price prediction method is characterized by comprising the following steps of: in the third step, the preprocessing comprises special symbol processing, english case conversion and unified complex and simplified characters.
3. The secondhand mobile phone price prediction method based on the time sequence process and mobile phone defect feature depth fusion of claim 1, wherein the secondhand mobile phone price prediction method is characterized by comprising the following steps of: the training stage is used for training the time sequence process modeling, attribute feature modeling and attention fusion module.
4. The secondhand mobile phone price prediction method based on the time sequence process and mobile phone defect feature depth fusion of claim 1, wherein the secondhand mobile phone price prediction method is characterized by comprising the following steps of: the method also comprises an application stage for processing the second hand mobile phone as an online service, and specifically comprises the following steps:
step 1, extracting metadata characteristics, model names and detection report text contents of a mobile phone to be predicted, dividing a historical sales sequence of the mobile phone, and preprocessing;
Step 2, inputting the processed price history time sequence, the metadata characteristics and the text sequence into a trained model, and outputting the predicted category of the price of the mobile phone;
and step 3, determining the sales guidance price of the mobile phone according to the prediction result through rules.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111122887.8A CN113837802B (en) | 2021-09-24 | 2021-09-24 | Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111122887.8A CN113837802B (en) | 2021-09-24 | 2021-09-24 | Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113837802A CN113837802A (en) | 2021-12-24 |
CN113837802B true CN113837802B (en) | 2024-05-28 |
Family
ID=78970058
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111122887.8A Active CN113837802B (en) | 2021-09-24 | 2021-09-24 | Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113837802B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101061094B1 (en) * | 2010-09-28 | 2011-08-31 | 홍상욱 | Information providing and selling system for used car on network |
WO2012071543A2 (en) * | 2010-11-24 | 2012-05-31 | Decide, Inc. | Price and model prediction system and method |
CN109492838A (en) * | 2019-01-16 | 2019-03-19 | 中国地质大学(武汉) | A kind of stock index price expectation method based on deep-cycle neural network |
AU2020100249A4 (en) * | 2019-04-26 | 2020-03-26 | Shanghai Academy Of Agricultural Sciences | Method and device for predicting product price and computer medium |
CN112016964A (en) * | 2020-08-27 | 2020-12-01 | 李忠耘 | Second-hand vehicle dynamic pricing method and device, electronic equipment and storage medium |
US10878505B1 (en) * | 2020-07-31 | 2020-12-29 | Agblox, Inc. | Curated sentiment analysis in multi-layer, machine learning-based forecasting model using customized, commodity-specific neural networks |
KR102199620B1 (en) * | 2020-05-20 | 2021-01-07 | 주식회사 네이처모빌리티 | System for providing bigdata based price comparison service using time series analysis and price prediction |
KR102234821B1 (en) * | 2020-10-12 | 2021-04-01 | 주식회사 브랜드쉐어 | Electronic device for performing a predection for a price of a product using big data and machine learning model and method for operating thereof |
KR20210063774A (en) * | 2019-11-25 | 2021-06-02 | (주)크래프트테크놀로지스 | Method, computer program and recording medium for predicting price of asset use of convolution neural network |
CN113033903A (en) * | 2021-03-31 | 2021-06-25 | 西安建筑科技大学 | Fruit price prediction method, medium and equipment of LSTM model and seq2seq model |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10185996B2 (en) * | 2015-07-15 | 2019-01-22 | Foundation Of Soongsil University Industry Cooperation | Stock fluctuation prediction method and server |
CN105930931A (en) * | 2016-04-22 | 2016-09-07 | 国网浙江省电力公司经济技术研究院 | Electric power engineering cost management method |
US20210110440A1 (en) * | 2019-10-15 | 2021-04-15 | A La Carte Media, Inc. | Systems and methods for enhanced evaluation of pre-owned electronic devices and provision of protection plans, repair, certifications, etc. |
-
2021
- 2021-09-24 CN CN202111122887.8A patent/CN113837802B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101061094B1 (en) * | 2010-09-28 | 2011-08-31 | 홍상욱 | Information providing and selling system for used car on network |
WO2012071543A2 (en) * | 2010-11-24 | 2012-05-31 | Decide, Inc. | Price and model prediction system and method |
CN109492838A (en) * | 2019-01-16 | 2019-03-19 | 中国地质大学(武汉) | A kind of stock index price expectation method based on deep-cycle neural network |
AU2020100249A4 (en) * | 2019-04-26 | 2020-03-26 | Shanghai Academy Of Agricultural Sciences | Method and device for predicting product price and computer medium |
KR20210063774A (en) * | 2019-11-25 | 2021-06-02 | (주)크래프트테크놀로지스 | Method, computer program and recording medium for predicting price of asset use of convolution neural network |
KR102199620B1 (en) * | 2020-05-20 | 2021-01-07 | 주식회사 네이처모빌리티 | System for providing bigdata based price comparison service using time series analysis and price prediction |
US10878505B1 (en) * | 2020-07-31 | 2020-12-29 | Agblox, Inc. | Curated sentiment analysis in multi-layer, machine learning-based forecasting model using customized, commodity-specific neural networks |
CN112016964A (en) * | 2020-08-27 | 2020-12-01 | 李忠耘 | Second-hand vehicle dynamic pricing method and device, electronic equipment and storage medium |
KR102234821B1 (en) * | 2020-10-12 | 2021-04-01 | 주식회사 브랜드쉐어 | Electronic device for performing a predection for a price of a product using big data and machine learning model and method for operating thereof |
CN113033903A (en) * | 2021-03-31 | 2021-06-25 | 西安建筑科技大学 | Fruit price prediction method, medium and equipment of LSTM model and seq2seq model |
Non-Patent Citations (2)
Title |
---|
Two-channel Attention Mechanism Fusion Model of Stock Price Prediction Based on CNN-LSTM;Lin Sun.et al;ACM Transactions on Asian and Low-Resource Language Information Processing;第1-12页 * |
基于HP-LSTM模型的股指价格预测方法;姚远等;计算机工程与应用;第296-304页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113837802A (en) | 2021-12-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110728541B (en) | Information streaming media advertising creative recommendation method and device | |
WO2022116536A1 (en) | Information service providing method and apparatus, electronic device, and storage medium | |
CN111464881B (en) | Full-convolution video description generation method based on self-optimization mechanism | |
CN112084334B (en) | Label classification method and device for corpus, computer equipment and storage medium | |
CN110795657A (en) | Article pushing and model training method and device, storage medium and computer equipment | |
CN112085565A (en) | Deep learning-based information recommendation method, device, equipment and storage medium | |
CN110223675A (en) | The screening technique and system of training text data for speech recognition | |
CN117453921A (en) | Data information label processing method of large language model | |
CN116304745B (en) | Text topic matching method and system based on deep semantic information | |
CN111723295A (en) | Content distribution method, device and storage medium | |
CN115659995B (en) | Text emotion analysis method and device | |
CN113032552A (en) | Text abstract-based policy key point extraction method and system | |
CN116975615A (en) | Task prediction method and device based on video multi-mode information | |
CN117216535A (en) | Training method, device, equipment and medium for recommended text generation model | |
CN114996486A (en) | Data recommendation method and device, server and storage medium | |
CN115018190A (en) | Overdue behavior prediction method and device, storage medium and electronic device | |
CN110377910A (en) | A kind of processing method, device, equipment and the storage medium of table description | |
CN116383521B (en) | Subject word mining method and device, computer equipment and storage medium | |
CN113837802B (en) | Secondhand mobile phone price prediction method integrating time sequence process and mobile phone defect feature depth | |
CN111241392A (en) | Method, device, equipment and readable storage medium for determining popularity of article | |
CN111581386A (en) | Construction method, device, equipment and medium of multi-output text classification model | |
CN116167371A (en) | Product recommendation method and device, processor and electronic equipment | |
CN117216617A (en) | Text classification model training method, device, computer equipment and storage medium | |
CN113688232B (en) | Method and device for classifying bid-inviting text, storage medium and terminal | |
CN113505207B (en) | Machine reading understanding method and system for financial public opinion research report |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |