CN106503228A - A kind of packet scarcity appraisal procedure and its system - Google Patents

A kind of packet scarcity appraisal procedure and its system Download PDF

Info

Publication number
CN106503228A
CN106503228A CN201610970543.5A CN201610970543A CN106503228A CN 106503228 A CN106503228 A CN 106503228A CN 201610970543 A CN201610970543 A CN 201610970543A CN 106503228 A CN106503228 A CN 106503228A
Authority
CN
China
Prior art keywords
packet
assessed
data
similarity
scarcity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610970543.5A
Other languages
Chinese (zh)
Inventor
张斌德
王军
孙玉权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoxin Youe Data Co Ltd
Original Assignee
Guoxin Youe Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoxin Youe Data Co Ltd filed Critical Guoxin Youe Data Co Ltd
Priority to CN201610970543.5A priority Critical patent/CN106503228A/en
Publication of CN106503228A publication Critical patent/CN106503228A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention provides a kind of packet scarcity appraisal procedure and its system, and the method is comprised the following steps:S100:Obtain multiple related data packets related to given content;S200:Determine packet to be assessed, and determine the similarity between packet to be assessed and other packets, choose with the similarity between packet to be assessed higher than predetermined threshold packet as comparing packet;S300:The scarcity of packet to be assessed is determined using default processing method.The present invention is by being estimated to the scarcity of packet such that it is able to understand the quality of packet, the value assessment for data provides certain reference frame.

Description

A kind of packet scarcity appraisal procedure and its system
Technical field
The present invention relates to big data field, and in particular to a kind of packet scarcity appraisal procedure and its system.
Background technology
Data trade is currently in the industry initial stage, and development is very fast, but lacks the theoretical direction of maturation.By data value Quantization is an extremely difficult thing, this be by data substitutive characteristics and current business environment determined.Meanwhile, this One work will be also hindered by numerous objective factors, such as the accurate assessment of data compiling costs, the devaluation of data and Life Cycle Phase changes, and the surcharge of data etc..With data product conclude the business increasingly prevailing, how to judge the value of data, this The puzzlement for not only bringing to data selling business, also brings puzzlement to buyer.
Well-known viewpoint is that thing which is rare is dear, for data no exception.More rare data, its value are also corresponding About big.The scarcity of data message resource is divided at 2 points, and one is rare root source, i.e., data message resource is objective Sexual valence value;Two is the rare form of expression, the serviceability of data message resource cause rare be possibly realized, data message resource Nonhomogeneity causes rare becoming necessarily.
Therefore, how the scarcity of data is estimated, preferably to provide preferably service as data trade market Become problem urgently to be resolved hurrily.
Content of the invention
For above-mentioned technical problem, the present invention provides a kind of packet scarcity appraisal procedure and its system.
The technical solution used in the present invention is:
Embodiments of the invention provide a kind of packet scarcity appraisal procedure, including:
S100:Obtain multiple related data packets related to given content;
S200:Determine packet to be assessed, and determine the similarity between packet to be assessed and other packets, choose With the similarity between packet to be assessed higher than predetermined threshold packet as comparing packet;
S300:The scarcity of packet to be assessed is determined using default processing method, is assessed especially by equation below The scarcity of packet to be assessed:
Wherein, f is the scarcity score of packet to be assessed, and span is [0,1];Y be except packet to be assessed with All data strip number sums in other outer packets;X is the data strip number in packet to be assessed.
Preferably, calculated between packet to be assessed and other packets using text similarity measurement algorithm in step s 200 Similarity, specifically include:
S210:Packet to be assessed and the text compared in packet are read in R LISP program LISPs, by participle instrument Or the text in each packet is split into single word by user-defined word segmentation regulation, determine Feature Words and count each spy The word frequency of word appearance is levied, and sets up document entry matrix;
S220:Packet to be assessed and the similarity compared between packet are calculated based on below equation:
Wherein, G is the similarity between packet to be assessed and other packets, and scope is [0,1];N1, N2…NmWith M1, M2…MmThe number of times that each Feature Words in packet respectively to be assessed and other packets occur.
Preferably, when G is more than 0.5, represent that packet to be assessed has similarity with packet is compared;When G is more than When 0.85, packet to be assessed is represented with to compare packet highly similar.
Preferably, as f=0, represent that the data in packet to be assessed are not rare;As f=1, number to be assessed is represented Data according to bag compare in packet at other and do not exist, very rare.
Preferably, related to given content multiple to obtain by crawling the network data of the multiple data platforms in the Internet Related data packets.
Another embodiment of the present invention provides a kind of packet scarcity assessment system, including:
Data acquisition module, obtains multiple related data packets related to given content;Similarity assessment module, determination are treated Assessment packet, and determine the similarity between packet to be assessed and other packets, choose and packet to be assessed between Similarity higher than predetermined threshold packet as comparing packet;Scarcity evaluation module, using default processing method come Determine the scarcity of packet to be assessed, especially by the scarcity that equation below assesses packet to be assessed:
Wherein, f is the scarcity score of packet to be assessed, and span is [0,1];Y be except packet to be assessed with All data strip number sums in other outer packets;X is the data strip number in packet to be assessed.
Alternatively, the similarity assessment module includes:Feature extraction unit, by keyword extraction instrument or makes by oneself Justice determines the Feature Words between packet to be assessed and the text compared in packet;Entry document matrix sets up unit, will treat Assessment packet is read in R LISP program LISPs with the text compared in packet, by participle instrument or user-defined participle Text in each packet is split into single word by rule, is counted the word frequency that each Feature Words occurs, and is set up document word Bar matrix;Similarity calculated, calculates packet to be assessed and the similarity compared between packet based on below equation:
Wherein, G is the similarity between packet to be assessed and other packets, and scope is [0,1];N1, N2…NmWith M1, M2…MmThe number of times that each Feature Words in packet respectively to be assessed and other packets occur.
Alternatively, the predetermined threshold is 0.5, and when G is more than 0.5, expression packet to be assessed has with packet is compared There is similarity;When G is more than 0.85, packet to be assessed is represented with to compare packet highly similar.
Alternatively, as f=0, represent that the data in packet to be assessed are not rare;As f=1, number to be assessed is represented Data according to bag compare in packet at other and do not exist, very rare.
Alternatively, the data acquisition module is obtained and refers to by crawling the network data of the multiple data platforms in the Internet Determine the related multiple related data packets of content.
The present invention is by being estimated to the scarcity of packet such that it is able to understands the quality of packet, is data Value assessment provides certain reference frame.
Description of the drawings
Fig. 1 is the schematic flow sheet of packet scarcity appraisal procedure provided in an embodiment of the present invention;
Fig. 2 is the structural representation of packet scarcity assessment system provided in an embodiment of the present invention.
Specific embodiment
Hereinafter, the specific embodiment of the present invention is described in conjunction with accompanying drawing.
【Embodiment 1】Packet scarcity appraisal procedure
Fig. 1 is the schematic flow sheet of packet scarcity appraisal procedure provided in an embodiment of the present invention.As shown in figure 1, this The packet scarcity appraisal procedure that embodiment is provided, including:
S100:Obtain related data packets
Specifically, given content can be based on, the phase on each big data business site is crawled using Python programming Packet is closed, and the data for crawling are stored in relevant database MySQL database, can be included inside packet various The file of data type, such as JSON, picture, video, audio frequency etc. file.The detailed process for crawling is:User input network address Afterwards, through dns server, server host being found, a request being sent to server, server is sent out after parsing The files such as browser HTML, JS, CSS of user are given, browser resolves are out.Therefore, the webpage that user sees substantially be by HTML code is constituted, and it is these contents that reptile climbs come, by analyzing and filtering these HTML codes, realize to picture, Crawling for the resource such as word and upload adnexa, thus can illustrate for packet to the website that each big data is concluded the business Related content is crawled.Thus, the multiple related data packets comprising same body content can be obtained.It is of course also possible to will The packet that obtained is chosen before evaluation operation to be estimated, rather than is crawled in evaluation operation in real time.
S200:The similarity between packet is calculated, the packet that similarity exceedes predetermined threshold is chosen
Specifically, a packet to be assessed can be determined according to practical situation, for example, it is desired to certain data provision platform The scarcity of data be estimated, the packet that the data provision platform is provided can be appointed as packet to be assessed, then The similarity between the packet to be assessed and other packets is calculated using text similarity measurement algorithm, similarity is chosen and is surpassed The packet of predetermined threshold is crossed, as comparing packet.Step S200 may particularly include:
S210:Text in packet is read in R LISP program LISPs, is advised by participle instrument or user-defined participle The text in each packet in the related data packets is split into single word then, Feature Words is determined and is counted each spy The word frequency of word appearance is levied, and sets up document entry matrix, for example, with regard to the packet of three import and export products, the entry of foundation Document matrix can be as shown in table 1 below:
Table 1:Entry document matrix
Feature Declaration Outlet Port Provinces and cities Quantity Originate in Species The amount of money Specification
Text 1 2 4 1 2 6 2 2 7 0
Text 2 1 5 4 3 8 2 2 5 1
Text 3 3 1 4 0 1 8 7 2 3
Numeral wherein in table 1 represents the number of times of the Feature Words occurred in corresponding text.
S230:Calculate the similarity between packet
The similarity between two packets can be calculated using following formula 1:
【Formula 1】
Wherein, G is the similarity between two packets, and scope is [0,1];N1, N2…NmAnd M1, M2…MmRespectively compare Compared with two packets in the number of times that occurs of each Feature Words.In the present embodiment, predetermined threshold can be 0.5, i.e., when G is more than When 0.5, represent that two packets are similar;When G is more than 0.85, represent that two packets are highly similar.
By taking table 1 as an example, the word occurred in text 1 is:C1、C2、C3、C4……Cn;The number of times that these words occur is respectively: N1, N2, N3 ... Nm, in text 2, the word of appearance is:C1、C2、C3、C4……Cn;The number of times that these words occur is respectively:M1、 M2、M3……Mm.Wherein, C1 represents same word in two texts, and N1 and M1 is that they distinguish corresponding number, then can base The similarity between text 1 and text 2 is calculated in above-mentioned formula, and calculating process is as follows:
As the similarity score between text 1 and text 2 is 0.97, more than 0.85, therefore, can determine whether comprising text 1 Packet and the packet comprising text 2 between there is high similarity.If it is determined that the scarcity of needs assessment text 1, Then can be using the data comprising text 2 as comparing packet.Equally, the similarity between text 1 and text 3 can be calculated, is led to It is 0.4 to cross the similarity score that can be calculated between text 1 and text 3, less than 0.5, then it represents that the packet comprising text 1 with Similarity between packet comprising text 3 is not high, can not be using the packet comprising text 3 as comparing packet.Certainly, When the scarcity of needs assessment text 2, then it is that correlation technique is identical with text 1 according to calculating similarity with text 2, when During the scarcity of needs assessment text 3, and so.
S300:Calculate the scarcity of packet to be assessed
When scarcity is calculated, need to select a packet to be assessed, the packet to be assessed can be according to practical situation To determine.If homogeneous data is more, then it represents that scarcity is lower;If homogeneous data is fewer, then it represents that scarcity is higher.
For specified packet to be assessed, the scarcity of the packet can be assessed by equation below 2:
【Formula 2】
Wherein, f is the scarcity score of packet to be assessed, and span is [0,1];Y be except packet to be assessed with All data strip number sums in other outer packets;X is the data strip number in packet to be assessed.Data strip number can be according to pre- If rule is determining, for example, it can be one section of text in short or with regard to certain event etc..As f=0, number to be assessed is represented Data according to bag are not very rare;As f=1, represent the data in packet to be assessed in other packets to be assessed Do not exist, very rare.
The assessment of scarcity is illustrated below by way of an example.
Example
First, two data provision platforms are crawled by Python programming according to given content " information-based related " Related data packets 1 and 2 on 1 and 2, and determine the scarcity of assessment packet 1.
Then, the content according to disclosed in step S200 sets up the entry of the packet with regard to the two data provision platforms Document matrix, as shown in table 2 below:
Table 2
Data Field Information Microblogging Machine Society Time Public sentiment Study Collection
Packet 1 1 2 3 2 1 1 1 1 1 1
Packet 22 1 1 1 2 0 0 1 3 2 5
Then, the similarity between the two packets is calculated using above-mentioned formula 1, obtain the phase between two packets Seemingly spend and must be divided into 0.63, show that the two packets are similar.
Learnt by statistics, the total data bar number of packet 1 and 2 is 6,000,000, and the data strip number of wherein packet 1 is 5000000, the data strip number of packet 2 is 1,000,000, and the scarcity for calculating packet 1 using above-mentioned formula 2 is:
This expression, the scarcity of packet 1 are very rare.
【Embodiment 2】Packet scarcity assessment system
Fig. 2 is the structural representation of packet scarcity assessment system provided in an embodiment of the present invention.As shown in Fig. 2 this The packet scarcity assessment system that embodiment is provided, assesses including data acquisition module, similarity assessment module and scarcity Module.
Wherein, data acquisition module is used for obtaining multiple related data packets related to given content.Can be by passing through net Network crawls the network data of multiple data provision platforms to obtain multiple related data packets related to given content.For example, may be used Given content is based on, and the related data packets on each big data business site is crawled using Python programming, and will be climbed The data for taking are stored in relevant database MySQL database, can include the file of various data types inside packet, Such as JSON, picture, video, audio frequency etc. file.The detailed process for crawling is:After user input network address, through DNS service Device, finds server host, sends a request to server, and server is sent to the browser of user after parsing The files such as HTML, JS, CSS, browser resolves are out.Therefore, the webpage that user sees substantially is made up of HTML code, is climbed What worm climbed is these contents, by analyzing and filtering these HTML codes, realizes to picture, word and uploads adnexa Etc. crawling for resource, thus can illustrate etc. that the content of correlation is climbed for packet to website that each big data is concluded the business Take.Thus, the multiple related data packets comprising same body content can be obtained.It is of course also possible to will choose before evaluation operation The packet for having obtained is being estimated, rather than is crawled in evaluation operation in real time.
Similarity assessment module is used for determining packet to be assessed, and determines between packet to be assessed and other packets Similarity, choose with the similarity between packet to be assessed higher than predetermined threshold packet as comparing packet.Phase May include like degree evaluation module:Entry document matrix sets up unit, and packet to be assessed is read with the text compared in packet Enter in R LISP program LISPs, the text in each packet is split into by list by participle instrument or user-defined word segmentation regulation Individual word, determines Feature Words and counts the word frequency that each Feature Words occurs, and set up document entry matrix;Similarity Measure list Unit, calculates packet to be assessed and the similarity compared between packet based on below equation:
Wherein, G is the similarity between packet to be assessed and other packets, and scope is [0,1];N1, N2…NmWith M1, M2…MmThe number of times that each Feature Words in packet respectively to be assessed and other packets occur.Predetermined threshold can be 0.5, when G is more than 0.5, represent that packet to be assessed has similarity with packet is compared;When G is more than 0.85, expression is treated Assessment packet is with to compare packet highly similar.
Scarcity evaluation module is used for utilizing default processing method to determine the scarcity of packet to be assessed, especially by Equation below assesses the scarcity of packet to be assessed:
Wherein, f is the scarcity score of packet to be assessed, and span is [0,1];Y be except packet to be assessed with All data strip number sums in other outer packets;X is the data strip number in packet to be assessed.As f=0, expression is treated Data in assessment packet are not very rare;As f=1, represent that the data in packet to be assessed compare data at other Do not exist in bag, very rare.
It should be noted that being related to several factors in terms of the value assessment of data file, need to consider each factor The final valuation of data file can be just drawn, the one side of simply estimated data's scarcity that the present invention is provided, is data file Valuation provide a reference frame.
To sum up, present invention introduces the scarcity analysis method of economics category carries out valuation to data assets, with more preferable For Data Market behavior service, promote Data Market transaction and the quick landing of data items.
Those skilled in the art are it should be appreciated that embodiments herein can be provided as method, system or computer program Product.Therefore, the application can adopt complete hardware embodiment, complete software embodiment or with reference to software and hardware in terms of reality Apply the form of example.And, the application can be adopted in one or more computers for wherein including computer usable program code The upper computer program that implements of usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) is produced The form of product.
Although having been described for the preferred embodiment of the application, those skilled in the art once know basic creation Property concept, then can make other change and modification to these embodiments.So, claims are intended to be construed to include excellent Select embodiment and fall into the had altered of the application scope and change.Obviously, those skilled in the art can be to the application Embodiment carries out the spirit and scope of various changes and modification without deviating from the embodiment of the present application.So, if the application is implemented These modifications of example and modification belong within the scope of the application claim and its equivalent technologies, then the application is also intended to include Including these changes and modification.

Claims (10)

1. a kind of packet scarcity appraisal procedure, it is characterised in that include:
S100:Obtain multiple related data packets related to given content;
S200:Determine packet to be assessed, and determine the similarity between packet to be assessed and other packets, choose and treat Similarity between assessment packet higher than predetermined threshold packet as comparing packet;
S300:The scarcity of packet to be assessed is determined using default processing method, is assessed especially by equation below to be evaluated Estimate the scarcity of packet:
f = 2 e - y / x 1 + e - y / x
Wherein, f is the scarcity score of packet to be assessed, and span is [0,1];Y is in addition to packet to be assessed All data strip number sums in other packets;X is the data strip number in packet to be assessed.
2. method according to claim 1, it is characterised in that calculated using text similarity measurement algorithm in step s 200 and treated Similarity between assessment packet and other packets, specifically includes:
S210:Packet to be assessed and the text compared in packet are read in R LISP program LISPs, by participle instrument or use Text in each packet is split into single word by the word segmentation regulation of family definition, is determined Feature Words and is counted each Feature Words The word frequency of appearance, and set up document entry matrix;
S220:Packet to be assessed and the similarity compared between packet are calculated based on below equation:
G = ( N 1 × M 1 ) + ( N 2 × M 2 ) + ... + ( N m × M m ) N 1 2 + N 2 2 + ... + N m 2 × M 1 2 + M 2 2 + ... + M m 2
Wherein, G is the similarity between packet to be assessed and other packets, and scope is [0,1];N1, N2…NmAnd M1, M2… MmThe number of times that each Feature Words in packet respectively to be assessed and other packets occur.
3. method according to claim 2, it is characterised in that when G is more than 0.5, represents packet to be assessed and compares Packet has similarity;When G is more than 0.85, packet to be assessed is represented with to compare packet highly similar.
4. method according to claim 1, it is characterised in that as f=0, represents data in packet to be assessed not Rare;As f=1, represent that the data in packet to be assessed compare in packet at other and do not exist, very rare.
5. method according to claim 1, it is characterised in that by crawling the network data of the multiple data platforms in the Internet To obtain multiple related data packets related to given content.
6. a kind of packet scarcity assessment system, it is characterised in that include:
Data acquisition module, obtains multiple related data packets related to given content;
Similarity assessment module, determines packet to be assessed, and determines similar between packet to be assessed and other packets Degree, choose with the similarity between packet to be assessed higher than predetermined threshold packet as comparing packet;
Scarcity evaluation module, determines the scarcity of packet to be assessed using default processing method, especially by following public affairs Formula assesses the scarcity of packet to be assessed:
f = 2 e - y / x 1 + e - y / x
Wherein, f is the scarcity score of packet to be assessed, and span is [0,1];Y is in addition to packet to be assessed All data strip number sums in other packets;X is the data strip number in packet to be assessed.
7. system according to claim 6, it is characterised in that the similarity assessment module includes:
Entry document matrix sets up unit, and packet to be assessed and the text compared in packet are read in R LISP program LISPs, Text in each packet is split into by single word by participle instrument or user-defined word segmentation regulation, Feature Words are determined The word frequency that each Feature Words occurs is counted and, and set up document entry matrix;
Similarity calculated, calculates packet to be assessed and the similarity compared between packet based on below equation:
G = ( N 1 × M 1 ) + ( N 2 × M 2 ) + ... + ( N m × M m ) N 1 2 + N 2 2 + ... + N m 2 × M 1 2 + M 2 2 + ... + M m 2
Wherein, G is the similarity between packet to be assessed and other packets, and scope is [0,1];N1, N2…NmAnd M1, M2… MmThe number of times that each Feature Words in packet respectively to be assessed and other packets occur.
8. system according to claim 7, it is characterised in that when G is more than 0.5, represents packet to be assessed and compares Packet has similarity;When G is more than 0.85, packet to be assessed is represented with to compare packet highly similar.
9. system according to claim 6, it is characterised in that as f=0, represents data in packet to be assessed not Rare;As f=1, represent that the data in packet to be assessed compare in packet at other and do not exist, very rare.
10. system according to claim 6, it is characterised in that the data acquisition module passes through by crawling the Internet The network data of multiple data platforms is obtaining multiple related data packets related to given content.
CN201610970543.5A 2016-10-28 2016-10-28 A kind of packet scarcity appraisal procedure and its system Pending CN106503228A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610970543.5A CN106503228A (en) 2016-10-28 2016-10-28 A kind of packet scarcity appraisal procedure and its system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610970543.5A CN106503228A (en) 2016-10-28 2016-10-28 A kind of packet scarcity appraisal procedure and its system

Publications (1)

Publication Number Publication Date
CN106503228A true CN106503228A (en) 2017-03-15

Family

ID=58321845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610970543.5A Pending CN106503228A (en) 2016-10-28 2016-10-28 A kind of packet scarcity appraisal procedure and its system

Country Status (1)

Country Link
CN (1) CN106503228A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345301A (en) * 2018-09-26 2019-02-15 国信优易数据有限公司 A kind of data price-determining system and determining method
CN110766428A (en) * 2018-07-25 2020-02-07 国信优易数据有限公司 Data value evaluation system and method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411583A (en) * 2010-09-20 2012-04-11 阿里巴巴集团控股有限公司 Method and device for matching texts
CN105488699A (en) * 2015-12-25 2016-04-13 国信优易数据有限公司 Data asset value assessment method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411583A (en) * 2010-09-20 2012-04-11 阿里巴巴集团控股有限公司 Method and device for matching texts
CN105488699A (en) * 2015-12-25 2016-04-13 国信优易数据有限公司 Data asset value assessment method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
科技日报: "全球首个数据资产评估模型发布", 《科技日报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766428A (en) * 2018-07-25 2020-02-07 国信优易数据有限公司 Data value evaluation system and method
CN109345301A (en) * 2018-09-26 2019-02-15 国信优易数据有限公司 A kind of data price-determining system and determining method

Similar Documents

Publication Publication Date Title
Sellers et al. The COM‐Poisson model for count data: a survey of methods and applications
Althouse et al. Differences in impact factor across fields and over time
Chan et al. Evaluating online ad campaigns in a pipeline: causal models at scale
TWI518614B (en) Method and Device for E - commerce Information Based on User Behavior
US8073865B2 (en) System and method for content extraction from unstructured sources
CN106056407A (en) Online banking user portrait drawing method and equipment based on user behavior analysis
US20180246880A1 (en) System for generating synthetic sentiment using multiple points of reference within a hierarchical head noun structure
WO2009110550A1 (en) Attribute extraction method, system, and program
JP6334431B2 (en) Data analysis apparatus, data analysis method, and data analysis program
CN107273391A (en) Document recommends method and apparatus
Heule et al. Local search for fast matrix multiplication
CN112507212A (en) Intelligent return visit method and device, electronic equipment and readable storage medium
Ashtor Investigating cohort similarity as an ex ante alternative to patent forward citations
CN104111969A (en) Method and system for measuring similarity
Kim et al. Proposing a missing data method for hospitality research on online customer reviews: An application of imputation approach
CN106874368B (en) RTB bidding advertisement position value analysis method and system
CN106503228A (en) A kind of packet scarcity appraisal procedure and its system
KR20110023750A (en) Object customization and management system
CN114880566A (en) User behavior analysis method, device, equipment and medium based on graph neural network
CN112511632B (en) Object pushing method, device and equipment based on multi-source data and storage medium
CN108959289B (en) Website category acquisition method and device
CN106649748B (en) Information recommendation method and device
Ramaswamy et al. Evaluating Asian free trade agreements: What does gravity model tell us?
Olmedilla et al. Identification of Influencers in eWord-of-Mouth communities using their Online Participation Features
CN106844718B (en) Data set determination method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170315

RJ01 Rejection of invention patent application after publication