CN113763097A - Method and device for updating article information - Google Patents

Method and device for updating article information Download PDF

Info

Publication number
CN113763097A
CN113763097A CN202011476702.9A CN202011476702A CN113763097A CN 113763097 A CN113763097 A CN 113763097A CN 202011476702 A CN202011476702 A CN 202011476702A CN 113763097 A CN113763097 A CN 113763097A
Authority
CN
China
Prior art keywords
item information
information table
time point
article
primary key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011476702.9A
Other languages
Chinese (zh)
Inventor
王安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202011476702.9A priority Critical patent/CN113763097A/en
Publication of CN113763097A publication Critical patent/CN113763097A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Probability & Statistics with Applications (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for updating article information, and relates to the technical field of computers. One embodiment of the method comprises: acquiring an article information table according to a set time point, wherein a main key of the article information table comprises an article identifier, an attribute name and a first attribute value corresponding to the attribute name; and comparing the item information table at a first time point with the item information table at a second time point according to the primary key of the item information table to determine an updating result of the item information table, wherein the first time point is a previous time point of the second time point. The embodiment can not only realize the storage of the article information with large data volume, but also improve the updating speed of the article information; on the other hand, the method and the device for updating the article information provided by the embodiment of the invention have lower requirements on the performance of physical devices of the machine, and save the hardware cost.

Description

Method and device for updating article information
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for updating article information.
Background
With the increasing popularity of electronic commerce, the detail pages of commodities maintained by e-commerce platforms are updated almost every day, for example, new commodities are pushed out, commodities with undesirable sales or stopped production are deleted, prices of existing commodities are adjusted, new styles are added to existing commodities, and the like. Generally, a graph database is used by the e-commerce platform for data storage, and in the neo4j for example, data entities, attributes and relationships involved on the commodity detail page are extracted, that is, graph data is constructed by taking entities as nodes and relationships as edges, wherein the entities and the relationships can possess attributes.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: under the conditions that the timeliness requirement on data is high and the data is updated at high frequency, due to the complexity of the relationship of graph data, the updating condition is difficult to obtain when the data is updated, so that the data can be updated in a full amount often, and when the data amount reaches billion level, the updating speed cannot meet the high-frequency updating requirement. On the other hand, the graph database has high performance requirements on the machine, and a large memory and a multi-core processor are often required to be configured, so that the hardware cost of the machine is increased.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for updating article information, which can not only implement storage of article information with large data volume, but also improve the updating speed of article information; on the other hand, the method and the device for updating the article information provided by the embodiment of the invention have lower requirements on the performance of physical devices of the machine, and save the hardware cost.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided an article information updating method including: acquiring an article information table according to a set time point, wherein a main key of the article information table comprises an article identifier, an attribute name and a first attribute value corresponding to the attribute name; and comparing the item information table at a first time point with the item information table at a second time point according to the primary key of the item information table to determine an updating result of the item information table, wherein the first time point is a previous time point of the second time point.
Optionally, the method for item information update, the primary key of the item information table further comprises one or more of the following: and the attribute group, the attribute alias and the second attribute value corresponding to the attribute alias are corresponding to the attribute name.
Optionally, the method for updating the item information further includes: for a primary key only appearing in the item information table at the first time point, determining that the item information represented by the primary key has been deleted;
and for the primary key only appearing in the item information table at the second time point, determining that the item represented by the primary key is the newly added item information at the second time point.
Optionally, the method for updating the item information further includes: and taking the main key of the article information table as a connecting mark to associate the article information table at the first time point with the article information table at the second time point.
Optionally, the method for updating the item information further includes: and recording the updating result to an article information statistical table, wherein the article information statistical table comprises a main key column and an effective state column, the main key of the article information statistical table is the same as that of the article information table, and the effective state column represents the updating result of the article information table.
Optionally, the method for updating the item information further includes: if the update result of the item information table indicates that the item information represented by the primary key is deleted, marking the valid state corresponding to the primary key as invalid in the item information statistical table; and if the updating result of the article information table indicates that the article information represented by the primary key is newly added article information, newly adding a record in the article information statistical table, wherein the effective state corresponding to the primary key of the newly added record is effective.
Optionally, the method for updating the item information further includes that the item information statistical table further includes a status time column, and the status time column is used to indicate a time point when the valid status occurs.
Optionally, the method for updating the item information further includes: and storing the item information table in a Hive database, and storing the item information statistical table in an elastic search engine.
To achieve the above object, according to a second aspect of embodiments of the present invention, there is provided an apparatus for updating article information, including: the system comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring an article information table according to a set time point, and a main key of the article information table comprises an article identifier, an attribute name and a first attribute value corresponding to the attribute name; and the updating result determining module is used for comparing the item information table at a first time point with the item information table at a second time point according to the primary key of the item information table to determine the updating result of the item information table, wherein the first time point is earlier than the second time point.
Optionally, the apparatus for updating article information further includes a statistics module, configured to record the update result to an article information statistics table, where the article information statistics table includes a main key column and an effective state column, the main key of the article information statistics table is the same as the main key of the article information table, and the effective state column indicates the update result of the article information table.
To achieve the above object, according to a third aspect of embodiments of the present invention, there is provided a computer readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method as in any one of the methods for item information updating described above.
One embodiment of the above invention has the following advantages or benefits: because the technical means of obtaining the article information table according to the set time point and associating the article information tables at different time points by the same main key so as to obtain the article information updating result is adopted, the article information storage with large data volume can be realized, and the article information updating speed can be improved; on the other hand, the method and the device for updating the article information provided by the embodiment of the invention have lower requirements on the performance of physical devices of the machine, and save the hardware cost.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 illustrates a data flow in an item information storage and update process according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of a flow of item information updates according to an embodiment of the invention;
FIGS. 3A and 3B are schematic diagrams illustrating an item information table according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a process for determining an update to item information according to an embodiment of the invention;
FIG. 5 is a schematic diagram of a process for updating the validity status of item information according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of the main modules of an apparatus for item information update according to an embodiment of the present invention;
FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
With the increasing popularity of electronic commerce, the detail pages of commodities maintained by e-commerce platforms are updated almost every day, for example, new commodities are pushed out, commodities with undesirable sales or stopped production are deleted, prices of existing commodities are adjusted, new styles are added to existing commodities, and the like. The updating of the item detail page will result in the updating of the item information concerned. However, although the product data provided by the e-commerce platform is enormous, the amount of product information that actually changes every day is small compared to the entire amount of product information. There is thus a need for a method that can efficiently update information of goods (articles) so that the article information that needs to be updated can be quickly located among a large amount of article information.
Fig. 1 shows a data flow in an item information storage and update process according to an embodiment of the present invention. As shown in fig. 1, the item detail page (i.e., the "merchant detail page" shown in fig. 1) contains a large amount of item information, such as an item identifier (e.g., Stock Keeping Unit, SKU), an item attribute name (e.g., color), and an attribute value (e.g., black) corresponding to the item attribute name, so that the required item information can be extracted from the item detail page. Then, the entities and the relationships between the entities involved in the item information are obtained by semantic analysis or the like, and it is understood that each entity in the figure is centered on a commodity. For example, the entity may be a "sku: 65972622357 ", another entity may be" color ", and yet another entity may be" sole material "-further, entity" sku: 65972622357 ' and the entity ' color ', the edge attribute is ' black ', to indicate the relation ' the color of the sports shoes is black '; entity "sku: 65972622357, and the solid sole material is "rubber", which indicates the relationship of "the sole material of the sports shoe is rubber". After obtaining the information of the entity, the relation and the like, determining a primary key of the item information according to an item identifier (for example, SKU), an attribute name (for example, color) and a first attribute value (for example, black) corresponding to the attribute name, and storing the information of the entity, the relation and the like into a database such as neo4j on one hand; on the other hand, based on the primary key, an item information table is constructed to compare the item information differences at different time points, and specifically, the item information table may be stored using a Hive database. Finally, based on the item information table, an item information update is obtained, and the update is synchronized to another database, for example, an elastic search Engine (ES).
An article information updating method according to an embodiment of the present invention will be described below with reference to fig. 2, fig. 3A, and fig. 3B, where fig. 2 is a schematic diagram of a flow of article information updating according to an embodiment of the present invention, and fig. 3A and 3B show schematic diagrams of an article information table according to an embodiment of the present invention.
As shown in fig. 2, in S201, item information is acquired at a set point-in-time — for example, at a fixed time of day, the item information is extracted from an item detail page to generate an item information table in a subsequent step; thus, each item information table contains information on all items included in the item detail page in the past 24 hours.
In S202, based on the acquired item information, the primary key of the item information table is determined. Specifically, the primary key includes an item identifier, an attribute name, and a first attribute value corresponding to the attribute name. As shown in fig. 3A, in the item information collected on day 1/12, the primary key "65972622357 _ material _ sole material _ rubber" has the following meaning: "65972622357" represents an item identifier such as a SKU for a good, and "sole material" represents an attribute name, and "rubber" represents an attribute value corresponding to the attribute name "sole material". Further, the primary key of the item information table further includes one or more of the following: and the attribute group, the attribute alias and the second attribute value corresponding to the attribute alias are corresponding to the attribute name. As shown in fig. 3A, in the article information collected on 12 months and 1 day, the primary key "60375970464 _ material _ fabric _ polyester fiber _ fabric 2_ polyester" has the following meaning: "60375970464" represents an item identifier, for example, SKU of a commodity, "fabric" represents an attribute name, "polyester fiber" represents an attribute value corresponding to the attribute name "fabric," material "represents an attribute group to which the attribute name" fabric "belongs," fabric 2 "represents an attribute alias of the attribute name" fabric, "and" polyester "represents an attribute value corresponding to the attribute alias" fabric 2 ". Therefore, the information of the article can be obtained based on the main key, and the process of searching the article information corresponding to the main key again is omitted.
In S203, an item information table is constructed. As described above, the item information table obtained at a certain time point includes all the item information from a time point (also referred to as a first time point) previous to the time point (also referred to as a second time point). After the primary key of the item information table is determined, the item information table is obtained. As shown in fig. 3A, the item information table configured for item information collected on day 1/12 includes two columns: "1201. primary key" and "1201. other fields". Here, "1201. primary key" represents the primary key of the item information table, and as described above, in the present invention, the primary information of the item is obtained based on the primary key. Further, "1201. other fields" may be used to store other information about the item, such as the particular source of the item information. That is, preferably, the "other field" in the item information table is generally used to store information that is not related to an item name, an attribute value, and the like. Preferably, the item information table is stored in a Hive database.
In S204, it is checked whether the total data of the article information at the current time point is legitimate. Specifically, after the item information table at the current time point is constructed, the data size (size) of the item information table is determined, and if the data size is far smaller or larger than a normal amount (for example, compared with the data size of the item information table at another time point), it indicates that there is a high possibility that an abnormal operation exists in the step of constructing the item information table. In an example not shown in fig. 2, if the data size of the article information at the current time point is far smaller or far larger than the normal size, there may be an abnormal operation in the step of extracting the article information or determining the main key of the article information table, and it is necessary to trace back to the previous step one by one to check whether the full size data of the article information at the current time point is abnormal. The data scale of the article information table at the current time point is checked, so that the following conditions are avoided: if the daily data size is abnormal, when comparing with the yesterday data, a large amount of data is deleted as data to be deleted or marked as invalid data, thereby affecting the overall storage of the item information. If the data size is equivalent to the normal amount, the flow of updating the article information proceeds to S205.
In S205, the item information tables at the current time point and the previous time point are compared. In S206, the new item information is obtained. In S207, the item information to be deleted is obtained. Specifically, according to the primary key of the item information table, comparing the item information table at a first time point with the item information table at a second time point to determine an update result of the item information table, where the first time point is earlier than the second time point, for example, the second time point is a current time, and the first time point is a previous time point of the current time point. The alignment process will be described below by taking fig. 3A and 3B as an example. In fig. 3A and 3B, the first time point is 12 month 1 day, the second time point is 12 month 2 day, and the item information table on 12 month 1 day and the item information table on 12 month 2 day are associated with each other with the primary key of the item information table as a connection flag. As shown by the dotted line frame in fig. 3A, in the article information table of 12 month 1 day, article information records including the primary keys "60375970464 _ material _ fabric _ polyester fiber _ fabric 2_ polyester" and "65432149846 _ main body parameter _ color _ black" are included, and correspondingly, in the article information table of 12 month 2 day, there is no information of these two primary keys — therefore, it can be determined that the article corresponding to these two primary keys is the article information deleted in 12 month 2 day. As shown by the dotted line frame in fig. 3B, the article information table of day 12/month 2 includes article information records with the primary keys "70375970464 _ material _ fabric _ polyester _ fabric 2_ polyester" and "75432149846 _ main body parameter _ color _ black", and correspondingly, the article information table of day 12/month 1 does not include information of the two primary keys — therefore, it can be determined that the article information corresponding to the two primary keys is the additional article information in day 12/month 2. It can be seen that, for the primary key appearing only in the item information table at the first time point, it is determined that the item information represented by the primary key has been deleted; and for the primary key only appearing in the item information table at the second time point, determining that the item information represented by the primary key is the newly added item information at the second time point.
In S208, the item information statistical table is updated. Table 1 shows an example of an item information statistical table.
Table 1 article information statistical table
Main key Active state Time of state
65972622357 material sole material rubber Is effective 12 months and 2 days or before
60375970464_ Material _ Fabric _ polyester fiber _ Fabric 2_ Terylene Invalidation 12 month and 2 days
65432149846_ body parameter _ color _ black Invalidation 12 month and 2 days
75972622357 material sole material rubber Is effective 12 months and 2 days or before
70375970464_ Material _ Fabric _ polyester fiber _ Fabric 2_ Terylene Is effective 12 month and 2 days
75432149846_ body parameter _ color _ black Is effective 12 month and 2 days
As shown in table 1, the article information statistical table includes a primary key column and an effective state column, the primary key of the article information statistical table is the same as the primary key of the article information table, and the effective state column indicates the update result of the article information table; preferably, a state time column is also included. Specifically, the article information statistical table is used for recording the valid state (i.e., valid or invalid) of the article, and further, the time point when the valid state of the article occurs is also recorded by using the state time. According to fig. 3A and 3B, after completing the construction of the item information table at 12 month and 2 days and determining the item information update record compared to the previous day, the item update record of 12 month and 2 days is known — that is, the items (information) whose main keys are "60375970464 _ material _ fabric _ polyester fiber _ fabric 2_ terylene", "65432149846 _ main body parameter _ color _ black" are deleted; articles (information) with the main keys of '70375970464 _ material _ fabric _ polyester fiber _ fabric 2_ terylene' and '75432149846 _ main body parameter _ color _ black' are newly added. Therefore, in table 1, the valid state of the article (information) whose primary key is "60375970464 _ material _ fabric _ polyester _ fabric 2_ polyester", "65432149846 _ main body parameter _ color _ black" is updated to "invalid", and the state time thereof is recorded as 12 months and 2 days; articles (information) with the main keys of '70375970464 _ material _ fabric _ polyester fiber _ fabric 2_ terylene' and '75432149846 _ main body parameter _ color _ black' are newly added, the effective state is recorded as 'effective', and the state time is recorded as 12 months and 2 days. On the other hand, for two items of information (primary keys "65972622357 _ material _ sole material _ rubber", "75972622357 _ material _ sole material _ rubber") that are not updated in 12 month and 2 days, the valid state is still "valid" in the item information statistical table in table 1, and the state time may be recorded as 12 month and 2 days to indicate the latest query time, or may be recorded as a time before 12 month and 2 days, for example, the time when the two items of information are first created. The significance of maintaining the status time is that when a historical status query request for specific item information is received, whether the item is valid or not can be known from the item information statistical table, and a time point at which the valid status of the item information is known last can be known, so that a query range is narrowed from a full database (for example, neo4j shown in fig. 1) or from the item information table at the time point, and more item information of the item can be searched more efficiently. In one embodiment, when the update result of the article information at the current time point is obtained compared with the update result of the article information at the previous time point, the difference data (the information of the article to be added and the information of the article to be deleted) can be imported into the article information statistical table in a multitask parallel manner. Preferably, the item information statistical table is stored in an elastic search engine (ES mapping). In summary, if the update result of the item information table indicates that the item information represented by the primary key has been deleted, the valid state corresponding to the primary key is marked as invalid in the item information statistical table; and if the updating result of the article information table indicates that the article information represented by the primary key is newly added article information, newly adding a record in the article information statistical table, wherein the effective state corresponding to the primary key of the newly added record is effective.
At this point, the updating of the article information is completed. Practice verifies that for the E-commerce platform, the daily changed article information quantity is smaller than all the article information quantities owned by the E-commerce platform, so that the incremental updating mode is adopted, the database updating time is greatly reduced, and the working efficiency is improved. Taking hundred million-level item information records as an example, when the hardware configuration of the database with full update is 3 physical machines with 32-core CPUs, 200G memories and 2T hard disks, the full update takes about 10 hours. Under the condition that the hardware configuration of the ES cluster with incremental update is 4 data nodes, 8Gjvm and 16G memory, the incremental update takes about 25min, and compared with the method, the incremental update method provided by the embodiment of the invention improves the efficiency by nearly 60 times.
Fig. 4 is a schematic diagram of a process of determining an update condition of the article information according to an embodiment of the present invention.
In S401, an item information table is obtained at a set time point, where a primary key of the item information table includes an item identifier, an attribute name, and a first attribute value corresponding to the attribute name;
in S402, comparing the item information table at a first time point with the item information table at a second time point according to the primary key of the item information table to determine an update result of the item information table, wherein the first time point is a previous time point to the second time point. Specifically, for a primary key appearing only in the item information table at the first time point, determining that the item information represented by the primary key has been deleted; and for the primary key only appearing in the item information table at the second time point, determining that the item information represented by the primary key is the newly added item information at the second time point.
Fig. 5 is a schematic diagram of a process of updating the validity status of item information according to an embodiment of the present invention.
In S501, the update result is recorded in an article information statistics table, where the article information statistics table includes a primary key column and an effective state column, the primary key of the article information statistics table is the same as the primary key of the article information table, and the effective state column indicates the update result of the article information table.
In S502, if the update result of the item information table indicates that the item information represented by the primary key has been deleted, marking the valid state corresponding to the primary key as invalid in the item information statistics table; and if the updating result of the article information table indicates that the article information represented by the primary key is newly added article information, newly adding a record in the article information statistical table, wherein the effective state corresponding to the primary key of the newly added record is effective.
Fig. 6 is a schematic diagram of main blocks of an apparatus for item information update according to an embodiment of the present invention. The device includes: the device comprises an acquisition module and an update result determination module. Specifically, the acquisition module is configured to acquire an item information table according to a set time point, where a primary key of the item information table includes an item identifier, an attribute name, and a first attribute value corresponding to the attribute name; and the updating result determining module is used for comparing the item information table at a first time point with the item information table at a second time point according to the primary key of the item information table to determine the updating result of the item information table, wherein the first time point is earlier than the second time point. Further, the updating result determining module is also used for determining that the item information represented by the primary key is deleted for the primary key only appearing in the item information table at the first time point; and for the primary key only appearing in the item information table at the second time point, determining that the item information represented by the primary key is the newly added item information at the second time point.
In addition, the device also comprises a statistic module for recording the update result to an article information statistic table, wherein the article information statistic table comprises a main key column and an effective state column, the main key of the article information statistic table is the same as that of the article information table, and the effective state column represents the update result of the article information table.
Fig. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed.
As shown in fig. 7, the system architecture 700 may include terminal devices 701, 702, 703, a network 704, and a server 705. The network 704 serves to provide a medium for communication links between the terminal devices 701, 702, 703 and the server 705. Network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 701, 702, 703 to interact with a server 705 over a network 704, to receive or send messages or the like. The terminal devices 701, 702, 703 may be terminals for presenting item detail pages to the user for the user to make an online purchase.
The terminal devices 701, 702, 703 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 705 may be a server that provides various services, such as a background server (for example only) that analyzes detailed pages of merchandise displayed on the terminal devices 701, 702, and 703 and extracts information to update the item information.
It should be noted that the item information updating method provided by the embodiment of the present invention is generally executed by the server 705.
It should be understood that the number of terminal devices, networks, and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use with a terminal device or server implementing an embodiment of the present invention. The terminal device or the server shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program executes the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 801.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units (or "modules") mentioned in the embodiments of the present invention may be implemented by software, or may be implemented by hardware. The described units (or "modules") may also be provided in a processor, which may be described, for example, as: a processor includes an acquisition unit (or "module"), an update result determination unit, and a statistics unit. Where the names of these units do not in some cases constitute a limitation on the unit itself, for example, a collection unit may also be described as a "unit that collects information from a web page rendered by a client".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring an article information table according to a set time point, wherein a main key of the article information table comprises an article identifier, an attribute name and a first attribute value corresponding to the attribute name; and comparing the item information table at a first time point with the item information table at a second time point according to the primary key of the item information table to determine an updating result of the item information table, wherein the first time point is a previous time point of the second time point.
According to the technical scheme of the embodiment of the invention, because the technical means of acquiring the article information table according to the set time point and associating the article information tables at different time points by the same main key so as to acquire the article information updating result is adopted, not only can the article information storage with large data volume be realized, but also the updating speed of the article information can be improved; on the other hand, the method and the device for updating the article information provided by the embodiment of the invention have lower requirements on the performance of physical devices of the machine, and save the hardware cost.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (12)

1. A method for updating item information, comprising:
acquiring an article information table according to a set time point, wherein a main key of the article information table comprises an article identifier, an attribute name and a first attribute value corresponding to the attribute name;
and comparing the item information table at a first time point with the item information table at a second time point according to the primary key of the item information table to determine an updating result of the item information table, wherein the first time point is a previous time point of the second time point.
2. The method of claim 1, wherein the primary key of the item information table further comprises one or more of: and the attribute group, the attribute alias and the second attribute value corresponding to the attribute alias are corresponding to the attribute name.
3. The method according to claim 1, wherein the comparing the item information table at the first time point and the item information table at the second time point according to the primary key of the item information table to determine the update result of the item information table comprises:
for a primary key only appearing in the item information table at the first time point, determining that the item information represented by the primary key has been deleted;
and for the primary key only appearing in the item information table at the second time point, determining that the item represented by the primary key is the newly added item information at the second time point.
4. The method of claim 1, wherein comparing the item information table at the first point in time with the item information table at the second point in time comprises:
and taking the main key of the article information table as a connecting mark to associate the article information table at the first time point with the article information table at the second time point.
5. The method of claim 2, further comprising:
and recording the updating result to an article information statistical table, wherein the article information statistical table comprises a main key column and an effective state column, the main key of the article information statistical table is the same as that of the article information table, and the effective state column represents the updating result of the article information table.
6. The method of claim 5, further comprising:
if the update result of the item information table indicates that the item information represented by the primary key is deleted, marking the valid state corresponding to the primary key as invalid in the item information statistical table;
and if the updating result of the article information table indicates that the article information represented by the primary key is newly added article information, newly adding a record in the article information statistical table, wherein the effective state corresponding to the primary key of the newly added record is effective.
7. The method of claim 5, wherein the item information statistics table further comprises a status time column indicating a time point when the valid status occurs.
8. The method of any of claims 1-5, further comprising:
and storing the item information table in a Hive database, and storing the item information statistical table in an elastic search engine.
9. An apparatus for updating information of an article, comprising:
the system comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring an article information table according to a set time point, and a main key of the article information table comprises an article identifier, an attribute name and a first attribute value corresponding to the attribute name;
and the updating result determining module is used for comparing the item information table at a first time point with the item information table at a second time point according to the primary key of the item information table to determine the updating result of the item information table, wherein the first time point is earlier than the second time point.
10. The apparatus according to claim 9, further comprising a statistics module, configured to record the update result to an item information statistics table, where the item information statistics table includes a primary key column and a valid status column, the primary key of the item information statistics table is the same as the primary key of the item information table, and the valid status column indicates the update result of the item information table.
11. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.
12. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-8.
CN202011476702.9A 2020-12-14 2020-12-14 Method and device for updating article information Pending CN113763097A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011476702.9A CN113763097A (en) 2020-12-14 2020-12-14 Method and device for updating article information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011476702.9A CN113763097A (en) 2020-12-14 2020-12-14 Method and device for updating article information

Publications (1)

Publication Number Publication Date
CN113763097A true CN113763097A (en) 2021-12-07

Family

ID=78786207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011476702.9A Pending CN113763097A (en) 2020-12-14 2020-12-14 Method and device for updating article information

Country Status (1)

Country Link
CN (1) CN113763097A (en)

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004023364A1 (en) * 2002-09-04 2004-03-18 Sang-Young Cha Method and system for updating goods information
KR100578595B1 (en) * 2005-09-15 2006-05-12 박태홍 A method for internet shopping
CN104699712A (en) * 2013-12-09 2015-06-10 阿里巴巴集团控股有限公司 Method and device for updating stock record information in database
US20160125021A1 (en) * 2014-10-31 2016-05-05 Microsoft Corporation Efficient updates in non-clustered column stores
CN106326243A (en) * 2015-06-19 2017-01-11 苏宁云商集团股份有限公司 Data processing method and apparatus
CN107203629A (en) * 2017-05-31 2017-09-26 北京京东尚科信息技术有限公司 Page rendering method, system and device
CN107368502A (en) * 2016-05-13 2017-11-21 北京京东尚科信息技术有限公司 Information synchronization method and device
CN107818496A (en) * 2017-11-14 2018-03-20 天脉聚源(北京)科技有限公司 A kind of merchandise news update method and device
CN107844994A (en) * 2017-11-14 2018-03-27 天脉聚源(北京)科技有限公司 A kind of merchandise information processing method and device
CN108805596A (en) * 2017-04-28 2018-11-13 北京京东尚科信息技术有限公司 Merchandise valuation information processing method, device, electronic equipment and storage medium
CN108958959A (en) * 2017-05-18 2018-12-07 北京京东尚科信息技术有限公司 The method and apparatus for detecting hive tables of data
CN109558448A (en) * 2018-10-10 2019-04-02 北京海数宝科技有限公司 Data processing method, device, computer equipment and storage medium
CN110727724A (en) * 2019-09-09 2020-01-24 上海陆家嘴国际金融资产交易市场股份有限公司 Data extraction method and device, computer equipment and storage medium
CN110879808A (en) * 2019-11-04 2020-03-13 泰康保险集团股份有限公司 Information processing method and device
CN111159207A (en) * 2019-12-16 2020-05-15 中国建设银行股份有限公司 Information processing method and device
US20200286014A1 (en) * 2017-10-18 2020-09-10 Beijing Jingdong Century Trading Co., Ltd. Information updating method and device
CN111798261A (en) * 2020-03-24 2020-10-20 北京沃东天骏信息技术有限公司 Information updating method and device
CN111930958A (en) * 2020-07-13 2020-11-13 车智互联(北京)科技有限公司 Graph database construction method, computing device and readable storage medium

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004023364A1 (en) * 2002-09-04 2004-03-18 Sang-Young Cha Method and system for updating goods information
KR100578595B1 (en) * 2005-09-15 2006-05-12 박태홍 A method for internet shopping
CN104699712A (en) * 2013-12-09 2015-06-10 阿里巴巴集团控股有限公司 Method and device for updating stock record information in database
US20160125021A1 (en) * 2014-10-31 2016-05-05 Microsoft Corporation Efficient updates in non-clustered column stores
CN106326243A (en) * 2015-06-19 2017-01-11 苏宁云商集团股份有限公司 Data processing method and apparatus
CN107368502A (en) * 2016-05-13 2017-11-21 北京京东尚科信息技术有限公司 Information synchronization method and device
CN108805596A (en) * 2017-04-28 2018-11-13 北京京东尚科信息技术有限公司 Merchandise valuation information processing method, device, electronic equipment and storage medium
CN108958959A (en) * 2017-05-18 2018-12-07 北京京东尚科信息技术有限公司 The method and apparatus for detecting hive tables of data
CN107203629A (en) * 2017-05-31 2017-09-26 北京京东尚科信息技术有限公司 Page rendering method, system and device
US20200286014A1 (en) * 2017-10-18 2020-09-10 Beijing Jingdong Century Trading Co., Ltd. Information updating method and device
CN107844994A (en) * 2017-11-14 2018-03-27 天脉聚源(北京)科技有限公司 A kind of merchandise information processing method and device
CN107818496A (en) * 2017-11-14 2018-03-20 天脉聚源(北京)科技有限公司 A kind of merchandise news update method and device
CN109558448A (en) * 2018-10-10 2019-04-02 北京海数宝科技有限公司 Data processing method, device, computer equipment and storage medium
CN110727724A (en) * 2019-09-09 2020-01-24 上海陆家嘴国际金融资产交易市场股份有限公司 Data extraction method and device, computer equipment and storage medium
CN110879808A (en) * 2019-11-04 2020-03-13 泰康保险集团股份有限公司 Information processing method and device
CN111159207A (en) * 2019-12-16 2020-05-15 中国建设银行股份有限公司 Information processing method and device
CN111798261A (en) * 2020-03-24 2020-10-20 北京沃东天骏信息技术有限公司 Information updating method and device
CN111930958A (en) * 2020-07-13 2020-11-13 车智互联(北京)科技有限公司 Graph database construction method, computing device and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
赵晓永;王磊;: "电商网页中商品规格信息自动抽取方法研究", 计算机工程与应用, no. 24, 15 December 2017 (2017-12-15), pages 168 - 171 *
陈长英: ""浙江专业市场借助电商平台实现转型升级"", 《电子商务》, 31 January 2015 (2015-01-31), pages 64 - 67 *

Similar Documents

Publication Publication Date Title
US9928537B2 (en) Management and storage of distributed bookmarks
US10853847B2 (en) Methods and systems for near real-time lookalike audience expansion in ads targeting
CN109034988B (en) Accounting entry generation method and device
CN103034680B (en) For data interactive method and the device of terminal device
BRPI0016904B1 (en) data binding system, method of integrating a plurality of data elements residing in a data storage system
JP2013519941A (en) Method and system for e-commerce transaction data accounting
CN103020128B (en) With the method and apparatus of data interaction with terminal device
CN110837520A (en) Data processing method, platform and system
CN110781203A (en) Method and device for determining data width table
CN113032668A (en) Product recommendation method, device and equipment based on user portrait and storage medium
CN110704486B (en) Data processing method, device, system, storage medium and server
CN107291923B (en) Information processing method and device
CN109857816B (en) Test sample selection method and device, storage medium and electronic equipment
CN116955856A (en) Information display method, device, electronic equipment and storage medium
CN110858199A (en) Document data distributed computing method and device
US10990988B1 (en) Finding business similarities between entities using machine learning
CN113763097A (en) Method and device for updating article information
CN110827044A (en) Method and device for extracting user interest mode
CN110827101A (en) Shop recommendation method and device
CN115796914A (en) Operation data analysis method, system, computer device and storage medium
CN115454971A (en) Data migration method and device, electronic equipment and storage medium
CN114549125A (en) Item recommendation method and device, electronic equipment and computer-readable storage medium
CN110738538B (en) Method and device for identifying similar objects
CN113761102A (en) Data processing method, device, server, system and storage medium
CN113495891A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination