CN110222249B - Data storage method, device, computer equipment and storage medium - Google Patents

Data storage method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN110222249B
CN110222249B CN201910370895.0A CN201910370895A CN110222249B CN 110222249 B CN110222249 B CN 110222249B CN 201910370895 A CN201910370895 A CN 201910370895A CN 110222249 B CN110222249 B CN 110222249B
Authority
CN
China
Prior art keywords
enterprise
data
text
information list
history information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910370895.0A
Other languages
Chinese (zh)
Other versions
CN110222249A (en
Inventor
吴壮伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910370895.0A priority Critical patent/CN110222249B/en
Publication of CN110222249A publication Critical patent/CN110222249A/en
Application granted granted Critical
Publication of CN110222249B publication Critical patent/CN110222249B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data storage method, a data storage device, computer equipment and a storage medium. The method comprises the following steps: crawling a corresponding source data set according to the received enterprise name, and preprocessing the source data set to obtain processed data corresponding to a plurality of text data one by one; performing de-duplication on the corresponding processed data to obtain a corresponding de-duplicated data set; the enterprise history information lists included in the duplicate-removed data set are obtained, and each enterprise information list is correspondingly added to a time node corresponding to a time axis or a calendar according to the occurrence time of the enterprise event in the included text and the corresponding enterprise history information list, so that development track data corresponding to the enterprise name is obtained; the development track data is displayed through a time axis or calendar. The method adopts a data crawling technology, realizes automatic crawling of the relevant information of the enterprise from the network, and sorts all the key information according to time after extracting the key information of the relevant information of the enterprise, so as to obtain the development track data of the enterprise.

Description

Data storage method, device, computer equipment and storage medium
Technical Field
The present invention relates to the field of data acquisition technologies, and in particular, to a data storage method, a data storage device, a computer device, and a storage medium.
Background
At present, large event information of enterprises has dispersibility in the time dimension, and is difficult to collect and intensively arrange. In addition, the data arrangement of each enterprise in the development of the enterprise is generally based on manual text operation, and a tool for automatic acquisition is absent. Moreover, each enterprise generally organizes its own enterprise history data, and does not form an enterprise history data query platform capable of viewing history events of all enterprises in a centralized manner.
Disclosure of Invention
The embodiment of the invention provides a data storage method, a data storage device, computer equipment and a data storage medium, and aims to solve the problems that in the prior art, enterprises are in partial manual text operation, and lack of automatic acquisition and data arrangement results in low efficiency in data arrangement of the enterprises.
In a first aspect, an embodiment of the present invention provides a data storage method, including:
Crawling a corresponding source data set according to the received enterprise name, and preprocessing the source data set to obtain processed data corresponding to a plurality of text data contained in the source data set one by one; wherein each processed data corresponds to an enterprise history information list;
Performing duplication removal on the processed data which are contained in the source data and correspond to the text data one by one to obtain a duplication-removed data set corresponding to the source data; and
Acquiring enterprise history information lists included in the duplicate-removed data set, and correspondingly adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list to obtain development track data corresponding to an enterprise name; the names of the time axis or the calendars are generated by the enterprise names corresponding to the enterprise information lists, and the development track data are data in the form of the time axis or the calendars added with the enterprise history information list;
And displaying the development track data through a time axis or a calendar.
In a second aspect, an embodiment of the present invention provides a data storage device, including:
the source data crawling unit is used for crawling a corresponding source data set according to the received enterprise name, preprocessing the source data set, and obtaining processed data corresponding to a plurality of text data included in the source data set one by one; wherein each processed data corresponds to an enterprise history information list;
the de-duplication unit is used for de-duplication the processed data which are contained in the source data and correspond to the text data one by one to obtain a de-duplicated data set which corresponds to the source data; and
The track data acquisition unit is used for acquiring enterprise history information lists included in the duplicate-removed data set, and correspondingly adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list to obtain development track data corresponding to an enterprise name; the names of the time axis or the calendars are generated by the enterprise names corresponding to the enterprise information lists, and the development track data are data in the form of the time axis or the calendars added with the enterprise history information list;
And the data display unit is used for displaying the development track data through a time axis or a calendar.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the data storage method described in the first aspect when the processor executes the computer program.
In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a computer program, where the computer program when executed by a processor causes the processor to perform the data storage method according to the first aspect.
The embodiment of the invention provides a data storage method, a data storage device, computer equipment and a storage medium. The method comprises the steps of crawling a corresponding source data set according to a received enterprise name, and preprocessing the source data set to obtain processed data corresponding to a plurality of text data contained in the source data set one by one; wherein each processed data corresponds to an enterprise history information list; performing duplication removal on the processed data which are contained in the source data and correspond to the text data one by one to obtain a duplication-removed data set corresponding to the source data; the enterprise history information lists included in the duplicate removal data set are obtained, and the enterprise information lists are correspondingly added to time nodes corresponding to time shafts or calendars according to the occurrence time of enterprise events in the included text and the corresponding enterprise history information lists, so that development track data corresponding to enterprise names are obtained; and displaying the development track data through a time axis or a calendar. The method adopts a data crawling technology, realizes automatic crawling of the relevant information of the enterprise from the network, and sorts all the key information according to time after extracting the key information of the relevant information of the enterprise, so as to obtain the development track data of the enterprise.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of an application scenario of a data storage method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a data storage method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of another flow chart of a data storage method according to an embodiment of the present invention;
FIG. 4 is a schematic sub-flowchart of a data storage method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of another sub-flowchart of a data storage method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of another sub-flowchart of a data storage method according to an embodiment of the present invention;
FIG. 7 is a schematic block diagram of a data storage device provided by an embodiment of the present invention;
FIG. 8 is another schematic block diagram of a data storage device provided in an embodiment of the present invention;
FIG. 9 is a schematic block diagram of a subunit of a data storage device according to an embodiment of the present invention;
FIG. 10 is a schematic block diagram of another subunit of a data storage device according to an embodiment of the present invention;
FIG. 11 is a schematic block diagram of another subunit of a data storage device according to an embodiment of the present invention;
Fig. 12 is a schematic block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 1 and fig. 2, fig. 1 is a schematic application scenario diagram of a data storage method according to an embodiment of the present invention, and fig. 2 is a flowchart of the data storage method according to an embodiment of the present invention, where the data storage method is applied to a server, and the method is executed by application software installed in the server.
As shown in fig. 2, the method includes steps S110 to S140.
S110, crawling a corresponding source data set according to the received enterprise name, and preprocessing the source data set to obtain processed data corresponding to a plurality of text data contained in the source data set one by one; wherein each processed data corresponds to a list of enterprise history information.
In this embodiment, in order to obtain development tracks of one or more enterprises, a crawler tool may crawl a large amount of web page content related to the corresponding enterprises. For example, when the web page content related to the enterprise a needs to be acquired, a to-be-crawled web page list related to the enterprise a is set for a crawler tool deployed in a server, and the web page content related to the enterprise a is acquired from a web page server of the to-be-crawled web page list web page through the crawler tool as a source data set. And then preprocessing the source data set to obtain historical text data related to the development process of the enterprise. Specifically, each piece of text data crawled from the web page can be subjected to extraction of a title keyword list and a text keyword list.
In one embodiment, as shown in fig. 4, step S110 includes:
s111, extracting keyword information from the title and the text of each text data in a plurality of text data included in the source data set through a word frequency-inverse text frequency index model respectively to obtain a title keyword list and a text keyword list corresponding to each text data;
S112, identifying the publishing time of each text data and the occurrence time of enterprise events in a plurality of text data included in the source data set to obtain text key time information corresponding to each text data;
And S113, packaging and combining a title keyword list, a text keyword list and text key time information corresponding to each text data in the source data set to obtain an enterprise history information list corresponding to each text data so as to obtain processed data corresponding to each source data.
In this embodiment, when keyword information extraction is performed on the title and the body of each text data in the plurality of text data included in the source data set by using a TF-IDF model (i.e., term frequency-inverse document frequency, which represents a word frequency-inverse text frequency index model), each text data may be simplified and expressed as a corresponding keyword list.
And then, when the publishing time of each text data and the occurrence time of the enterprise event in the plurality of text data included in the source data set are identified, the publishing time of the text data is the publishing time of the text below the title of the scanned text (the publishing time of the news is below the news title on a general network), the identification process of the occurrence time of the enterprise event is the text content of the scanned text data, whether the text content of the time points such as X5 minutes when X1 year X2 month X3 day X4 are included or not is judged, and if yes, the text content is identified to obtain the occurrence time of the enterprise event.
Since each text data is simplified to be a combination of the title keyword list, the text keyword list and the text key time information, each text data corresponding to each source data can be simplified to be represented by the enterprise history information list, that is, the processed data corresponding to each source data can be represented by the corresponding enterprise history information list.
And S120, performing de-duplication on the processed data which are contained in the source data and correspond to the text data one by one, so as to obtain a de-duplicated data set corresponding to the source data.
In this embodiment, the repeated data is determined, so that it is possible that the crawler crawls news contents of a plurality of web pages, which are all events describing the same enterprise, but the news contents are simultaneously reloaded by a plurality of websites; the same webpage content is subjected to pretreatment, and the obtained treated data are the same, so that a plurality of the same treated data only need to be kept for one item and the rest are subjected to duplicate removal and deletion, and repeated enterprise event information can be effectively prevented from being obtained.
In one embodiment, as shown in fig. 5, step S120 includes:
S121, acquiring each piece of processed data, and judging whether repeated processed data exists in the processed data corresponding to a plurality of pieces of text data included in a source data set one by one according to a title keyword list and a text keyword list included in each piece of processed data;
And S122, if the processed data corresponding to the text data included in the source data set one by one has repeated processed data, arbitrarily selecting one group of the processed data to store and delete the processed data which is not selected to store in the plurality of the same processed data, and obtaining the de-duplicated data set corresponding to the source data.
In this embodiment, if the processed data corresponding to the text data one by one in the source data set includes repeated processed data, it indicates that the crawler crawls news content of a plurality of web pages, which are all the same enterprise event describing the enterprise, only the news content of one of the web pages needs to be saved at this time, and other repeated redundant processed data are deleted, so as to obtain the duplicate removed data set corresponding to the source data. And if the processed data which is contained in the source data set and corresponds to the text data one by one does not contain repeated processed data, reserving the source data set as a duplicate-removed data set. Through the de-preprocessing, the size of the data set is effectively reduced.
S130, acquiring enterprise history information lists included in the de-duplicated data set, and correspondingly adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list to obtain development track data corresponding to an enterprise name; wherein the names of the time axis or the calendar are generated by the enterprise names corresponding to the enterprise information lists, and the development track data is data in the form of the time axis or the calendar added with the enterprise history information list.
In this embodiment, because duplicate enterprise history information lists are deleted from the deduplicated dataset, and each enterprise history information list includes enterprise event occurrence times, each enterprise history information list may be ordered according to the enterprise event occurrence times, and each enterprise history information list may be added to a corresponding time node according to the enterprise event occurrence times, so as to obtain development track data of an enterprise.
Specifically, a time axis or a calendar with names is generated by the enterprise names corresponding to the enterprise information lists, then each enterprise history information list is added to a time node corresponding to the time axis or the calendar according to the occurrence time of the enterprise event, only a highlight mark is carried out on the time node corresponding to the time axis or the calendar, and one or more enterprise history information lists corresponding to the time node on the same day are correspondingly popped up when the time node is clicked.
In one embodiment, as shown in fig. 6, step S130 includes:
s131, sorting the enterprise history information lists in a descending order according to the occurrence time of the enterprise events included in the enterprise information lists to obtain a sorted enterprise history information list set;
And S132, sequentially adding each enterprise event occurrence time in the ordered enterprise history information list set to a time node corresponding to a preset time axis or calendar according to the descending order of the enterprise event occurrence time so as to obtain development track data corresponding to the enterprise name.
In this embodiment, the enterprise history information lists may be ordered according to occurrence time of the enterprise event, and each enterprise history information list is added to a time axis or a date corresponding to a calendar according to occurrence time of the enterprise event, so as to obtain development track data using the time axis or the calendar as a carrier, and a date may be selected to view the enterprise history information list and specific text content thereof.
And S140, displaying the development track data through a time axis or a calendar.
In this embodiment, after each enterprise history information list is added to a date corresponding to a time axis or a calendar according to the occurrence time of an enterprise event, the development track data is displayed through the time axis or the calendar. In order to save display space, generally, only a time axis named by an enterprise name or a calendar is initially displayed, a time node highlight identifier of an enterprise history information list is added, and after a user clicks the time node, a corresponding enterprise history information list is displayed.
In one embodiment, as shown in fig. 3, step S140 further includes:
And S150, if the entered search keywords are detected, comparing the search keywords with an enterprise historical information list, and acquiring an enterprise historical information list with the pearson similarity with the search keywords exceeding a preset similarity threshold value to serve as a search pushing result.
In this embodiment, after preprocessing and deduplication processing are performed on each source data in the source data set, an enterprise history information list corresponding to each text data one by one is obtained, and at this time, the enterprise history information lists are mapped to corresponding time nodes in chronological order to form development track data of an enterprise. Because each enterprise history information list comprises a title keyword list, a text keyword list and text keyword time information, the input search keywords can be compared with the keywords in each enterprise history information list, and the enterprise history information list which is similar to the search keywords in the search keywords is screened to be used as a search pushing result.
Specifically, an enterprise history information list with the pearson similarity with the search keyword exceeding a preset similarity threshold is obtained to serve as a search pushing result. When the pearson similarity between each keyword in the enterprise history information list and the search keyword is calculated, each keyword in the enterprise history information list and the search keyword can be converted into a Word vector through a Word2Vec model (the Word2Vec model is a model for converting words into Word vectors), and then the pearson similarity between the Word vector corresponding to each keyword and the Word vector corresponding to the search keyword is calculated.
The pearson correlation coefficient between any two vectors is calculated by the following formula:
wherein E represents a mathematical expectation;
the range of values for ρ X,Y is (0, 1), with ρ X,Y being closer to 1 indicating a higher degree of similarity for the two column vectors and ρ X,Y being closer to 0 indicating a lower degree of similarity for the two vectors.
The method realizes that the enterprise related information is automatically crawled from the network, and after the key information of the enterprise related information is extracted, all the key information is ordered according to time, so that the development track data of the enterprise is obtained.
The embodiment of the invention also provides a data storage device which is used for executing any embodiment of the data storage method. In particular, referring to fig. 7, fig. 7 is a schematic block diagram of a data storage device according to an embodiment of the present invention. The data storage device 100 may be configured in a server.
As shown in fig. 7, the data storage device 100 includes a source data crawling unit 110, a deduplication unit 120, a trajectory data acquisition unit 130, and a data display unit 140.
A source data crawling unit 110, configured to crawl a corresponding source data set according to the received enterprise name, and pre-process the source data set to obtain processed data corresponding to a plurality of text data included in the source data set one-to-one; wherein each processed data corresponds to a list of enterprise history information.
In this embodiment, for the development track of one or more enterprises, a crawler tool may crawl a large amount of web content related to the corresponding enterprise. For example, when the web page content related to the enterprise a needs to be acquired, a to-be-crawled web page list related to the enterprise a is set for a crawler tool deployed in a server, and the web page content related to the enterprise a is acquired from a web page server of the to-be-crawled web page list web page through the crawler tool as a source data set. And then preprocessing the source data set to obtain historical text data related to the development process of the enterprise. Specifically, each piece of text data crawled from the web page can be subjected to extraction of a title keyword list and a text keyword list.
In one embodiment, as shown in FIG. 9, the source data crawling unit 110 comprises:
A keyword extraction unit 111, configured to extract keyword information from a title and a text of each text data in the plurality of text data included in the source data set through a word frequency-inverse text frequency index model, to obtain a title keyword list and a text keyword list corresponding to each text data;
a time information identifying unit 112, configured to identify a publishing time of each text data in the plurality of text data included in the source data set and an occurrence time of an enterprise event, so as to obtain text key time information corresponding to each text data;
And the data combination unit 113 is configured to encapsulate and combine the title keyword list, the text keyword list, and the text key time information corresponding to each text data in the source data set, so as to obtain an enterprise history information list corresponding to each text data, so as to obtain processed data corresponding to each source data.
In this embodiment, when keyword information extraction is performed on the title and the body of each text data in the plurality of text data included in the source data set by using a TF-IDF model (i.e., term frequency-inverse document frequency, which represents a word frequency-inverse text frequency index model), each text data may be simplified and expressed as a corresponding keyword list.
And then, when the publishing time of each text data and the occurrence time of the enterprise event in the plurality of text data included in the source data set are identified, the publishing time of the text data is the publishing time of the text below the title of the scanned text (the publishing time of the news is below the news title on a general network), the identification process of the occurrence time of the enterprise event is the text content of the scanned text data, whether the text content of the time points such as X5 minutes when X1 year X2 month X3 day X4 are included or not is judged, and if yes, the text content is identified to obtain the occurrence time of the enterprise event.
Since each text data is simplified to be a combination of the title keyword list, the text keyword list and the text key time information, each text data corresponding to each source data can be simplified to be represented by the enterprise history information list, that is, the processed data corresponding to each source data can be represented by the corresponding enterprise history information list.
And the deduplication unit 120 is configured to deduplicate the processed data that corresponds to the plurality of text data included in the source data one by one, so as to obtain a deduplicated dataset that corresponds to the source data.
In this embodiment, the repeated data is determined, so that it is possible that the crawler crawls news contents of a plurality of web pages, which are all events describing the same enterprise, but the news contents are simultaneously reloaded by a plurality of websites; the same webpage content is subjected to pretreatment, and the obtained treated data are the same, so that a plurality of the same treated data only need to be kept for one item and the rest are subjected to duplicate removal and deletion, and repeated enterprise event information can be effectively prevented from being obtained.
In one embodiment, as shown in fig. 10, the deduplication unit 120 includes:
A data comparison unit 121, configured to obtain each piece of processed data, and determine whether there is repeated processed data corresponding to the plurality of pieces of text data included in the source data set one by one according to the title keyword list and the text keyword list included in each piece of processed data;
the data deleting unit 122 is configured to, if there is duplicate processed data in the processed data corresponding to the plurality of text data included in the source data set, store and delete any selected group of the plurality of identical processed data, and obtain a de-duplicated data set corresponding to the source data.
In this embodiment, if the processed data corresponding to the text data one by one in the source data set includes repeated processed data, it indicates that the crawler crawls news content of a plurality of web pages, which are all the same enterprise event describing the enterprise, only the news content of one of the web pages needs to be saved at this time, and other repeated redundant processed data are deleted, so as to obtain the duplicate removed data set corresponding to the source data. Through the de-preprocessing, the size of the data set is effectively reduced.
The track data obtaining unit 130 is configured to obtain an enterprise history information list included in the deduplicated dataset, and correspondingly add each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list, so as to obtain development track data corresponding to an enterprise name; wherein the names of the time axis or the calendar are generated by the enterprise names corresponding to the enterprise information lists, and the development track data is data in the form of the time axis or the calendar added with the enterprise history information list.
In this embodiment, because duplicate enterprise history information lists are deleted from the deduplicated dataset, and each enterprise history information list includes enterprise event occurrence times, each enterprise history information list may be ordered according to the enterprise event occurrence times, and each enterprise history information list may be added to a corresponding time node according to the enterprise event occurrence times, so as to obtain development track data of an enterprise.
Specifically, a time axis or a calendar with names is generated by the enterprise names corresponding to the enterprise information lists, then each enterprise history information list is added to a time node corresponding to the time axis or the calendar according to the occurrence time of the enterprise event, only a highlight mark is carried out on the time node corresponding to the time axis or the calendar, and one or more enterprise history information lists corresponding to the time node on the same day are correspondingly popped up when the time node is clicked.
In one embodiment, as shown in fig. 11, the trajectory data acquisition unit 130 includes:
A descending order sorting unit 131, configured to sort the enterprise history information lists in descending order according to the occurrence time of the enterprise event included in each enterprise information list, so as to obtain a sorted enterprise history information list set;
the track data mapping unit 132 is configured to sequentially add each of the ordered enterprise history information list sets to a time node corresponding to a preset time axis or calendar in descending order of occurrence time of enterprise events, so as to obtain development track data corresponding to the enterprise name.
In this embodiment, the enterprise history information lists may be ordered according to occurrence time of the enterprise event, and each enterprise history information list is added to a time axis or a date corresponding to a calendar according to occurrence time of the enterprise event, so as to obtain development track data using the time axis or the calendar as a carrier, and a date may be selected to view the enterprise history information list and specific text content thereof.
And a data display unit 140, configured to display the development track data through a time axis or a calendar.
In this embodiment, after each enterprise history information list is added to a date corresponding to a time axis or a calendar according to the occurrence time of an enterprise event, the development track data is displayed through the time axis or the calendar. In order to save display space, generally, only a time axis named by an enterprise name or a calendar is initially displayed, a time node highlight identifier of an enterprise history information list is added, and after a user clicks the time node, a corresponding enterprise history information list is displayed.
In one embodiment, as shown in FIG. 8, the data storage device 100 further comprises:
And the search unit 150 is configured to compare the entered search keyword with the enterprise history information list if the entered search keyword is detected, and obtain an enterprise history information list with pearson similarity with the search keyword exceeding a preset similarity threshold, as a search pushing result.
In this embodiment, after preprocessing and deduplication processing are performed on each source data in the source data set, an enterprise history information list corresponding to each text data one by one is obtained, and at this time, the enterprise history information lists are mapped to corresponding time nodes in chronological order to form development track data of an enterprise. Because each enterprise history information list comprises a title keyword list, a text keyword list and text keyword time information, the input search keywords can be compared with the keywords in each enterprise history information list, and the enterprise history information list which is similar to the search keywords in the search keywords is screened to be used as a search pushing result.
Specifically, an enterprise history information list with the pearson similarity with the search keyword exceeding a preset similarity threshold is obtained to serve as a search pushing result. When the pearson similarity between each keyword in the enterprise history information list and the search keyword is calculated, each keyword in the enterprise history information list and the search keyword can be converted into a Word vector through a Word2Vec model (the Word2Vec model is a model for converting words into Word vectors), and then the pearson similarity between the Word vector corresponding to each keyword and the Word vector corresponding to the search keyword is calculated.
The device realizes that the enterprise related information is automatically crawled from the network, and after the key information of the enterprise related information is extracted, all the key information is ordered according to time, so that the development track data of the enterprise is obtained.
The data storage means described above may be implemented in the form of a computer program which is executable on a computer device as shown in fig. 12.
Referring to fig. 12, fig. 12 is a schematic block diagram of a computer device according to an embodiment of the present invention. The computer device 500 is a server, and the server may be a stand-alone server or a server cluster formed by a plurality of servers.
With reference to FIG. 12, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032, when executed, may cause the processor 502 to perform a data storage method.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform a data storage method.
The network interface 505 is used for network communication, such as providing for transmission of data information, etc. It will be appreciated by those skilled in the art that the structure shown in FIG. 12 is merely a block diagram of some of the structures associated with the present inventive arrangements and does not constitute a limitation of the computer device 500 to which the present inventive arrangements may be applied, and that a particular computer device 500 may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
Wherein the processor 502 is configured to execute a computer program 5032 stored in a memory to perform the following functions: crawling a corresponding source data set according to the received enterprise name, and preprocessing the source data set to obtain processed data corresponding to a plurality of text data contained in the source data set one by one; wherein each processed data corresponds to an enterprise history information list; performing duplication removal on the processed data which are contained in the source data and correspond to the text data one by one to obtain a duplication-removed data set corresponding to the source data; acquiring enterprise history information lists included in the duplicate-removed data set, and correspondingly adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list to obtain development track data corresponding to an enterprise name; the names of the time axis or the calendars are generated by the enterprise names corresponding to the enterprise information lists, and the development track data are data in the form of the time axis or the calendars added with the enterprise history information list; and displaying the development track data through a time axis or a calendar.
In an embodiment, after executing the obtaining the enterprise history information list included in the de-duplicated data set, the processor 502 adds each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of the enterprise event in the included text and the corresponding enterprise history information list, so as to obtain development track data corresponding to the enterprise name; after the step of displaying the development track data through a time axis or a calendar, the following operations are further performed: if the entered search keywords are detected, comparing the search keywords with an enterprise historical information list, and acquiring an enterprise historical information list with the pearson similarity with the search keywords exceeding a preset similarity threshold value to serve as a search pushing result.
In one embodiment, when the step of preprocessing the source data set to obtain processed data corresponding to a plurality of text data included in the source data set is performed by the processor 502, the following operations are performed: extracting keyword information from the titles and texts of each text data in the plurality of text data included in the source data set through a word frequency-inverse text frequency index model respectively to obtain a title keyword list and a text keyword list corresponding to each text data; identifying the publishing time of each text data and the occurrence time of enterprise events in a plurality of text data included in the source data set to obtain text key time information corresponding to each text data; and packaging and combining a title keyword list, a text keyword list and text key time information corresponding to each piece of text data in the source data set to obtain an enterprise historical information list corresponding to each piece of text data so as to obtain processed data corresponding to each piece of source data.
In one embodiment, when the step of performing deduplication on the processed data corresponding to the plurality of text data included in the source data one by one to obtain the deduplicated data set corresponding to the source data, the processor 502 performs the following operations: acquiring each piece of processed data, and judging whether the processed data corresponding to a plurality of pieces of text data included in the source data set one by one has repeated processed data according to a title keyword list and a text keyword list included in each piece of processed data; if the processed data corresponding to the text data included in the source data set one by one has repeated processed data, any selected group of the processed data is stored and the processed data which is not selected to be stored is deleted, so that the de-duplicated data set corresponding to the source data is obtained.
In one embodiment, when executing the step of adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list, the processor 502 executes the following operations: according to the occurrence time of the enterprise event included in each enterprise information list, ordering each enterprise history information list in a descending order to obtain an ordered enterprise history information list set; and sequentially adding each of the ordered enterprise history information list sets to a time node corresponding to a preset time axis or calendar according to the descending order of the occurrence time of the enterprise event so as to obtain development track data corresponding to the enterprise name.
Those skilled in the art will appreciate that the embodiment of the computer device shown in fig. 12 is not limiting of the specific construction of the computer device, and in other embodiments, the computer device may include more or less components than those shown, or certain components may be combined, or a different arrangement of components. For example, in some embodiments, the computer device may include only a memory and a processor, and in such embodiments, the structure and function of the memory and the processor are consistent with the embodiment shown in fig. 12, and will not be described again.
It should be appreciated that in embodiments of the present invention, the Processor 502 may be a central processing unit (Central Processing Unit, CPU), the Processor 502 may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application SPECIFIC INTEGRATED Circuits (ASICs), off-the-shelf Programmable gate arrays (Field-Programmable GATEARRAY, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In another embodiment of the invention, a computer-readable storage medium is provided. The computer readable storage medium may be a non-volatile computer readable storage medium. The computer readable storage medium stores a computer program, wherein the computer program when executed by a processor performs the steps of: crawling a corresponding source data set according to the received enterprise name, and preprocessing the source data set to obtain processed data corresponding to a plurality of text data contained in the source data set one by one; wherein each processed data corresponds to an enterprise history information list; performing duplication removal on the processed data which are contained in the source data and correspond to the text data one by one to obtain a duplication-removed data set corresponding to the source data; acquiring enterprise history information lists included in the duplicate-removed data set, and correspondingly adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list to obtain development track data corresponding to an enterprise name; the names of the time axis or the calendars are generated by the enterprise names corresponding to the enterprise information lists, and the development track data are data in the form of the time axis or the calendars added with the enterprise history information list; and displaying the development track data through a time axis or a calendar.
In an embodiment, the obtaining the enterprise history information list included in the de-duplicated data set correspondingly adds each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of the enterprise event in the included text and the corresponding enterprise history information list, so as to obtain development track data corresponding to the enterprise name; after displaying the development track data through a time axis or a calendar, the method further comprises the following steps: if the entered search keywords are detected, comparing the search keywords with an enterprise historical information list, and acquiring an enterprise historical information list with the pearson similarity with the search keywords exceeding a preset similarity threshold value to serve as a search pushing result.
In an embodiment, the preprocessing the source data set to obtain processed data corresponding to a plurality of text data included in the source data set one to one includes: extracting keyword information from the titles and texts of each text data in the plurality of text data included in the source data set through a word frequency-inverse text frequency index model respectively to obtain a title keyword list and a text keyword list corresponding to each text data; identifying the publishing time of each text data and the occurrence time of enterprise events in a plurality of text data included in the source data set to obtain text key time information corresponding to each text data; and packaging and combining a title keyword list, a text keyword list and text key time information corresponding to each piece of text data in the source data set to obtain an enterprise historical information list corresponding to each piece of text data so as to obtain processed data corresponding to each piece of source data.
In an embodiment, the performing deduplication on the processed data corresponding to the plurality of text data included in the source data one by one to obtain a deduplicated data set corresponding to the source data includes: acquiring each piece of processed data, and judging whether the processed data corresponding to a plurality of pieces of text data included in the source data set one by one has repeated processed data according to a title keyword list and a text keyword list included in each piece of processed data; if the processed data corresponding to the text data included in the source data set one by one has repeated processed data, any selected group of the processed data is stored and the processed data which is not selected to be stored is deleted, so that the de-duplicated data set corresponding to the source data is obtained.
In an embodiment, the adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of the enterprise event in the included text and the corresponding enterprise history information list, to obtain development track data corresponding to the enterprise name includes: according to the occurrence time of the enterprise event included in each enterprise information list, ordering each enterprise history information list in a descending order to obtain an ordered enterprise history information list set; and sequentially adding each of the ordered enterprise history information list sets to a time node corresponding to a preset time axis or calendar according to the descending order of the occurrence time of the enterprise event so as to obtain development track data corresponding to the enterprise name.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus, device and unit described above may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein. Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the units is merely a logical function division, there may be another division manner in actual implementation, or units having the same function may be integrated into one unit, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present invention.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units may be stored in a storage medium if implemented in the form of software functional units and sold or used as stand-alone products. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (8)

1. A method of data storage, comprising:
Crawling a corresponding source data set according to the received enterprise name, and preprocessing the source data set to obtain processed data corresponding to a plurality of text data contained in the source data set one by one; wherein each processed data corresponds to an enterprise history information list;
performing duplication removal on the processed data which are contained in the source data and correspond to the text data one by one to obtain a duplication-removed data set corresponding to the source data;
Acquiring enterprise history information lists included in the duplicate-removed data set, and correspondingly adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list to obtain development track data corresponding to an enterprise name; the names of the time axis or the calendars are generated by the enterprise names corresponding to the enterprise information lists, and the development track data are data in the form of the time axis or the calendars added with the enterprise history information list; and
Displaying the development track data through a time axis or a calendar;
the step of adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the enterprise event occurrence time and the corresponding enterprise history information list in the included text to obtain development track data corresponding to the enterprise name, including:
According to the occurrence time of the enterprise event included in each enterprise information list, ordering each enterprise history information list in a descending order to obtain an ordered enterprise history information list set;
sequentially adding each of the ordered enterprise history information list sets to a time node corresponding to a preset time axis or calendar according to the descending order of the occurrence time of the enterprise event so as to obtain development track data corresponding to the enterprise name;
and only highlighting the time node corresponding to the time axis or the calendar added with the enterprise history information list.
2. The method for storing data according to claim 1, wherein the obtaining the enterprise history information list included in the de-duplicated dataset includes adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of the enterprise event in the included text and the corresponding enterprise history information list, and obtaining development track data corresponding to the enterprise name; after displaying the development track data through a time axis or a calendar, the method further comprises the following steps:
if the entered search keywords are detected, comparing the search keywords with an enterprise historical information list, and acquiring an enterprise historical information list with the pearson similarity with the search keywords exceeding a preset similarity threshold value to serve as a search pushing result.
3. The data storage method according to claim 1, wherein preprocessing the source data set to obtain processed data corresponding to a plurality of text data included in the source data set one-to-one, includes:
Extracting keyword information from the titles and texts of each text data in the plurality of text data included in the source data set through a word frequency-inverse text frequency index model respectively to obtain a title keyword list and a text keyword list corresponding to each text data;
Identifying the publishing time of each text data and the occurrence time of enterprise events in a plurality of text data included in the source data set to obtain text key time information corresponding to each text data;
And packaging and combining a title keyword list, a text keyword list and text key time information corresponding to each piece of text data in the source data set to obtain an enterprise historical information list corresponding to each piece of text data so as to obtain processed data corresponding to each piece of source data.
4. The method for storing data according to claim 1, wherein the step of performing deduplication on the processed data, which is one-to-one corresponding to the plurality of text data included in the source data, to obtain a deduplicated data set corresponding to the source data, includes:
acquiring each piece of processed data, and judging whether the processed data corresponding to a plurality of pieces of text data included in the source data set one by one has repeated processed data according to a title keyword list and a text keyword list included in each piece of processed data;
If the processed data corresponding to the text data included in the source data set one by one has repeated processed data, any selected group of the processed data is stored and the processed data which is not selected to be stored is deleted, so that the de-duplicated data set corresponding to the source data is obtained.
5. A data storage device, comprising:
the source data crawling unit is used for crawling a corresponding source data set according to the received enterprise name, preprocessing the source data set, and obtaining processed data corresponding to a plurality of text data included in the source data set one by one; wherein each processed data corresponds to an enterprise history information list;
the de-duplication unit is used for de-duplication the processed data which are contained in the source data and correspond to the text data one by one to obtain a de-duplicated data set which corresponds to the source data; and
The track data acquisition unit is used for acquiring enterprise history information lists included in the duplicate-removed data set, and correspondingly adding each enterprise information list to a time node corresponding to a time axis or a calendar according to the occurrence time of an enterprise event in the included text and the corresponding enterprise history information list to obtain development track data corresponding to an enterprise name; the names of the time axis or the calendars are generated by the enterprise names corresponding to the enterprise information lists, and the development track data are data in the form of the time axis or the calendars added with the enterprise history information list;
The data display unit is used for displaying the development track data through a time axis or a calendar;
the track data acquisition unit includes:
the descending order sorting unit is used for descending order sorting of the enterprise history information lists according to the occurrence time of the enterprise events included in the enterprise information lists to obtain a sorted enterprise history information list set;
the track data mapping unit is used for sequentially adding each enterprise event occurrence time in the ordered enterprise history information list set to a time node corresponding to a preset time axis or calendar according to the descending order of the enterprise event occurrence time so as to obtain development track data corresponding to the enterprise name;
and only highlighting the time node corresponding to the time axis or the calendar added with the enterprise history information list.
6. The data storage device of claim 5, further comprising:
And the retrieval unit is used for comparing the retrieval keywords with the enterprise historical information list if the entered retrieval keywords are detected, and acquiring the enterprise historical information list with the pearson similarity with the retrieval keywords exceeding a preset similarity threshold value to serve as a retrieval pushing result.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the data storage method according to any of claims 1 to 4 when executing the computer program.
8. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, causes the processor to perform the data storage method according to any one of claims 1 to 4.
CN201910370895.0A 2019-05-06 2019-05-06 Data storage method, device, computer equipment and storage medium Active CN110222249B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910370895.0A CN110222249B (en) 2019-05-06 2019-05-06 Data storage method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910370895.0A CN110222249B (en) 2019-05-06 2019-05-06 Data storage method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110222249A CN110222249A (en) 2019-09-10
CN110222249B true CN110222249B (en) 2024-05-31

Family

ID=67820532

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910370895.0A Active CN110222249B (en) 2019-05-06 2019-05-06 Data storage method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110222249B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807702A (en) * 2019-11-05 2020-02-18 深圳前海微众银行股份有限公司 Method, device, equipment and storage medium for managing information after loan
CN111061826A (en) * 2019-12-19 2020-04-24 瞰湖科技(武汉)有限公司 Vehicle data information sorting method, storage medium and system
CN113553407B (en) * 2021-06-18 2022-09-27 北京百度网讯科技有限公司 Event tracing method and device, electronic equipment and storage medium
CN114021788B (en) * 2021-10-25 2022-07-26 深圳市维度数据科技股份有限公司 Prediction method, prediction device, electronic equipment and storage medium
CN115964190B (en) * 2022-12-07 2023-07-14 中科雨辰科技有限公司 Data processing system for updating historical event information

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107123049A (en) * 2017-05-02 2017-09-01 灯塔财经信息有限公司 Listed company's dragon and tiger list region display methods and system based on electronic map
CN107464068A (en) * 2017-09-18 2017-12-12 前海梧桐(深圳)数据有限公司 Enterprise development trend forecasting method and its system based on neutral net
JP2018041350A (en) * 2016-09-09 2018-03-15 カシオ計算機株式会社 Information processing device and program
CN108133047A (en) * 2018-01-15 2018-06-08 北京金堤科技有限公司 Methods of exhibiting, device and the electronic equipment of enterprise's collection of illustrative plates
CN109360057A (en) * 2018-10-12 2019-02-19 平安科技(深圳)有限公司 Information-pushing method, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018041350A (en) * 2016-09-09 2018-03-15 カシオ計算機株式会社 Information processing device and program
CN107123049A (en) * 2017-05-02 2017-09-01 灯塔财经信息有限公司 Listed company's dragon and tiger list region display methods and system based on electronic map
CN107464068A (en) * 2017-09-18 2017-12-12 前海梧桐(深圳)数据有限公司 Enterprise development trend forecasting method and its system based on neutral net
CN108133047A (en) * 2018-01-15 2018-06-08 北京金堤科技有限公司 Methods of exhibiting, device and the electronic equipment of enterprise's collection of illustrative plates
CN109360057A (en) * 2018-10-12 2019-02-19 平安科技(深圳)有限公司 Information-pushing method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN110222249A (en) 2019-09-10

Similar Documents

Publication Publication Date Title
CN110222249B (en) Data storage method, device, computer equipment and storage medium
US11853334B2 (en) Systems and methods for generating and using aggregated search indices and non-aggregated value storage
JP4919515B2 (en) Duplicate document detection and display function
Bergman White paper: the deep web: surfacing hidden value
CN108509547B (en) Information management method, information management system and electronic equipment
US9922092B2 (en) Devices, systems, and methods for context management
US20090198677A1 (en) Document Comparison Method And Apparatus
US20030172357A1 (en) Knowledge management using text classification
US20100115003A1 (en) Methods For Merging Text Snippets For Context Classification
US20100115001A1 (en) Methods For Pairing Text Snippets To File Activity
AU2004201344A1 (en) Computer searching with associations
MX2009000584A (en) RANKING FUNCTIONS USING AN INCREMENTALLY-UPDATABLE, MODIFIED NAÿVE BAYESIAN QUERY CLASSIFIER.
CN106294595A (en) A kind of document storage, search method and device
CN106250552B (en) Aggregating WEB pages on search engine results pages
US10783195B2 (en) System and method for constructing search results
US20080147631A1 (en) Method and system for collecting and retrieving information from web sites
US20120239657A1 (en) Category classification processing device and method
US20170147652A1 (en) Search servers, end devices, and search methods for use in a distributed network
US20070208684A1 (en) Information collection support apparatus, method of information collection support, computer readable medium, and computer data signal
CN108763961B (en) Big data based privacy data grading method and device
US10255379B2 (en) System and method for displaying timeline search results
US10474700B2 (en) Robust stream filtering based on reference document
CN105243073A (en) Bookmark access method and device and terminal
EP4002152A1 (en) Data tagging and synchronisation system
CN115292478A (en) Method, device, equipment and storage medium for recommending search content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant