CN113360780A - Big data based information recommendation method and system - Google Patents

Big data based information recommendation method and system Download PDF

Info

Publication number
CN113360780A
CN113360780A CN202110911121.1A CN202110911121A CN113360780A CN 113360780 A CN113360780 A CN 113360780A CN 202110911121 A CN202110911121 A CN 202110911121A CN 113360780 A CN113360780 A CN 113360780A
Authority
CN
China
Prior art keywords
information
result
recommendation
feature
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110911121.1A
Other languages
Chinese (zh)
Inventor
杨昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Heima Qifu Technology Co ltd
Original Assignee
Beijing Heima Qifu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Heima Qifu Technology Co ltd filed Critical Beijing Heima Qifu Technology Co ltd
Priority to CN202110911121.1A priority Critical patent/CN113360780A/en
Publication of CN113360780A publication Critical patent/CN113360780A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an information recommendation method and system based on big data, wherein the method comprises the following steps: obtaining a first set of information; extracting the features of the first information set to obtain a first feature information set; obtaining a first information source; performing clustering analysis on the first information source to obtain a first clustering result, and constructing a first information source database according to the first clustering result; traversing operation is carried out on a first feature information set in a first information source database, and feature information belonging to a first clustering result is marked to obtain first label information; performing dynamic sorting according to the number of the first label information to obtain a first dynamic sorting result; and inputting the first dynamic sorting result information into the first recommendation model to obtain a first output result, wherein the first output result comprises first recommendation information and second recommendation information. The technical problem that information obtained by mainly analyzing historical data of a user is lack of real-time performance in the prior art is solved.

Description

Big data based information recommendation method and system
Technical Field
The invention relates to the technical field of the general public, in particular to an information recommendation method and system based on big data.
Background
With the development of the internet era and the arrival of the big data era, people gradually move from the information-deficient era to the information-overloaded era, and the recommendation system is generated in order to enable users to efficiently acquire the required information from massive information. The main task of the recommendation system is to contact users and information, which on one hand helps users to find information valuable to themselves, and on the other hand enables information to be presented to users interested in it, thereby realizing win-win of information consumers and information producers.
The recommendation system based on the big data learns the preference of the user by analyzing the historical records of the user, so that interested information is actively recommended to the user, and the personalized recommendation requirement of the user is met.
However, in the process of implementing the technical solution of the invention in the embodiments of the present application, the inventors of the present application find that the above-mentioned technology has at least the following technical problems:
in the prior art, recommendation is mainly performed by analyzing historical data of a user, and the obtained information has certain hysteresis and is lack of real-time performance.
Disclosure of Invention
The embodiment of the application provides an information recommendation method and system based on big data, and solves the technical problems that in the prior art, recommendation is mainly performed by analyzing historical data of a user, obtained information has certain hysteresis, and real-time performance is poor. The technical effect that the recommendation information has real-time performance is achieved by extracting the features of the information acquired by the big data platform, matching the information with different information source groups, marking, dynamically sequencing according to the number change trend of the marks and analyzing by using an intelligent model according to the dynamic sequencing result.
In view of the foregoing problems, embodiments of the present application provide an information recommendation method and system based on big data.
In a first aspect, an embodiment of the present application provides an information recommendation method based on big data, where the method includes: obtaining a first set of information; performing feature extraction on the first information set to obtain a first feature information set; obtaining a first information source; performing clustering analysis on the first information source to obtain a first clustering result, and constructing a first information source database according to the first clustering result; traversing operation is carried out on the first feature information set in the first information source database, and feature information belonging to the first clustering result is marked to obtain first label information; performing dynamic sorting according to the number information of the first label information to obtain a first dynamic sorting result; and inputting the first dynamic sorting result information into a first recommendation model to obtain a first output result, wherein the first output result comprises first recommendation information and second recommendation information.
On the other hand, an embodiment of the present application provides an information recommendation system based on big data, where the system includes: a first obtaining unit for obtaining a first set of information; a second obtaining unit, configured to perform feature extraction on the first information set to obtain a first feature information set; a third obtaining unit, configured to obtain a first information source; a fourth obtaining unit, configured to perform cluster analysis on the first information source, obtain a first clustering result, and construct a first information source database according to the first clustering result; a fifth obtaining unit, configured to perform traversal operation on the first feature information set in the first information source database, and mark feature information belonging to the first clustering result to obtain first label information; a sixth obtaining unit, configured to perform dynamic sorting according to the number of the first tag information, and obtain a first dynamic sorting result; a seventh obtaining unit, configured to input the first dynamic ranking result information into a first recommendation model, and obtain a first output result, where the first output result includes first recommendation information and second recommendation information.
In a third aspect, an embodiment of the present application provides a big data-based information recommendation system, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the method according to any one of the first aspect when executing the program.
One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:
due to the adoption of obtaining the first set of information; performing feature extraction on the first information set to obtain a first feature information set; obtaining a first information source; performing clustering analysis on the first information source to obtain a first clustering result, and constructing a first information source database according to the first clustering result; traversing operation is carried out on the first feature information set in the first information source database, and feature information belonging to the first clustering result is marked to obtain first label information; performing dynamic sorting according to the number information of the first label information to obtain a first dynamic sorting result; the embodiment of the application provides an information recommendation method and system based on big data, so that the technical effect that the recommendation information has real-time performance is achieved by extracting the characteristics of the information acquired by a big data platform, matching the information with different information source groups, marking the information, dynamically sequencing according to the number change trend of the marks, and analyzing by using an intelligent model according to the dynamic sequencing result.
The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.
Drawings
FIG. 1 is a schematic flow chart of an information recommendation method based on big data according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of another big data-based information recommendation method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of another big data-based information recommendation method according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of an information recommendation system based on big data according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an exemplary electronic device according to an embodiment of the present application.
Description of reference numerals: a first obtaining unit 11, a second obtaining unit 12, a third obtaining unit 13, a fourth obtaining unit 14, a fifth obtaining unit 15, a sixth obtaining unit 16, a seventh obtaining unit 17, an electronic device 300, a memory 301, a processor 302, a communication interface 303, and a bus architecture 304.
Detailed Description
The embodiment of the application provides an information recommendation method and system based on big data, and solves the technical problems that in the prior art, recommendation is mainly performed by analyzing historical data of a user, obtained information has certain hysteresis, and real-time performance is poor. The technical effect that the recommendation information has real-time performance is achieved by extracting the features of the information acquired by the big data platform, matching the information with different information source groups, marking, dynamically sequencing according to the number change trend of the marks and analyzing by using an intelligent model according to the dynamic sequencing result.
Summary of the application
With the development of the internet era and the arrival of the big data era, people gradually move from the information-deficient era to the information-overloaded era, and the recommendation system is generated in order to enable users to efficiently acquire the required information from massive information. The main task of the recommendation system is to contact users and information, which on one hand helps users to find information valuable to themselves, and on the other hand enables information to be presented to users interested in it, thereby realizing win-win of information consumers and information producers. The recommendation system based on the big data learns the preference of the user by analyzing the historical records of the user, so that interested information is actively recommended to the user, and the personalized recommendation requirement of the user is met. However, in the prior art, recommendation is mainly performed by analyzing historical data of a user, and the obtained information has certain hysteresis and is short of real-time performance.
In view of the above technical problems, the technical solution provided by the present application has the following general idea:
the embodiment of the application provides an information recommendation method based on big data, wherein the method comprises the following steps: obtaining a first set of information; performing feature extraction on the first information set to obtain a first feature information set; obtaining a first information source; performing clustering analysis on the first information source to obtain a first clustering result, and constructing a first information source database according to the first clustering result; traversing operation is carried out on the first feature information set in the first information source database, and feature information belonging to the first clustering result is marked to obtain first label information; performing dynamic sorting according to the number information of the first label information to obtain a first dynamic sorting result; and inputting the first dynamic sorting result information into a first recommendation model to obtain a first output result, wherein the first output result comprises first recommendation information and second recommendation information.
Having thus described the general principles of the present application, various non-limiting embodiments thereof will now be described in detail with reference to the accompanying drawings.
Example one
As shown in fig. 1, an embodiment of the present application provides an information recommendation method based on big data, where the method includes:
s100: obtaining a first set of information;
specifically, the first information set refers to the service product usage data information in a set time interval acquired based on big data, which is not limited to two examples: book information borrowed and purchased by readers in the library; the online shopping platform browses and purchases shops, product information and the like. Preferably, the collected information is stored according to time elements, the setting of the time interval is determined according to the actual situation of the service product, and the information in the time interval can reflect the recent use characteristics of the service product. And the acquisition of the first information set is convenient for subsequent information feedback processing.
S200: performing feature extraction on the first information set to obtain a first feature information set;
specifically, the first feature information set is feature information used for characterizing the service product attribute by performing feature extraction on the first information set, which is still exemplified above: in the library, the type of books being borrowed (literature, learning materials, mathematics), the type of books being purchased, the borrowing time and duration, the frequency, the number of books being purchased, and other information. The feature extraction mode preferably uses a feature extraction model based on convolutional neural network training for feature extraction, and convolution can be used as a feature extractor in machine learning, so that extracted feature information has centralization and representativeness, convolution features of the first information set are obtained, and the first accurate feature information set is formed. By means of feature extraction, the main attribute information of the service product information is obtained, data redundancy is reduced, and data processing efficiency is improved.
S300: obtaining a first information source;
s400: performing clustering analysis on the first information source to obtain a first clustering result, and constructing a first information source database according to the first clustering result;
specifically, the first information source refers to data of an information providing end corresponding to the first information set based on big data reading storage, and optionally unlimited information of users, enterprises, groups, organizations and the like using the service product; the cluster analysis refers to grouping data objects according to information found in the data describing the objects and their relationships, with the aim that objects within a group are similar to each other, while objects in different groups are not related. Classifying and clustering categories according to different attributes of the group of the first information source, wherein the classification is determined according to a service object of a recommendation system, for example, if the group is applied to a library, the group for borrowing and purchasing books is preferably classified by using professional categories, such as students, civil engineering, lawyers, writers and the like; if the method is applied to the online store platform, the division can be mainly carried out according to different age intervals of browsing and purchasing groups. Further, the first clustering result is obtained by classifying according to a clustering analysis mode well matched with the first information source. The first information source database refers to the first information source data set after being clustered according to the first clustering result storage management, and optionally, data in the first information source database can be updated in real time according to the updating of the first information set. And clustering the first information source to obtain different types of cluster group composition databases, thereby facilitating management and quick calling.
S500: traversing operation is carried out on the first feature information set in the first information source database, and feature information belonging to the first clustering result is marked to obtain first label information;
s600: performing dynamic sorting according to the number information of the first label information to obtain a first dynamic sorting result;
specifically, the first tag information refers to comparing and matching the first feature information set with the non-stop categories in the first information source database, and the first feature information belonging to different categories of the first information source is used as the data category label in the first clustering result. For example, in a library, the number of students and the borrowing amount, the number of teachers and the borrowing amount, the number of lawyers and the borrowing amount, etc. of learning books, the number of students and the purchasing amount, the number of teachers and the purchasing amount, the number of lawyers and the purchasing amount, etc. are purchased. Further, the first dynamic ranking result refers to ranking according to the number of the first tag information on the first feature information set, for example, the number of people in each profession for borrowing the learning material books, the current number of borrowed people and the number of borrowed people in a future period of time can be predicted according to the change trend of the number of people, dynamic ranking is performed according to the change condition of the prediction information, and the selectable implementation manner of the prediction information is obtained by using the linear relation reasoning of the change of the dynamic ranking. The ranking results within a period of time in the present and future can be predicted through the first dynamic ranking results, and the ranking results can be adjusted according to the real-time feedback of the first information set, so that the ranking results have the technical effect of timeliness.
S700: and inputting the first dynamic sorting result information into a first recommendation model to obtain a first output result, wherein the first output result comprises first recommendation information and second recommendation information.
Specifically, the first output result information is recommendation information given according to the ranking result obtained by inputting the first dynamic ranking result information into the first recommendation model through intelligent analysis, the first recommendation model is established on the basis of a neural network model and has the characteristics of the neural network model, wherein the artificial neural network is an abstract mathematical model which is proposed and developed on the basis of modern neuroscience and aims at reflecting the structure and the function of the human brain, the neural network is an operation model and is formed by connecting a large number of nodes (or called neurons) with each other, each node represents a specific output function called an excitation function, the connection between every two nodes represents a weighted value called a weight for a signal passing through the connection, which is equivalent to the memory of the artificial neural network, and the output of the network is in accordance with the connection mode of the network, the first recommendation model established based on the neural network model can output accurate first output result information, the first recommendation information and the second recommendation information are contained in the first output result information, and the best service product is predicted and recommended to each clustering group through representation analysis, so that the method has strong analysis and calculation capacity and achieves the technical effects of accuracy and high efficiency.
Further, based on the inputting of the first dynamic ranking result information into the first recommendation model, a first output result is obtained, where the first output result includes first recommendation information and second recommendation information, step S700 includes:
s710: inputting the first dynamic sorting result information into the first recommendation model;
s720: the first recommendation model is obtained by training multiple sets of training data, and each set of data in the multiple sets of data comprises: the first dynamic sorting result information and identification information identifying the first output result;
s730: a first output result is obtained.
Specifically, the first recommended model is a neural network model, i.e., a neural network model in machine learning, which reflects many basic features of human brain functions and is a highly complex nonlinear dynamical learning system. The first recommendation model is obtained by training a plurality of groups of training data, and each group of data in the plurality of groups of data comprises: the first dynamic sorting result information and identification information identifying the first output result. The first recommendation model is continuously modified from me, and when the output information of the first recommendation model reaches a preset accuracy rate/convergence state, the supervised learning process is ended. By carrying out data training on the first recommendation model, the first recommendation model can process input data more accurately, and further the output information of the first output result is more accurate, so that the technical effects of accurately obtaining data information and improving the intellectualization of the evaluation result are achieved.
Further, based on the first recommendation model, training is performed on a plurality of sets of training data, where each set of data in the plurality of sets of data includes: as shown in fig. 2, the first dynamic ranking result information and the identification information identifying the first output result further include step S800:
s810: evaluating the accuracy of the first output result to obtain a first performance index of the first recommendation model;
s820: deleting the first sequencing information in the first dynamic sequencing result to obtain a second dynamic sequencing result;
s830: training the first recommendation model by using the second sequencing result to obtain a second output result;
s840: evaluating the accuracy of the second output result to obtain a second performance index of the first recommendation model;
s850: obtaining a first performance threshold;
s860: calculating a difference value between the first performance index and the second performance index, wherein if the difference value meets the first performance threshold, the first sequencing information is deletable information, and a first deletion instruction is obtained;
s870: and deleting the first sequencing information according to the first deleting instruction.
Specifically, the first performance index refers to information that is obtained when the identification information of the first output result is used to evaluate that each group of ranking information of the first recommendation model in the first dynamic ranking result information is present, and the accuracy degree of the first output result is obtained, and the first performance index is used as a reference index; the second dynamic sorting result refers to deleting the sorting information in a random group of the first dynamic sorting results, namely the first sorting information. Further, inputting the first recommendation model for training according to the second dynamic sorting result as training data to obtain the second output result; the second performance index refers to the accuracy of the first recommendation model under the training of the second dynamic ranking result, which is evaluated by the identification information of the first output result. Furthermore, if the difference between the first performance index and the second performance index meets the first performance threshold, it indicates that the difference between the first performance index and the second performance index is not large, and it indicates that the influence of the first sorting information on the output result is small, and the first sorting information is deleted. By using the method, each group of ranking information in the first dynamic ranking result information is processed in a traversing manner, ranking data with small influence are deleted, the redundancy of training data is reduced, and the training efficiency of the first recommendation model is improved.
Further, based on the feature extraction performed on the first information set, a first feature information set is obtained, and step S200 includes:
s210: inputting the first information set into a first feature extraction model to obtain a first feature set;
s220: performing missing value analysis on the first feature set to obtain a first analysis result;
s230: obtaining a first miss threshold;
s240: determining whether the first analysis result is within the first dropout threshold;
s250: if the first analysis result is within the first miss threshold, obtaining a first supplemental instruction;
s260: supplementing the first feature set according to the first supplement instruction and the first analysis result;
s270: and taking the supplemented first feature set as the first feature information set.
Specifically, the first feature extraction model refers to an intelligent analysis model for extracting information features; the first feature set refers to inputting the first information set into the first feature extraction model for processing to obtain a feature information set, the first feature extraction model preferably uses a feature extraction model based on convolutional neural network training for feature extraction, and convolution can be used as a feature extractor in machine learning, so that extracted feature information has centralization and representativeness, convolution features of the first information set are obtained, and the first feature set is formed accurately. By means of feature extraction, the main attribute information of the service product information is obtained, data redundancy is reduced, and data processing efficiency is improved.
Further, missing value analysis is one of the common problems in data processing, and if not handled properly, it will cause part of the analysis process to simply discard these missing cases from the analysis; it may also result in a reduction in the accuracy of the analysis results, leading to biased or even erroneous conclusions. The first analysis result refers to that missing value analysis is performed on the first feature set to obtain data representing the missing degree of the first feature set information, and the optional confirmation mode of the missing information is as follows: and comparing the information types of the first characteristic set with the information types of the same category groups in the historical data one by one, and marking and storing the first characteristic information of the missing information.
Furthermore, the first missing threshold refers to a value representing the maximum missing degree of the first feature set, a specific value of the first missing threshold needs to be determined according to an actual situation of a service object of the recommendation system, the first analysis result is compared with the first missing threshold, if the first analysis result is smaller than or equal to the first missing threshold, it indicates that the missing data amount is small and the data is easy to supplement, and the missing value of the first feature set is supplemented according to the first supplement instruction to obtain the first feature information set. By analyzing the missing value of the first characteristic set and performing supplementary processing on the missing information, the accuracy and the integrity of the obtained first characteristic information set are guaranteed.
Further, based on the supplementing the first feature set according to the first supplementing instruction and according to the first analysis result, step S260 includes
S261: reading a first feature set according to the first analysis result to obtain first missing data;
s262: determining a first missing type according to the first missing data;
s263: and inputting the first missing data and the first missing type into the first missing data processing model to obtain a first processing result, and supplementing the first feature set by using the first processing result.
Specifically, the first missing data refers to the first feature set and the corresponding missing information, which are stored by comparing the information type of the first feature set with the information types of the same category groups in the historical data one by one; the first missing type information refers to the type of the first missing data determined according to the attribute of the missing information, and the preferred division mode is completely random missing, non-random missing and the like. The first missing data processing model refers to an intelligent model for performing supplementary processing on the first missing data, optionally an intelligent model based on neural network training is used, the training mode is similar to that of the first recommendation model, and multiple sets of data are adopted, wherein each set of data comprises: the first missing data information and the first missing type information. And continuously comparing the processed first processing result data with the historical data one by one until the information type is not lost, and supplementing the first feature set by using the first processing result data. The data with small missing data amount can be supplemented through the first missing data processing model, so that the integrity of information is ensured, and the accuracy of data processing is improved.
Further, if the first analysis result is not within the first missing threshold, step S240 further includes:
s241: obtaining a first deletion instruction, deleting the first feature set of which the first analysis result is not within the first deletion threshold;
s242: acquiring a first acquisition instruction, and acquiring the second information set;
s243: and inputting the second information set and the first information set into the first feature extraction model to obtain a second feature set.
Specifically, when the first analysis result is not within the first missing threshold, which indicates that the first missing data volume is large, information acquisition needs to be performed based on the big data again. Deleting the first feature information of which the first analysis result is not within the first missing threshold value according to the first deleting instruction before collection; further, the second information set refers to a result obtained by reading an information source corresponding to the deleted first characteristic information and performing information acquisition on the information source according to the first acquisition instruction; furthermore, the second feature set refers to a result of inputting the second information set and the first information set into the first feature extraction model for feature extraction, and the second feature set is processed in the same way as the first feature set until the first feature information set is completely characterized. For data with overlarge information loss, the data is deleted, then the corresponding information is collected based on the big data again, and then the characteristic extraction is carried out to supplement the information, so that the technical effect of ensuring the integrity of the first characteristic information set is achieved.
Further, as shown in fig. 3, the method step S900 includes:
s910: constructing a first change curve according to the first dynamic sorting result;
s920: obtaining a first change trend sorting result according to the first change curve;
s930: and correcting the first output result according to the first change trend sorting result to obtain a third output result.
Specifically, the first variation curve refers to a curve obtained by fitting a plurality of sets of data in the first dynamic sorting result, and the fitting manner is not limited: in a library, one of the first clustering results is a borrowing amount of a student literature book: within a set time interval, a plurality of time periods are recorded, the borrowing amount of each period is recorded, a fitting borrowing curve is given, and other professions such as teachers, lawyers and writers are fitted in the same mode. And obtaining the first change curve. Further, the first trend ranking result refers to a result after trend analysis is performed according to the first variation curve, which is not limited by the following example: after the first change curves of different professions are fitted, if the first change curve of the student is most obvious in rising trend but is ranked second, the borrowing amount of the teacher is ranked first, but the recent first change curve is relatively gentle, the first recommended user of the literature book is the student temporarily. This is merely to illustrate the principles and not to limit the actual complex analysis process. Furthermore, according to the above form, the first output result is corrected by using the first trend ranking result to obtain the third output result. Through the analysis of the first dynamic sequencing curve, the potential development trends of different clustering information are obtained, and the information is recommended in real time or in a predictive mode according to the reasoning result, so that the control degree of the user attribute change is improved.
To sum up, the information recommendation method and system based on big data provided by the embodiment of the application have the following technical effects:
1. due to the adoption of obtaining the first set of information; performing feature extraction on the first information set to obtain a first feature information set; obtaining a first information source; performing clustering analysis on the first information source to obtain a first clustering result, and constructing a first information source database according to the first clustering result; traversing operation is carried out on the first feature information set in the first information source database, and feature information belonging to the first clustering result is marked to obtain first label information; performing dynamic sorting according to the number information of the first label information to obtain a first dynamic sorting result; the embodiment of the application provides an information recommendation method and system based on big data, so that the technical effect that the recommendation information has real-time performance is achieved by extracting the characteristics of the information acquired by a big data platform, matching the information with different information source groups, marking the information, dynamically sequencing according to the number change trend of the marks, and analyzing by using an intelligent model according to the dynamic sequencing result.
2. By processing each group of ranking information in the first dynamic ranking result information in a traversing manner, ranking data with small influence are deleted, the redundancy of training data is reduced, and the training efficiency of the first recommendation model is improved.
3. Through the analysis of the first dynamic sequencing curve, the potential development trends of different clustering information are obtained, and the information is recommended in real time or in a predictive mode according to the reasoning result, so that the control degree of the user attribute change is improved.
Example two
Based on the same inventive concept as the big data based information recommendation method in the foregoing embodiment, as shown in fig. 4, an embodiment of the present application provides a big data based information recommendation system, where the system includes:
a first obtaining unit 11, the first obtaining unit 11 being configured to obtain a first information set;
a second obtaining unit 12, where the second obtaining unit 12 is configured to perform feature extraction on the first information set to obtain a first feature information set;
a third obtaining unit 13, where the third obtaining unit 13 is configured to obtain a first information source;
a fourth obtaining unit 14, where the fourth obtaining unit 14 is configured to perform cluster analysis on the first information source, obtain a first clustering result, and construct a first information source database according to the first clustering result;
a fifth obtaining unit 15, where the fifth obtaining unit 15 is configured to perform traversal operation on the first feature information set in the first information source database, mark feature information belonging to the first clustering result, and obtain first label information;
a sixth obtaining unit 16, where the sixth obtaining unit 16 is configured to perform dynamic sorting according to the number of the first tag information, and obtain a first dynamic sorting result;
a seventh obtaining unit 17, where the seventh obtaining unit 17 is configured to input the first dynamic ranking result information into a first recommendation model, and obtain a first output result, where the first output result includes first recommendation information and second recommendation information.
Further, the system comprises:
a first input unit for inputting the first dynamic ranking result information into the first recommendation model;
a first training unit, configured to train the first recommendation model to be obtained by multiple sets of training data, where each set of data in the multiple sets of data includes: the first dynamic sorting result information and identification information identifying the first output result;
an eighth obtaining unit to obtain a first output result.
Further, the system comprises:
a ninth obtaining unit, configured to evaluate accuracy of the first output result, and obtain a first performance index of the first recommendation model;
the first deleting unit is used for deleting the first sequencing information in the first dynamic sequencing result to obtain a second dynamic sequencing result;
a tenth obtaining unit, configured to train the first recommendation model using the second ranking result, and obtain a second output result;
an eleventh obtaining unit, configured to evaluate accuracy of the second output result, and obtain a second performance index of the first recommendation model;
a twelfth obtaining unit, configured to obtain a first performance threshold;
a thirteenth obtaining unit, configured to calculate a difference between the first performance index and the second performance index, where if the first performance threshold is met, the first ordering information is deletable information, and a first deletion instruction is obtained;
and the second deleting unit is used for deleting the first sequencing information according to the first deleting instruction.
Further, the system comprises:
a fourteenth obtaining unit, configured to input the first information set into a first feature extraction model, and obtain a first feature set;
a fifteenth obtaining unit, configured to perform missing value analysis on the first feature set to obtain a first analysis result;
a sixteenth obtaining unit configured to obtain a first miss threshold;
a first judging unit, configured to judge whether the first analysis result is within the first missing threshold;
a seventeenth obtaining unit, configured to obtain a first supplemental instruction if the first analysis result is within the first miss threshold;
a first supplementing unit, configured to supplement the first feature set according to the first supplementing instruction and according to the first analysis result;
a first setting unit configured to take the supplemented first feature set as the first feature information set.
Further, the system comprises
An eighteenth obtaining unit, configured to read the first feature set according to the first analysis result, and obtain first missing data;
a first determining unit, configured to determine a first deletion type according to the first deletion data;
a second input unit, configured to input the first missing data and the first missing type into the first missing data processing model, obtain a first processing result, and supplement the first feature set with the first processing result.
Still further, the system further comprises:
a nineteenth obtaining unit, configured to obtain a first deletion instruction, delete the first feature set of which the first analysis result is not within the first deletion threshold;
a twentieth obtaining unit, configured to obtain a first acquisition instruction and acquire the second information set;
a twenty-first obtaining unit, configured to input the second information set and the first information set into the first feature extraction model, and obtain a second feature set.
Further, the system further comprises:
the first construction unit is used for constructing a first change curve according to the first dynamic sequencing result;
a twenty-second obtaining unit, configured to obtain a first variation trend ranking result according to the first variation curve;
and the twenty-third obtaining unit is used for correcting the first output result according to the first change trend sorting result to obtain a third output result.
Exemplary electronic device
The electronic device of the embodiment of the present application is described below with reference to figure 5,
based on the same inventive concept as the big data based information recommendation method in the foregoing embodiments, the present application embodiment further provides a big data based information recommendation system, including: a processor coupled to a memory, the memory for storing a program that, when executed by the processor, causes a system to perform the method of any of the first aspects.
The electronic device 300 includes: processor 302, communication interface 303, memory 301. Optionally, the electronic device 300 may also include a bus architecture 304. Wherein, the communication interface 303, the processor 302 and the memory 301 may be connected to each other through a bus architecture 304; the bus architecture 304 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus architecture 304 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 5, but this is not intended to represent only one bus or type of bus.
Processor 302 may be a CPU, microprocessor, ASIC, or one or more integrated circuits for controlling the execution of programs in accordance with the teachings of the present application.
The communication interface 303 may be any device, such as a transceiver, for communicating with other devices or communication networks, such as an ethernet, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), a wired access network, and the like.
The memory 301 may be a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an electrically erasable Programmable read-only memory (EEPROM), a compact disc read-only memory (compact disc)
read-only memory, CD-ROM) or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory may be self-contained and coupled to the processor through a bus architecture 304. The memory may also be integral to the processor.
The memory 301 is used for storing computer-executable instructions for executing the present application, and is controlled by the processor 302 to execute. The processor 302 is configured to execute the computer-executable instructions stored in the memory 301, so as to implement a big data-based information recommendation method provided by the above-mentioned embodiments of the present application.
Optionally, the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
The embodiment of the application provides an information recommendation method based on big data, wherein the method comprises the following steps: obtaining a first set of information; performing feature extraction on the first information set to obtain a first feature information set; obtaining a first information source; performing clustering analysis on the first information source to obtain a first clustering result, and constructing a first information source database according to the first clustering result; traversing operation is carried out on the first feature information set in the first information source database, and feature information belonging to the first clustering result is marked to obtain first label information; performing dynamic sorting according to the number information of the first label information to obtain a first dynamic sorting result; and inputting the first dynamic sorting result information into a first recommendation model to obtain a first output result, wherein the first output result comprises first recommendation information and second recommendation information. The technical effect that the recommendation information has real-time performance is achieved by extracting the features of the information acquired by the big data platform, matching the information with different information source groups, marking, dynamically sequencing according to the number change trend of the marks and analyzing by using an intelligent model according to the dynamic sequencing result.
Those of ordinary skill in the art will understand that: the various numbers of the first, second, etc. mentioned in this application are only used for the convenience of description and are not used to limit the scope of the embodiments of this application, nor to indicate the order of precedence. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one" means one or more. At least two means two or more. "at least one," "any," or similar expressions refer to any combination of these items, including any combination of singular or plural items. For example, at least one (one ) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer finger
The instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, where the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device including one or more available media integrated servers, data centers, and the like. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
The various illustrative logical units and circuits described in this application may be implemented or operated upon by design of a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.
The steps of a method or algorithm described in the embodiments herein may be embodied directly in hardware, in a software element executed by a processor, or in a combination of the two. The software cells may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. For example, a storage medium may be coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may be disposed in a terminal. In the alternative, the processor and the storage medium may reside in different components within the terminal. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Although the present application has been described in conjunction with specific features and embodiments thereof, it will be evident that various modifications and combinations can be made thereto without departing from the spirit and scope of the application. Accordingly, the specification and figures are merely exemplary of the present application as defined in the appended claims and are intended to cover any and all modifications, variations, combinations, or equivalents within the scope of the present application. It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations.

Claims (9)

1. A big data-based information recommendation method, wherein the method comprises the following steps:
obtaining a first set of information;
performing feature extraction on the first information set to obtain a first feature information set;
obtaining a first information source;
performing clustering analysis on the first information source to obtain a first clustering result, and constructing a first information source database according to the first clustering result;
traversing operation is carried out on the first feature information set in the first information source database, and feature information belonging to the first clustering result is marked to obtain first label information;
performing dynamic sorting according to the number information of the first label information to obtain a first dynamic sorting result;
and inputting the first dynamic sorting result information into a first recommendation model to obtain a first output result, wherein the first output result comprises first recommendation information and second recommendation information.
2. The method of claim 1, wherein the inputting the first dynamic ranking result information into a first recommendation model, obtaining a first output result, the first output result comprising first recommendation information, second recommendation information, comprises:
inputting the first dynamic sorting result information into the first recommendation model;
the first recommendation model is obtained by training multiple sets of training data, and each set of data in the multiple sets of data comprises: the first dynamic sorting result information and identification information identifying the first output result;
a first output result is obtained.
3. The method of claim 2, wherein the first recommendation model is trained from a plurality of sets of training data, each of the plurality of sets of data comprising: the first dynamic ranking result information and identification information identifying the first output result, including:
evaluating the accuracy of the first output result to obtain a first performance index of the first recommendation model;
deleting the first sequencing information in the first dynamic sequencing result to obtain a second dynamic sequencing result;
training the first recommendation model by using the second sequencing result to obtain a second output result;
evaluating the accuracy of the second output result to obtain a second performance index of the first recommendation model;
obtaining a first performance threshold;
calculating a difference value between the first performance index and the second performance index, wherein if the difference value meets the first performance threshold, the first sequencing information is deletable information, and a first deletion instruction is obtained;
and deleting the first sequencing information according to the first deleting instruction.
4. The method of claim 1, wherein said extracting features from said first information set to obtain a first feature information set comprises:
inputting the first information set into a first feature extraction model to obtain a first feature set;
performing missing value analysis on the first feature set to obtain a first analysis result;
obtaining a first miss threshold;
determining whether the first analysis result is within the first dropout threshold;
if the first analysis result is within the first miss threshold, obtaining a first supplemental instruction;
supplementing the first feature set according to the first supplement instruction and the first analysis result;
and taking the supplemented first feature set as the first feature information set.
5. The method of claim 4, wherein said supplementing said first feature set according to said first supplemental instruction according to said first analysis result comprises
Reading a first feature set according to the first analysis result to obtain first missing data;
determining a first missing type according to the first missing data;
and inputting the first missing data and the first missing type into the first missing data processing model to obtain a first processing result, and supplementing the first feature set by using the first processing result.
6. The method of claim 4, wherein if the first analysis result is not within the first absence threshold, further comprising:
obtaining a first deletion instruction, deleting the first feature set of which the first analysis result is not within the first deletion threshold;
acquiring a first acquisition instruction, and acquiring the second information set;
and inputting the second information set and the first information set into the first feature extraction model to obtain a second feature set.
7. The method of claim 1, wherein the method further comprises:
constructing a first change curve according to the first dynamic sorting result;
obtaining a first change trend sorting result according to the first change curve;
and correcting the first output result according to the first change trend sorting result to obtain a third output result.
8. A big data based information recommendation system, wherein the system comprises:
a first obtaining unit for obtaining a first set of information;
a second obtaining unit, configured to perform feature extraction on the first information set to obtain a first feature information set;
a third obtaining unit, configured to obtain a first information source;
a fourth obtaining unit, configured to perform cluster analysis on the first information source, obtain a first clustering result, and construct a first information source database according to the first clustering result;
a fifth obtaining unit, configured to perform traversal operation on the first feature information set in the first information source database, and mark feature information belonging to the first clustering result to obtain first label information;
a sixth obtaining unit, configured to perform dynamic sorting according to the number of the first tag information, and obtain a first dynamic sorting result;
a seventh obtaining unit, configured to input the first dynamic ranking result information into a first recommendation model, and obtain a first output result, where the first output result includes first recommendation information and second recommendation information.
9. A big-data based information recommendation system, comprising: a processor coupled with a memory for storing a program that, when executed by the processor, causes a system to perform the method of any of claims 1 to 7.
CN202110911121.1A 2021-08-10 2021-08-10 Big data based information recommendation method and system Pending CN113360780A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110911121.1A CN113360780A (en) 2021-08-10 2021-08-10 Big data based information recommendation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110911121.1A CN113360780A (en) 2021-08-10 2021-08-10 Big data based information recommendation method and system

Publications (1)

Publication Number Publication Date
CN113360780A true CN113360780A (en) 2021-09-07

Family

ID=77540915

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110911121.1A Pending CN113360780A (en) 2021-08-10 2021-08-10 Big data based information recommendation method and system

Country Status (1)

Country Link
CN (1) CN113360780A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113643286A (en) * 2021-10-12 2021-11-12 南通海美电子有限公司 Electronic component assembly detection method and system
CN113689247A (en) * 2021-10-27 2021-11-23 冰联(广州)网络科技有限公司 Block chain electronic ticket marking method and system based on information flow parallel connection
CN113762163A (en) * 2021-09-09 2021-12-07 杭州澳亚生物技术股份有限公司 GMP workshop intelligent monitoring management method and system
CN114038512A (en) * 2021-11-02 2022-02-11 卫星化学股份有限公司 Method and system for optimizing catalytic conditions for synthesizing acrylic acid
CN114037677A (en) * 2021-11-05 2022-02-11 安徽宇呈数据技术有限公司 Portable map acquisition equipment capable of accessing charge pal
CN114417954A (en) * 2021-12-01 2022-04-29 江苏权正检验检测有限公司 Information processing method and system for improving food detection effect
CN116578755A (en) * 2022-03-30 2023-08-11 江苏控智电子科技有限公司 Information analysis system and method based on artificial intelligence and big data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109597941A (en) * 2018-12-12 2019-04-09 拉扎斯网络科技(上海)有限公司 Sort method and device, electronic equipment and storage medium
CN110827129A (en) * 2019-11-27 2020-02-21 中国联合网络通信集团有限公司 Commodity recommendation method and device
CN111339406A (en) * 2020-02-17 2020-06-26 北京百度网讯科技有限公司 Personalized recommendation method, device, equipment and storage medium
CN112348629A (en) * 2020-10-26 2021-02-09 邦道科技有限公司 Commodity information pushing method and device
CN112632385A (en) * 2020-12-29 2021-04-09 中国平安人寿保险股份有限公司 Course recommendation method and device, computer equipment and medium
CN112667899A (en) * 2020-12-30 2021-04-16 杭州智聪网络科技有限公司 Cold start recommendation method and device based on user interest migration and storage equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109597941A (en) * 2018-12-12 2019-04-09 拉扎斯网络科技(上海)有限公司 Sort method and device, electronic equipment and storage medium
CN110827129A (en) * 2019-11-27 2020-02-21 中国联合网络通信集团有限公司 Commodity recommendation method and device
CN111339406A (en) * 2020-02-17 2020-06-26 北京百度网讯科技有限公司 Personalized recommendation method, device, equipment and storage medium
CN112348629A (en) * 2020-10-26 2021-02-09 邦道科技有限公司 Commodity information pushing method and device
CN112632385A (en) * 2020-12-29 2021-04-09 中国平安人寿保险股份有限公司 Course recommendation method and device, computer equipment and medium
CN112667899A (en) * 2020-12-30 2021-04-16 杭州智聪网络科技有限公司 Cold start recommendation method and device based on user interest migration and storage equipment

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762163A (en) * 2021-09-09 2021-12-07 杭州澳亚生物技术股份有限公司 GMP workshop intelligent monitoring management method and system
CN113643286A (en) * 2021-10-12 2021-11-12 南通海美电子有限公司 Electronic component assembly detection method and system
CN113643286B (en) * 2021-10-12 2023-06-13 南通海美电子有限公司 Electronic component assembly detection method and system
CN113689247A (en) * 2021-10-27 2021-11-23 冰联(广州)网络科技有限公司 Block chain electronic ticket marking method and system based on information flow parallel connection
CN113689247B (en) * 2021-10-27 2022-02-15 冰联(广州)网络科技有限公司 Block chain electronic ticket marking method and system based on information flow parallel connection
CN114038512A (en) * 2021-11-02 2022-02-11 卫星化学股份有限公司 Method and system for optimizing catalytic conditions for synthesizing acrylic acid
CN114038512B (en) * 2021-11-02 2024-04-30 卫星化学股份有限公司 Catalytic condition optimization method and system for synthesizing acrylic acid
CN114037677A (en) * 2021-11-05 2022-02-11 安徽宇呈数据技术有限公司 Portable map acquisition equipment capable of accessing charge pal
CN114417954A (en) * 2021-12-01 2022-04-29 江苏权正检验检测有限公司 Information processing method and system for improving food detection effect
CN114417954B (en) * 2021-12-01 2023-12-26 江苏权正检验检测有限公司 Information processing method and system for improving food detection effect
CN116578755A (en) * 2022-03-30 2023-08-11 江苏控智电子科技有限公司 Information analysis system and method based on artificial intelligence and big data
CN116578755B (en) * 2022-03-30 2024-01-09 张家口微智网络科技有限公司 Information analysis system and method based on artificial intelligence and big data

Similar Documents

Publication Publication Date Title
CN113360780A (en) Big data based information recommendation method and system
CN109033101B (en) Label recommendation method and device
CN111488385B (en) Data processing method and device based on artificial intelligence and computer equipment
CN111797320B (en) Data processing method, device, equipment and storage medium
CN113312468B (en) Conversation mode-based conversation recommendation method, device, equipment and medium
CN111949887A (en) Item recommendation method and device and computer-readable storage medium
CN111190968A (en) Data preprocessing and content recommendation method based on knowledge graph
CN114565196B (en) Multi-event trend prejudging method, device, equipment and medium based on government affair hotline
CN113554175A (en) Knowledge graph construction method and device, readable storage medium and terminal equipment
CN115809376A (en) Intelligent recommendation method based on big teaching data
CN110083766B (en) Query recommendation method and device based on meta-path guiding embedding
CN110968802A (en) User characteristic analysis method, analysis device and readable storage medium
CN113313470B (en) Employment type assessment method and system based on big data
CN108021713B (en) Document clustering method and device
CN113934937A (en) Intelligent content recommendation method and device, terminal and storage medium
CN113282831A (en) Search information recommendation method and device, electronic equipment and storage medium
CN110837732B (en) Method and device for identifying intimacy between target persons, electronic equipment and storage medium
CN112163415A (en) User intention identification method and device for feedback content and electronic equipment
CN112598405A (en) Business project data management method and system based on big data
CN117076770A (en) Data recommendation method and device based on graph calculation, storage value and electronic equipment
CN114238615B (en) Enterprise service result data processing method and system
CN114692978A (en) Social media user behavior prediction method and system based on big data
CN111460300B (en) Network content pushing method, device and storage medium
CN113704617A (en) Article recommendation method, system, electronic device and storage medium
Kuznietsova et al. Business intelligence techniques for missing data imputation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210907

RJ01 Rejection of invention patent application after publication