CN115564393A - Recruitment requirement similarity-based job recommendation method - Google Patents

Recruitment requirement similarity-based job recommendation method Download PDF

Info

Publication number
CN115564393A
CN115564393A CN202211299760.8A CN202211299760A CN115564393A CN 115564393 A CN115564393 A CN 115564393A CN 202211299760 A CN202211299760 A CN 202211299760A CN 115564393 A CN115564393 A CN 115564393A
Authority
CN
China
Prior art keywords
recruitment
job
text information
text
post
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211299760.8A
Other languages
Chinese (zh)
Other versions
CN115564393B (en
Inventor
刘红岩
高歌
车尚锟
杜思霖
景昊
谢志辉
吴显仁
徐伟招
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Today Talent Information Technology Co ltd
Original Assignee
Shenzhen Today Talent Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Today Talent Information Technology Co ltd filed Critical Shenzhen Today Talent Information Technology Co ltd
Priority to CN202211299760.8A priority Critical patent/CN115564393B/en
Publication of CN115564393A publication Critical patent/CN115564393A/en
Application granted granted Critical
Publication of CN115564393B publication Critical patent/CN115564393B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a job recommendation method based on recruitment requirement similarity, which comprises the following steps of S0: a matching prediction model construction stage; training a Word2Vec model and a matching prediction model; step S1: HNSW graph composition phase; calculating feature vectors for all enterprise recruitment requirements in the system by using a matching prediction model, and constructing an HNSW (HNSW) graph; step S2: a user service stage; and acquiring post recruitment requirements suitable for job seekers, and searching the most similar N posts in the constructed HNSW picture and recommending the N posts to the job seekers. The invention realizes the acquisition and the satisfaction of the real job hunting intention of the job hunter; the method has the advantages that the positions are similar in demand, but the difference of the feature vectors corresponding to completely different positions of the suitable candidate group is improved; the method has the advantages that the text similarity is low, but the difference of the feature vectors corresponding to the positions suitable for the same batch of candidate people is reduced, and the similarity is improved; the efficiency is higher, and the method is more suitable for providing real-time service for users.

Description

Recruitment requirement similarity-based job recommendation method
Technical Field
The invention relates to the field of computer software, in particular to a job position recommendation method based on recruitment requirement similarity.
Background
The patent is 'a position recommendation method and a calculation device' with application number 201710534021.5. The method comprises the steps of screening candidate position sets with more candidate and target position co-occurrence times through position co-occurrence information, then calculating the similarity between recruitment requirement texts by utilizing an LDA topic model and completely based on text information, and recommending the positions with the text similarity larger than a certain threshold value to a current user, wherein the text similarity between the candidate positions and the target positions is larger than the threshold value.
Another patent is a "job recommendation method" with application number 202110661337.7, which trains a deep learning model (without using text information) according to a job browsing sequence of a user, generates a feature vector representation of each job, and recommends more jobs for the user according to the feature vector similarity of the jobs.
The prior art has the following defects:
1. the utilization of the interactive record information of the user in the platform is lacked, and the real job hunting intention of the user in job hunting cannot be obtained. The utilization of the text information is insufficient, and the association between the text information and the talent post matching degree is not fully mined.
2. The JDs which have high text similarity but are actually suitable for the candidates are completely different and cannot be distinguished in the prior art, and can be recommended to one candidate at the same time. The JD with low text similarity and actually suitable for the same candidate is incapable of finding the connection between the JD and the candidate and recommending the JD and the candidate to the same candidate.
3. When the most similar positions of the positions suitable for a certain user are calculated for position recommendation, the calculation speed is low.
Thus, the prior art is deficient and needs improvement.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the job recommendation method based on the recruitment requirement similarity is capable of obtaining and meeting the real job hunting intention of job seekers, improving the difference of feature vectors corresponding to different positions of candidate groups which are similar in job requirements but are actually suitable, reducing the difference of feature vectors corresponding to positions which are lower in text similarity but are suitable for the same candidate group, improving the similarity, improving the efficiency, and providing real-time service for users.
The technical scheme of the invention is as follows: a job recommendation method based on recruitment requirement similarity comprises the following steps: step S0: a matching prediction model construction stage; acquiring text information from personal resumes and enterprise recruitment requirements, and using the acquired text information to train a Word2Vec Model based on a Skip-Gram Model; training a matching prediction model based on TextCNN + MLP by using the acquired text information, the trained Word2Vec model and the collected resume and post historical matching information; step S1: HNSW graph composition phase; calculating feature vectors for all enterprise recruitment requirements in the system by using the matching prediction model, and constructing an HNSW graph by using the obtained feature vectors; step S2: a user service stage; the method comprises the steps of obtaining a post recruitment requirement suitable for a job seeker, searching N most similar posts in a constructed HNSW (HNSW) graph according to the post recruitment requirement, and recommending the N most similar posts to the job seeker, wherein N is a natural number.
In the job position recommendation method based on the recruitment requirement similarity, after the text information is acquired, the text information is subjected to text preprocessing operations of word segmentation, word deactivation and punctuation removal to obtain preprocessed text information in step S0; moreover, using the preprocessed text information to train a Word2Vec Model based on the Skip-Gram Model; and training a matching prediction model based on TextCNN + MLP by using the preprocessed text information, the trained Word2Vec model and the collected resume and post historical matching information.
In the job recommendation method based on the recruitment requirement similarity, in the step S0, text information is acquired from all personal resumes and enterprise recruitment requirements stored in the system; for the personal resume, extracting the whole text of the personal statement, the work experience and the project experience, and splicing the text into one section as the text information of the personal resume; for the enterprise recruitment requirement, extracting a whole text segment related to the recruitment requirement from the enterprise recruitment requirement, and splicing the text segments into a segment to be used as the enterprise recruitment requirement text information.
In the job recommendation method based on the recruitment requirement similarity, in the step S0, when a Word2Vec Model based on a Skip-Gram Model is trained, the designated window size WIN of the Skip-Gram Model is 2, and the dimension D of the Word2Vec Model converting each Word into a high-dimensional vector is recorded as 128.
In the job recommendation method based on recruitment requirement similarity, in the step S0, historical matching information of resume and successful post matching is collected from historical data of a system when a matching prediction model based on TextCNN + MLP is trained.
In the job position recommendation method based on recruitment requirement similarity, after historical matching information of resumes and posts is collected, collected successfully matched data is used as sample positive examples, sample negative examples are randomly sampled according to the proportion of positive samples and negative samples of 1:1, and finally the positive samples and the negative samples are mixed together to form training data, wherein each piece of training data comprises personal resume text information, enterprise recruitment requirement text information and an identifier indicating whether matching is successful or not.
In the job position recommendation method based on recruitment requirement similarity, in step S1, when calculating feature vectors for all enterprise recruitment requirements in the system by using the matching prediction model, text information of the enterprise recruitment requirements is input to the matching prediction model, and the feature vectors of the enterprise recruitment requirements are obtained from the matching prediction model.
In the job recommendation method based on recruitment requirement similarity, in the step S1, when the obtained feature vector is used to construct the HNSW map, the NSW map is constructed first, and then the HNSW map is constructed on the basis of the NSW map.
In the job recommendation method based on the recruitment requirement similarity, in the step S2, one or more job recruitment requirements suitable for a job seeker are acquired according to the interaction record of the job seeker and the system.
In the job recommendation method based on the recruitment requirement similarity, in the step S2, a job seeker delivers and obtains a post of an interview opportunity, and obtains text information of the post as a post recruitment requirement suitable for the job seeker.
The invention has the beneficial effects that:
1. the interactive information of the job seeker on the platform is fully utilized, and the real job hunting intention of the job seeker is obtained and met.
2. The text data, the history matching data and the TextCNN model are used for supervised learning, the deep learning model is trained to obtain the feature vectors corresponding to all positions, and the difference of the feature vectors corresponding to completely different positions of candidate groups which have similar position requirements and are practically suitable can be improved. The difference of the feature vectors corresponding to the positions of the text which has lower similarity and is suitable for the same batch of candidate groups is reduced, and the similarity is improved.
3. When the similar positions of one position are calculated to realize position recommendation, the HNSW rapid retrieval algorithm is adopted, and compared with a sorting method adopted by the conventional patent, the method has the advantages of higher speed and higher efficiency, and is more suitable for providing real-time service for users.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of the Skip-gram model of the present invention;
FIG. 3 is a schematic structural diagram of a Word2Vec model based on Skip-gram in the present invention;
FIG. 4 is a schematic structural diagram of a matching prediction model based on TextCNN in the present invention;
FIG. 5 is a schematic diagram of the textCNN of the present invention;
FIG. 6 is a schematic diagram of the construction of HNSW in the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and the specific embodiments.
The embodiment provides a job recommendation method based on recruitment requirement similarity, which comprises 3 stages, namely S0: a model construction stage; s1: a composition stage; s2: and a user service stage. The S0 model construction phase comprises the following 4 steps, S0-0: and acquiring text information from personal resumes and enterprise recruitment requirements. S0-1: and performing text preprocessing operations such as word segmentation, word stop removal, punctuation removal and the like on the text to obtain a preprocessed text. S0-2: the Word2Vec Model based on Skip-Gram Model is trained using the preprocessed text information. S0-3: and training a matching prediction model based on the TextCNN + MLP by using the preprocessed text information, the trained Word2Vec model and historical matching information of resumes and posts collected from the platform. The S1 patterning phase comprises the following 2 steps, S1-0: and calculating the characteristic vectors for all enterprise recruitment requirements in the system by using the matching prediction model. S1-1: and constructing an HNSW graph by using the feature vectors of all enterprise recruitment requirements. The S2 user service phase comprises two steps, S2-0: and for a job seeker who is looking for work, acquiring one or more suitable post recruitment requirements according to information such as interaction records between the job seeker and the system. S2-1: and using the post recruitment requirement interested by the user, searching the most similar N posts in the HNSW graph constructed in the S1 stage, and recommending the N posts to the user, wherein the flow chart is shown in FIG. 1.
In the embodiment of the invention, the related English shorthand Chinese meanings are as follows:
CV (constant value) is as follows: curriculum Vitae, personal resume provided by job seeker.
JD: job Description, enterprise recruitment requirements published by the enterprise.
Word2Vec: word to vector, which refers to inputting a Chinese Word into a model, the model outputs a high-dimensional vector which can represent the semantic meaning of the Word.
TextCNN: text conditional Neural Network, text Convolutional Neural Network.
MLP: multilayer Perceptron, a Multilayer Perceptron, i.e., a Multilayer neural network.
HNSW: hierarchical Navigable Small World graphs, a graph structure that allows for rapid retrieval of target nodes.
The specific implementation mode of the invention is as follows:
1. s0: a model construction stage;
(1) S0-0: extracting texts from CV (personal resume) and JD (enterprise recruitment requirement);
this step is used to extract the text information from all CV and JD stored in the system, the input being the complete CV and JD. For CV, extracting the whole text of personal statement, work experience and project experience, and splicing the texts into one text as CV text information. And for JD, extracting the whole text related to the recruitment requirement, and splicing the texts into one section to be used as JD text information.
(2) S0-1: preprocessing a text;
this step is used to preprocess all CV text messages and JD text messages. For each CV and JD text message, we first clause based on periods, exclamations and question marks. And then removing punctuation marks from each sentence, and finding various punctuation marks and replacing the punctuation marks with blank spaces through automatic programming. The following is a word segmentation operation, which is automatically implemented using a toolkit named "Jieba" in the Python programming language. Then, the stop words are removed, and words without practical meaning, such as "yes", "also", "and", "a number of words, etc., are extracted from the text. So far, the text information of CV and JD is preprocessed. The processed information is referred to as "CV text" or "JD text", and the "CV text" or "JD text" as applicable in the following description refers to the preprocessed CV and JD text information.
(3) S0-2: training a Word2Vec model;
this step is used to train the Word2Vec model using the processed CV text and JD text described above. This model was trained using the Word2Vec structure based on the Skip-gram model. The schematic diagram of the Skip-gram model is shown in fig. 2, and it is first necessary to specify a window size denoted as WIN, and then for each word in a sentence, predict his first and last WIN words using the current word. In short, the words around the word are predicted by using the current word, so that the semantic relation contained between the words is learned. The Word2Vec model based on Skip-gram is a general Word vector representation model and is widely applied to researches related to natural language processing.
The structure diagram of the Word2Vec model based on Skip-gram is shown in fig. 3, and taking the case of "i excel in the realization of the machine learning algorithm" as an example, we predict "surrounding words" as "central words" one by one starting from the first Word. We take the core word as "machine learning" as an example, and use the neural network shown in fig. 3 to predict which words are around. We label the lexicon of all words as V, so the number of words in our lexicon is | V |, and because the Word2Vec model converts each Word into a high-dimensional vector, we label the dimension of this vector as D. The working steps of the Word2Vec model are as follows:
an input layer: i.e. a one-hot coding layer. One-hot coding means that a vector with the length of | V | is used for representing a text, each dimension of the vector respectively represents V words in a word stock, the word contained in the current text takes the value of the corresponding position in the vector as 1, and the position corresponding to the word not contained in the text is taken as 0. Taking the word "machine learning" in this application as an example, only the dimension corresponding to the word "machine learning" in the vector with length | V | is taken as 1, and the other dimensions are taken as 0. Thus, the word-centric is transformed into a matrix with one dimension being 1 x | V |.
Hiding the layer: the hidden layer is composed of a matrix named "word embedding matrix" for converting the input words into word vectors, because the dimension of the word vectors is D, and in order to achieve this goal, the shape of the word embedding matrix is | V | × D, and we denote this matrix as W1. The input one-hot code is multiplied by the W1 matrix, so that a vector 1*D can be obtained, and the vector is the word vector of the current central word.
An output layer: the output layer is also composed of a parameter matrix, denoted as W2, which is a matrix of D x | V |. We use the word vector output from the hidden layer to perform matrix multiplication with the matrix W2, and then obtain a vector of 1 x | V |.
And outputting a result: and (3) passing the vector of the 1 x | V | obtained by the output layer through a Softmax function layer to obtain a new vector of the 1 x | V | wherein the sum of all dimensions of the vector is equal to 1, and the value of each position of the vector represents the probability of the occurrence of the word corresponding to the position. For example if the word "good" is located at the 100 th bit of the 1 x | V | vector used in one-hot encoding. If the value of the 100 th bit in the output result is 0.3, the model predicts that when the central word is 'machine learning', the probability of containing 'excellence' in the surrounding words is 0.3.
Loss function: we use cross-entropy loss as a method of computation of the loss function. Let the value of each dimension in the output result (vector of 1. X | V |) be P i Let us take a shapeThe vector with the shape of 1 x | V | is also used as a true value vector and is marked as T, and the position corresponding to the true "surrounding word" in the vector is 1, and the other positions are 0. The cross entropy loss calculation method is as follows:
Figure BDA0003904079910000081
and gradually inputting training data into a model to calculate the loss function, deriving parameters W1 and W2 to be updated, and continuously updating the values of the two groups of parameters in a gradient descending manner.
In this embodiment, WIN is 2,D is 128.
(4) S0-3: training a matching prediction model based on TextCNN;
this step is used to train a match prediction model in conjunction with the Word2Vec model, the TextCNN model, and the CV-JD match data in the data, which can be used to obtain the feature vectors of CV and JD.
The structure of the matching prediction model based on the TextCNN is shown in fig. 4, and is divided into the following steps:
data is collected from historical data of the system for matching CV and JD outcomes, successful matching means that a Candidate (CV) delivered a position (JD) and gained an opportunity to interview through resume screening. These successfully matched data are taken as sample positive examples. We randomly sample out the negative sample case according to the positive and negative sample ratio of 1: 1. And finally, mixing the positive and negative samples together to form training data, wherein each piece of data comprises a CV text, a JD text and an identifier for judging whether the matching is successful, wherein the identifier is 1, and the unmatched identifier is 0.
For each piece of training data. Sentences contained in one CV or JD text are spliced to form a whole text, and each Word in the text is converted into a D-dimensional Word vector through a trained Word2Vec model in S0-2. Assuming that the number of words contained in a CV or JD is X, a CV or JD text processed by the Word2Vec model is transformed into a matrix of X-D size.
And taking the word vector matrix as an input of the TextCNN model. The TextCNN model employs 16 convolution kernels, 2 each of which are 1 × d,2 × d,3 × d, … 8*D in size. Maximum pooling is used after convolution. The final TextCNN outputs a 16-dimensional CV TextCNN vector and a 16-dimensional JD TextCNN vector.
And (4) adopting a Skip-connection method, namely averaging the word vectors of X X D size of the CV to lines to convert the lines into D-dimensional vectors, and then splicing the D-dimensional vectors with the vectors output by the TextCNN to form the feature vectors of the CV. JD works in the same manner. Through such processing, CV and JD are respectively converted into 144-dimensional feature vectors.
And calculating cosine similarity of the feature vectors of CV and JD. The calculated similarity value is used as the probability that the current CV and JD predicted by the model can achieve matching. The cross entropy loss function is calculated using the predicted probability, the true value (0 or 1) of whether CV-JD successfully matched. And the whole neural network model is trained by using a back propagation method, and the parameters of the TextCNN are updated.
The above flow does not describe in detail the specific operation principle of the convolutional layer and the pooling layer of TextCNN. Fig. 5 is a schematic diagram of the TextCNN, and this embodiment takes fig. 5 as an example to describe the flow principle of the TextCNN, and the TextCNN used in this embodiment is different from the following exemplary model only in the size and number of convolution kernels. TextCNN comprises the following steps:
inputting data: for the matrix of X × D size introduced in the above flow, X represents the included word data, D represents the dimension of the word vector, and for simplicity of explanation, X =7 is taken in the example of fig. 5, and D =5 is taken.
Convolution: convolution operations are performed using a plurality of convolution kernels having different sizes, respectively. The convolution kernel is a matrix of size N x D, the first convolution kernel shown in the following figure is a matrix of 4*5. The convolution operation corresponding to the convolution kernel with the size of N X D means that N rows are taken from top to bottom from an input matrix of X X D, namely, a matrix of N X D is intercepted, then the matrix is multiplied by the corresponding position of the convolution kernel matrix and accumulated and summed to obtain a summed number, and the summed number is added with a random offset and then processed by a RELU activation function to obtain a convolution result; and then, in the input matrix of X D, sliding the intercepting window down by one grid, namely, intercepting a matrix of N X D from the 2 nd row, and repeating the convolution operation to obtain a summation result. According to the step, an input matrix of X X D, using convolution kernel of N X D, can obtain a convolution result vector with length of X-N + 1.
Pooling: and (4) a maximum pooling method is used, namely, the maximum value of convolution vectors generated by the same convolution kernel is taken as a pooling result.
And (4) outputting a result: and splicing the pooling results of the plurality of convolution kernels into a vector output which is used as the output of the TextCNN.
2. S1: a composition stage;
(1) S1-0: calculating the feature vectors of all JD;
for all JDs, the texts are respectively input into the matching prediction models of S0-3, and a "feature vector of JD" is obtained from the models and recorded, wherein the feature vector is a vector of 144 dimensions in this embodiment.
(2) S1-1: constructing HNSW by using the feature vectors of all JD;
the HNSW picture is constructed into two parts, wherein the first part is constructed into an NSW picture (Navigable Small World pictures), and the second part is constructed into the HNSW picture on the basis of NSW.
The process of constructing the NSW graph is as follows, and the number of nearest neighbors is taken as 3:
first we assign all JDs that need to be mapped to unique IDs, sort the JDs by ID, and place them in a "candidate set". The JDs are next placed one by one in the order in the figure.
Assume ID 0,1,2,3 ….
Initially, the first JD (ID = 0) is taken from the candidate set and placed directly in the graph.
JD 1 is taken and placed in the graph, and since only JD 0 exists in the graph at this time, JD 0 is the neighbor node, and an edge connecting JD 1 and JD 0 is added in the graph.
Similarly, take JD 2, add JD 2 to JD 1 and JD 2 to JD 0 edges.
Take JD 3, add JD 3 to JD 2, and JD 3 to JD 0 edges.
Taking JD 4, selecting 3 JDs with the largest similarity to the eigenvectors of JD 4 from the nodes already placed in the graph, and adding JD 4 to the edges between the 3 JDs.
As with JD 4, JD 5, JD 6 … are processed in sequence until all JDs are added to the graph.
Through the steps, the NSW graph can be constructed, the NSW graph is efficient to search, for example, if a user is suitable for JD X at the moment, the matching prediction model is firstly used for solving the feature vector of JD X. One would next look for which JD X is the most similar. We can look for the following steps, assuming that the m most similar JDs are to be found:
establishing a traversed set, a candidate set and a result set, wherein the initial values are all null;
randomly finding a node JD Y as an initial node, putting JD Y into a traversed set, and putting JD Y and the similarity of the JD Y and the feature vector of JD X into a candidate set.
And searching all the points in the candidate set for neighbor nodes, if the neighbor nodes are in the traversed set, ignoring the neighbor nodes, and otherwise, putting the neighbor nodes into the traversed set.
The distances of the neighbor nodes to JD X are calculated, and the neighbor nodes and the distances between the neighbor nodes and JD X are put into a 'candidate set'.
And sorting the nodes in the candidate set from large to small according to the similarity between the nodes and JD X.
And checking whether the first m points in the result set are completely consistent with the first m points in the sorted candidate set, if so, ending iteration, wherein the m points are the most similar m JDs, otherwise, emptying the result set, copying the first m points in the candidate set into the result set, and returning to the third step.
The above is the flow of NSW, the structure of constructing HNSW is shown in fig. 6, the bottom layer is the NSW constructed by us, we designate HNSW as N layers (fig. 6 is 3 layers), from the bottom layer, part of nodes are selected by a random method, and are retained in the second layer, and so on, from the second layer, a third layer can be generated. The relationship of the edges between the nodes remains unchanged. The searching process of the most similar JD is consistent with NSW, the initial point JD Y is taken at the topmost layer of HNSW, and when the neighbor node of one node is searched, if the current layer has no neighbor, the next layer is needed to search.
The HNSW algorithm source can refer to:
Malkov,Y.A.,&Yashunin,D.A.(2018).Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs.IEEE transactions on pattern analysis and machine intelligence,42(4),824-836.
3. s2: a user service stage;
(1) S2-0: acquiring intention posts of job seekers through interactive records;
and taking the post where the user delivers and obtains the interview opportunity as the true intention post of the user, and obtaining the JD text of the post.
(2) S2-1, calculating similar posts and recommending the similar posts to a user;
m JDs most similar to the JD text are searched in the graph constructed in S1, that is, a new JD is recommended to the user, where m is a natural number and can be set according to an actual application scenario.
The foregoing is a detailed description of all technical details of the present invention.
The invention can solve the following technical problems:
1. and acquiring the real job hunting intention of the candidate in one job hunting, and recommending the positions according to the real intention. Specifically, the technical problems that can be solved include:
(1) The types of positions that each candidate can fit into are varied. For example, a test development engineer with a rich programming experience, job categories to which resume can be adapted are a test engineer and a software development engineer, and the job seeker wants to seek development towards software development, but does not want to work with the test engineer. At the moment, the existing job recommendation method recommends two types of posts, namely a test engineer and a software development post, to a recruitment user according to the matching degree of the recruitment requirement and the work experience of the user, but cannot acquire the real job hunting intention and make reasonable recommendation.
(2) The motivation for candidates to seek employment is manifold. Dislike of the current job type, hope for new opportunities for occupational development are an important category. For example, in a primary product operation experience, when a job is sought, a user product manager is required to be transformed, the existing job recommendation method cannot obtain the real intention of the job seeker only according to resume information and recruitment requirement information, and only product operation type posts are recommended. How to acquire the real job-seeking intention of a user and make a recommendation by using information in the system is a technical problem to be solved.
2. And calculating the real similar and dissimilar relations between the recruitment requirements. Specifically, two types of practical problems are involved:
(1) Many job positions with small differences and high similarity in characters of a recruitment requirement text (JD) have very different requirements for candidates. Such as: most contents of the job requirement texts of the algorithm engineer and the senior algorithm engineer are overlapped and different in proficiency and experience of work, and at the moment, if recommendation is carried out only through the similarity of the job requirement texts, the two kinds of jobs may be recommended to the same candidate. However, when recommending similar positions, the similarity between these two types of positions should be low in an ideal model.
(2) Many job positions with large differences and low similarity in the characters of the recruitment requirement text (JD) are consistent with suitable candidates. Different JDs have different writing details and writing styles, which results in a text that differs greatly but the actual positions of recruitment are consistent. Such as: the two expressions of "requiring the candidate to be familiar with various natural language processing algorithms and familiar with the corresponding code tool frameworks and code implementation" and "requiring the candidate to be familiar with text processing methods, such as word segmentation, syntactic structure analysis, emotional tendency recognition, automatic question and answer algorithm, and the like, and capable of being programmed and implemented skillfully" are greatly different in text content, but actually are talents in the same category of recruited persons. In an ideal similar job recommendation model, the similarity between these two categories of jobs should be high.
3. Based on recommendation of job demand similarity, when a similar job of a certain job is recommended, similarity of all jobs and the job needs to be calculated and sequenced, and the problems of long calculation time and poor real-time performance are solved. The solution provided by the invention can quickly recommend similar jobs, has high calculation efficiency and is suitable for serving users in real time.
Compared with the prior art, the invention has the advantages that:
1. the interaction information of the job seeker on the platform is fully utilized, and the real job hunting intention of the job seeker is obtained and met.
2. The text data, the history matching data and the TextCNN model are used for supervised learning, the deep learning model is trained to obtain the feature vectors corresponding to all positions, and the difference of the feature vectors corresponding to completely different positions of candidate groups which have similar position requirements and are practically suitable can be improved. The difference of the feature vectors corresponding to the positions of the text which has lower similarity and is suitable for the same batch of candidate groups is reduced, and the similarity is improved.
3. When the similar positions of one position are calculated to realize position recommendation, the HNSW rapid retrieval algorithm is adopted, and compared with a sorting method adopted by the conventional patent, the method has the advantages of higher speed and higher efficiency, and is more suitable for providing real-time service for users.
The present invention is not limited to the above preferred embodiments, and any modifications, equivalent substitutions and improvements made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A job recommendation method based on recruitment requirement similarity is characterized by comprising the following steps:
step S0: a matching prediction model construction stage; acquiring text information from personal resumes and enterprise recruitment requirements, and using the acquired text information to train a Word2Vec Model based on the Skip-Gram Model; training a matching prediction model based on TextCNN + MLP by using the acquired text information, the trained Word2Vec model and the collected resume and post historical matching information;
step S1: HNSW graph composition phase; calculating feature vectors for all enterprise recruitment requirements in the system by using the matching prediction model, and constructing an HNSW graph by using the obtained feature vectors;
step S2: a user service stage; the method comprises the steps of obtaining post recruitment requirements suitable for job seekers, searching the most similar N posts in a constructed HNSW picture according to the post recruitment requirements, and recommending the most similar N posts to the job seekers, wherein N is a natural number.
2. The method for job recommendation based on recruitment requirement similarity according to claim 1 wherein: in the step S0, after the text information is obtained, text preprocessing operations of word segmentation, word stop removal and punctuation removal are carried out on the obtained text information to obtain preprocessed text information; moreover, using the preprocessed text information to train a Word2Vec Model based on the Skip-Gram Model; and training a matching prediction model based on TextCNN + MLP by using the preprocessed text information, the trained Word2Vec model and the collected resume and post historical matching information.
3. The method for job recommendation based on recruitment requirement similarity according to claim 2 wherein: in the step S0, acquiring text information from all personal resumes and enterprise recruitment requirements stored in the system; for the personal resume, extracting the whole text of the personal statement, the work experience and the project experience, and splicing the text into one section as the text information of the personal resume; for the enterprise recruitment requirement, extracting a whole text segment related to the recruitment requirement from the enterprise recruitment requirement, and splicing the text segments into a segment to be used as the enterprise recruitment requirement text information.
4. The method for job recommendation based on recruitment requirement similarity according to claim 3 wherein: in step S0, when training the Word2Vec Model based on the Skip-Gram Model, the specified window size WIN of the Skip-Gram Model is 2, and the dimension D of the Word2Vec Model in converting each Word into a high-dimensional vector is 128.
5. The recruitment requirement similarity based position recommendation method of claim 4, wherein the step of: in step S0, when training the matching prediction model based on TextCNN + MLP, historical matching information of successful matching between resume and post is collected from the historical data of the system.
6. The method for job recommendation based on recruitment requirement similarity according to claim 5 wherein: in the step S0, after history matching information of resume and post successful matching is collected, collected successfully matched data is used as a sample positive example, a sample negative example is randomly sampled according to the proportion of the positive sample to the negative sample of 1:1, finally the positive sample and the negative sample are mixed together to form training data, and each piece of training data comprises personal resume text information, enterprise recruitment requirement text information and an identification of whether the resume and post are successfully matched.
7. The method for job recommendation based on recruitment requirement similarity according to claim 1 wherein: in the step S1, when the feature vectors are calculated for all enterprise recruitment requirements in the system by using the matching prediction model, the text information of the enterprise recruitment requirements is input into the matching prediction model, and the feature vectors of the enterprise recruitment requirements are obtained from the matching prediction model.
8. The method for job recommendation based on recruitment requirement similarity according to claim 7 wherein: in step S1, when constructing the HNSW graph using the obtained feature vectors, an NSW graph is constructed first, and then the HNSW graph is constructed on the basis of the NSW graph.
9. The recruitment requirement similarity based position recommendation method of claim 1, wherein the step of: in step S2, one or more post recruitment requirements suitable for the job seeker are obtained according to the interaction records of the job seeker and the system.
10. The method for job recommendation based on recruitment requirement similarity according to claim 9 wherein: in the step S2, the post of the interview opportunity is delivered and obtained through the job seeker, and text information of the post is obtained to serve as the post recruitment requirement suitable for the job seeker.
CN202211299760.8A 2022-10-24 2022-10-24 Position recommendation method based on recruitment demand similarity Active CN115564393B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211299760.8A CN115564393B (en) 2022-10-24 2022-10-24 Position recommendation method based on recruitment demand similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211299760.8A CN115564393B (en) 2022-10-24 2022-10-24 Position recommendation method based on recruitment demand similarity

Publications (2)

Publication Number Publication Date
CN115564393A true CN115564393A (en) 2023-01-03
CN115564393B CN115564393B (en) 2024-05-10

Family

ID=84746723

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211299760.8A Active CN115564393B (en) 2022-10-24 2022-10-24 Position recommendation method based on recruitment demand similarity

Country Status (1)

Country Link
CN (1) CN115564393B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115865777A (en) * 2023-02-15 2023-03-28 无锡一技信息科技有限公司 RPA technology-based recruitment order intelligent distribution routing method
CN115934899A (en) * 2023-02-28 2023-04-07 天津徙木科技有限公司 IT industry resume recommendation method and device, electronic equipment and storage medium
CN116452169A (en) * 2023-06-14 2023-07-18 北京华品博睿网络技术有限公司 Online recruitment generation type recommendation system and method
CN117689354A (en) * 2024-02-04 2024-03-12 芯知科技(江苏)有限公司 Intelligent processing method and platform for recruitment information based on cloud service
CN117709916A (en) * 2024-02-01 2024-03-15 武汉厚溥数字科技有限公司 Employment information processing method and device, electronic equipment and storage medium
CN117709917A (en) * 2024-02-05 2024-03-15 芯知科技(江苏)有限公司 Intelligent data processing method and system for recruitment platform

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110313963A1 (en) * 2010-01-22 2011-12-22 AusGrads Pty Ltd Recruiting system
CN107590133A (en) * 2017-10-24 2018-01-16 武汉理工大学 The method and system that position vacant based on semanteme matches with job seeker resume
AU2017245283A1 (en) * 2016-09-14 2018-03-29 Herington, Leonard MR Yardie app - Yardie is an online Lawn Mowing / Landscaping network through the use of the "Yardie" app, allows users to request lawn mowing, landscaping and gardening services via an On-Demand request or selecting a Yardie Partner(contractor) profile, directly requesting the yardie for said services.
US20210117459A1 (en) * 2019-10-18 2021-04-22 Baidu Usa Llc Efficient retrieval of top similarity representations
CN113268560A (en) * 2020-02-17 2021-08-17 北京沃东天骏信息技术有限公司 Method and device for text matching
WO2021212749A1 (en) * 2020-04-24 2021-10-28 平安科技(深圳)有限公司 Method and apparatus for labelling named entity, computer device, and storage medium
US20210357869A1 (en) * 2020-05-15 2021-11-18 Microsoft Technology Licensing, Llc Instant content notification with user similarity
CN115034178A (en) * 2022-07-01 2022-09-09 杨双远 Method and storage medium for knowledge graph of human sentry demand text

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110313963A1 (en) * 2010-01-22 2011-12-22 AusGrads Pty Ltd Recruiting system
AU2017245283A1 (en) * 2016-09-14 2018-03-29 Herington, Leonard MR Yardie app - Yardie is an online Lawn Mowing / Landscaping network through the use of the "Yardie" app, allows users to request lawn mowing, landscaping and gardening services via an On-Demand request or selecting a Yardie Partner(contractor) profile, directly requesting the yardie for said services.
CN107590133A (en) * 2017-10-24 2018-01-16 武汉理工大学 The method and system that position vacant based on semanteme matches with job seeker resume
US20210117459A1 (en) * 2019-10-18 2021-04-22 Baidu Usa Llc Efficient retrieval of top similarity representations
CN113268560A (en) * 2020-02-17 2021-08-17 北京沃东天骏信息技术有限公司 Method and device for text matching
WO2021212749A1 (en) * 2020-04-24 2021-10-28 平安科技(深圳)有限公司 Method and apparatus for labelling named entity, computer device, and storage medium
US20210357869A1 (en) * 2020-05-15 2021-11-18 Microsoft Technology Licensing, Llc Instant content notification with user similarity
CN115034178A (en) * 2022-07-01 2022-09-09 杨双远 Method and storage medium for knowledge graph of human sentry demand text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王军仁: "基于用户画像的人岗匹配研究", 《万方数据》, pages 1 - 70 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115865777A (en) * 2023-02-15 2023-03-28 无锡一技信息科技有限公司 RPA technology-based recruitment order intelligent distribution routing method
CN115865777B (en) * 2023-02-15 2023-06-02 无锡一技信息科技有限公司 Recruitment order intelligent distribution routing method based on RPA technology
CN115934899A (en) * 2023-02-28 2023-04-07 天津徙木科技有限公司 IT industry resume recommendation method and device, electronic equipment and storage medium
CN116452169A (en) * 2023-06-14 2023-07-18 北京华品博睿网络技术有限公司 Online recruitment generation type recommendation system and method
CN116452169B (en) * 2023-06-14 2023-11-24 北京华品博睿网络技术有限公司 Online recruitment generation type recommendation system and method
CN117709916A (en) * 2024-02-01 2024-03-15 武汉厚溥数字科技有限公司 Employment information processing method and device, electronic equipment and storage medium
CN117689354A (en) * 2024-02-04 2024-03-12 芯知科技(江苏)有限公司 Intelligent processing method and platform for recruitment information based on cloud service
CN117689354B (en) * 2024-02-04 2024-04-19 芯知科技(江苏)有限公司 Intelligent processing method and platform for recruitment information based on cloud service
CN117709917A (en) * 2024-02-05 2024-03-15 芯知科技(江苏)有限公司 Intelligent data processing method and system for recruitment platform
CN117709917B (en) * 2024-02-05 2024-06-07 芯知科技(江苏)有限公司 Intelligent data processing method and system for recruitment platform

Also Published As

Publication number Publication date
CN115564393B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
CN115564393B (en) Position recommendation method based on recruitment demand similarity
CN112131350B (en) Text label determining method, device, terminal and readable storage medium
US5671333A (en) Training apparatus and method
CN110597988A (en) Text classification method, device, equipment and storage medium
Huang et al. Expert as a service: Software expert recommendation via knowledge domain embeddings in stack overflow
CN111858896B (en) Knowledge base question-answering method based on deep learning
CN113821605B (en) Event extraction method
CN111831789A (en) Question-answer text matching method based on multilayer semantic feature extraction structure
CN112307164A (en) Information recommendation method and device, computer equipment and storage medium
US20220327492A1 (en) Ontology-based technology platform for mapping skills, job titles and expertise topics
CN111241361A (en) Intelligent referral system and method for enterprises and colleges based on cloud platform
CN117033571A (en) Knowledge question-answering system construction method and system
CN112559723A (en) FAQ search type question-answer construction method and system based on deep learning
CN115203507A (en) Event extraction method based on pre-training model and oriented to document field
CN113961666B (en) Keyword recognition method, apparatus, device, medium, and computer program product
Tallapragada et al. Improved Resume Parsing based on Contextual Meaning Extraction using BERT
Liu et al. Resume parsing based on multi-label classification using neural network models
CN113516094A (en) System and method for matching document with review experts
Chang et al. Knowledge element extraction for knowledge-based learning resources organization
US11379763B1 (en) Ontology-based technology platform for mapping and filtering skills, job titles, and expertise topics
CN113342964B (en) Recommendation type determination method and system based on mobile service
Roberts-Witt Practical taxonomies
CN112579666A (en) Intelligent question-answering system and method and related equipment
Liu et al. Suggestion mining from online reviews usingrandom multimodel deep learning
Balbi et al. A Text Mining Strategy based on local contexts of words

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant