CN115774996B

CN115774996B - Intelligent interview topdressing problem generation method and device and electronic equipment

Info

Publication number: CN115774996B
Application number: CN202211548972.5A
Authority: CN
Inventors: 戴科彬; 闻洪海; 陈少波
Original assignee: Duomian Beijing Technology Co ltd; Tongdao Jingying Tianjin Information Technology Co ltd; Yingshi Internet Beijing Information Technology Co ltd
Current assignee: Duomian Beijing Technology Co ltd; Tongdao Jingying Tianjin Information Technology Co ltd; Yingshi Internet Beijing Information Technology Co ltd
Priority date: 2022-12-05
Filing date: 2022-12-05
Publication date: 2023-07-25
Anticipated expiration: 2042-12-05
Also published as: CN115774996A

Abstract

The invention discloses an intelligent interview topdressing problem generation method, device and electronic equipment, and relates to the technical field of commercial interview information data processing. The method comprises the following steps: analyzing answers of the candidates, and identifying a plurality of semantic entities from the answers; extracting the relation of a plurality of semantic entities to obtain entity relation information; obtaining standard answers corresponding to the questions, and determining answer result information of the candidate based on the standard answers and the entity relation information; and determining an additional query strategy based on the answer result information, determining additional query knowledge points in the knowledge graph corresponding to the questions according to the additional query strategy, and generating additional questions corresponding to the additional query knowledge points. The invention can conduct targeted inquiry, so that the inquiry questions help interviewee to excavate the knowledge depth and breadth of candidates, and finally determine the post matching degree.

Description

Intelligent interview topdressing problem generation method and device and electronic equipment

Technical Field

The invention relates to the technical field of data processing of business interview information, in particular to an intelligent interview topdressing problem generation method, device and electronic equipment.

Background

The intelligent interview on the current market usually presets a plurality of pursuit dimensions of each source question, uses a model to judge a target dimension to be pursued, and recommends the pursuit questions under the target dimension to the interview officer.

The inventor finds that the current scheme can only perform one-way questioning according to a specific system in the interview process, lacks effective understanding of answer contents of the candidate and performs targeted questioning, the interview question cannot help interview staff to excavate knowledge depth and breadth of the candidate, and finally determines the matching degree of the candidate and the posts.

Disclosure of Invention

In order to solve the technical problems or at least partially solve the technical problems, the embodiment of the invention provides a method, a device and electronic equipment for generating an intelligent interview, which establish the connection between knowledge points and knowledge points in the form of a knowledge graph, each interview can determine the interview direction based on the response condition of a candidate, and the interview can be targeted aiming at sub-knowledge points, peer-level knowledge points, unreferenced knowledge points and the like, so that interview can help interviewee to excavate the knowledge depth and breadth of the candidate, and finally determine the matching degree of the candidate and posts.

The embodiment of the invention provides an intelligent interview topdressing problem generating method, which comprises the following steps:

analyzing answers of the candidate for the questions, and identifying a plurality of semantic entities from the answers; extracting attribute relations among the plurality of semantic entities to obtain entity relation information among the plurality of semantic entities; obtaining standard answers corresponding to the questions, and determining answer result information of the candidate based on the standard answers and the entity relation information; the answer result information is used for representing the mastering conditions of the candidate on a plurality of knowledge points related to the questions; and determining at least one additional query strategy based on the answer result information, determining additional query knowledge points in the knowledge graph corresponding to the questions according to the additional query strategy, and generating additional questions corresponding to the additional query knowledge points.

The embodiment of the invention also provides an intelligent interview topdressing problem generating device, which comprises:

the recognition module is used for analyzing answers of the candidates for answering the questions and recognizing a plurality of semantic entities from the answers; the extraction module is used for extracting attribute relations among the plurality of semantic entities to obtain entity relation information of the plurality of semantic entities; the determining module is used for obtaining standard answers corresponding to the questions and determining answer result information of the candidate based on the standard answers and the entity relation information; the answer result information is used for representing the mastering conditions of the candidate on a plurality of knowledge points related to the questions; and the generating module is used for determining at least one additional query strategy based on the response result information, determining additional query knowledge points in the knowledge graph corresponding to the questions according to the additional query strategy, and generating additional questions corresponding to the additional query knowledge points.

The embodiment of the invention also provides electronic equipment, which comprises:

one or more processors; a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of generating a top-up question as described above.

The embodiment of the invention also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method for generating a top-up question as described above.

Embodiments of the present invention also provide a computer program product comprising a computer program or instructions which, when executed by a processor, implement the method of generating a topdressing problem as described above.

Compared with the prior art, the technical scheme provided by the embodiment of the invention has the following advantages:

(1) The relation between the knowledge points and the knowledge points is established in the form of a knowledge graph, each inquiry can determine the inquiry direction based on the response condition of the candidate, and the aiming at sub-knowledge points, the same-level knowledge points, the non-mentioned knowledge points and the like, the inquiry is carried out in a targeted manner, so that the inquiry question can help the interviewee to excavate the knowledge depth and breadth of the candidate, and finally the matching degree of the candidate and the posts is determined.

(2) Aiming at knowledge points related to each topic, the prior knowledge of the knowledge points is utilized to enhance the effect of semantic entity recognition, and the method can be effectively suitable for standard answer core words in various situations.

Drawings

The above and other features, advantages and aspects of embodiments of the present invention will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

FIG. 1 is a flow chart of a method for generating a challenge for intelligent interview in an embodiment of the invention;

FIG. 2 is a flow chart of an enhanced method for semantic entity identification based on dictionary+rule approaches using prior knowledge in an embodiment of the present invention;

FIG. 3 is a flow chart of an enhanced method for semantic entity identification based on machine learning/deep learning with prior knowledge in an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of an intelligent interview topquestion generating device according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device in an embodiment of the present invention.

Detailed Description

Embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While the invention is susceptible of embodiment in the drawings, it is to be understood that the invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided to provide a more thorough and complete understanding of the invention. It should be understood that the drawings and embodiments of the invention are for illustration purposes only and are not intended to limit the scope of the present invention.

It should be understood that the various steps recited in the method embodiments of the present invention may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the invention is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is "based at least in part on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like herein are merely used for distinguishing between different devices, modules, or units and not for limiting the order or interdependence of the functions performed by such devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those skilled in the art will appreciate that "one or more" is intended to be construed as "one or more" unless the context clearly indicates otherwise.

The intelligent interview in the current market mainly has the problems that the interview process is too mechanical, only one-way interview can be carried out according to a preset assessment dimension in the interview process, a tree-shaped or net-shaped structure of a knowledge system is ignored, and the connection between knowledge points is missing to a certain extent, so that the effective understanding of answer contents of candidates is lacking and the targeted interview is carried out. Therefore, the embodiment of the invention discloses a knowledge-graph-based inquiring method, which establishes the relation between knowledge points in a knowledge graph form, can inquire about one or more knowledge points in each question, can expand questions about sub-knowledge points, peer knowledge points, unreferenced knowledge points and the like in each inquiry, and can mine the knowledge depth and breadth of candidates.

Referring to fig. 1, an embodiment of the present invention provides a flowchart of a method for generating an intelligent interview.

Step S110, analyzing answers of the candidate for the questions, and identifying a plurality of semantic entities from the answers.

And carrying out semantic entity recognition on the answer of the candidate, and recognizing any semantic entity which needs to be used. Semantic entities are word combinations of nouns in documents/files/sentences that describe real world exact objects, such as person names, organization names, place names, and other all entities identified by names, broader entities also including numbers, dates, currencies, addresses, etc., entity types in different fields can be greatly differentiated, such as important entity types in the medical field typically include gene names, protein structure attribute names, compound names, drug names, and disease names, etc.

Semantic entity recognition may be implemented based on dictionary + bi-directional maximum matching rules, or may be implemented based on any sequence labeling algorithm, including but not limited to cyclic neural networks (e.g., RNN, LSTM), pre-training models (BERT).

For example, the candidate answer is "transaction isolation level has committed reads and repeatable reads, committed reads level can solve dirty reads, repeatable reads level can solve magic reads and nonrepeatable reads", from which the following semantic entities are extracted: "transaction isolation level, committed read level, repeatable read level, dirty read, nonrepeatable read, phantom read".

Step S120, extracting attribute relations of the plurality of semantic entities to obtain entity relation information of the plurality of semantic entities.

Combining the semantic entity recognition results, combining any two semantic entities into semantic entity pairs, and carrying out relation judgment on each semantic entity pair. The relationship determination may be implemented using any text classification algorithm including, but not limited to, based on traditional machine learning (e.g., LR, SVM), based on convolutional neural networks (e.g., textCNN), based on recurrent neural networks (e.g., RNN, LSTM), based on pre-training models (e.g., BERT).

Continuing with the example in step S110, the following relationships are refined for the identified semantic entities: the committed read level belongs to the transaction isolation level, the repeatable read level belongs to the transaction isolation level, the committed read level can solve the illusion read, the repeatable read level can solve the illusion read, and the repeatable read level can solve the unrepeatable read.

Step S130, obtaining standard answers corresponding to the questions, and determining answer result information of the candidate based on the standard answers and the entity relation information.

The answer result information is used for representing mastering conditions of the candidate on a plurality of knowledge points related to the questions.

In general, questions can relate to a plurality of knowledge points, one or more semantic entity relationship answers correspond to one knowledge point, whether the extracted semantic entity relationship is consistent with the semantic entity relationship in the standard answer is mainly judged, knowledge point explanation candidates corresponding to consistent parts are well mastered, knowledge point explanation candidates corresponding to inconsistent parts are wrongly mastered, and knowledge point explanation candidates corresponding to missing parts are mastered.

Continuing the example above, the standard answers include: the uncommitted read level belongs to the transaction isolation level, the committed read level belongs to the transaction isolation level, the repeatable read level belongs to the transaction isolation level, the serialized level belongs to the transaction isolation level, the committed read level can resolve dirty reads, the repeatable read level can resolve unrepeatable reads, the serialized level can resolve dirty reads, the serialized level can resolve unrepeatable reads, and the serialized level can resolve phantom reads. Comparing the standard answer with the semantic entity relationship in step S120, it can be found that there are 4 consistent ("committed read level belongs to transaction isolation level", "repeatable read level can solve dirty read"), there are 1 inconsistent ("committed read level can solve phantom read"), there are 5 unrecited ("uncommitted read level belongs to transaction isolation level", "serialization level can solve dirty read", "serialization level can solve unrepeatable read", "serialization level can solve phantom read"). The method is characterized in that the knowledge points of the candidates in the direction of 'uncommitted read level' of the transaction isolation level are mastered with errors and defects, and the knowledge points of the candidates in the direction of 'serialization level' of the transaction isolation level are mastered with defects.

Optionally, the entity relationship information includes entity relationship information of a plurality of attributes, the attributes being determined based on the topic. Generally, semantic entity relationships can be categorized according to attributes, which are determined according to answers to topics. Continuing with the above example, the question standard answer involves several levels included in the transaction isolation level, and the questions that each level can solve, so that it can be determined that the attributes include two relationship attributes, namely "belonging to relationship" and "solving relationship". After determining the attributes, classifying the semantic entity relationships according to the attributes.

Further, this step may determine answer result information according to the following manner:

combing from the standard answers to obtain semantic entity relation answers of all the attributes; and matching the entity relation information of the attribute with the semantic entity relation answer aiming at each attribute to obtain answer result information of the candidate under the attribute.

Continuing the above example, the standard answer includes: semantic entity-relationship answer 4 sets "belong to relationship": the uncommitted read level belongs to the transaction isolation level, the committed read level belongs to the transaction isolation level, the repeatable read level belongs to the transaction isolation level, and the serialization level belongs to the transaction isolation level; semantic entity-relationship answer 6 sets of "solution relationship": "committed read level may resolve dirty reads", "repeatable read level may resolve unreadable", "serialization level may resolve dirty reads", "serialization level may resolve unreadable reads", "serialization level may resolve phantom reads". The entity relation information includes: semantic entity-relationship answer 2 sets "belong to relationship": "committed read level belongs to transaction isolation level", "repeatable read level belongs to transaction isolation level"; semantic entity-relationship answer 3 sets of "solution relationship": "submitted read level may resolve unread", "repeatable read level may resolve unread".

And comparing the two information to obtain answer result information of the candidate under each attribute. Namely, a "no-commit read level" and a "serialization level" belonging to a relationship "having a missing knowledge point", a "solution relationship" having a missing knowledge point "serialization level" can solve a problem, and a "solution relationship" having an erroneous knowledge point "having a committed read level" can solve a problem.

And step S140, determining at least one additional query strategy based on the answer result information, determining additional query knowledge points in the knowledge graph corresponding to the questions according to the additional query strategy, and generating additional questions corresponding to the additional query knowledge points.

In the scheme, a plurality of inquiry strategies are provided for selection, and each inquiry strategy can simultaneously consider the knowledge consistency of knowledge points related to the answer of the candidate and the standard answer and the divergence expansion of the associated knowledge points. The query strategies include peer knowledge point query strategies (i.e. breadth-first, searching the peer knowledge points related to the missing knowledge points from the candidate), sub knowledge point query strategies (i.e. depth-first, searching the sub knowledge points related to the candidate from the candidate to master the good knowledge points), and error knowledge point query strategies (i.e. error clarification, re-applying the query to the knowledge points for which the candidate has mastered the error).

Specifically, a corresponding relationship between candidate answer result information and an additional query strategy can be established in advance: knowledge point mastering good corresponding sub knowledge point inquiring strategies, knowledge point mastering missing corresponding peer knowledge point inquiring strategies and knowledge point mastering missing error corresponding error knowledge point inquiring strategies.

Furthermore, in the scheme, a tree-type knowledge graph among the knowledge points is pre-established, wherein the knowledge graph comprises a hierarchical relation among the knowledge points and a recommendation question corresponding to each knowledge point. Firstly, a knowledge point corresponding to the question can be positioned in the knowledge graph, after the overtime strategy is determined, the corresponding overtime knowledge point can be searched in the knowledge graph according to the overtime strategy, and the overtime question is selected from recommended questions corresponding to the overtime knowledge point.

Further, since the step S140 and the step S130 are decoupled, the challenge strategy can be flexibly adjusted according to the actual service scenario. The determination of the point of challenge knowledge may be accomplished as follows:

determining the corresponding inquiry strategy of each attribute according to the answer result information under each attribute; aiming at each attribute, determining an overtaking knowledge point in a knowledge graph corresponding to the question according to an overtaking strategy corresponding to the attribute; and determining the pursuit question in the question library based on each pursuit knowledge point.

Continuing the above example, if the candidate's "uncommitted read level" and "serialized level" knowledge points are missing, triggering a peer knowledge point query strategy of "uncommitted read level" and "serialized level" to find the peer knowledge points associated therewith. Such as the candidate not replying to "serialization level belongs to transaction isolation level," may trigger an additional question "do there is knowledge of serialization? What is serialization referred to? ".

The knowledge point of the candidate's problem which can be solved by the submitted reading level' is mastered well, and then a sub-knowledge point inquiring strategy of the candidate's problem which can be solved by the repeated reading level' is triggered. For example, the candidate has answered "submitted read level can resolve dirty reads," can trigger an additional question, "how submitted read level is implemented," or "what is dirty read meant? ".

The knowledge point mastering error of the candidate's problem which can be solved by the repeatable read level' triggers an error knowledge point inquiring strategy of the candidate's problem which can be solved by the repeatable read level'. Such as a candidate answering an error "repeatable read level can solve magic reads," a triggerable challenge "repeatable read level can solve what questions? ".

According to the technical scheme provided by the embodiment of the invention, the connection between the knowledge points is established in the form of the knowledge map, each inquiry can determine the inquiry direction based on the response condition of the candidate, and the aiming at sub knowledge points, the same-level knowledge points, the non-mentioned knowledge points and the like, the inquiry is carried out in a targeted manner, so that the inquiry problem can help the interviewee to excavate the knowledge depth and breadth of the candidate, and finally the matching degree of the candidate and the posts is determined.

Each question in the question bank is information with standard answers or survey points, which may be collectively referred to as a priori knowledge. The inventor finds that, aiming at knowledge points related to each topic, prior knowledge of the knowledge points can be added into a semantic entity recognition step, and answer answers of candidates are preprocessed (text correction, text supplement and text similar word replacement), so that the accuracy of semantic entity recognition is enhanced.

As an optional implementation manner of the embodiment of the present invention, before performing step S110, the method further includes:

extracting knowledge points related to the topics to obtain priori knowledge information; the prior knowledge information comprises semantic entities, common prefixes, common suffixes and polysemous word meanings in standard answers.

For example, the standard answer of the title "which system the chassis of the automobile consists of" is "drive train, brake train, running train, steering train", and four semantic entities in the standard answer all have the common suffix "train", so that the candidate usually ignores the common suffix information when answering the title, such as directly answering "drive, brake, running, steering", and then the prior knowledge can be used to directly locate the semantic entity to mention "drive" can be equivalent to the semantic entity "drive train";

For example, the title "plants of Rosaceae" has standard answers of "apple, rose", wherein "apple" is a polysemous word, which refers to a plant having a mobile phone model, and the meaning of "apple" in the title is a plant rather than a mobile phone model. When the candidate answers the title, if the semantic entity mentions "apple", the candidate can directly position the semantic entity mentions "apple" to the semantic entity "apple" in the plant by using the prior knowledge.

As some alternative implementations of the embodiments of the present invention, there are two main types of conventional semantic entity identification: namely a dictionary + rule approach and a machine learning/deep learning approach. The priori knowledge obtained by the topic analysis can play a role in enhancing the accuracy of semantic entity identification for both modes. Fig. 2 shows a method for enhancing the prior knowledge on the dictionary+rule mode, and fig. 3 shows a method for enhancing the prior knowledge on the machine learning/deep learning mode.

As shown in fig. 2, the method for identifying the semantic entity comprises the following steps:

step S210, determining core words in the standard answers based on the priori knowledge information, performing finest-granularity word segmentation on each core word to obtain a plurality of first word segments included in each core word, and calculating the weight of each first word segment.

Specifically, the core word is determined according to semantic entities in the priori knowledge information. For each core word, word segmentation is performed by using the finest granularity word segmentation rule, and the weight of each first word segmentation is calculated. For example, taking an example of "how the vehicle chassis is composed of several systems", the standard answer is "drive train, brake train, running train, steering train", including four core words of "drive train", "brake train", "running train", "steering train", and taking "brake train" as an example, performing the finest granularity word division will obtain two first words of "brake" and "train", and calculate the weight of each first word. The weight is calculated using the following formula,

wherein FreqD is the frequency of occurrence of the first word in all documents, freqQ is the frequency of occurrence of the first word in the present standard answer, bonuseprefix is a reward parameter of the first word occurring as a prefix of at least two core words in the present standard answer, bonusesuff ix is a reward parameter of the first word occurring as a suffix of at least two core words in the present standard answer, and the reward parameter is a preset constant value greater than 1. The above parameters can all be obtained through a priori knowledge.

Step S220, word segmentation is carried out on the answers to obtain a plurality of second word segments; the length threshold of the second word segment is determined based on the length of the first word segment.

And (3) segmenting the answer to be answered by using an N-Gram model to obtain second segmentation combinations in the answer to be answered by the candidate, wherein the lower limit of the length value range (namely the numerical value of N) of each second segmentation is the minimum length-2 of all the first segmentation, and the upper limit is the maximum length +2 of all the first segmentation. For example, the longest word of the first word is "brake", the shortest word is "series", and thus the upper limit of N takes the value 2+2=4, and the lower limit of N takes the value 1-2= -1, and thus N of this example takes meaningful values of 1 to 4.

For example, if the answer to the candidate is "drive-thru", the following N-Gram combination is calculated: transmission, dynamic, automatic, dynamic, traveling, driving, dynamic automatic, dynamic traveling, driving automatic, dynamic automatic, automatic traveling, dynamic traveling, driving automatic, dynamic automatic traveling, and automatic traveling.

Step S230, sorting the first words in the core words according to the weight of each first word in the core words, sequentially calculating the similarity between each first word in the core words and each second word according to the weight, screening out candidate second words with similarity higher than a first preset similarity threshold, and outputting a plurality of first matching results, wherein the first matching results comprise the first word, the candidate second word, the similarity and similarity type of the first word and the candidate second word, and the position of the candidate second word in the answer.

For each core word, the first word segments of the core word are ordered according to the weight calculated in step S210, and then, for the core word "brake system", the weight of the first word segment "brake" is 0.9, and the weight of the first word segment "system" is 0.1, then, the first word segments of the core word "brake system" are ordered to obtain "brake" and "system".

Further, the similarity of each first word and each second word of the answer answered by the candidate is sequentially calculated according to the weight sequence, and the second words meeting the conditions are screened out according to a similarity threshold. Preferably, the similarity includes edit distance similarity, semantic similarity and pronunciation similarity, wherein the edit distance is the minimum operation number for describing the conversion of one character string into another character string, and the operations include insertion, deletion and replacement; semantic similarity is a vector similarity calculation based on word embedding; the pronunciation similarity is the similarity of two Chinese characters judged by the aspects of initials, finals and tones.

More preferably, when calculating the first word segment ranked first, it is compared with all the second words of the candidate answers; when the first word segment ranked at the second position is calculated, comparing the first word segment with the adjacent second word segment of the target second word segment of which the answer of the candidate is matched with the first word segment; and so on. For example, a "brake system" would first compare the first word "brake" ranked first to each second word of the candidate answer "drive-through", where the pronunciation similarity of "brake" to "automatic" is 0.99 above a threshold; the first word "series" ranked in the second position is then used to compare to the second word (i.e., "line", "travel", "drive") that is "automatically" adjacent to the target second word in the candidate answer, each similarity being below a threshold. Thus, by comparing the threshold values, candidate second word "automatic" for the first word "brake" is screened out, which can be noted as: ("brake", "automatic", [3,4],0.99, similar pronunciation), where [3,4] represents the 3 rd, 4 th character position where "automatic" is "drive automatic travel"; no candidate second word for the first word "line" may be noted as "line", [ ],0, none ". Because the first word "system" has no candidate second word, the candidate second word "automatic" alone constitutes the matching result corresponding to the core word "brake system" in the answer.

Further, the five fields of the matching result in this step can be summarized as: the method comprises the steps of core vocabulary word segmentation of a standard answer, N-Gram word segmentation in an answer by a candidate, the position of the N-Gram word segmentation in the answer by the candidate in a primitive, the similarity of the core vocabulary word segmentation of the standard answer and the N-Gram word segmentation in the answer by the candidate, and the type of the similarity (including pronunciation similarity, editing distance similarity, semantic similarity, word segmentation deletion and the like).

Step S240, merging the plurality of first matching results into a second matching result, calculating weighted similarity between the core word and the second matching result based on the weight of each first word segment in the core word and the similarity between each first word segment and the corresponding candidate second word segment, and screening out a second matching result with weighted similarity higher than a second preset similarity threshold.

For example, the matching results ("brake", "automatic", [3,4],0.99, similar pronunciation), ("tie", [ ],0, none) in step S230 are combined to obtain the matching result corresponding to the core word. And multiplying the weight of each first word segment included in each core word by the similarity of the candidate second word segment corresponding to the first word segment for each core word, and then summing to obtain the weighted similarity of each core word and the matching result. Namely: the weight of the first word "brake" of the core word "brake system" is 0.9, the similarity of the corresponding candidate second word "automatic" is 0.99, the weight of the first word "system" is 0.1, and the corresponding candidate second word is not present, so that the weighted similarity score of the final core word "brake system" and the matching result is 0.9×0.99+0.1×0=0.891. The matching results are selected to be combined ("brake system", "auto", [3,4],0.891, [ pronunciation similar ]), assuming that the value is higher than the predetermined threshold value of 0.85.

Step S250, based on the second matching result, determining a plurality of semantic entities in the answer.

On the basis of the steps S210-S250, further, if the candidate second word at the same position in the answer is present in the plurality of second matching results, a matching conflict occurs; optimizing the weighted similarity between the core word and the second matching result based on a conflict matching algorithm until the candidate second words at the same position in the answer only appear in one second matching result, and taking the preset number of second matching results with highest weighted similarity as final matching results; based on the screened final matching result, a plurality of semantic entities in the answer to the answer are determined.

In this step, a conflict means that characters at the same position in the answer are present in a plurality of second words. For example, the standard answer is "chassis system", and the candidate answer is "chassis is subdivided into suspension system, steering system", and step S240 outputs three matching results: ("Chassis System", "Chassis", [1,2],0.9, [ word missing ]), ("Chassis System", "Chassis thin", [1,2,3],0.88, [ edit distance similar, pronunciation similar ]), ("Chassis System", "Chassis thin", [1,2,3,4],0.85, [ edit distance similar, pronunciation similar ]), all of which are intended to match "Chassis System" in standard answers, and there is a positional conflict, such as the presence of the same element [1,2] in the position list [1,2] of the first matching result and the position list [1,2,3] of the second matching result. The step is to select the most suitable M matching results from N matching results, and the following conditions are required to be satisfied: 1. the M matching results are not matched with each other in a conflict mode; 2. the average similarity score for the M matching results is highest. Preferably, the optimal value is solved by adopting the ideas of divide-and-conquer and dynamic programming through a matching conflict algorithm.

As shown in fig. 3, the method for identifying the semantic entity includes:

step S310, carrying out text correction on the answer based on the prior knowledge information, the similarity of the first word segmentation and the second word segmentation to obtain a corrected answer; wherein the text modification includes correction of text similar pronunciation, expansion of text prefix and suffix, and substitution of text similar words.

In this step, the correction includes correction of text-like pronunciations of the answer, expansion of text prefixes and substitution of text-like words. Specific:

for example, if the standard answer is "brake system", and the candidate answer is "drive-thru", step S240 outputs a matching result ("brake system", "auto", [3,4],0.891, [ pronunciation similar ]), which is combined from the following 2 sub-matching results: ("brake", "automatic", [3,4],0.99, similar pronunciation), ("tie", [ ],0, none), the candidate answer "drive automatic travel" is corrected to "drive brake travel" according to the ("brake", "automatic", [3,4],0.99, similar pronunciation).

For example, if the standard answer is "before and after destruction", the answer of the candidate is "before and after destruction", the matching result ("before destruction", "1, 2, 3", "1.0", "similar edit distance"), ("after destruction", "1, 2, 4", "0.9", "similar edit distance ]) is outputted in step S240, and it can be seen that the two matches share" destruction ", so that the answer of the candidate" before and after destruction "is expanded to" before and after destruction ".

For example, the text of the answer of the candidate is replaced by a similar word, the similar word is used as a class of priori knowledge, and the similar word is provided by an expert when the question is input, and if the answer of the candidate is an 'anonymous pipe', the similar word is replaced by an 'anonymous pipe'.

Step S320, inputting the corrected answer to the deep learning model in the form of characters, inputting the final matching result to the deep learning model in the form of vocabulary, and outputting a plurality of semantic entities in the recognized answer.

Specifically, the deep learning model is described by taking a FLAT algorithm model as an example. If the standard answer is "brake system", "running system", the answer of the candidate is "automatic running", the two matching results are assumed to be the same after the conflict matching optimization ("brake system", "automatic", "1, 2", "0.891", "similar pronunciation") ("running system", "running", "3, 4", "0.9", "the like"), and the answer of the candidate is corrected to be "brake running" after the similar pronunciation correction. The data input into the FLAT algorithm model includes two parts: the first part is character-level information, i.e. information comprising the following four characters ("system", 1), ("dynamic", 2), ("line", 3), ("driving", 4), wherein the meaning of the three fields is respectively (character, character in the beginning position of the original, character in the ending position of the original). The second part is vocabulary-level information of the matching result, i.e. comprises the following two vocabularies ("brake", 1, 2), (travel, 3, 4), wherein the meaning of the three fields is (word, word at the beginning of the original, word at the end of the original) respectively. The output of the FLAT algorithm model is a sequence label of information at the first part character level, such as adding a fourth field, ("brake", 1, "B-LOC"), ("dynamic", 2, "E-LOC"), ("line", 3, "B-LOC"), ("drive", 4, "E-LOC"), where "B-LOC" identifies the beginning of an entity and "E-LOC" represents the result of an entity, so the end result is to output two entities, "brake" and "drive".

In one embodiment, referring to fig. 4, a schematic structural diagram of a topdressing problem generating device is provided. The apparatus may be used to perform the method of generating a challenge shown in any of fig. 1-3, the apparatus comprising: the identification module 410, the extraction module 420, the determination module 430 and the monk formation module 440; wherein, the liquid crystal display device comprises a liquid crystal display device,

an identifying module 410, configured to parse answers of the candidate for the questions, and identify a plurality of semantic entities from the answers;

the extraction module 420 is configured to extract relationships between the plurality of semantic entities to obtain entity relationship information of the plurality of semantic entities;

a determining module 430, configured to obtain a standard answer corresponding to the question, and determine answer result information of the candidate based on the standard answer and the entity relationship information; the answer result information is used for representing the mastering conditions of the candidate on a plurality of knowledge points related to the questions;

the generating module 440 is configured to determine at least one additional query policy based on the answer result information, determine an additional query knowledge point in a knowledge graph corresponding to the question according to the additional query policy, and generate an additional question corresponding to the additional query knowledge point; the inquiring strategies comprise a peer knowledge point inquiring strategy, a sub knowledge point inquiring strategy and an error knowledge point inquiring strategy.

Optionally, the device further comprises an extraction module, configured to extract knowledge points related to the topic, so as to obtain priori knowledge information; the prior knowledge information comprises semantic entities, common prefixes, common suffixes and polysemous word meanings in the answers.

Optionally, the recognition module 410 is further configured to determine a core word in the standard answer based on the a priori knowledge information, perform the finest granularity word segmentation on the core word to obtain a plurality of first word segments, and calculate a weight of each first word segment; word segmentation is carried out on answers to the answers to obtain a plurality of second words, and the length of the second words is determined based on the length of the first words; sorting the first words in the core words according to the weight of each first word in the core words, sequentially calculating the similarity between each first word in the core words and each second word according to the weight, screening out candidate second words with the similarity higher than a first preset similarity threshold, and outputting a plurality of first matching results, wherein the first matching results comprise the first word, the candidate second word, the similarity between the first word and the candidate second word, the similarity type of the first word and the candidate second word, and the position of the candidate second word in the answer; combining the plurality of first matching results into a second matching result, calculating weighted similarity between the core word and the second matching result based on the weight of each first word in the core word and the similarity of each first word and the corresponding candidate second word, and screening out a second matching result with weighted similarity higher than a second preset similarity threshold; based on the second matching result, a plurality of semantic entities in the answer to the answer is determined.

Optionally, the identifying module 410 is further configured to, if the candidate second word at the same location in the answer appears in the plurality of second matching results, cause a matching conflict to occur; optimizing the weighted similarity between the core word and the second matching result based on a conflict matching algorithm until the candidate second words at the same position in the answer only appear in one second matching result, and taking the preset number of second matching results with highest weighted similarity as final matching results; based on the screened final matching result, a plurality of semantic entities in the answer to the answer are determined.

Optionally, the recognition module 410 is further configured to perform text correction on the answer to be answered based on the a priori knowledge information, the similarity of the first word segment and the second word segment, to obtain a corrected answer to be answered; wherein the text modification includes correction of text similar pronunciation, expansion of text prefix and suffix, and substitution of text similar words. And inputting the corrected answer to a deep learning model in the form of characters, inputting the final matching result to the deep learning model in the form of words, and outputting a plurality of semantic entities in the recognized answer.

Optionally, the generating module 440 is further configured to trigger a sub-knowledge point query strategy if the answer result information indicates that knowledge points are mastered well, and preferentially determine the queried sub-knowledge points in the knowledge graph corresponding to the questions, so as to generate a queried question corresponding to the queried sub-knowledge points; if the answer result information represents that the knowledge points are mastered incorrectly, triggering a reiterating and inquiring strategy, and preferentially reiterating and inquiring the knowledge points in the knowledge graph corresponding to the question; and if the answer result information represents knowledge point mastering deletion, triggering a peer knowledge point inquiring strategy, preferentially determining the inquired peer knowledge points in the knowledge graph corresponding to the questions, and generating the inquired questions corresponding to the inquired peer knowledge points.

It should be noted that, the technical schemes corresponding to the additional problem generating device provided by the embodiments of the present invention that may be used to execute the above method embodiments are similar in implementation principle and technical effect, and are not repeated herein.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. Referring now in particular to fig. 5, a schematic diagram of an electronic device 500 suitable for use in implementing embodiments of the present invention is shown. The electronic device 500 in the embodiment of the present invention may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), wearable electronic devices, and the like, and fixed terminals such as digital TVs, desktop computers, smart home devices, and the like. The electronic device shown in fig. 5 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present invention.

As shown in fig. 5, the electronic device 500 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 501, which may perform various suitable actions and processes to implement the methods of embodiments of the present invention according to programs stored in a Read Only Memory (ROM) 502 or loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

In general, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 507 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 508 including, for example, magnetic tape, hard disk, etc.; and communication means 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 shows an electronic device 500 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. Alternative implementations or with more or fewer devices are possible.

The above description is only illustrative of the preferred embodiments of the present invention and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in the present invention is not limited to the specific combinations of technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the spirit of the disclosure. Such as the above-mentioned features and the technical features disclosed in the present invention (but not limited to) having similar functions are replaced with each other.

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Also, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the invention. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. The method for generating the pursuit problem of the intelligent interview is characterized by comprising the following steps of:

determining a core word in a standard answer based on priori knowledge information, performing finest granularity word segmentation on the core word to obtain a plurality of first word segments, and calculating the weight of each first word segment; word segmentation is carried out on answers to the answers to obtain a plurality of second words, and the length of the second words is determined based on the length of the first words; sorting the first words in the core words according to the weight of each first word in the core words, sequentially calculating the similarity between each first word in the core words and each second word according to the weight, screening out candidate second words with the similarity higher than a first preset similarity threshold, and outputting a plurality of first matching results, wherein the first matching results comprise the first word, the candidate second word, the similarity between the first word and the candidate second word, the similarity type of the first word and the candidate second word, and the position of the candidate second word in the answer; combining the plurality of first matching results into a second matching result, calculating weighted similarity between the core word and the second matching result based on the weight of each first word in the core word and the similarity of each first word and the corresponding candidate second word, and screening out a second matching result with weighted similarity higher than a second preset similarity threshold; determining a plurality of semantic entities in the answer based on the screened second matching result;

Extracting attribute relations among the plurality of semantic entities to obtain entity relation information among the plurality of semantic entities;

obtaining standard answers corresponding to the questions, and determining answer result information of the candidate based on the standard answers and the entity relation information; the answer result information is used for representing the mastering conditions of the candidate on a plurality of knowledge points related to the questions;

and determining at least one additional query strategy based on the answer result information, determining additional query knowledge points in the knowledge graph corresponding to the questions according to the additional query strategy, and generating additional questions corresponding to the additional query knowledge points.

2. The method of claim 1, wherein prior to parsing the candidate's answer to the question, the method further comprises:

3. The method as recited in claim 1, further comprising:

if the candidate second word at the same position in the answer appears in a plurality of second matching results, matching conflict appears;

Optimizing the weighted similarity between the core word and the second matching result based on a conflict matching algorithm until the candidate second words at the same position in the answer only appear in one second matching result, and taking the preset number of second matching results with highest weighted similarity as final matching results;

based on the screened final matching result, a plurality of semantic entities in the answer to the answer are determined.

4. A method according to any one of claims 1 or 3, wherein the step of parsing the answer of the candidate for the question, and identifying a plurality of semantic entities from the answer to the answer comprises:

based on the prior knowledge information and the similarity of the first word segmentation and the second word segmentation, carrying out text correction on the answer to be answered to obtain a corrected answer to be answered; the text correction comprises correction of text similar pronunciation, expansion of text prefix and replacement of text similar words;

and inputting the corrected answer to the deep learning model in the form of characters, inputting the final matching result to the deep learning model in the form of words, and outputting a plurality of semantic entities in the recognized answer.

5. The method of claim 1, wherein the step of determining at least one additional strategy based on the answer result information, and determining additional knowledge points in the knowledge graph corresponding to the questions according to the additional strategy, and generating additional questions corresponding to the additional knowledge points, comprises:

if the answer result information indicates that knowledge points are mastered well, triggering a sub-knowledge point inquiring strategy, preferentially determining the inquired sub-knowledge points in the knowledge graph corresponding to the questions, and generating the inquired questions corresponding to the inquired sub-knowledge points;

if the answer result information represents that the knowledge points are mastered incorrectly, triggering a reiterating and inquiring strategy, and preferentially reiterating and inquiring the knowledge points in the knowledge graph corresponding to the question;

and if the answer result information represents knowledge point mastering deletion, triggering a peer knowledge point inquiring strategy, preferentially determining the inquired peer knowledge points in the knowledge graph corresponding to the questions, and generating the inquired questions corresponding to the inquired peer knowledge points.

6. An intelligent interview topdressing problem generating device, comprising:

the recognition module is used for determining core words in the standard answers based on priori knowledge information, carrying out finest granularity word segmentation on the core words to obtain a plurality of first word segments, and calculating the weight of each first word segment; word segmentation is carried out on answers to the answers to obtain a plurality of second words, and the length of the second words is determined based on the length of the first words; sorting the first words in the core words according to the weight of each first word in the core words, sequentially calculating the similarity between each first word in the core words and each second word according to the weight, screening out candidate second words with the similarity higher than a first preset similarity threshold, and outputting a plurality of first matching results, wherein the first matching results comprise the first word, the candidate second word, the similarity between the first word and the candidate second word, the similarity type of the first word and the candidate second word, and the position of the candidate second word in the answer; combining the plurality of first matching results into a second matching result, calculating weighted similarity between the core word and the second matching result based on the weight of each first word in the core word and the similarity of each first word and the corresponding candidate second word, and screening out a second matching result with weighted similarity higher than a second preset similarity threshold; determining a plurality of semantic entities in the answer based on the screened second matching result;

The extraction module is used for extracting attribute relations among the plurality of semantic entities to obtain entity relation information of the plurality of semantic entities;

the determining module is used for obtaining standard answers corresponding to the questions and determining answer result information of the candidate based on the standard answers and the entity relation information; the answer result information is used for representing the mastering conditions of the candidate on a plurality of knowledge points related to the questions;

and the generating module is used for determining at least one additional query strategy based on the response result information, determining additional query knowledge points in the knowledge graph corresponding to the questions according to the additional query strategy, and generating additional questions corresponding to the additional query knowledge points.

7. The intelligent interview topquestion generating device of claim 6 wherein said identification module is further configured to cause a match conflict if candidate second words at the same location in said answered answer appear in a plurality of second match results; optimizing the weighted similarity between the core word and the second matching result based on a conflict matching algorithm until the candidate second words at the same position in the answer only appear in one second matching result, and taking the preset number of second matching results with highest weighted similarity as final matching results; based on the screened final matching result, a plurality of semantic entities in the answer to the answer are determined.

8. An electronic device, comprising:

one or more processors;

a storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-5.