CN116306569A - Model evaluation method and device, electronic equipment and storage medium - Google Patents

Model evaluation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116306569A
CN116306569A CN202310074952.7A CN202310074952A CN116306569A CN 116306569 A CN116306569 A CN 116306569A CN 202310074952 A CN202310074952 A CN 202310074952A CN 116306569 A CN116306569 A CN 116306569A
Authority
CN
China
Prior art keywords
characteristic information
feedback data
public opinion
information
opinion feedback
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310074952.7A
Other languages
Chinese (zh)
Inventor
尹杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to CN202310074952.7A priority Critical patent/CN116306569A/en
Publication of CN116306569A publication Critical patent/CN116306569A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/268Morphological analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a model evaluation method, a model evaluation device, electronic equipment and a storage medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: obtaining public opinion feedback data; acquiring a plurality of characteristic information corresponding to the public opinion feedback data through a target model, wherein the characteristic information comprises scenes, attributes, views and emotions; determining the correctness of the characteristic information according to the emotion of the characteristic information and the emotion of the labeling information corresponding to the public opinion feedback data, which is labeled in advance; and determining the evaluation result of the target model according to the number of the correct characteristic information.

Description

Model evaluation method and device, electronic equipment and storage medium
Technical Field
The application belongs to the technical field of artificial intelligence, and particularly relates to a model evaluation method, a device, electronic equipment and a storage medium.
Background
In the public opinion data, a user carries out public opinion feedback according to scenes, attributes and opinions and carries own emotions, and in order to extract specific opinion of the user, four groups of scenes, attributes, opinion and emotion are developed and used for representing the public opinion. Because of the variety of representations of users and the diversification of chinese representations, a single word or multiple words may characterize the specific content of the feedback, resulting in a model extraction of public opinion that may not be consistent with what the user wants to express. The existing model evaluation method cannot accurately judge the conditions, and accuracy of effect evaluation of the model extraction public opinion is reduced.
Disclosure of Invention
The embodiment of the application aims to provide a model evaluation method, a device, electronic equipment and a storage medium, which can solve the problem of low accuracy of effect evaluation of model extraction public opinion.
In a first aspect, an embodiment of the present application provides a method for evaluating a model, where the method includes:
obtaining public opinion feedback data;
acquiring a plurality of characteristic information corresponding to the public opinion feedback data through a target model, wherein the characteristic information comprises scenes, attributes, views and emotions;
determining the correctness of the characteristic information according to the emotion of the characteristic information and the emotion of the labeling information corresponding to the public opinion feedback data, which is labeled in advance;
and determining the evaluation result of the target model according to the number of the correct characteristic information.
In a second aspect, an embodiment of the present application provides a model evaluation device, where the device includes:
the first acquisition module is used for acquiring public opinion feedback data;
the second acquisition module is used for acquiring a plurality of characteristic information corresponding to the public opinion feedback data through a target model, wherein the characteristic information comprises scenes, attributes, views and emotions;
the first determining module is used for determining the correctness of the characteristic information according to the emotion of the characteristic information and the emotion of the labeling information which is labeled in advance and corresponds to the public opinion feedback data;
And the second determining module is used for determining the evaluation result of the target model according to the number of the correct characteristic information.
In a third aspect, embodiments of the present application provide an electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the method as described in the first aspect.
In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor implement the steps of the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and where the processor is configured to execute a program or instructions to implement a method according to the first aspect.
In a sixth aspect, embodiments of the present application provide a computer program product stored in a storage medium, the program product being executable by at least one processor to implement the method according to the first aspect.
In the embodiment of the application, the feature information corresponding to the public opinion feedback data is obtained through the target model, the feature information comprises scenes, attributes, views and emotions, the correctness of the feature information is judged through the emotion of the feature information and the emotion of the labeling information, and whether the feature information is correct can be accurately judged. And determining an evaluation result of the target model according to the number of the correct characteristic information, wherein the better the performance of the target model is, the more the number of the correct characteristic information is obtained through the target model, and the performance of the target model can be accurately reflected through the number of the correct characteristic information, so that the accuracy of the evaluation result of the target model is improved.
Drawings
FIG. 1 is a flow chart of a model evaluation method in an embodiment of the present application.
FIG. 2 is a flow chart of a model evaluation method in another embodiment of the present application.
FIG. 3 is a schematic diagram of a model evaluation device in an embodiment of the present application.
Fig. 4 is a schematic diagram of an electronic device in an embodiment of the present application.
Fig. 5 is a schematic diagram of a hardware structure of an electronic device in an embodiment of the present application.
Detailed Description
Technical solutions in the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application are within the scope of the protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or otherwise described herein, and that the objects identified by "first," "second," etc. are generally of a type and do not limit the number of objects, for example, the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The model evaluation method provided by the embodiment of the application is described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.
As shown in FIG. 1, the present embodiment describes a model evaluation method, which includes steps 1100-1400.
Step 1100: and obtaining public opinion feedback data.
The public opinion feedback data may be feedback data of the use of a product by the user, and the public opinion feedback data may reflect the use experience of the product by the user. For example, the public opinion feedback data is "mobile card is available on another mobile phone, no signal is not served on this mobile phone, how is this a return? ".
Step 1200: and acquiring a plurality of characteristic information corresponding to the public opinion feedback data through a target model, wherein the characteristic information comprises scenes, attributes, views and emotions.
The scene, attribute and viewpoint of the characteristic information are directly extracted from the public opinion feedback data. The scene, attribute and viewpoint of the characteristic information are character strings in the public opinion feedback data. Multiple scenes, multiple attributes and multiple views can be extracted from one piece of public opinion feedback data through the target model. For example, the public opinion feedback data is "mobile card is available on another mobile phone, no signal is not served on this mobile phone, how is this a return? The scene of the public opinion feedback data is a mobile card, the attribute of the public opinion feedback data is a service and a signal, and the view of the public opinion feedback data is no.
According to the multiple scenes, the multiple attributes and the multiple views of the public opinion feedback data, multiple relation matching pairs with association relations can be obtained, and each relation matching pair comprises one scene, one attribute and one view. And carrying out emotion judgment on each relation matching pair to obtain an emotion judgment result corresponding to each relation matching pair, and combining the emotion judgment result with the corresponding relation matching pair to obtain a plurality of characteristic information. Feature information may be represented using a form of a tetrad, e.g., a relationship matching pair is < scene: a mobile card; attributes: a service; the point of view: and (3) judging the emotion of the relation matching pair to obtain negative emotion, wherein the characteristic information is < scene: a mobile card; attributes: a service; the point of view: the method is free; emotion: negative direction >.
Step 1300: and determining the correctness of the characteristic information according to the emotion of the characteristic information and the emotion of the labeling information corresponding to the public opinion feedback data, which is labeled in advance.
The labeling information is information for manually labeling the public opinion feedback data, and has higher correctness. The annotation information also comprises scenes, attributes, views and emotions, and can be expressed in a four-tuple form. Judging whether the emotion of the feature information is consistent with the emotion of the labeling information. If so, the characteristic information is considered to be correct. For example, if the emotion of the feature information is "negative", the emotion of the label information is "negative", and the emotion of the feature information matches the emotion of the label information, then the feature information is correct.
As shown in fig. 2, after the labeling information and the feature information are acquired, emotion judgment is performed according to the labeling information and the feature information, and whether emotion of the labeling information is consistent with emotion of the feature information is judged. If the emotion of the labeling information is inconsistent with the emotion of the characteristic information, the labeling information is not matched with the characteristic information, namely the characteristic information is wrong.
Step 1400: and determining the evaluation result of the target model according to the number of the correct characteristic information.
After the correctness of each characteristic information is obtained, the number of the correct characteristic information is counted. The more the number of correct feature information, the better the performance of the target model to extract feature information.
According to the method and the device for extracting the semantic feature information, the correctness of the feature information is judged through emotion, the correctness of the extracted semantic is guaranteed, the extracted result with incorrect semantic is filtered, the correctness of the feature information can be accurately judged, and the accuracy of the target model evaluation result is improved.
In this embodiment, step 1300 includes: determining that the characteristic information is wrong under the condition that the emotion of the characteristic information is inconsistent with the emotion of the labeling information corresponding to the public opinion feedback data; and under the condition that the emotion of the characteristic information is consistent with the emotion of the labeling information corresponding to the public opinion feedback data, determining that the characteristic information is correct.
Judging whether the emotion of the characteristic information is consistent with the emotion of the labeling information corresponding to the public opinion feedback data. The emotion of the marked information is marked manually, and the accuracy is high. If the emotion of the characteristic information is inconsistent with the emotion of the labeling information, the characteristic information is indicated to be wrong.
For example, if the emotion of the feature information is "negative", the emotion of the label information is "positive", and the emotion of the feature information is inconsistent with the emotion of the label information, the feature information is erroneous.
According to the embodiment, the feature information corresponding to the public opinion feedback data is obtained through the target model, the feature information comprises scenes, attributes, views and emotions, the correctness of the feature information is judged through the emotion of the feature information and the emotion of the labeling information, and whether the feature information is correct can be accurately judged. And determining an evaluation result of the target model according to the number of the correct characteristic information, wherein the better the performance of the target model is, the more the number of the correct characteristic information is obtained through the target model, and the performance of the target model can be accurately reflected through the number of the correct characteristic information, so that the accuracy of the evaluation result of the target model is improved.
In this embodiment, step 1300 includes steps 1310-1320.
Step 1310: and under the condition that the emotion of the characteristic information is consistent with the emotion of the labeling information, acquiring a first character string corresponding to the characteristic information, wherein the first character string comprises a scene of the characteristic information, an attribute of the characteristic information and a viewpoint of the characteristic information.
The labeling information is manually labeled information corresponding to the public opinion feedback data, and the labeling information also comprises scenes, attributes, views and emotions. The annotation information may be represented using a four-tuple approach. For example, for public opinion feedback data "mobile card is available on another mobile phone, no signal is not served on this mobile phone, how do it return? ", the annotation information may be < scene: a mobile card; attributes: a service; the point of view: the method is free; emotion: negative direction >.
After the feature information is obtained, judging whether the emotion of the feature information is consistent with the emotion of the labeling information. If so, the feature information is judged further, indicating that the feature information is likely to be correct.
And splicing the scene, the attribute and the view of the characteristic information to obtain a first character string. For example, feature information is < scene: a mobile card; attributes: a service; the point of view: the method is free; emotion: and negative direction >, and the first character string obtained after the scene, the attribute and the view of the characteristic information are spliced is 'mobile card service is not available'.
Step 1320: and acquiring the longest public subsequence according to the first character string and the second character string, wherein the second character string comprises the scene of the annotation information, the attribute of the annotation information and the viewpoint of the annotation information, and the longest public subsequence comprises the same characters in the first character string and the second character string.
And splicing the scene, the attribute and the viewpoint of the labeling information to obtain a second character string. The longest common subsequence represents all of the characters in the first string and the second string that are the same. For example, the first character string is "mobile card out of service", the second character string is "mobile card out of service", and the first character string is identical to the second character string, then the longest common subsequence is "mobile card out of service". For example, the first character string is "5 hours for battery" and the second character string is "5 hours for battery" and the first character string is different from the second character string, and the longest common subsequence is "5 hours for battery".
Step 1330: and determining the correctness of the characteristic information according to the longest public subsequence.
The longest common subsequence may reflect a degree of similarity between the first string and the second string, where a higher degree of similarity indicates a higher accuracy of the feature information corresponding to the first string. And judging whether the characteristic information is correct according to the similarity degree of the first character string and the second character string.
According to the embodiment, the feature information is further judged under the condition that the emotion of the feature information is consistent with the emotion of the labeling information, so that the accuracy of the accuracy judgment result of the feature information is improved, and the accuracy of the target model evaluation result is improved.
In this embodiment, step 1310 includes steps 1311-1312.
Step 1311: and acquiring the scene of the characteristic information, the attribute of the characteristic information and the order of the views of the characteristic information in the public opinion feedback data.
The public opinion feedback data is a character string, and the scene of the characteristic information, the attribute of the characteristic information and the viewpoint of the characteristic information are all characters in the public opinion feedback data. The position of the scene of the characteristic information in the public opinion feedback data can be obtained, the position of the attribute of the characteristic information in the public opinion feedback data is obtained, the position of the viewpoint of the characteristic information in the public opinion feedback data is obtained, and the sequence of the scene of the characteristic information, the attribute of the characteristic information and the viewpoint of the characteristic information in the public opinion feedback data is determined according to the three positions.
For example, the public opinion feedback data is "mobile card is available on another mobile phone, no signal is not served on this mobile phone, how is this a return? ", the corresponding feature information is < scene: a mobile card; attributes: a service; the point of view: the method is free; emotion: negative direction >, the position of the scene 'mobile card' of the characteristic information in the public opinion feedback data is 0, the position of the attribute 'service' of the characteristic information in the public opinion feedback data is 22, and the position of the view 'none' of the characteristic information in the public opinion feedback data is 2. The scene is before the view and the service, the view is between the scene and the service, and then the sequence is scene, view and service.
Step 1312: and splicing the scene of the characteristic information, the attribute of the characteristic information and the viewpoint of the characteristic information according to the sequence to obtain the first character string.
And splicing the scenes, the attributes and the views of the characteristic information according to the sequence. For example, feature information is < scene: a mobile card; attributes: a service; the point of view: the method is free; emotion: negative direction), the sequence is scene, viewpoint, service, then the first character string obtained after splicing is "mobile card is not served".
According to the embodiment, the real semantics of the public opinion feedback data can be accurately reflected by splicing according to the scene of the characteristic information, the attribute of the characteristic information and the sequence of the viewpoint of the characteristic information in the public opinion feedback data.
In this embodiment, step 1330 includes steps 1331-1332.
Step 1331: a ratio between the length of the longest common subsequence and the length of the second string is calculated.
After the longest common subsequence is obtained, the length of the longest common subsequence is calculated, and the length of the second character string is calculated. The length of a character string is the number of characters contained in the character string. For example, the first character string is "mobile card out of service", the second character string is "mobile card out of service", the longest common subsequence is "mobile card out of service", and the lengths of the longest common subsequence and the second character string are both 6, so the ratio is 1. For another example, the first character string is "battery for 5 hours", the second character string is "battery for 5 hours" in power saving mode, the longest common subsequence is "battery for 5 hours", the length of the longest common subsequence is 9, and the length of the second character string is 13, and the ratio is 0.69.
Step 1332: determining that the characteristic information is correct if the ratio is greater than or equal to a threshold; in the case where the ratio is smaller than the threshold value, it is determined that the characteristic information is erroneous.
After the ratio of the length of the longest common subsequence to the length of the second string is obtained, the ratio is compared to a threshold value. The ratio can reflect the similarity between the characteristic information and the annotation information. The higher the ratio, the more the number of characters in the explanatory feature information that are the same as the labeling information, and the closer the feature information is to the labeling information. If the ratio is greater than the threshold, the characteristic information may be considered correct. If the ratio is less than the threshold, the characteristic information is considered erroneous. For example, a ratio of 0.69, a threshold of 0.6, and a ratio greater than the threshold, the characteristic information is considered correct.
According to the embodiment, the ratio between the length of the longest public subsequence and the length of the second character string is calculated, the similarity between the feature information and the labeling information can be reflected by the ratio, the correctness of the feature information is judged according to the ratio, and the accuracy of the correctness judgment result of the feature information can be improved.
In this embodiment, step 1200 includes steps 1210-1240.
Step 1210: and acquiring at least one scene of the public opinion feedback data, at least one attribute of the public opinion feedback data and at least one view of the public opinion feedback data through the target model.
And extracting keywords, phrases and the like of the public opinion feedback data through the target model. A piece of public opinion feedback data may contain multiple scenes, multiple attributes, multiple perspectives. For example, the public opinion feedback data is "mobile card is available on another mobile phone, no signal is not served on this mobile phone, how is this a return? The scene of the public opinion feedback data is a mobile card, the attribute of the public opinion feedback data is a service and a signal, and the view of the public opinion feedback data is no.
Step 1220: and combining the scene of the public opinion feedback data, the attribute of the public opinion feedback data and the viewpoint of the public opinion feedback data to obtain a plurality of character strings to be judged, wherein each character string to be judged comprises one scene of the public opinion feedback data, one attribute of the public opinion feedback data and one viewpoint of the public opinion feedback data.
And matching one viewpoint with each scene and attribute in sequence based on the viewpoint of the public opinion feedback data to obtain a character string to be judged. For example, the scene of the public opinion feedback data is a mobile card, the attribute of the public opinion feedback data is a service and a signal, and the views of the public opinion feedback data are no. And matching the viewpoint 'none' with the scene 'mobile card' and the attribute 'service' to obtain the character string 'mobile card no service' to be judged.
Step 1230: and carrying out emotion judgment on the character string to be judged to obtain emotion corresponding to the character string to be judged.
And judging the emotion of each character string to be judged through the target model to obtain the emotion of each character string to be judged. For example, the scene, the attribute and the viewpoint of the public opinion feedback data are combined to obtain four character strings to be judged, and emotion judgment is carried out on each character string to be judged to obtain four emotion judgment results corresponding to the four character strings to be judged.
Step 1240: and obtaining a plurality of characteristic information according to the character strings to be judged and the corresponding emotions.
And combining each character string to be judged with the corresponding emotion to obtain the characteristic information. For example, the character string to be determined is "mobile card is not served", the corresponding emotion is "negative", and the feature information < scene is obtained: a mobile card; attributes: a service; the point of view: the method is free; emotion: negative direction >.
In this embodiment, step 1400 includes 1410-1440.
Step 1410: and acquiring the total quantity of all feature information extracted by the target model and corresponding to the public opinion feedback data and the total quantity of marking information marked in advance and corresponding to the public opinion feedback data.
The objective model may extract a plurality of characteristic information corresponding to the public opinion feedback data. When the public opinion feedback data is manually marked, a plurality of marking information corresponding to the public opinion feedback data can be marked. For example, for the same piece of public opinion feedback data, 4 pieces of characteristic information are extracted through a target model, and 5 pieces of labeling information are manually labeled.
Step 1420: and calculating the accuracy of the target model according to the number of the correct characteristic information and the total number of the characteristic information which is extracted by the target model and corresponds to the public opinion feedback data.
In the feature information extracted by the object model, there may be some feature information that is erroneous. The accuracy of the target model is the ratio of the number of correct feature information extracted by the target model to the total number of extracted feature information. For example, the target model extracts 4 pieces of characteristic information, wherein the number of correct characteristic information is 3, and the accuracy of the target model is 75%.
Step 1430: and calculating the recall rate of the target model according to the quantity of all the correct characteristic information and the total quantity of the pre-labeled labeling information corresponding to the public opinion feedback data.
All labeling information corresponding to the public opinion feedback data can be labeled in a manual labeling mode. The manually noted annotation information is generally correct. The feature information extracted by the target model may only contain a part of labeling information of manual labeling. The recall is the ratio of the number of correct feature information to the total number of annotation information. For example, the target model extracts 4 pieces of characteristic information, wherein the number of correct characteristic information is 3, 5 pieces of labeling information are manually labeled, and the recall rate is 60%.
Step 1440: and determining an evaluation result of the target model according to the accuracy rate of the target model and the recall rate of the target model.
And calculating the F1 value of the target model according to the recall rate of the target model and the accuracy rate of the target model. The F1 value is positively correlated to the accuracy and recall. The higher the F1 value, the better the extraction effect of the target model is. The F1 value can be calculated by the following formula: f1 =2 x recovery x precision/(recovery+precision).
Wherein, recovery represents the recall rate of the target model, and precision represents the accuracy rate of the target model.
According to the model evaluation method provided by the embodiment of the application, the execution subject can be a model evaluation device. In the embodiment of the application, a model evaluation device executing a model evaluation method is taken as an example, and the model evaluation device provided in the embodiment of the application is described.
As shown in fig. 3, this embodiment describes a model evaluating apparatus 300, including:
the first obtaining module 301 is configured to obtain public opinion feedback data.
And a second obtaining module 302, configured to obtain, through a target model, a plurality of feature information corresponding to the public opinion feedback data, where the feature information includes a scene, an attribute, a viewpoint, and an emotion.
The first determining module 303 is configured to determine the correctness of the feature information according to the emotion of the feature information and the emotion of the labeling information corresponding to the public opinion feedback data, which is labeled in advance.
And the second determining module 304 is configured to determine an evaluation result of the target model according to the number of correct feature information.
According to the embodiment, the feature information corresponding to the public opinion feedback data is obtained through the target model, the feature information comprises scenes, attributes, views and emotions, the correctness of the feature information is judged through the emotion of the feature information and the emotion of the labeling information, and whether the feature information is correct can be accurately judged. And determining an evaluation result of the target model according to the number of the correct characteristic information, wherein the better the performance of the target model is, the more the number of the correct characteristic information is obtained through the target model, and the performance of the target model can be accurately reflected through the number of the correct characteristic information, so that the accuracy of the evaluation result of the target model is improved.
Optionally, the first determining module is specifically configured to:
determining that the characteristic information is wrong under the condition that the emotion of the characteristic information is inconsistent with the emotion of the labeling information;
and determining that the characteristic information is correct under the condition that the emotion of the characteristic information is consistent with the emotion of the labeling information.
Optionally, the first determining module is specifically configured to:
acquiring a first character string corresponding to the characteristic information under the condition that the emotion of the characteristic information is consistent with the emotion of the labeling information, wherein the first character string comprises a scene of the characteristic information, an attribute of the characteristic information and a viewpoint of the characteristic information;
obtaining a longest public subsequence according to the first character string and a second character string, wherein the second character string comprises a scene of the annotation information, an attribute of the annotation information and a viewpoint of the annotation information, and the longest public subsequence comprises the same characters in the first character string and the second character string;
and determining the correctness of the characteristic information according to the longest public subsequence.
Optionally, the first determining module is specifically configured to:
Acquiring the scene of the characteristic information, the attribute of the characteristic information and the sequence of the views of the characteristic information in the public opinion feedback data;
and splicing the scene of the characteristic information, the attribute of the characteristic information and the viewpoint of the characteristic information according to the sequence to obtain the first character string.
Optionally, the first determining module is specifically configured to:
calculating a ratio between the length of the longest common subsequence and the length of the second string;
determining that the characteristic information is correct if the ratio is greater than or equal to a threshold;
in the case where the ratio is smaller than the threshold value, it is determined that the characteristic information is erroneous.
Optionally, the second obtaining module is specifically configured to:
acquiring at least one scene of the public opinion feedback data, at least one attribute of the public opinion feedback data and at least one view of the public opinion feedback data through the target model;
combining the scene of the public opinion feedback data, the attribute of the public opinion feedback data and the viewpoint of the public opinion feedback data to obtain a plurality of character strings to be judged, wherein each character string to be judged comprises one scene of the public opinion feedback data, one attribute of the public opinion feedback data and one viewpoint of the public opinion feedback data;
Carrying out emotion judgment on the character string to be judged to obtain emotion corresponding to the character string to be judged;
and obtaining a plurality of characteristic information according to the character strings to be judged and the corresponding emotions.
Optionally, the second determining module is specifically configured to:
acquiring the total quantity of all feature information which is extracted by the target model and corresponds to the public opinion feedback data and the total quantity of marking information which is marked in advance and corresponds to the public opinion feedback data;
calculating the accuracy of the target model according to the number of the correct characteristic information and the total number of the characteristic information which is extracted by the target model and corresponds to the public opinion feedback data;
calculating the recall rate of the target model according to the number of the correct characteristic information and the total number of the characteristic information corresponding to the public opinion feedback data, which are marked in advance;
and determining an evaluation result of the target model according to the accuracy rate of the target model and the recall rate of the target model.
The model evaluation device in the embodiment of the application can be an electronic device, or can be a component in the electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. By way of example, the electronic device may be a mobile phone, tablet computer, notebook computer, palm computer, vehicle-mounted electronic device, mobile internet appliance (Mobile Internet Device, MID), augmented reality (augmented reality, AR)/Virtual Reality (VR) device, robot, wearable device, ultra-mobile personal computer, UMPC, netbook or personal digital assistant (personal digital assistant, PDA), etc., but may also be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and the embodiments of the present application are not limited in particular.
The model evaluation device in the embodiment of the application may be a device with an operating system. The operating system may be an Android operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
The model evaluation device provided in the embodiment of the present application can implement each process implemented by the embodiments of the methods of fig. 1 to fig. 2, and in order to avoid repetition, a detailed description is omitted here.
Optionally, as shown in fig. 4, the embodiment of the present application further provides an electronic device 400, including a processor 401 and a memory 402, where the memory 402 stores a program or an instruction that can be executed on the processor 401, and the program or the instruction implements each step of the embodiment of the model evaluation method when executed by the processor 401, and the steps can achieve the same technical effect, so that repetition is avoided, and no further description is given here.
The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device described above.
Fig. 5 is a schematic hardware structure of an electronic device implementing an embodiment of the present application.
The electronic device 1000 includes, but is not limited to: radio frequency unit 1001, network module 1002, audio output unit 1003, input unit 1004, sensor 1005, display unit 1006, user input unit 1007, interface unit 1008, memory 1009, and processor 1010.
Those skilled in the art will appreciate that the electronic device 1000 may also include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 1010 by a power management system to perform functions such as managing charge, discharge, and power consumption by the power management system. The electronic device structure shown in fig. 5 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.
The processor 1010 is configured to obtain public opinion feedback data.
The processor 1010 is further configured to obtain, through a target model, a plurality of feature information corresponding to the public opinion feedback data, where the feature information includes a scene, an attribute, a viewpoint, and an emotion.
The processor 1010 is further configured to determine the correctness of the feature information according to the emotion of the feature information and the emotion of the labeling information corresponding to the public opinion feedback data.
The processor 1010 is further configured to determine an evaluation result of the target model according to the number of all correct feature information.
According to the embodiment, the feature information corresponding to the public opinion feedback data is obtained through the target model, the feature information comprises scenes, attributes, views and emotions, the correctness of the feature information is judged through the emotion of the feature information and the emotion of the labeling information, and whether the feature information is correct can be accurately judged. And determining an evaluation result of the target model according to the number of the correct characteristic information, wherein the better the performance of the target model is, the more the number of the correct characteristic information is obtained through the target model, and the performance of the target model can be accurately reflected through the number of the correct characteristic information, so that the accuracy of the evaluation result of the target model is improved.
Optionally, the processor 1010 is further specifically configured to:
determining that the characteristic information is wrong under the condition that the emotion of the characteristic information is inconsistent with the emotion of the labeling information;
and determining that the characteristic information is correct under the condition that the emotion of the characteristic information is consistent with the emotion of the labeling information.
Optionally, the processor 1010 is further specifically configured to:
acquiring a first character string corresponding to the characteristic information under the condition that the emotion of the characteristic information is consistent with the emotion of the labeling information, wherein the first character string comprises a scene of the characteristic information, an attribute of the characteristic information and a viewpoint of the characteristic information;
obtaining the longest public subsequence according to the first character string and the second character string, wherein the second character string comprises the scene of the annotation information, the attribute of the annotation information and the viewpoint of the annotation information;
and determining the correctness of the characteristic information according to the longest public subsequence.
Optionally, the processor 1010 is further specifically configured to:
acquiring the scene of the characteristic information, the attribute of the characteristic information and the sequence of the views of the characteristic information in the public opinion feedback data;
And splicing the scene of the characteristic information, the attribute of the characteristic information and the viewpoint of the characteristic information according to the sequence to obtain the first character string.
Optionally, the processor 1010 is further specifically configured to:
calculating a ratio between the length of the longest common subsequence and the length of the second string;
determining that the characteristic information is correct if the ratio is greater than or equal to a threshold;
in the case where the ratio is smaller than the threshold value, it is determined that the characteristic information is erroneous.
Optionally, the processor 1010 is further specifically configured to:
acquiring at least one scene of the public opinion feedback data, at least one attribute of the public opinion feedback data and at least one view of the public opinion feedback data through the target model;
combining the scene of the public opinion feedback data, the attribute of the public opinion feedback data and the viewpoint of the public opinion feedback data to obtain a plurality of character strings to be judged, wherein each character string to be judged comprises one scene of the public opinion feedback data, one attribute of the public opinion feedback data and one viewpoint of the public opinion feedback data;
Carrying out emotion judgment on the character string to be judged to obtain emotion corresponding to the character string to be judged;
and obtaining a plurality of characteristic information according to the character strings to be judged and the corresponding emotions.
Optionally, the processor 1010 is further specifically configured to:
acquiring the total quantity of all feature information which is extracted by the target model and corresponds to the public opinion feedback data and the total quantity of the feature information which is marked in advance and corresponds to the public opinion feedback data;
calculating the accuracy of the target model according to the number of all the correct characteristic information and the total number of the characteristic information;
calculating the recall rate of the target model according to the number of the correct characteristic information and the total number of the characteristic information corresponding to the public opinion feedback data, which are marked in advance;
and determining an evaluation result of the target model according to the accuracy rate of the target model and the recall rate of the target model.
It should be understood that in the embodiment of the present application, the input unit 1004 may include a graphics processor (Graphics Processing Unit, GPU) 10041 and a microphone 10042, and the graphics processor 10041 processes image data of still pictures or videos obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 1006 may include a display panel 10061, and the display panel 10061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1007 includes at least one of a touch panel 10071 and other input devices 10072. The touch panel 10071 is also referred to as a touch screen. The touch panel 10071 can include two portions, a touch detection device and a touch controller. Other input devices 10072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.
The memory 1009 may be used to store software programs as well as various data. The memory 1009 may mainly include a first memory area storing programs or instructions and a second memory area storing data, wherein the first memory area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 1009 may include volatile memory or nonvolatile memory, or the memory 1009 may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM), static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (ddr SDRAM), enhanced SDRAM (Enhanced SDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DRRAM). Memory 1009 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.
The processor 1010 may include one or more processing units; optionally, the processor 1010 integrates an application processor that primarily processes operations involving an operating system, user interface, application programs, and the like, and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 1010.
The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the processes of the embodiment of the model evaluation method are implemented, and the same technical effects can be achieved, so that repetition is avoided, and no redundant description is provided herein.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
The embodiment of the application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled with the processor, and the processor is used for running a program or an instruction, so that each process of the embodiment of the model evaluation method can be implemented, the same technical effect can be achieved, and in order to avoid repetition, the description is omitted here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
The embodiments of the present application provide a computer program product, which is stored in a storage medium, and the program product is executed by at least one processor to implement the respective processes of the embodiments of the model evaluation method described above, and achieve the same technical effects, so that repetition is avoided, and a detailed description is omitted here.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are also within the protection of the present application.

Claims (10)

1. A model evaluation method, comprising:
obtaining public opinion feedback data;
acquiring a plurality of characteristic information corresponding to the public opinion feedback data through a target model, wherein the characteristic information comprises scenes, attributes, views and emotions;
determining the correctness of the characteristic information according to the emotion of the characteristic information and the emotion of the labeling information corresponding to the public opinion feedback data, which is labeled in advance;
and determining the evaluation result of the target model according to the number of the correct characteristic information.
2. The method of claim 1, wherein the determining the correctness of the characteristic information based on the emotion of the characteristic information and the emotion of the pre-labeled labeling information corresponding to the public opinion feedback data comprises:
acquiring a first character string corresponding to the characteristic information under the condition that the emotion of the characteristic information is consistent with the emotion of the labeling information, wherein the first character string comprises a scene of the characteristic information, an attribute of the characteristic information and a viewpoint of the characteristic information;
obtaining a longest public subsequence according to the first character string and a second character string, wherein the second character string comprises a scene of the annotation information, an attribute of the annotation information and a viewpoint of the annotation information, and the longest public subsequence comprises the same characters in the first character string and the second character string;
And determining the correctness of the characteristic information according to the longest public subsequence.
3. The method according to claim 2, wherein the obtaining the first string corresponding to the feature information includes:
acquiring the scene of the characteristic information, the attribute of the characteristic information and the sequence of the views of the characteristic information in the public opinion feedback data;
and splicing the scene of the characteristic information, the attribute of the characteristic information and the viewpoint of the characteristic information according to the sequence to obtain the first character string.
4. The method of claim 2, wherein said determining the correctness of said characteristic information based on said longest common subsequence comprises:
calculating a ratio between the length of the longest common subsequence and the length of the second string;
determining that the characteristic information is correct if the ratio is greater than or equal to a threshold;
in the case where the ratio is smaller than the threshold value, it is determined that the characteristic information is erroneous.
5. The method of claim 1, wherein the obtaining, by a target model, a plurality of feature information corresponding to the public opinion feedback data comprises:
Acquiring at least one scene of the public opinion feedback data, at least one attribute of the public opinion feedback data and at least one view of the public opinion feedback data through the target model;
combining the scene of the public opinion feedback data, the attribute of the public opinion feedback data and the viewpoint of the public opinion feedback data to obtain a plurality of character strings to be judged, wherein each character string to be judged comprises one scene of the public opinion feedback data, one attribute of the public opinion feedback data and one viewpoint of the public opinion feedback data;
carrying out emotion judgment on the character string to be judged to obtain emotion corresponding to the character string to be judged;
and obtaining a plurality of characteristic information according to the character strings to be judged and the corresponding emotions.
6. The method according to claim 1, wherein the determining the evaluation result of the target model according to the number of correct feature information comprises:
acquiring the total quantity of all feature information which is extracted by the target model and corresponds to the public opinion feedback data and the total quantity of marking information which is marked in advance and corresponds to the public opinion feedback data;
Calculating the accuracy of the target model according to the number of the correct characteristic information and the total number of the characteristic information which is extracted by the target model and corresponds to the public opinion feedback data;
calculating the recall rate of the target model according to the quantity of all the correct characteristic information and the total quantity of the pre-labeled labeling information corresponding to the public opinion feedback data;
and determining an evaluation result of the target model according to the accuracy rate of the target model and the recall rate of the target model.
7. A model evaluation device, characterized by comprising:
the first acquisition module is used for acquiring public opinion feedback data;
the second acquisition module is used for acquiring a plurality of characteristic information corresponding to the public opinion feedback data through a target model, wherein the characteristic information comprises scenes, attributes, views and emotions;
the first determining module is used for determining the correctness of the characteristic information according to the emotion of the characteristic information and the emotion of the labeling information which is labeled in advance and corresponds to the public opinion feedback data;
and the second determining module is used for determining the evaluation result of the target model according to the quantity of all the correct characteristic information.
8. The apparatus of claim 7, wherein the first determining module is specifically configured to:
acquiring a first character string corresponding to the characteristic information under the condition that the emotion of the characteristic information is consistent with the emotion of the labeling information, wherein the first character string comprises a scene of the characteristic information, an attribute of the characteristic information and a viewpoint of the characteristic information;
obtaining a longest public subsequence according to the first character string and a second character string, wherein the second character string comprises a scene of the annotation information, an attribute of the annotation information and a viewpoint of the annotation information, and the longest public subsequence comprises the same characters in the first character string and the second character string;
and determining the correctness of the characteristic information according to the longest public subsequence.
9. An electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the model evaluation method according to any one of claims 1-6.
10. A readable storage medium, characterized in that the readable storage medium has stored thereon a program or instructions which, when executed by a processor, implement the steps of the model evaluation method according to any one of claims 1-6.
CN202310074952.7A 2023-01-29 2023-01-29 Model evaluation method and device, electronic equipment and storage medium Pending CN116306569A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310074952.7A CN116306569A (en) 2023-01-29 2023-01-29 Model evaluation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310074952.7A CN116306569A (en) 2023-01-29 2023-01-29 Model evaluation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116306569A true CN116306569A (en) 2023-06-23

Family

ID=86826505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310074952.7A Pending CN116306569A (en) 2023-01-29 2023-01-29 Model evaluation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116306569A (en)

Similar Documents

Publication Publication Date Title
WO2019184217A1 (en) Hotspot event classification method and apparatus, and storage medium
CN111625635A (en) Question-answer processing method, language model training method, device, equipment and storage medium
CN110263248B (en) Information pushing method, device, storage medium and server
CN112559800B (en) Method, apparatus, electronic device, medium and product for processing video
CN109558513B (en) Content recommendation method, device, terminal and storage medium
CN112287914B (en) PPT video segment extraction method, device, equipment and medium
CN110347866B (en) Information processing method, information processing device, storage medium and electronic equipment
CN110413787A (en) Text Clustering Method, device, terminal and storage medium
WO2021254251A1 (en) Input display method and apparatus, and electronic device
CN112631437A (en) Information recommendation method and device and electronic equipment
CN112861750B (en) Video extraction method, device, equipment and medium based on inflection point detection
CN113407775B (en) Video searching method and device and electronic equipment
CN113869063A (en) Data recommendation method and device, electronic equipment and storage medium
CN112231507A (en) Identification method and device and electronic equipment
CN116149528A (en) Document display method and device and electronic equipment
CN116127062A (en) Training method of pre-training language model, text emotion classification method and device
CN116306569A (en) Model evaluation method and device, electronic equipment and storage medium
CN110276001B (en) Checking page identification method and device, computing equipment and medium
CN111476028A (en) Chinese phrase identification method, system, storage medium and electronic equipment
CN114998896B (en) Text recognition method and device
CN112765447B (en) Data searching method and device and electronic equipment
CN111782060B (en) Object display method and device and electronic equipment
CN116187341A (en) Semantic recognition method and device
CN118158188A (en) Information pushing method and device
CN116049494A (en) Comment information identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination