CN111768767B - User tag extraction method and device, server and computer readable storage medium - Google Patents

User tag extraction method and device, server and computer readable storage medium Download PDF

Info

Publication number
CN111768767B
CN111768767B CN202010440700.8A CN202010440700A CN111768767B CN 111768767 B CN111768767 B CN 111768767B CN 202010440700 A CN202010440700 A CN 202010440700A CN 111768767 B CN111768767 B CN 111768767B
Authority
CN
China
Prior art keywords
logic
objects
logical
formula
conditions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010440700.8A
Other languages
Chinese (zh)
Other versions
CN111768767A (en
Inventor
欧阳湘粤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhuiyi Technology Co Ltd
Original Assignee
Shenzhen Zhuiyi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhuiyi Technology Co Ltd filed Critical Shenzhen Zhuiyi Technology Co Ltd
Priority to CN202010440700.8A priority Critical patent/CN111768767B/en
Publication of CN111768767A publication Critical patent/CN111768767A/en
Application granted granted Critical
Publication of CN111768767B publication Critical patent/CN111768767B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/61Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a user tag extraction method and device, a server and a computer readable storage medium, comprising the following steps: in the process of extracting the user tag from the audio data of the conversation robot and the user, the audio data of the conversation robot and the user are acquired, and the audio data are input into a preset rule module. The preset rule module comprises a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions. And analyzing the audio data through a logic formula to obtain the label of the user. The data structure of the prefix expression is adopted to obtain a logic formula, and the logic formula only comprises simple operators and operands. Compared with the traditional JSON data structure, the prefix expression greatly reduces the space consumption during storage of the logic formula in the use scene, and further improves the operation efficiency.

Description

User tag extraction method and device, server and computer readable storage medium
Technical Field
The present application relates to the field of artificial intelligence technology, and in particular, to a method and apparatus for extracting a user tag, a server, and a computer readable storage medium.
Background
With the continuous development of artificial intelligence and natural language processing technology, conversational robots are gradually and widely applied in a plurality of business scenes such as financial services, home life, personal assistants and the like, and the quality and efficiency of the services are improved.
However, due to the fact that the use scene of the dialogue robot is complex, and due to the complexity of natural language, when the logic in the use scene is stored by adopting a traditional JSON (JavaScript Object Natation) data structure, the occupied space in a database is too large by adopting a JSON data structure, a large amount of storage space is consumed, and the operation efficiency is reduced.
Disclosure of Invention
The embodiment of the application provides a user tag extraction method, a device, a server and a computer readable storage medium, which can reduce the space consumption when storing logic in a use scene, thereby improving the operation efficiency.
A user tag extraction method, comprising:
acquiring audio data of the conversation robot and a user;
inputting the audio data into a preset rule module, wherein the preset rule module comprises a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions;
and analyzing the audio data through the logic formula to obtain the label of the user.
In one embodiment, the analyzing the audio data by the logic formula to obtain the tag of the user includes:
inquiring the logic formula from a database through the preset rule module;
analyzing the character strings in the logic formula to obtain a logic relationship between the objects;
rendering the objects and the logical relationship between the objects into a logical tree;
and analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, the parsing the character string corresponding to the logical expression to obtain the object and the logical relationship between the objects includes:
and analyzing the character strings in the logic formula through a first function to obtain objects and logic relations among the objects.
In one embodiment, the parsing the character string in the logical expression by the first function to obtain the object and the logical relationship between the objects includes:
sequentially inputting the character strings in the logic formulas into a state machine for reading;
respectively executing different commands in sequence according to the read values to obtain return values;
based on all the returned values, the objects and the logical relationship between the objects are obtained.
In one embodiment, the method further comprises: and abstracting an object in a preset logic tree to obtain the logic formula, and storing the logic formula into a database.
In one embodiment, the method further comprises: and abstracting the objects in the preset logic tree into the logic formula through a second function, and storing the logic formula into a database.
In one embodiment, the abstracting, by the second function, the object in the preset logic tree into the logic formula, and storing the logic formula in the database includes:
circulating through all objects in a preset logic tree in a recursion mode, and acquiring values corresponding to logic relations among the objects and values corresponding to the objects;
and converting the values corresponding to the logical relations among the objects and the values corresponding to the objects into character strings to obtain logical formulas, and storing the logical formulas into a database.
In one embodiment, the logical relationship is represented in the form of a prefix.
In one embodiment, the logical formula includes a set of conditions and a logical relationship between the set of conditions, the set of conditions including a set of sub-conditions and a logical relationship between the set of sub-conditions, the set of sub-conditions including at least one condition and a logical relationship between the conditions.
A user tag extraction apparatus comprising:
the acquisition module is used for acquiring the audio data of the conversation robot and the user;
the input module is used for inputting the audio data into a preset rule module, the preset rule module comprises a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic among the conditions;
and the analysis module is used for analyzing the audio data through the logic formula to obtain the label of the user.
In one embodiment, the analysis module comprises:
the logic formula query unit is used for querying the logic formula from the database through the preset rule module;
the character string analysis unit is used for analyzing the character string corresponding to the logic formula to obtain an object and a logic relationship between the objects;
a logic tree generating unit, configured to render the object and a logic relationship between the objects into a logic tree;
and the analysis unit is used for analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, the string parsing unit is further configured to parse the string in the logical expression by using a first function to obtain an object and a logical relationship between the objects.
In one embodiment, the character string parsing unit is further configured to sequentially input the character strings in the logic formula to a state machine for reading; respectively executing different commands in sequence according to the read values to obtain return values; based on all the returned values, the objects and the logical relationship between the objects are obtained.
In one embodiment, the apparatus further comprises: and the logic formula storage module is used for abstracting the objects in the preset logic tree to obtain the logic formula and storing the logic formula into a database.
In one embodiment, the logic formula storage module is further configured to abstract an object in a preset logic tree into the logic formula through a second function, and store the logic formula into a database.
In one embodiment, the logic formula storage module is further configured to recursively cycle through all objects in a preset logic tree, and obtain a value corresponding to a logic relationship between the objects and a value corresponding to the objects; and converting the values corresponding to the logical relations among the objects and the values corresponding to the objects into character strings to obtain logical formulas, and storing the logical formulas into a database.
In one embodiment, the logical relationship is represented in the form of a prefix.
In one embodiment, the logical formula includes a set of conditions and a logical relationship between the set of conditions, the set of conditions including a set of sub-conditions and a logical relationship between the set of sub-conditions, the set of sub-conditions including at least one condition and a logical relationship between the conditions.
A server comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of the method as above.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method as above.
In the method, the device, the server and the computer readable storage medium for extracting the user tag, the audio data of the conversation robot and the user are obtained in the process of extracting the user tag from the audio data of the conversation robot and the user, and the audio data are input into the preset rule module. The preset rule module comprises a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions. And analyzing the audio data through a logic formula to obtain the label of the user. The data structure of the prefix expression is adopted to obtain a logic formula, and the logic formula only comprises simple operators and operands. Compared with the traditional JSON data structure, the prefix expression greatly reduces the space consumption during storage of the logic formula in the use scene, and further improves the operation efficiency.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an application environment diagram of a user tag extraction method in one embodiment;
FIG. 2 is a flow chart of a user tag extraction method in one embodiment;
FIG. 3 is a flowchart of the method for analyzing audio data to obtain labels of users by logic formula in FIG. 2;
FIG. 4 is a schematic diagram of the structure of a logical tree in one embodiment;
FIG. 5 is a schematic diagram of a logical tree rendered according to a prefix expression, according to one embodiment;
FIG. 6 is a flowchart of a method for resolving a string in a logical formula to obtain an object and a logical relationship between objects according to a first function in one embodiment;
FIG. 7 is a flowchart of a method for abstracting an object in a preset logical tree into a logical formula and storing the logical formula in a database according to a second function in one embodiment;
FIG. 8 is a block diagram of a user tag extraction apparatus in one embodiment;
FIG. 9 is a block diagram of the analysis module of FIG. 8;
FIG. 10 is a schematic diagram of an internal structure of a server in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It will be understood that the terms first, second, etc. as used herein may be used to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another element.
Fig. 1 is an application scenario diagram of a touch screen control method in one embodiment. As shown in fig. 1, the application environment includes a conversation robot 120 and a server 140. The server 140 acquires audio data of the conversation robot and the user from the conversation robot 120. The server 140 inputs the audio data into a preset rule module, wherein the preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions. And analyzing the audio data through a logic formula to obtain the label of the user.
Fig. 2 is a flowchart of a user tag extraction method in one embodiment, and as shown in fig. 2, a user tag extraction method is provided, which is applied to a server and includes steps 220 to 260.
Step 220, audio data of the conversation robot and the user are acquired.
The conversation robot is a voice robot which replaces manual conversation with a user to realize a certain function, such as Cortana, xiaozi of Microsoft, siri of apple, *** Now, arin honey, *** secret, turing robot, assistant coming and going question, and the like, and belongs to one type of conversation robot. Of course, the conversation robot may also be a voice robot that replaces a traditional manual call center, for example, a voice robot that replaces a manual call center in the banking, carrier, or other industries. The dialogue robot can communicate with the user and record in the communication process, so that audio data are generated, semantic recognition is carried out on the audio data, and the audio data after the semantic recognition are obtained.
Step 240, inputting the audio data into a preset rule module, wherein the preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions.
After audio data generated in the communication process of the conversation robot and the user are obtained and semantic recognition is carried out, the audio data after the semantic recognition is input into a preset rule module on a server. The preset rule module can analyze the input audio data according to preset rules to obtain labels of users. The preset rules in the preset rule module can be expressed in the form of a logic formula, wherein the logic formula comprises conditions and logic relations among the conditions. The condition is a judging condition in the process of analyzing the audio data to obtain the user tag. The conditions and the logical relationships between the conditions include or and the relationships, and of course, other types of logical relationships may be included, which are not limited in the present application.
Here, the data structure adopted by the logic formula in the preset rule module is a prefix expression. Wherein the prefix expression is an expression in which an operator is in front of an operand. Correspondingly, there are also a suffix expression and a suffix expression. Because the data structure adopted by the logic formula in the preset rule module is a prefix expression, the logic relationship between the conditions in the logic formula and the conditions is set before the conditions.
In the conventional rule module, the logic formula adopts a JSON data structure, and the JSON data structure comprises complex objects and a tuple structure. Therefore, the logical formula of the JSON data structure is larger, so that more memory space is occupied, the analysis speed is low, and the operation speed is reduced. For example, { "logics": "," values ": [ {" logics ":", value ": [ {" logics ":", "values": [ "E", "F", "G" ] }, "B" ], "C" ], "A", { "logics": "," values ": [" H "," I "] }," D "] } ] }) is a logical structure of the JSON data structure.
In step 260, the audio data is analyzed by logic formula to obtain the label of the user.
After the audio data generated by the conversation robot and the user in the communication process are acquired, the audio data are input into a preset rule module on the server. And analyzing the audio data through a logic formula in a preset rule module to obtain the label of the user. Because the logical formula includes the conditions and the logical relations among the conditions, whether the audio data meets the conditions is judged sequentially through the logical formula, and therefore the label of the user is finally obtained.
In the embodiment of the application, in the process of extracting the user tag from the audio data of the conversation robot and the user, the audio data of the conversation robot and the user are acquired, and the audio data are input into a preset rule module. The preset rule module comprises a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions. And analyzing the audio data through a logic formula to obtain the label of the user. The data structure of the prefix expression is adopted to obtain a logic formula, and the logic formula only comprises simple operators and operands. Compared with the traditional JSON data structure, the prefix expression greatly reduces the space consumption during storage of the logic formula in the use scene, and further improves the operation efficiency.
In one embodiment, as shown in fig. 3, step 260, analyzing the audio data by a logic formula to obtain a tag of the user includes:
in step 262, the logic formula is queried from the database through a preset rule module.
When the service is created, a corresponding logic tree is carded out according to an actual service scene, the objects in the logic tree are abstracted to obtain a logic formula, and the logic formula is stored in a database. The data structure employed by the logic in the present application is a prefix expression, for example, (|a, (& B, C, (|e, F, G)), (|d, (& H, I))). The structure of the logical tree expressed by the logical formula is shown in fig. 4.
And then, when user label extraction is needed in the actual service scene, acquiring audio data of the conversation robot and the user, and inputting the audio data into a preset rule module. And querying the logic formula from the database through a preset rule module.
In step 264, the character strings in the logical expression are parsed to obtain objects and logical relationships between the objects.
After the logical formulas are obtained from the database, the character strings in the logical formulas are analyzed. The method comprises the steps of analyzing character strings in a logic formula and abstracting objects in a logic tree, wherein the two opposite processes are corresponding. And analyzing the character strings in the logic formula to obtain the logic relationship between the objects.
Step 266, the objects and logical relationships between the objects are rendered as a logical tree.
After the character strings in the logical formulas are analyzed to obtain the objects and the logical relations among the objects, the objects and the logical relations among the objects can be rendered into a logical tree. As shown in fig. 5, in this service scenario of the bank promoting, the obtained logical tree is rendered according to the prefix expression. The logical tree contains the contents of the conditions and the logical relationships between the conditions.
In step 268, the audio data is analyzed by the logic tree to obtain a tag of the user.
The audio data is analyzed by conditions in the logic tree and logical relations between conditions, for example, condition 1 in fig. 5 includes condition 1.1, condition 1.2, condition 1.3, condition 1.4, condition 1.5. And the logical relationship among the condition 1.1, the condition 1.2, the condition 1.3, the condition 1.4 and the condition 1.5 is also.
Further, condition 1.5 includes condition 1.5.1, condition 1.5.2, condition 1.5.3, condition 1.5.4, condition 1.5.5. And the logical relationship among condition 1.5.1, condition 1.5.2, condition 1.5.3, condition 1.5.4, condition 1.5.5 is also.
Further, conditions 1.5.5 include conditions 1.5.5.1,1.5.5.2,1.5.5.3, 1.5.5.4. And the logical relationship between condition 1.5.5.1, condition 1.5.5.2, condition 1.5.5.3, and condition 1.5.5.4 is and. The content in the above conditions includes ID, word slot value, hang-up node, data threshold, on condition, play status, hang-up person, hang-up node, etc. in the process trigger, which is not limited in the present application.
And inputting the audio data of the conversation robot and the user into a logic tree for analysis to obtain an analysis result. And then the label of the user can be obtained according to the analysis result. The labels of the users are classification obtained by dividing the users in the service scene. For example, in a business scenario of bank collection, the user may be provided with labels such as "call on and collection successful", "call on but collection failure", "call not on", and the like, and other types of labels may be included. Therefore, audio data of the conversation robot and the user are input into the logic tree for analysis, and an analysis result is obtained. And then labeling the user with any one or more of the above labels according to the analysis result.
In the embodiment of the application, a logic formula is queried from a database through a preset rule module, and character strings in the logic formula are analyzed to obtain objects and logic relations among the objects. Rendering the objects and the logical relationship between the objects into a logic tree, and analyzing the audio data through the logic tree to obtain the labels of the users. Therefore, the logic tree in the actual service scene is finally rendered by analyzing the prefix expression which is a logic expression with a simple structure. At this time, the user's tag can be analyzed by inputting the audio data into the audio data. The logical formula employing the prefix expression has a greatly reduced data size compared to the conventional JSON data structure. Therefore, the storage space in the database is greatly reduced by storing the logic formula in the database, and the smaller data volume also improves the operation efficiency in the two processes of abstracting the logic formula into the logic formula and analyzing the logic formula into the object. Furthermore, the efficiency of extracting the user tag from the audio data of the conversation robot and the user is finally improved.
In one embodiment, parsing the character string corresponding to the logical expression to obtain the object and the logical relationship between the objects includes:
and analyzing the character strings in the logic formula through the first function to obtain the objects and the logic relations among the objects.
The first function is a logicl expression2Object, and the first function can analyze the character strings in the logic formula to obtain the objects and the logic relationship between the objects. For example, the logical formulas (|a, (& B, C, (|e, F, G)), (|d, (& H, I))) may be parsed to arrive at the object and the logical relationship between the objects. Specifically, the character strings in the logic formula are sequentially input into a state machine for reading. And sequentially executing different commands according to the read values to obtain return values, and obtaining the objects and the logic relationship among the objects based on all the return values.
In the embodiment of the application, the simple logic formula acquired from the database is realized through the first function, and the information represented in the logic formula is analyzed, namely, the character strings in the logic formula are analyzed through the first function to obtain the objects and the logic relation among the objects. Thus, objects and logical relationships between objects can be rendered as logical trees. And analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, as shown in fig. 6, parsing the character string in the logical formula by the first function to obtain the object and the logical relationship between the objects includes:
step 620, sequentially inputting the character strings in the logic formula into a state machine for reading;
step 640, respectively executing different commands in turn according to the read values to obtain return values;
step 660, based on all returned values, obtaining the objects and the logical relationships between the objects.
Wherein the source code of the logicExpression2Object function is as follows:
in the embodiment of the application, the simple logic formula acquired from the database is realized through the first function, and the information represented in the logic formula is analyzed, namely, the character strings in the logic formula are analyzed through the first function to obtain the objects and the logic relation among the objects. Thus, objects and logical relationships between objects can be rendered as logical trees. And analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, a method for extracting a user tag is provided, which further includes: the method comprises the steps of abstracting objects in a preset logic tree to obtain a logic formula, and storing the logic formula into a database.
When the service is created, the corresponding logic tree is carded out according to the actual service scene. As shown in fig. 5, the logical tree rendered in this business scenario is collected by the bank. The logical tree contains the contents of the conditions and the logical relationships between the conditions. The method comprises the steps of abstracting objects in a logic tree to obtain a logic formula, and storing the logic formula into a database. The data structure adopted by the logic formula in the application is a prefix expression, for example, the logic formula obtained by abstracting the objects in the logic tree shown in fig. 5 is: (& 1.1,1.2,1.3,1.4, (& 1.5.1,1.5.2,1.5.3,1.5.4, (& 1.5.5.1,1.5.5.2,1.5.5.3,1.5.5.4))). In the embodiment of the application, the object in the preset logic tree is abstracted to obtain a logic formula, and the data structure of the logic formula is a prefix expression. Compared with the traditional JSON data structure, the prefix expression greatly reduces the space consumption during storage of the logic formula in the use scene, and further improves the operation efficiency.
In one embodiment, a method for extracting a user tag is provided, which further includes: and abstracting the objects in the preset logic tree into logic formulas through a second function, and storing the logic formulas into a database.
The second function is object2Logicexpress, and the second function can abstract the objects in the preset logic tree into logic formulas, and store the logic formulas into the database. Specifically, all objects in the preset logic tree are circularly traversed in a recursion mode, and values corresponding to the logic relations among the objects and the values corresponding to the objects are obtained. And converting the values corresponding to the logical relations among the objects and the values corresponding to the objects into character strings to obtain logical formulas, and storing the logical formulas into a database. For example, the logical formula abstracted from the objects in the logical tree shown in fig. 5 is: (& 1.1,1.2,1.3,1.4, (& 1.5.1,1.5.2,1.5.3,1.5.4, (& 1.5.5.1,1.5.5.2,1.5.5.3,1.5.5.4))).
In the embodiment of the application, when the service is created, a corresponding logic tree is carded out according to the actual service scene, the objects in the logic tree are abstracted to obtain a logic formula, and the logic formula is stored in a database. So that the corresponding logic formula is directly obtained from the database in the subsequent service scene, and the character strings in the logic formula are analyzed to obtain the objects and the logic relations among the objects. The objects and logical relationships between the objects are rendered as logical trees. The audio data can be analyzed through the logic tree to obtain the user's tag. And the logic formula adopts the data structure of the prefix expression, so that the space consumption during storing the logic formula in the use scene is greatly reduced by the data structure of the prefix expression, and the operation efficiency is further improved.
In one embodiment, as shown in fig. 7, the abstraction of the object in the preset logic tree into a logic formula by the second function, and storing the logic formula into the database includes:
step 720, circularly traversing all objects in the preset logic tree in a recursion mode to obtain values corresponding to the logic relations among the objects and values corresponding to the objects;
step 740, converting the values corresponding to the logical relationships between the objects and the values corresponding to the objects into character strings to obtain logical formulas, and storing the logical formulas in a database.
Wherein the source code of the second function object2LogicExpression is as follows:
in the embodiment of the application, the object in the preset logic tree is abstracted into a logic formula through the second function, and the logic formula is stored in the database. The logic formula adopts the data structure of the prefix expression, and the space consumption during storing the logic formula in the use scene is greatly reduced due to the data structure of the prefix expression, so that the operation efficiency is improved.
In one embodiment, the logical relationship is represented in the form of a prefix.
Wherein the prefix expression is an expression in which an operator is in front of an operand. Correspondingly, there are also a suffix expression and a suffix expression. Because the data structure adopted by the logic formula in the preset rule module is a prefix expression, the logic relationship between the conditions in the logic formula is set before the conditions. For example, (|a, (& B, C, (|e, F, G)), (|d, (& H, I))). The first tuple is A, the second tuple is B, the third tuple is C, the fourth tuple is (|E, F, G), and the fifth tuple is (|D, (& H, I).
Wherein, the logical relationship between the three condition groups of A, (& B, C, (|E, F, G)), (|D, (& H, I)) is an OR relationship. Further, the second condition group (& B, C, (|e, F, G)) includes three sub-condition groups of B, C, (|e, F, G), and the logical relationship between the three sub-condition groups of B, C, (|e, F, G) is a sum relationship. Still further, the logical relationship between the three conditions of E, F, G in the sub-condition group (|E, F, G) is an OR relationship.
Conventionally, the logical formula in the preset rule module adopts a JSON data structure, and the JSON data structure includes complex objects and a tuple structure. Therefore, the logical formula of the JSON data structure is larger, so that more memory space is occupied, the analysis speed is low, and the operation speed is reduced. For example, { "logics": "," values ": [ {" logics ":", value ": [ {" logics ":", "values": [ "E", "F", "G" ] }, "B" ], "C" ], "A", { "logics": "," values ": [" H "," I "] }," D "] } ] }) is a logical structure of the JSON data structure.
In the embodiment of the application, the logic formula of the complex JSON data structure is expressed in the form of a prefix expression, namely, the logic relation is expressed in the form of a prefix in the logic formula. The logical expression of the conventional JSON data structure is converted into a prefix expression in the form of a, (& B, C, (|e, F, G)), (|d, (& H, I)). Obviously, the logical formula using the prefix expression has a greatly reduced data size compared to the conventional JSON data structure. Therefore, the storage space in the database is greatly reduced by storing the logic formula in the database, and the smaller data volume also improves the operation efficiency in the two processes of abstracting the logic formula into the logic formula and analyzing the logic formula into the object. Furthermore, the efficiency of extracting the user tag from the audio data of the conversation robot and the user is finally improved.
In one embodiment, the logical formula includes a set of conditions and a logical relationship between the set of conditions, the set of conditions including a set of sub-conditions and a logical relationship between the set of sub-conditions, the set of sub-conditions including at least one condition and a logical relationship between the conditions.
Specifically, for example, one logic formula is: (|a, (|b, C, (|e, F, G)), (|d, (|h, I))). The logical formula includes a set of conditions and a logical relationship between the set of conditions, wherein the logical formula includes three sets of conditions: a, (& B, C, (|e, F, G)), (|d, (& H, I)). And for a, (& B, C, (|e, F, G)), (|d, (& H, I)) the logical relationship between the three condition sets is the or relationship.
Further, the condition sets include logical relationships between the sub-condition sets. In the second condition set (& B, C, (|e, F, G)), three sub-condition sets are included: b, C, (|e, F, G). The logical relationship among the three sub-condition groups B, C, (|E, F, G) is a sum relationship.
Still further, the set of sub-conditions includes at least one condition and a logical relationship between the conditions. And B and C, wherein each of the two sub-condition groups comprises only one condition. The sub-condition group (|e, F, G) contains three conditions E, F, G. And the logical relationship among the three conditions of E, F and G is OR.
In the embodiment of the application, the logic formula is divided into condition groups from outside to inside layer by layer, a sub-condition group is arranged under the condition groups, and the conditions are arranged under the sub-condition groups. And there is a logical relationship between the condition sets, which is set in the form of a prefix at the forefront of all condition sets. There is a logical relationship between the sub-condition sets, which is set in the form of a prefix at the forefront of all condition sets. There is a logical relationship between conditions, which is set in the form of a prefix at the forefront of all conditions. The logical relation in the logical formula is clear and definite, and is convenient for analysis or abstraction in the subsequent operation.
And compared with the traditional JSON data structure, the logical formula adopting the prefix expression has greatly reduced data size. Therefore, the storage space in the database is greatly reduced by storing the logic formula in the database, and the smaller data volume also improves the operation efficiency in the two processes of abstracting the logic formula into the logic formula and analyzing the logic formula into the object. Furthermore, the efficiency of extracting the user tag from the audio data of the conversation robot and the user is finally improved.
In one embodiment, as shown in fig. 8, there is provided a user tag extraction apparatus 800 comprising:
an acquisition module 820 for acquiring audio data of the conversation robot and the user;
the input module 840 is configured to input the audio data into a preset rule module, where the preset rule module includes a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula includes conditions and logic between the conditions;
the analysis module 860 is configured to analyze the audio data through a logic formula to obtain a tag of the user.
In one embodiment, as shown in FIG. 9, the analysis module 860 includes:
a logic formula query unit 862 for querying a logic formula from the database through a preset rule module;
a character string analysis unit 864, configured to analyze a character string corresponding to the logical expression to obtain an object and a logical relationship between the objects;
a logical tree generating unit 866, configured to render the objects and the logical relationships between the objects into a logical tree;
and the analysis unit 868 is used for analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, the character string parsing unit is further configured to parse the character string in the logic formula through a first function to obtain the object and a logic relationship between the objects.
In one embodiment, the character string analysis unit is further configured to sequentially input the character strings in the logic formula into the state machine for reading; respectively executing different commands in sequence according to the read values to obtain return values; based on all the returned values, the objects and the logical relationships between the objects are obtained.
In one embodiment, there is provided a user tag extraction apparatus further comprising: the logic formula storage module is used for abstracting the objects in the preset logic tree to obtain a logic formula, and storing the logic formula into the database.
In one embodiment, the logic formula storage module is further configured to abstract the object in the preset logic tree into a logic formula through the second function, and store the logic formula into the database.
In one embodiment, the logic formula storage module is further configured to recursively cycle through all objects in the preset logic tree, and obtain a value corresponding to a logic relationship between the objects and a value corresponding to the objects; and converting the values corresponding to the logical relations among the objects and the values corresponding to the objects into character strings to obtain logical formulas, and storing the logical formulas into a database.
In one embodiment, the logical relationship is represented in the form of a prefix.
In one embodiment, the logical formula includes a set of conditions and a logical relationship between the set of conditions, the set of conditions including a set of sub-conditions and a logical relationship between the set of sub-conditions, the set of sub-conditions including at least one condition and a logical relationship between the conditions.
The above-mentioned division of the respective modules in the user tag extraction apparatus is only for illustration, and in other embodiments, the user tag extraction apparatus may be divided into different modules as needed to complete all or part of the functions of the user tag extraction apparatus.
FIG. 10 is a schematic diagram of an internal structure of a server in one embodiment. As shown in fig. 10, the server includes a processor and a memory connected through a system bus. Wherein the processor is configured to provide computing and control capabilities to support the operation of the entire server. The memory may include a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The computer program is executable by a processor for implementing a user tag extraction method provided in the following embodiments. The internal memory provides a cached operating environment for operating system computer programs in the non-volatile storage medium. The server may be a cell phone, tablet computer or personal digital assistant or wearable device, etc.
The implementation of each module in the user tag extraction apparatus provided in the embodiment of the present application may be in the form of a computer program. The computer program may run on a terminal or a server. Program modules of the computer program may be stored in the memory of the terminal or server. Which when executed by a processor, performs the steps of the method described in the embodiments of the application.
The embodiment of the application also provides a computer readable storage medium. One or more non-transitory computer-readable storage media containing computer-executable instructions that, when executed by one or more processors, cause the processors to perform the steps of a user tag extraction method.
A computer program product comprising instructions which, when run on a computer, cause the computer to perform a user tag extraction method.
Any reference to memory, storage, database, or other medium used by embodiments of the application may include non-volatile and/or volatile memory. Suitable nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The foregoing examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (14)

1. A method for extracting a user tag, comprising:
acquiring audio data of the conversation robot and a user;
inputting the audio data into a preset rule module, wherein the preset rule module comprises a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions;
inquiring the logic formula from a database through the preset rule module;
analyzing the character strings in the logic formula to obtain a logic relationship between the objects;
rendering the objects and the logical relationship between the objects into a logical tree;
and analyzing the audio data through the logic tree to obtain the label of the user.
2. The method of claim 1, wherein the parsing the string corresponding to the logical expression to obtain the object and the logical relationship between the objects comprises:
and analyzing the character strings in the logic formula through a first function to obtain objects and logic relations among the objects.
3. The method according to claim 2, wherein the parsing the string in the logical expression by the first function to obtain the object and the logical relationship between the objects includes:
sequentially inputting the character strings in the logic formulas into a state machine for reading;
respectively executing different commands in sequence according to the read values to obtain return values;
based on all the returned values, the objects and the logical relationship between the objects are obtained.
4. The method according to claim 1, wherein the method further comprises: and abstracting an object in a preset logic tree to obtain the logic formula, and storing the logic formula into a database.
5. The method according to claim 1, wherein the method further comprises: and abstracting the objects in the preset logic tree into the logic formula through a second function, and storing the logic formula into a database.
6. The method of claim 5, wherein abstracting the objects in the preset logical tree into the logical formula by the second function, storing the logical formula in the database, comprises:
circulating through all objects in a preset logic tree in a recursion mode, and acquiring values corresponding to logic relations among the objects and values corresponding to the objects;
and converting the values corresponding to the logical relations among the objects and the values corresponding to the objects into character strings to obtain logical formulas, and storing the logical formulas into a database.
7. The method according to any of claims 1-6, wherein the logical relationship is represented in the form of a prefix.
8. The method of claim 7, wherein the logical formula comprises a set of conditions and a logical relationship between the set of conditions, the set of conditions comprising a set of sub-conditions and a logical relationship between the set of sub-conditions, the set of sub-conditions comprising at least one condition and a logical relationship between the conditions.
9. A user tag extraction apparatus, comprising:
the acquisition module is used for acquiring audio data of the conversation robot and the user;
the input module is used for inputting the audio data into a preset rule module, the preset rule module comprises a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic among the conditions;
the analysis module is used for inquiring the logic formula from the database through the preset rule module; analyzing the character strings in the logic formula to obtain a logic relationship between the objects; rendering the objects and the logical relationship between the objects into a logical tree; and analyzing the audio data through the logic tree to obtain the label of the user.
10. The apparatus of claim 9, wherein the analysis module comprises:
the logic formula query unit is used for querying the logic formula from the database through the preset rule module;
the character string analysis unit is used for analyzing the character string corresponding to the logic formula to obtain an object and a logic relationship between the objects;
a logic tree generating unit, configured to render the object and a logic relationship between the objects into a logic tree;
and the analysis unit is used for analyzing the audio data through the logic tree to obtain the label of the user.
11. The apparatus of claim 10, wherein the string parsing unit is further configured to parse a string in the logical formula by a first function to obtain an object and a logical relationship between the objects.
12. The apparatus of claim 11, wherein the string parsing unit is further configured to sequentially input the strings in the logical formula to a state machine for reading; respectively executing different commands in sequence according to the read values to obtain return values; based on all the returned values, the objects and the logical relationship between the objects are obtained.
13. A server comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of the user tag extraction method of any of claims 1 to 8.
14. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the user tag extraction method according to any of claims 1 to 8.
CN202010440700.8A 2020-05-22 2020-05-22 User tag extraction method and device, server and computer readable storage medium Active CN111768767B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010440700.8A CN111768767B (en) 2020-05-22 2020-05-22 User tag extraction method and device, server and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010440700.8A CN111768767B (en) 2020-05-22 2020-05-22 User tag extraction method and device, server and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111768767A CN111768767A (en) 2020-10-13
CN111768767B true CN111768767B (en) 2023-08-15

Family

ID=72719676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010440700.8A Active CN111768767B (en) 2020-05-22 2020-05-22 User tag extraction method and device, server and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111768767B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222459A (en) * 2021-05-31 2021-08-06 中国测试技术研究院 System and method for dynamically constructing food uncertainty evaluation model by expression tree

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116196A1 (en) * 1998-11-12 2002-08-22 Tran Bao Q. Speech recognizer
CN107193843B (en) * 2016-03-15 2020-08-28 阿里巴巴集团控股有限公司 Character string screening method and device based on AC automaton and suffix expression
CN110019725A (en) * 2017-12-22 2019-07-16 科沃斯商用机器人有限公司 Man-machine interaction method, system and its electronic equipment
CN108521525A (en) * 2018-04-03 2018-09-11 南京甄视智能科技有限公司 Intelligent robot customer service marketing method and system based on user tag system
CN109710811B (en) * 2018-11-28 2021-03-02 汉海信息技术(上海)有限公司 User portrait detection method, device and application system
CN110532487B (en) * 2019-09-11 2022-07-29 北京百度网讯科技有限公司 Label generation method and device
CN111081233B (en) * 2019-12-31 2023-01-06 联想(北京)有限公司 Audio processing method and electronic equipment

Also Published As

Publication number Publication date
CN111768767A (en) 2020-10-13

Similar Documents

Publication Publication Date Title
CN106776544B (en) Character relation recognition method and device and word segmentation method
CN110457302B (en) Intelligent structured data cleaning method
CN102682090B (en) A kind of sensitive word matching treatment system and method based on polymerization word tree
CN107193843B (en) Character string screening method and device based on AC automaton and suffix expression
CN111459977B (en) Conversion of natural language queries
CN112612761B (en) Data cleaning method, device, equipment and storage medium
CN110851136A (en) Data acquisition method and device, electronic equipment and storage medium
CN108763202A (en) Method, apparatus, equipment and the readable storage medium storing program for executing of the sensitive text of identification
CN111897828A (en) Data batch processing implementation method, device, equipment and storage medium
CN111768767B (en) User tag extraction method and device, server and computer readable storage medium
CN111831624A (en) Data table creating method and device, computer equipment and storage medium
CN113419721B (en) Web-based expression editing method, device, equipment and storage medium
US20100023924A1 (en) Non-constant data encoding for table-driven systems
CN112395407A (en) Method and device for extracting enterprise entity relationship and storage medium
CN111770357A (en) Bullet screen-based video highlight segment identification method, terminal and storage medium
CN115858776B (en) Variant text classification recognition method, system, storage medium and electronic equipment
CN110851597A (en) Method and device for sentence annotation based on similar entity replacement
CN115686455A (en) Application development method, device and equipment based on spreadsheet and storage medium
CN116822491A (en) Log analysis method and device, equipment and storage medium
CN115640316A (en) Paging method, interceptor, plug-in and server for data query
CN109410069A (en) Settlement data processing method, device, computer equipment and storage medium
CN111796830B (en) Protocol analysis processing method, device, equipment and medium
CN114089980A (en) Programming processing method, device, interpreter and nonvolatile storage medium
US20150324333A1 (en) Systems and methods for automatically generating hyperlinks
CN114020898B (en) Man-machine automatic dialogue method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant