Disclosure of Invention
In view of the above, the present invention provides a smart city monitoring method, apparatus and information processing device to improve the above-mentioned problems.
In a first aspect of an embodiment of the present invention, there is provided a smart city monitoring method applied to a smart city monitoring system, the smart city monitoring system including an information processing device and an acquisition device that are in communication with each other, the acquisition device being disposed in different areas of a building and configured to acquire monitoring information in the different areas of the building, the method including:
the method comprises the steps that the acquisition equipment monitors a target area of a building in real time, acquires monitoring information of the target area, and uploads the monitoring information to the information processing equipment in real time through a preset information transmission channel; the information transmission channel is generated by the information processing device according to the matching degree between first parameter structured information obtained by the information processing device and second parameter structured information of the acquisition device, wherein the first parameter structured information is used for representing communication parameter logic distribution of the information processing device, and the second parameter structured information is used for representing communication parameter logic distribution of the acquisition device; the monitoring information comprises image information corresponding to the target area and voice information corresponding to the target area;
The information processing device receives the monitoring information corresponding to the target area uploaded by the acquisition device through the information transmission channel, and carries out information category identification on the monitoring information to obtain first category information used for representing the image information and second category information used for representing the voice information, wherein the first category information is included in the monitoring information; classifying the monitoring information according to the first category information and the second category information to obtain the image information corresponding to the first category information and the voice information corresponding to the second category information in the monitoring information;
the information processing equipment extracts first information features of the image information and second information features of the voice information, performs feature recognition on the first information features based on a first preset feature database to obtain a first recognition result, and performs feature recognition on the second information features based on a second preset feature database to obtain a second recognition result; the first preset feature database is an image feature database, and the second preset feature database is a voice feature database;
the information processing equipment acquires a second confidence coefficient of the second recognition result obtained by taking the first recognition result as a reference and a first confidence coefficient of the first recognition result obtained by taking the second recognition result as a reference; obtaining a weighting coefficient for weighting the first recognition result and the second recognition result according to the first confidence coefficient and the second confidence coefficient; carrying out weighted summation on the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result, and determining an early warning grade corresponding to the third recognition result; and judging that the target area is abnormal when the early warning level exceeds a set level.
Optionally, the extracting the first information feature of the image information and the second information feature of the voice information includes:
determining image coding information of the image information, dividing the image coding information according to coding segmentation identifiers in the image coding information to obtain a plurality of continuous coding information segments, determining matching coefficients between every two coding information segments, and carrying out relevance correction on each coding information segment according to all the determined matching coefficients to obtain target coding information segments corresponding to each coding information segment;
determining the code character distribution corresponding to each target code information segment, listing the character structure topology of each code character distribution, determining the character distribution characteristics corresponding to each code character distribution according to the character structure topology, and integrating all the character distribution characteristics to obtain the first information characteristics of the image information;
extracting a spectrogram of the voice information, separating a voiceprint curve in the voice information from the spectrogram, and acquiring voiceprint characteristics corresponding to the voiceprint curve;
text information corresponding to the voice information is obtained according to the voice information, word segmentation processing is carried out on the text information to obtain a plurality of keywords, topic information corresponding to the text information is determined according to semantic connection relations among the keywords, and topic features of the topic information are extracted;
Determining a first association weight of the voiceprint feature relative to the theme feature and a second association weight of the theme feature relative to the voiceprint feature, and carrying out weighted summation on the voiceprint feature and the theme feature based on the first association weight and the second association weight to obtain a second information feature of the voice information.
Optionally, the performing feature recognition on the first information feature based on a first preset feature database to obtain a first recognition result includes:
acquiring a first target feature with the minimum cosine distance between the first target feature and the first information feature in the first preset feature database;
determining a behavior category corresponding to the first target feature from a preset first mapping relation list, wherein the behavior category is a person obtaining behavior category in the image information;
and generating the first recognition result based on the behavior category.
Optionally, the performing feature recognition on the second information feature based on a second preset feature database to obtain a second recognition result includes:
determining at least part of second target features, of which the similarity value between the second target features and the second information features in the second preset feature database is smaller than or equal to a set threshold value;
And determining a semantic result corresponding to each second target feature according to a preset second mapping relation list, and fusing all the semantic results to obtain the second identification result.
Optionally, the performing weighted summation on the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result includes:
respectively listing the first identification result and the second identification result in a numerical code form to obtain a first numerical code list corresponding to the first identification result and a second numerical code list corresponding to the second identification result;
pairing each first list unit in the first numerical code list and each second list unit in the second numerical code to obtain at least part of list unit groups; wherein each list unit group comprises a first list unit and a second list unit;
and generating a third numerical code list according to at least part of list unit groups, weighting the third numerical code list based on the weighting coefficient to obtain a fourth numerical code list, and converting the fourth numerical code list into a third identification result based on first conversion logic which lists the first identification result in a numerical code form to obtain a first numerical code list corresponding to the first identification result or based on second conversion logic which lists the second identification result in a numerical code form to obtain a second numerical code list corresponding to the second identification result.
In a second aspect of the embodiment of the invention, a smart city monitoring system is provided, which comprises an information processing device and an acquisition device which are communicated with each other, wherein the acquisition device is arranged in different areas of a building;
the acquisition equipment is used for monitoring a target area of a building in real time and acquiring monitoring information of the target area, and uploading the monitoring information to the information processing equipment in real time through a preset information transmission channel; the information transmission channel is generated by the information processing device according to the matching degree between first parameter structured information obtained by the information processing device and second parameter structured information of the acquisition device, wherein the first parameter structured information is used for representing communication parameter logic distribution of the information processing device, and the second parameter structured information is used for representing communication parameter logic distribution of the acquisition device; the monitoring information comprises image information corresponding to the target area and voice information corresponding to the target area;
the information processing device is used for receiving the monitoring information corresponding to the target area uploaded by the acquisition device through the information transmission channel, and carrying out information category identification on the monitoring information to obtain first category information which is included in the monitoring information and used for representing the image information and second category information which is used for representing the voice information; classifying the monitoring information according to the first category information and the second category information to obtain the image information corresponding to the first category information and the voice information corresponding to the second category information in the monitoring information; the information processing equipment extracts first information features of the image information and second information features of the voice information, performs feature recognition on the first information features based on a first preset feature database to obtain a first recognition result, and performs feature recognition on the second information features based on a second preset feature database to obtain a second recognition result; the first preset feature database is an image feature database, and the second preset feature database is a voice feature database; the information processing equipment acquires a second confidence coefficient of the second recognition result obtained by taking the first recognition result as a reference and a first confidence coefficient of the first recognition result obtained by taking the second recognition result as a reference; obtaining a weighting coefficient for weighting the first recognition result and the second recognition result according to the first confidence coefficient and the second confidence coefficient; carrying out weighted summation on the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result, and determining an early warning grade corresponding to the third recognition result; and judging that the target area is abnormal when the early warning level exceeds a set level.
Optionally, the information processing apparatus is specifically configured to:
determining image coding information of the image information, dividing the image coding information according to coding segmentation identifiers in the image coding information to obtain a plurality of continuous coding information segments, determining matching coefficients between every two coding information segments, and carrying out relevance correction on each coding information segment according to all the determined matching coefficients to obtain target coding information segments corresponding to each coding information segment;
determining the code character distribution corresponding to each target code information segment, listing the character structure topology of each code character distribution, determining the character distribution characteristics corresponding to each code character distribution according to the character structure topology, and integrating all the character distribution characteristics to obtain the first information characteristics of the image information;
extracting a spectrogram of the voice information, separating a voiceprint curve in the voice information from the spectrogram, and acquiring voiceprint characteristics corresponding to the voiceprint curve;
text information corresponding to the voice information is obtained according to the voice information, word segmentation processing is carried out on the text information to obtain a plurality of keywords, topic information corresponding to the text information is determined according to semantic connection relations among the keywords, and topic features of the topic information are extracted;
Determining a first association weight of the voiceprint feature relative to the theme feature and a second association weight of the theme feature relative to the voiceprint feature, and carrying out weighted summation on the voiceprint feature and the theme feature based on the first association weight and the second association weight to obtain a second information feature of the voice information.
Optionally, the information processing apparatus is specifically configured to:
acquiring a first target feature with the minimum cosine distance between the first target feature and the first information feature in the first preset feature database;
determining a behavior category corresponding to the first target feature from a preset first mapping relation list, wherein the behavior category is a person obtaining behavior category in the image information;
and generating the first recognition result based on the behavior category.
Optionally, the information processing apparatus is specifically configured to:
determining at least part of second target features, of which the similarity value between the second target features and the second information features in the second preset feature database is smaller than or equal to a set threshold value;
and determining a semantic result corresponding to each second target feature according to a preset second mapping relation list, and fusing all the semantic results to obtain the second identification result.
Optionally, the information processing apparatus is specifically configured to:
respectively listing the first identification result and the second identification result in a numerical code form to obtain a first numerical code list corresponding to the first identification result and a second numerical code list corresponding to the second identification result;
pairing each first list unit in the first numerical code list and each second list unit in the second numerical code to obtain at least part of list unit groups; wherein each list unit group comprises a first list unit and a second list unit;
and generating a third numerical code list according to at least part of list unit groups, weighting the third numerical code list based on the weighting coefficient to obtain a fourth numerical code list, and converting the fourth numerical code list into a third identification result based on first conversion logic which lists the first identification result in a numerical code form to obtain a first numerical code list corresponding to the first identification result or based on second conversion logic which lists the second identification result in a numerical code form to obtain a second numerical code list corresponding to the second identification result.
Advantageous effects
According to the smart city monitoring method and system provided by the embodiment of the invention, firstly, the acquisition equipment uploads the acquired monitoring information of the target area to the information processing equipment through the preset information transmission channel, so that the time and accuracy of the transmission of the monitoring information can be improved.
Secondly, the information processing equipment performs feature extraction on the image information and the voice information in the monitoring information, and performs feature recognition to obtain a first recognition result and a second recognition result.
And finally, determining a weighting coefficient based on the first confidence coefficient and the second confidence coefficient which are obtained according to the first recognition result and the second recognition result, realizing weighted summation of the first recognition result and the second recognition result to obtain a third recognition result, and then judging that the target area is abnormal when the early warning level of the third recognition result exceeds the set level.
Therefore, the monitoring information can be deeply mined from the association angle of the image information and the voice information, and further a more comprehensive and reliable monitoring analysis result is obtained, so that the safety of the building is ensured.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
In order to improve the phenomenon that monitoring information is not mined in place in the process of monitoring a building, the embodiment of the invention provides a smart city monitoring method and a smart city monitoring system, which can deeply mine the monitoring information so as to obtain a more comprehensive and reliable monitoring analysis result to ensure the safety of the building.
Referring to fig. 1, a schematic architecture diagram of a smart city monitoring system 100 according to an embodiment of the present invention is provided, where the smart city monitoring system 100 includes an information processing device 200 and a collecting device 300 that communicate with each other. In this embodiment, the capturing device 300 may be a camera or a microphone, which is not limited herein. The acquisition device 300 may be disposed in different areas of a building structure for acquiring monitoring information in the different areas of the building structure.
Referring to fig. 2 in combination, a flowchart of a smart city monitoring method according to an embodiment of the invention is provided, and the method is applied to the smart city monitoring system 100 in fig. 1. Further, the method may be implemented specifically by what is described in the following steps.
Step S21, the acquisition equipment monitors a target area of a building in real time and acquires monitoring information of the target area, and the monitoring information is uploaded to the information processing equipment in real time through a preset information transmission channel; the information transmission channel is generated by the information processing device according to the matching degree between first parameter structured information obtained by the information processing device and second parameter structured information of the acquisition device, wherein the first parameter structured information is used for representing communication parameter logic distribution of the information processing device, and the second parameter structured information is used for representing communication parameter logic distribution of the acquisition device; the monitoring information comprises image information corresponding to the target area and voice information corresponding to the target area.
Step S22, the information processing equipment receives the monitoring information corresponding to the target area uploaded by the acquisition equipment through the information transmission channel, and carries out information category identification on the monitoring information to obtain first category information which is included in the monitoring information and is used for representing the image information and second category information which is used for representing the voice information; classifying the monitoring information according to the first category information and the second category information to obtain the image information corresponding to the first category information and the voice information corresponding to the second category information in the monitoring information.
Step S23, the information processing equipment extracts first information features of the image information and second information features of the voice information, performs feature recognition on the first information features based on a first preset feature database to obtain a first recognition result, and performs feature recognition on the second information features based on a second preset feature database to obtain a second recognition result; the first preset feature database is an image feature database, and the second preset feature database is a voice feature database;
step S24, the information processing equipment acquires a second confidence coefficient of the second recognition result obtained by taking the first recognition result as a reference and a first confidence coefficient of the first recognition result obtained by taking the second recognition result as a reference; obtaining a weighting coefficient for weighting the first recognition result and the second recognition result according to the first confidence coefficient and the second confidence coefficient; carrying out weighted summation on the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result, and determining an early warning grade corresponding to the third recognition result; and judging that the target area is abnormal when the early warning level exceeds a set level.
It can be understood that, based on the descriptions in the above steps S21 to S24, first, the collecting device uploads the collected monitoring information of the target area to the information processing device through the preset information transmission channel, so that the reliability and accuracy of the transmission of the monitoring information can be improved. Secondly, the information processing equipment performs feature extraction on the image information and the voice information in the monitoring information, and performs feature recognition to obtain a first recognition result and a second recognition result. And finally, determining a weighting coefficient based on the first confidence coefficient and the second confidence coefficient which are obtained according to the first recognition result and the second recognition result, realizing weighted summation of the first recognition result and the second recognition result to obtain a third recognition result, and then judging that the target area is abnormal when the early warning level of the third recognition result exceeds the set level. Therefore, the monitoring information can be deeply mined from the association angle of the image information and the voice information, and further a more comprehensive and reliable monitoring analysis result is obtained, so that the safety of the building is ensured.
In an alternative embodiment, in order to accurately determine the first information feature and the second information feature, in step S23, the extracting the first information feature of the image information and the second information feature of the voice information may specifically include what is described in the following steps.
Step S231, determining image coding information of the image information, dividing the image coding information according to coding division identifiers in the image coding information to obtain a plurality of continuous coding information segments, determining matching coefficients between every two coding information segments, and carrying out relevance correction on each coding information segment according to all the determined matching coefficients to obtain a target coding information segment corresponding to each coding information segment.
Step S232, determining the code character distribution corresponding to each target code information segment, listing the character structure topology of each code character distribution, determining the character distribution characteristics corresponding to each code character distribution according to the character structure topology, and integrating all the character distribution characteristics to obtain the first information characteristics of the image information.
Step S233, extracting a spectrogram of the voice information, separating a voiceprint curve in the voice information from the spectrogram, and obtaining voiceprint characteristics corresponding to the voiceprint curve.
Step S234, text information corresponding to the voice information is obtained according to the voice information, word segmentation processing is carried out on the text information to obtain a plurality of keywords, topic information corresponding to the text information is determined according to semantic connection relations among the keywords, and topic features of the topic information are extracted.
Step S235, determining a first association weight of the voiceprint feature relative to the theme feature and a second association weight of the theme feature relative to the voiceprint feature, and performing weighted summation on the voiceprint feature and the theme feature based on the first association weight and the second association weight to obtain a second information feature of the voice information.
It will be appreciated that the first information feature and the second information feature can be accurately determined by the description of steps S231 to S235.
In a specific implementation, in step S23, the feature recognition of the first information feature based on the first preset feature database may specifically include a method described in the following steps.
(11) And acquiring a first target feature with the minimum cosine distance between the first target feature and the first information feature in the first preset feature database.
(12) And determining a behavior category corresponding to the first target feature from a preset first mapping relation list, wherein the behavior category is a person obtained behavior category in the image information.
(13) And generating the first recognition result based on the behavior category.
In this embodiment, the first recognition result including the behavior category can be accurately determined through the descriptions in the steps (11) - (13).
On the basis of the above, in step S23, the feature recognition of the second information feature based on the second preset feature database may specifically include a method described in the following steps.
(21) And determining at least part of second target features, of which the similarity value between the second information features and the second preset feature database is smaller than or equal to a set threshold value.
(22) And determining a semantic result corresponding to each second target feature according to a preset second mapping relation list, and fusing all the semantic results to obtain the second identification result.
In the present embodiment, the second recognition result can be accurately determined by the contents described in the above steps (21) to (23).
In a specific implementation, in step S24, the weighted summation of the first recognition result and the second recognition result based on the weighting coefficient obtains a third recognition result, which may specifically include a method described in the following steps.
Step S241, listing the first recognition result and the second recognition result in a numerical code form to obtain a first numerical code list corresponding to the first recognition result and a second numerical code list corresponding to the second recognition result.
Step S242, pairing each first list unit in the first numerical code list and each second list unit in the second numerical code to obtain at least part of list unit groups; wherein each list element group comprises a first list element and a second list element.
Step S243, generating a third numerical code list according to at least part of the list unit groups, weighting the third numerical code list based on the weighting coefficient to obtain a fourth numerical code list, and converting the fourth numerical code list into a third recognition result based on first conversion logic that lists the first recognition result in a numerical code form to obtain a first numerical code list corresponding to the first recognition result or based on second conversion logic that lists the second recognition result in a numerical code form to obtain a second numerical code list corresponding to the second recognition result.
It is understood that the third recognition result can be accurately obtained based on the descriptions of the above steps S241 to S243.
Referring to fig. 3 in combination, a block diagram of an information processing apparatus 200 according to an embodiment of the invention is shown. The information processing apparatus 200 in the embodiment of the present invention has data storage, transmission, processing functions, and as shown in fig. 3, the information processing apparatus 200 includes: memory 211, processor 212, network module 213 and smart city monitoring device 201.
The memory 211, the processor 212 and the network module 213 are electrically connected directly or indirectly to enable transmission or interaction of data. For example, the components may be electrically connected to each other by one or more communication buses or signal lines. The memory 211 stores the smart city monitoring device 20, and the smart city monitoring device 20 includes at least one software function module stored in the memory 211 in the form of software or firmware (firmware), and the processor 212 executes various function applications and data processes by executing software programs and modules stored in the memory 211, such as the smart city monitoring device 20 in the embodiment of the present invention, that is, implements the smart city monitoring method in the embodiment of the present invention.
The Memory 211 may be, but is not limited to, a random access Memory (Random Access Memory, RAM), a Read Only Memory (ROM), a programmable Read Only Memory (Programmable Read-Only Memory, PROM), an erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), an electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc. The memory 211 is used for storing a program, and the processor 212 executes the program after receiving an execution instruction.
The processor 212 may be an integrated circuit chip having data processing capabilities. The processor 212 may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc. The methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The network module 213 is configured to establish a communication connection between the information processing apparatus 200 and other communication terminal apparatuses through a network, and implement a transceiving operation of network signals and data. The network signals may include wireless signals or wired signals.
It is to be understood that the structure shown in fig. 3 is merely illustrative, and the information processing apparatus 200 may further include more or less components than those shown in fig. 3, or have a different configuration from that shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
Embodiments of the present invention also provide a computer-readable storage medium including a computer program. The computer program controls the information processing apparatus 200 where the readable storage medium is located to execute the steps corresponding to the information processing apparatus 200 in the smart city monitoring method shown in fig. 2 when running.
In summary, according to the smart city monitoring method and system provided by the embodiment of the invention, firstly, the acquisition equipment uploads the acquired monitoring information of the target area to the information processing equipment through the preset information transmission channel, so that the information and accuracy in the transmission of the monitoring information can be improved.
Secondly, the information processing equipment performs feature extraction on the image information and the voice information in the monitoring information, and performs feature recognition to obtain a first recognition result and a second recognition result.
And finally, determining a weighting coefficient based on the first confidence coefficient and the second confidence coefficient which are obtained according to the first recognition result and the second recognition result, realizing weighted summation of the first recognition result and the second recognition result to obtain a third recognition result, and then judging that the target area is abnormal when the early warning level of the third recognition result exceeds the set level.
Therefore, the monitoring information can be deeply mined from the association angle of the image information and the voice information, and further a more comprehensive and reliable monitoring analysis result is obtained, so that the safety of the building is ensured.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus and method embodiments described above are merely illustrative, for example, flow diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present invention may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, an information processing device 200, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.