CN109977816B - Information processing method, device, terminal and storage medium - Google Patents

Information processing method, device, terminal and storage medium Download PDF

Info

Publication number
CN109977816B
CN109977816B CN201910189975.6A CN201910189975A CN109977816B CN 109977816 B CN109977816 B CN 109977816B CN 201910189975 A CN201910189975 A CN 201910189975A CN 109977816 B CN109977816 B CN 109977816B
Authority
CN
China
Prior art keywords
information
data
images
image
pixels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910189975.6A
Other languages
Chinese (zh)
Other versions
CN109977816A (en
Inventor
魏亚男
姜譞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201910189975.6A priority Critical patent/CN109977816B/en
Publication of CN109977816A publication Critical patent/CN109977816A/en
Application granted granted Critical
Publication of CN109977816B publication Critical patent/CN109977816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the disclosure discloses an information processing method, an information processing device, a terminal and a storage medium, wherein the method comprises the following steps: obtaining an image sequence consisting of a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified; analyzing the correlation among the plurality of images to obtain first data associated with the form of the object to be recognized; determining second data associated with attribute information of the object to be identified based on the first data; based on at least the first data and the second data, a feature of an object to be identified in the sequence of images is obtained.

Description

Information processing method, device, terminal and storage medium
Technical Field
The present disclosure relates to, but not limited to, the field of computer technologies, and in particular, to an information processing method, an information processing apparatus, a terminal, and a storage medium.
Background
At present, in the related art, when an object to be recognized is analyzed, after a single image including the object to be recognized is acquired, the object to be recognized is analyzed based on the single image; this analysis method is difficult to obtain accurate analysis results.
Disclosure of Invention
In order to solve the above technical problems, embodiments of the present disclosure desirably provide an information processing method, an apparatus, a terminal, and a storage medium, which solve the problem in the related art that it is difficult to obtain an accurate analysis result when analyzing an object to be recognized, ensure accuracy of the analysis result, and improve intelligence of information processing.
The technical scheme of the disclosure is realized as follows:
an information processing method, the method comprising:
obtaining an image sequence consisting of a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified;
analyzing the correlation among the plurality of images to obtain first data associated with the form of the object to be recognized;
determining second data associated with attribute information of the object to be identified, the attribute information being characterized by attribute characteristic parameters of the object to be identified in the single image, based on the first data;
obtaining features of the object to be identified in the image sequence based on at least the first data and the second data.
Optionally, the obtaining an image sequence composed of a plurality of images includes:
obtaining video information containing the object to be identified;
extracting a plurality of images from the video information according to the time sequence to obtain the image sequence; wherein each image of the plurality of images is a two-dimensional image or a three-dimensional image.
Optionally, the analyzing the correlation between the plurality of images to obtain first data associated with the morphology of the object to be recognized includes:
and carrying out correlation analysis on the pixels of different images and the pixels of the same image through the same model to obtain the first data.
Optionally, the performing, by the same model, correlation analysis between pixels of different images and between different pixels of the same image to obtain the first data includes:
performing correlation analysis between pixels of different images and between different pixels of the same image based on an nth specific time interval through the same model to obtain an analysis result corresponding to the nth specific time interval; the intervals of two adjacent time intervals are different, N is a positive integer which is more than or equal to 1 and less than N, and N is the total number of the specific time intervals;
generating the first data based on a plurality of the analysis results.
Optionally, the analyzing the plurality of images based on the nth specific time interval to obtain an analysis result corresponding to the nth specific time interval includes:
determining a void rate corresponding to the nth specific time interval in the void convolutional neural network model;
and performing correlation analysis between the pixels of different images and between different pixels of the same image by using the hole convolution neural network model based on the hole rate corresponding to the nth specific time interval to obtain an analysis result corresponding to the nth specific time interval.
Optionally, the performing, by the same model, correlation analysis between pixels of different images and between different pixels of the same image to obtain the first data includes:
performing correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific space interval through the same model to obtain an analysis result corresponding to the nth specific space interval; the intervals of two adjacent specific space intervals are different, N ' is a positive integer which is greater than or equal to 1 and smaller than N ', and N ' is the total number of the specific space intervals; wherein each image of the plurality of images is a two-dimensional image;
generating the first data based on a plurality of the analysis results.
Optionally, the performing, by the same model, correlation analysis between pixels of the different images and between different pixels of the same image based on the nth specific spatial interval to obtain an analysis result corresponding to the nth specific spatial interval includes:
determining the void rate corresponding to the nth specific space interval in the void convolutional neural network model;
and performing correlation analysis between pixels of different images and between different pixels of the same image on the basis of the void rate corresponding to the nth specific space interval by using the void convolution neural network model to obtain an analysis result corresponding to the nth specific space interval.
Optionally, the generating the first data based on a plurality of the analysis results includes:
and weighting the analysis results through a connecting layer connected with a cavity convolution neural network model to obtain the first data.
Optionally, the determining, based on the first data, second data associated with attribute information of the object to be identified includes:
processing the first data through a deep neural network model to obtain feature information associated with the attribute information of each image in the plurality of images and the object to be identified; the attribute information is multiple, and the multiple attribute information corresponds to multi-scale characteristic information; each attribute information is characterized by the attribute characteristic parameters of the object to be identified in the single image;
determining a plurality of characteristic information with position mapping relation in the plurality of characteristic information;
and fusing the plurality of characteristic information with the position mapping relation to obtain the second data.
Optionally, the obtaining the feature of the object to be recognized in the image sequence based on at least the first data and the second data includes:
determining a first mark point of a target object in the object to be identified in a first direction and length information in the first direction based on the first data;
determining a second marker point of the target object in a second direction, first boundary information in the second direction, and second boundary information in the second direction based on the second data;
obtaining positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information and the second boundary information; the characteristic of the object to be identified comprises the positioning information, and an included angle between the first direction and the second direction is 90 degrees.
Optionally, the obtaining the positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information, and the second boundary information includes:
determining a size of the target object in the first direction based on the first marker point and the length information;
determining a plane position of the target object in the second direction based on the second marker point, the first boundary information and the second boundary information;
and obtaining the positioning information based on the size and the plane position.
Optionally, the obtaining the positioning information based on the size and the plane position includes:
determining a localization area from the object to be identified based on the size and the planar position by a deep neural network model;
obtaining a confidence that the positioning region comprises the target object through the deep neural network model;
determining the location area and the confidence as the location information.
An information processing apparatus, the information processing apparatus comprising: the device comprises a first acquisition module, a first processing module, a second processing module and a third processing module, wherein:
the first acquisition module is used for acquiring an image sequence formed by a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified;
the first processing module is used for analyzing the correlation among the plurality of images to obtain first data associated with the form of the object to be recognized;
the second processing module is used for determining second data associated with attribute information of the object to be identified based on the first data, wherein the attribute information is characterized by attribute characteristic parameters of the object to be identified in a single image;
the third processing module is configured to obtain a feature of the object to be recognized in the image sequence based on at least the first data and the second data.
Optionally, the first obtaining module includes: a first acquisition unit and a first processing unit, wherein:
the first obtaining unit is used for obtaining video information containing the object to be identified;
the first processing unit is used for extracting a plurality of images from the video information in a time sequence to obtain the image sequence; wherein each image of the plurality of images is a two-dimensional image or a three-dimensional image.
Optionally, the first processing module is further configured to perform correlation analysis between pixels of different images and between different pixels of the same image through the same model, so as to obtain the first data.
Optionally, the first processing module includes: a second processing unit and a third processing unit, wherein:
the second processing unit is used for performing correlation analysis on the pixels of different images and the pixels of the same image based on an nth specific time interval through the same model to obtain an analysis result corresponding to the nth specific time interval; the intervals of two adjacent time intervals are different, N is a positive integer which is more than or equal to 1 and less than N, and N is the total number of the specific time intervals;
the third processing unit is configured to generate the first data based on a plurality of the analysis results.
Optionally, the second processing unit is further configured to:
determining a void rate corresponding to the nth specific time interval in the void convolutional neural network model;
and performing correlation analysis between the pixels of different images and between different pixels of the same image by using the hole convolution neural network model based on the hole rate corresponding to the nth specific time interval to obtain an analysis result corresponding to the nth specific time interval.
Optionally, the first processing module includes: a fourth processing unit and a fifth processing unit, wherein:
the fourth processing unit is configured to perform correlation analysis between pixels of the different images and between different pixels of the same image based on an nth specific spatial interval through the same model, so as to obtain an analysis result corresponding to the nth specific spatial interval; the intervals of two adjacent specific space intervals are different, N ' is a positive integer which is greater than or equal to 1 and smaller than N ', and N ' is the total number of the specific space intervals; wherein each image of the plurality of images is a two-dimensional image;
the fifth processing unit is configured to generate the first data based on a plurality of the analysis results.
Optionally, the fourth processing unit is further configured to:
determining the void rate corresponding to the nth specific space interval in the void convolutional neural network model;
and performing correlation analysis between pixels of different images and between different pixels of the same image on the basis of the void rate corresponding to the nth specific space interval by using the void convolution neural network model to obtain an analysis result corresponding to the nth specific space interval.
Optionally, the fifth processing unit is further configured to:
and weighting the analysis results through a connecting layer connected with a cavity convolution neural network model to obtain the first data.
Optionally, the second processing module includes: a sixth processing unit, a first determining unit, and a seventh processing unit, wherein:
the sixth processing unit is configured to process the first data through a deep neural network model to obtain feature information associated with attribute information of each of the plurality of images and the object to be identified; the attribute information is multiple, and the multiple attribute information corresponds to multi-scale characteristic information; each attribute information is characterized by the attribute characteristic parameters of the object to be identified in the single image;
the first determining unit is configured to determine a plurality of pieces of feature information having a position mapping relationship among the plurality of pieces of feature information;
the seventh processing unit is configured to fuse the plurality of feature information having the position mapping relationship to obtain the second data.
Optionally, the third processing module includes: a second determining unit, a third determining unit, and an eighth processing unit, wherein:
the second determining unit is used for determining a first mark point of a target object in the object to be identified in a first direction and length information in the first direction based on the first data;
the third determining unit is configured to determine, based on the second data, a second marker point of the target object in a second direction, first boundary information in the second direction, and second boundary information in the second direction;
the eighth processing unit is configured to obtain positioning information of the target object based on the first marker point, the length information, the second marker point, the first boundary information, and the second boundary information; the characteristic of the object to be identified comprises the positioning information, and an included angle between the first direction and the second direction is 90 degrees.
Optionally, the eighth processing unit is further configured to:
determining a size of the target object in the first direction based on the first marker point and the length information;
determining a plane position of the target object in the second direction based on the second marker point, the first boundary information and the second boundary information;
and obtaining the positioning information based on the size and the plane position.
Optionally, the eighth processing unit is further configured to:
determining a localization area from the object to be identified based on the size and the planar position by a deep neural network model;
obtaining a confidence that the positioning region comprises the target object through the deep neural network model;
determining the location area and the confidence as the location information.
Optionally, the eighth processing unit is further configured to:
if the confidence coefficient meets a threshold range, determining that the positioning area comprises the target object;
and if the confidence coefficient does not meet the threshold range, determining that the positioning area does not comprise the target object.
A terminal, the terminal comprising: a processor, a memory, and a communication bus;
the communication bus is used for realizing communication connection between the processor and the memory;
the processor is used for executing the information processing program stored in the memory so as to realize the steps in the information processing method provided by the embodiment of the disclosure.
A storage medium storing one or more programs, which are executable by one or more processors to implement steps in an information processing method provided by an embodiment of the present disclosure.
The information processing method, the information processing device, the terminal and the storage medium provided by the embodiment of the disclosure are used for obtaining an image sequence formed by a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified; analyzing the correlation among the plurality of images to obtain first data associated with the form of the object to be recognized; determining second data associated with attribute information of the object to be identified based on the first data; obtaining features of an object to be identified in the image sequence based on at least the first data and the second data; the problem that in the related art, when an object to be recognized is analyzed, an accurate analysis result is difficult to obtain is solved, the accuracy of the analysis result is ensured, and the intelligent degree of information processing is improved.
Drawings
Fig. 1 is a schematic flowchart of an information processing method according to an embodiment of the present disclosure;
fig. 2 is a schematic flow chart of another information processing method provided by the embodiment of the disclosure;
fig. 3 is a schematic flowchart of another information processing method provided by an embodiment of the present disclosure;
fig. 4 is a schematic architecture diagram of an infrastructure network according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure.
An embodiment of the present disclosure provides an information processing method, which is applied to a terminal and is shown in fig. 1, and the method includes the following steps:
step 101, obtaining an image sequence consisting of a plurality of images.
Wherein at least one image of the plurality of images comprises an object to be identified.
In the embodiment of the present disclosure, a plurality of images have temporal correlation or spatial correlation therebetween. Illustratively, the temporal correlation characterization is based on sampling at a specific time interval to obtain the image sequence, i.e. the image sequence characterizes video information; and the spatial correlation characterization is sampled based on a specific sampling interval to obtain the image sequence, namely the image sequence characterizes three-dimensional information. In this way, the terminal obtains a plurality of images having spatial correlation or temporal correlation, and constructs an image sequence based on the plurality of images.
Step 102, analyzing the correlation among the plurality of images to obtain first data associated with the form of the object to be recognized.
In the embodiment of the present disclosure, the terminal analyzes the correlation between the plurality of images to obtain the first data associated with the form of the object to be recognized, and may be implemented by the following steps: and carrying out correlation analysis on the pixels of different images and the pixels of the same image through the same model to obtain first data. In the embodiment of the disclosure, after the terminal acquires the image sequence, the terminal may perform correlation analysis between pixels of different images through the same model, and perform correlation analysis between different pixels of the same image, thereby obtaining first data associated with a form of an object to be recognized, that is, a spatial geometric form of the object to be recognized in a three-dimensional space. It should be noted that, in the method for analyzing the correlation between pixels of different images and between different pixels of the same image to obtain the first data associated with the form of the object to be recognized in the embodiment of the present disclosure, compared with the method for analyzing only a single image containing the object to be recognized in the related art, the association relationship between multiple images and the detail feature in each image are fully captured, and then the first data associated with the appearance of the object to be recognized in the space is acquired, so that the discontinuity in the analysis result is reduced, and the accuracy of the analysis result is improved.
Step 103, determining second data associated with the attribute information of the object to be identified based on the first data.
Wherein the attribute information is characterized by attribute characteristic parameters of the object to be identified in the single image.
In the embodiment of the present disclosure, the attribute information of the object to be recognized includes attribute feature parameters of the object to be recognized in each image included in the plurality of images, for example, a plurality of pieces of information corresponding to a plurality of attributes characterizing the object to be recognized in each image included in the plurality of images, such as a contour parameter, a color brightness degree, a feature distribution condition, and the like of the object to be recognized; that is, the second data is multi-scale feature data included in each of the plurality of images, i.e., a single image. Of course, the second data associated with the attribute information of the object to be recognized may also include information within a specific range of the object to be recognized; such as association information between other objects within a certain range of the periphery of the object to be recognized and the object to be recognized. Here, after the terminal acquires the first data, the second data associated with the attribute information of the object to be recognized is determined based on the first data.
And 104, obtaining the characteristics of the object to be identified in the image sequence at least based on the first data and the second data.
In the embodiment of the disclosure, after the terminal acquires the first data and the second data, the terminal processes the first data and the second data, that is, the terminal processes the first data associated with the characteristics of the object to be identified in the three-dimensional space and the second data composed of the multi-scale characteristic data on the plane, so as to obtain the characteristics of the object to be identified in the image sequence; therefore, in the process of analyzing the image sequence, not only the correlation among a plurality of ordered images containing the object to be identified is considered, but also the multi-scale characteristics of the object to be identified in the image sequence are considered, so that the accuracy of the analysis result is ensured, and the intelligence degree of information processing is improved.
The information processing method provided by the embodiment of the disclosure obtains an image sequence formed by a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified; analyzing the correlation among the plurality of images to obtain first data associated with the form of the object to be recognized; determining second data associated with attribute information of the object to be identified based on the first data; obtaining features of an object to be identified in the image sequence based on at least the first data and the second data; the problem that in the related art, when an object to be recognized is analyzed, an accurate analysis result is difficult to obtain is solved, the accuracy of the analysis result is ensured, and the intelligent degree of information processing is improved.
Based on the foregoing embodiments, an embodiment of the present disclosure provides an information processing method, which is applied to a terminal and shown in fig. 2, and includes the following steps:
step 201, video information containing an object to be identified is obtained.
In the embodiment of the disclosure, in the running process of the terminal, the video acquisition is performed on the object to be identified, and the video information containing the object to be identified is obtained.
For example, the object to be recognized may include a specific place, and the specific place may include an airport, a bus stop, an operation field, and the like. Taking a specific field as an airport example, the terminal obtains video information containing the airport, such as the video information of an airport waiting room, and the video information is used as a reference factor for analyzing the specific information of the airport waiting room; thereby obtaining the analysis result of the specific information about the airport waiting room; for example, an analysis result about the traffic flow or the object flow of the airport waiting room can be obtained based on the video information.
For example, the specific site may also take an airport runway as an example, and the terminal obtains video information including the airport runway and uses the video information as a reference factor for analyzing a takeoff process of an airplane on the airport runway; so as to obtain an analysis result about the takeoff process of the airplane on the runway of the airport.
Step 202, extracting a plurality of images from the video information in time sequence to obtain an image sequence.
Wherein each image of the plurality of images is a two-dimensional image or a three-dimensional image.
Illustratively, still taking the video information as the video information of the airport waiting room as an example, the terminal extracts a plurality of two-dimensional images from the video information according to the time sequence to obtain an image sequence; these two-dimensional images are then used to analyze the characteristics of the traffic flow or object flow in the waiting room of the airport.
Illustratively, still taking the video information as the video information of the airport runway as an example, the terminal extracts a plurality of three-dimensional images from the video information according to the time sequence to obtain an image sequence; these three-dimensional images are then used to analyze the relevant characteristics of the takeoff trajectory of the aircraft on the airport runway during a particular period of time.
In the embodiment of the disclosure, the terminal extracts a plurality of images from the video information in time sequence, the plurality of extracted images including the object to be recognized have time correlation, that is, a specific time interval exists between the plurality of extracted images including the object to be recognized, and the terminal determines the plurality of extracted images as an image sequence.
And 203, performing correlation analysis on the pixels of different images and the pixels of the same image based on the nth specific time interval through the same model to obtain an analysis result corresponding to the nth specific time interval.
The intervals of two adjacent time intervals are different, N is a positive integer which is greater than or equal to 1 and smaller than N, and N is the total number of the specific time intervals.
In the embodiment of the disclosure, after obtaining the image sequence, the terminal may analyze a plurality of images included in the image sequence based on a plurality of different specific time intervals to obtain an analysis result corresponding to each specific time interval. Illustratively, since the specific time intervals are different, the terminal performs correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific time interval, and the correlation between the obtained images is also different.
In this embodiment of the disclosure, the step 203 may obtain an analysis result corresponding to the nth specific time interval by performing correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific time interval through the same model, and includes the following steps: firstly, determining the void rate corresponding to the nth specific time interval in the void convolutional neural network model; secondly, performing correlation analysis between pixels of different images and between different pixels of the same image based on the void rate corresponding to the nth specific time interval by using a void convolution neural network model to obtain an analysis result corresponding to the nth specific time interval.
In the embodiment of the disclosure, the hole convolution neural network model comprises a hole rate corresponding to any specific time interval, so that when any ordered image is analyzed, the depth mining can be performed on the correlation between the pixels of different images with specific time intervals in the ordered image and the correlation between different pixels of the same image, and the characteristics of an object to be identified can be more comprehensively obtained; therefore, the omission of features caused by discontinuity in the process of analyzing based on a single image is avoided, the discontinuity in the analysis result is reduced, and further, the correlation distribution condition between the same or corresponding features is fully considered based on the time dimension in the process of analyzing the object to be identified so as to obtain a more accurate analysis result.
Step 204, generating first data based on the plurality of analysis results.
In the embodiment of the present disclosure, the step 204 of generating the first data based on a plurality of analysis results may include the following steps: and weighting the plurality of analysis results through a connecting layer connected with the cavity convolutional neural network model to obtain first data.
In the embodiment of the disclosure, after the terminal obtains the analysis result corresponding to each group of images, the terminal performs weighting processing on the plurality of analysis results through the connection layer connected by the void convolutional neural network to obtain the first data.
Step 205, processing the first data through the deep neural network model to obtain feature information associated with the attribute information of each image and the object to be identified in the plurality of images.
The attribute information includes a plurality of pieces of information corresponding to a plurality of attributes characterizing the object to be recognized in each image.
In the embodiment of the present disclosure, the first data includes data associated with a feature of an object to be recognized in a three-dimensional stereo space, and the first data may be referred to as a base layer feature. The terminal inputs the basic layer characteristics into the deep neural network, and characteristic information of each image in the plurality of images and attribute information of the object to be recognized can be obtained. That is, the feature information includes a feature in each of the plurality of images. The features in each image include attribute feature parameters of the object to be identified in each image.
And step 206, determining a plurality of pieces of feature information with position mapping relations in the plurality of pieces of feature information.
In the embodiment of the disclosure, after the terminal processes the first data through the deep neural network to obtain a plurality of feature information, the terminal determines the plurality of feature information having a position mapping relationship among the plurality of feature information.
And step 207, fusing the plurality of characteristic information with the position mapping relation to obtain second data.
In the embodiment of the present disclosure, the fusing, by the terminal, the multiple pieces of feature information having the position mapping relationship may include: the terminal fuses a plurality of feature information with position mapping relation into a new feature, and the new feature can be represented in a feature matrix form; and then extracting to obtain second data based on the feature matrix. It should be noted that the terminal fuses a plurality of feature information with position mapping relationships, so that more detailed features can be captured in the process of analyzing an object to be identified with a relatively small size, and omission of features is avoided; for an object to be recognized having a relatively large size, a more accurate analysis result can be obtained.
And 208, determining a first mark point of a target object in the object to be identified in the first direction and length information in the first direction based on the first data.
In the embodiment of the present disclosure, the target object is an object included in the object to be recognized. The terminal determines a first mark point of a target object in the object to be recognized in the first direction and length information of the target object in the first direction based on the first data, and further positions the spatial position of the target object in the object to be recognized.
For example, when the object to be recognized is a specific field, the target object may be a human body or an object contained in the specific field, and the terminal locates the target object among the objects to be recognized based on the first data.
Step 209, based on the second data, determining a second marker point of the target object in the second direction, first boundary information in the second direction, and second boundary information in the second direction.
In the embodiments of the present disclosure, the second direction is different from the first direction. Taking the spatial coordinates as an example, the first direction may be a Z-axis direction, and the second direction may be a direction in a plane enclosed by the X-axis and the Y-axis. And the terminal determines a second marking point of the target object in the second direction, first boundary information in the second direction and second boundary information in the second direction based on the second data, and further positions the cross section position of the target object in the object to be identified.
Step 210, obtaining positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information and the second boundary information.
The characteristics of the object to be identified comprise positioning information, and an included angle between the first direction and the second direction is 90 degrees.
In this embodiment of the present disclosure, the obtaining, in step 210, the positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information, and the second boundary information may include the following steps:
a1, determining the size of the target object in the first direction based on the first mark point and the length information.
In the embodiment of the present disclosure, still taking the first direction as the Z-axis direction as an example, the terminal determines, based on the first mark point and the length information, a size of the target object in the Z-axis direction, that is, determines a size of the target object in the Z-axis direction in the three-dimensional space, such as a starting point of the target object in the Z-axis direction and a length of the target object in the Z-axis direction.
a2, and determining the plane position of the target object in the second direction based on the second mark point, the first boundary information and the second boundary information.
In the embodiment of the present disclosure, still taking the first direction as the direction in the plane enclosed by the X axis and the Y axis as an example, the terminal determines the plane position of the target object in the second direction, that is, determines the plane position of the target object in the XY plane, based on the second marker point, the first boundary information, and the second boundary information.
a3, based on the size and the plane position, obtaining positioning information.
In the embodiment of the present disclosure, the a3 obtains the positioning information based on the size and the plane position, and may include the following steps:
a31, extracting a positioning area from the object to be recognized based on the size and the plane position through the deep neural network model.
In the embodiment of the disclosure, after determining the corresponding size of the target object in the Z-axis direction in the three-dimensional space and the plane position of the target object in the XY plane, the terminal extracts the positioning region from the object to be recognized through the deep neural network based on the size and the plane position, that is, a spatial rectangle is framed in the three-dimensional space.
and a32, obtaining the confidence that the positioning area comprises the target object through the deep neural network model.
In the embodiment of the disclosure, while the terminal frames a space rectangle, the terminal can also obtain the confidence that the space rectangle includes the target object through the deep neural network. Here, the confidence characterizing spatial rectangle includes the probability of the target object.
a33, determining the positioning area and the confidence as the positioning information.
In the embodiment of the disclosure, the terminal determines the positioning region and the confidence as the analysis result, so as to realize the positioning of the target object in the object to be recognized and obtain the accuracy of the positioning result.
It should be noted that, for the descriptions of the same steps and the same contents in this embodiment as those in other embodiments, reference may be made to the descriptions in other embodiments, which are not described herein again.
Based on the foregoing embodiments, an embodiment of the present disclosure provides an information processing method having spatial correlation between a plurality of images, the method being applied to a terminal, the method including the steps of:
step 301, obtaining an image sequence composed of a plurality of images.
Wherein at least one image of the plurality of images comprises an object to be identified.
In the embodiment of the disclosure, in the running process of the terminal, an image sequence formed by a plurality of images is obtained.
For example, the information processing method provided by the embodiment of the present disclosure may be applied to a scene where an object to be recognized is detected, such as security check; understandably, the spatial correlation between the plurality of images at this time is related to the spacing between the plurality of images. For example, in the process of detecting an object to be recognized, the object to be recognized is subjected to image acquisition at a specific angle and a specific distance, so as to obtain a set of sequence images. Referring to fig. 3 and 4, the terminal obtains the sequence of images as input information.
Step 302, performing correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific space interval through the same model to obtain an analysis result corresponding to the nth specific space interval.
The intervals of two adjacent specific space intervals are different, N ' is a positive integer which is greater than or equal to 1 and smaller than N ', and N ' is the total number of the specific space intervals. Wherein each image of the plurality of images is a two-dimensional image.
In the embodiment of the disclosure, after obtaining the image sequence, the terminal may analyze a plurality of images included in the image sequence based on a plurality of different specific spatial intervals to obtain an analysis result corresponding to each specific spatial interval. Illustratively, since the plurality of specific spatial intervals are different, the correlation between the images obtained by analyzing the plurality of images by the terminal is also different.
In this embodiment of the disclosure, the step 302 may obtain an analysis result corresponding to the nth specific spatial interval by performing correlation analysis on the pixels of different images and the pixels of the same image based on the nth specific spatial interval through the same model, and may include the following steps: firstly, determining the void rate corresponding to the nth specific space interval in a void convolutional neural network model; secondly, performing correlation analysis between pixels of different images and between different pixels of the same image by using a cavity convolution neural network model based on the cavity rate corresponding to the nth specific space interval to obtain an analysis result corresponding to the nth specific space interval.
In the embodiment of the disclosure, the hole convolution neural network includes a hole rate corresponding to any specific space interval, so that when any ordered image is analyzed, the depth mining can be performed on the correlation between the pixels of different images with specific space intervals in the ordered image and the correlation between different pixels of the same image, and further the object features to be identified can be more comprehensively obtained.
In the embodiment of the present disclosure, the hole convolutional neural network includes 3 hole rates as an example, each hole rate corresponds to a specific spatial interval, for example, referring to fig. 3 and 4, as shown, an arrow 31 represents that the plurality of images are analyzed through a first hole rate in the hole convolutional neural network, for example, the correlation between pixels of a plurality of different images with an N1 interval and between different pixels of the same image in the plurality of images is analyzed with N1 as a first interval, so as to obtain a first analysis result; the arrow 32 represents that the plurality of images are analyzed through a second void rate in the void convolution neural network, for example, the N2 is used as a second interval to analyze the correlation between the pixels of a plurality of different images with N2 intervals in the plurality of images and between different pixels of the same image, so as to obtain a second analysis result; arrow 33 indicates that the plurality of images are analyzed by a third void rate in the void convolution neural network, for example, the correlation between pixels of a plurality of different images with an N3 interval and between different pixels of the same image is analyzed at a third interval of N3, so as to obtain a third analysis result.
In the embodiment of the disclosure, the hole convolution neural network includes a hole rate corresponding to any specific space interval, so that when any ordered image is analyzed, deep mining can be performed on correlation between pixels of a plurality of different images with specific space intervals in the ordered image and between different pixels of the same image, and then characteristics of an object to be identified are acquired more comprehensively, thereby avoiding missing of characteristics caused by discontinuity during analysis based on a single image, reducing discontinuity in an analysis result, and further, when the object to be identified is analyzed, based on a time dimension, fully considering correlation distribution conditions between the same or corresponding characteristics, so as to obtain a more accurate analysis result.
Step 303, generating first data based on the plurality of analysis results.
In this embodiment of the disclosure, the step 303 generates the first data based on a plurality of analysis results, and may include the following steps: and weighting the plurality of analysis results through a connecting layer connected with the cavity convolutional neural network to obtain first data.
Step 304, processing the first data through the deep neural network model to obtain feature information of each image in the plurality of images associated with the attribute information of the object to be identified.
The attribute information is multiple, and the multiple attribute information corresponds to the multi-scale feature information.
In the embodiment of the disclosure, the terminal inputs the first data obtained by processing the hole convolution neural network into the deep neural network, and processes the first data through the feature extractor of the terminal to extract the second data, that is, the multi-scale features. Here, the feature extractor aims to identify feature information of the object to be identified on the cross section. The feature extractor comprises six sets of convolution blocks, Block0-Block5For Block1To Block5It consists of two convolutional layers, with kernel sizes of 1 × 1 and 3 × 3, and step sizes of 1 and 2. They, we fill one zero on each side of the feature so that the shape of the output feature of the block is 2(6-i)×2(6 -i). Binding Block0The characteristics of the medium 3D convolutional layer can be collected into the characteristics of six scales, and then one of the inputs of the output predictor of the terminal is input.
And 305, determining a plurality of pieces of feature information with position mapping relations in the plurality of pieces of feature information.
And step 306, fusing the plurality of characteristic information with the position mapping relation to obtain second data.
Step 307, based on the first data, determining a first mark point of a target object in the object to be identified in the first direction and length information in the first direction.
Step 308, based on the second data, determining a second marker point of the target object in the second direction, first boundary information in the second direction, and second boundary information in the second direction.
In the embodiment of the disclosure, an output predictor of a terminal receives two features as input features, wherein one input feature is a multi-scale feature output by a feature extractor, and the other input feature is a hole convolution neural networkAnd (4) outputting the basic layer characteristics. The two input features are then processed through two channels in the output predictor. In the multi-scale feature channel, C2d11Based on Block0To Block5Predicting the position of the target object in the cross section to obtain (x, y, W, H), wherein x and y are used for representing the second mark point in the second direction, W is used for representing the first boundary information in the second direction, and H is used for representing the second boundary information in the second direction. While C2d12The confidence of the prediction result, i.e. the probability of predicting the position of the target object in the cross section, can be obtained. In the base layer pass, the base layer features maintain continuity between images with which to pass C3d2And predicting a starting point of the target object on the Z axis and the length L to obtain (Z, L), wherein Z is used for representing a first marking point in the first direction, and L is used for representing length information in the first direction.
Step 309, obtaining the positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information and the second boundary information.
The characteristics of the object to be identified comprise positioning information, and an included angle between the first direction and the second direction is 90 degrees.
In this embodiment of the disclosure, the step 309 of obtaining the positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information, and the second boundary information may include the following steps:
b1, determining the size of the target object in the first direction based on the first mark point and the length information.
b2, determining the plane position of the target object in the second direction based on the second mark point, the first boundary information and the second boundary information.
b3, obtaining positioning information based on the size and the plane position.
In the embodiment of the present disclosure, the obtaining of the positioning information by b3 based on the size and the plane position may include the following steps:
b31, determining a locating area from the object to be identified based on the size and the plane position through the deep neural network model.
b32, obtaining the confidence that the positioning area comprises the target object through the deep neural network model.
b33, determining the positioning area and the confidence coefficient as the positioning information.
In the embodiment of the disclosure, the terminal determines that the confidence coefficient meets the threshold range, and further determines that the positioning area comprises the target object; and the terminal determines that the confidence coefficient does not meet the threshold range, and further determines that the positioning area does not comprise the target object. Thus, the identification and the positioning of the target object in the object to be identified are realized.
Based on the above, the information processing method provided by the embodiment of the present disclosure is suitable for scene security inspection, positioning of a specific target object in an industrial electronic Computed Tomography (CT) image, and calibrating a motion trajectory of the target object in a specific environment; of course, the method can also be used for the scene of lesion detection or lesion location in medical CT images.
In another embodiment of the present disclosure, the information processing method provided in the embodiments of the present disclosure may be applied to a scene of lesion detection or lesion location; in the related technology, when the focus is positioned, only a single case image, such as a medical CT image, is detected for a target individual, the spatial correlation among the CT images is completely ignored, the condition that the lesion area is discontinuous, a single focus is detected into a plurality of focuses in a grading way, more false detections and the like fail to be detected occurs. In addition, there is a solution for locating a lesion based on point cloud and RGB-D data, which has disadvantages that point cloud data only contains information of a target surface, and the target is detected from the point cloud data, and more, it is attempted to detect and classify the target by surface edge information of the object, which is different from CT data in terms of data format. RGB-D target detection techniques attempt to achieve target detection using added depth information. The CT image may be converted to RGB-D data form by 3D reconstruction. However, after the conversion, the object to be recognized, such as other organs than the liver, for example, the kidney, the pancreas, etc., is introduced. For the detection of a lesion region such as a liver lesion region, noise is caused by the introduction of other organs, so that the detection accuracy is low.
By adopting the information processing method provided by the embodiment of the disclosure, the given input can be a fixed-length CT sequence based on a 3D convolutional neural network with a cavity, the maximum circumscribed cube of a focus region in an object to be identified is output, and the focus positioning is realized. Meanwhile, in the embodiment of the disclosure, in order to solve the problem that the representative space thicknesses of different CT frames are different, a convolutional neural network model with a cavity is provided to capture multi-scale tumor features. By adopting the information processing method provided by the embodiment of the disclosure, the spatial correlation of the CT image can be fully captured, the method is suitable for the multi-scale tumor characteristics, and the discontinuity in the detection result is reduced.
Referring to fig. 3 and 4, exemplary, liver lesion examination is taken as an example to locate the lesion in an abdominal CT image. The CT sequence is acquired before positioning, and an abdomen CT image is obtained through a CT machine which scans an abdomen area from the lower part to the upper part. During a scan, acquiring a series of CT slices over a cross-section based on equal-spacing, housefield (hu), values; a CT sequence showing the condition of the abdominal organ with a cross section having a size of 512 x 512 is obtained, which comprises D images. Thus, a CT sequence of size 512X D was constructed.
In the embodiment of the present disclosure, when the terminal processes a CT sequence, the terminal may be divided into three processing modules as follows: the device comprises a basic network module, a feature extractor module and an output predictor module. Here, the basic network module is configured to process input information, such as a CT sequence, to obtain a basic layer feature representing a form of an object to be recognized in a three-dimensional space; the characteristic extractor module is used for processing the basic layer characteristics to obtain the multi-scale characteristics of the object to be identified; and then inputting the base layer features into a base layer channel in the output predictor module, and inputting the multi-scale features into a multi-scale feature channel in the output predictor module. And then, the multi-scale characteristic channel processes the multi-scale characteristic to obtain the positioning information of the object to be identified on the cross section and the confidence corresponding to the positioning information. And the basic layer channel processes the basic layer characteristics to obtain the information of the starting point and the length of the object to be identified in the Z-axis direction. Finally, the terminal determines a spatial range of a target object in the object to be recognized based on the positioning information of the object to be recognized on the cross section, the starting point in the Z-axis direction and the length information, and the spatial range includes the confidence of the target object. It should be noted that, in the process of processing the three-dimensional image and the two-dimensional image by the base network, the adopted convolution parameters are different, for example, the sizes of convolution kernels are different.
In practical application, the CT sequence is used as input information of a basic network, where basic features are extracted through the basic network of 3D holes, and a feature extractor module is used to extract multi-scale features. The multiscale features and base layer features are fed simultaneously into an output predictor, which has two channels. The first channel, the multi-scale channel, uses the multi-scale information to predict the planar position and size of the lesion area in the object to be identified, i.e., (x, y, W, H), while giving a confidence of the prediction, c 1. The second channel, the base layer channel, uses base layer features to predict the position and length in the Z-axis direction of the focal region in the object to be identified, i.e., (Z, L). The output results of the two channels are combined to obtain the final output results (x, y, z, W, H, L) and c2, i.e. the maximum circumscribed cuboid of the lesion region and the confidence that the maximum circumscribed cuboid is the lesion region. The selection of the base network may be a 3D computer vision Group (VGG) network or a residual neural network, as shown in fig. 4, and a corresponding example diagram is given by taking the base network based on the 3D VGG as an example. Illustratively, the convolution parameters (128, 3 × 3 × 3, 1,1,1, [1,1,2]), where the number of convolution kernels is 128, the number of convolution kernels is 3 × 3 × 3, the step size is 1, the void rate is [1,1,2], i.e., the void rate in the Z direction is 2; in this way, an accurate localization of the focal region in the object to be identified is achieved.
It should be noted that, for the descriptions of the same steps and the same contents in this embodiment as those in other embodiments, reference may be made to the descriptions in other embodiments, which are not described herein again.
The embodiment of the present disclosure provides an information processing apparatus, fig. 5 is a schematic diagram of a composition structure of the information processing apparatus according to the embodiment of the present disclosure, and as shown in fig. 5, the information processing apparatus 4 includes: a first obtaining module 41, a first processing module 42, a second processing module 43 and a third processing module 44; wherein the content of the first and second substances,
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring an image sequence formed by a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified;
the first processing module is used for carrying out correlation analysis on pixels of different images and different pixels of the same image through the same model to obtain first data associated with the form of the object to be recognized;
the second processing module is used for determining second data related to attribute information of the object to be recognized based on the first data, and the attribute information is characterized by attribute characteristic parameters of the object to be recognized in a single image;
and the third processing module is used for obtaining the characteristics of the object to be identified in the image sequence at least based on the first data and the second data.
In other embodiments, the first obtaining module comprises: a first acquisition unit and a first processing unit, wherein:
a first acquisition unit configured to acquire video information including an object to be identified;
the first processing unit is used for extracting a plurality of images from the video information in time sequence to obtain an image sequence; wherein each image of the plurality of images is a two-dimensional image or a three-dimensional image.
In other embodiments, the first processing module comprises: a second processing unit and a third processing unit, wherein:
the second processing unit is used for carrying out correlation analysis on the pixels of different images and the pixels of the same image based on the nth specific time interval through the same model to obtain an analysis result corresponding to the nth specific time interval; the intervals of two adjacent time intervals are different, N is a positive integer which is more than or equal to 1 and less than N, and N is the total number of the specific time intervals;
a third processing unit for generating the first data based on the plurality of analysis results.
In other embodiments, the second processing unit is further configured to:
determining a void rate corresponding to the nth specific time interval in the void convolutional neural network model;
and performing correlation analysis between pixels of different images and between different pixels of the same image on the basis of the void rate corresponding to the nth specific time interval by using a void convolution neural network model to obtain an analysis result corresponding to the nth specific time interval.
In other embodiments, the first processing module comprises: a fourth processing unit and a fifth processing unit, wherein:
the fourth processing unit is used for carrying out correlation analysis on the pixels of different images and the pixels of the same image based on the nth specific space interval through the same model to obtain an analysis result corresponding to the nth specific space interval; the intervals of two adjacent specific space intervals are different, N ' is a positive integer which is greater than or equal to 1 and smaller than N ', and N ' is the total number of the specific space intervals; wherein each image of the plurality of images is a two-dimensional image;
a fifth processing unit for generating the first data based on the plurality of analysis results.
In other embodiments, the fourth processing unit is further configured to:
determining the void rate corresponding to the nth specific space interval in the void convolutional neural network model;
and performing correlation analysis between pixels of different images and between different pixels of the same image on the basis of the void rate corresponding to the nth specific space interval by using a void convolution neural network model to obtain an analysis result corresponding to the nth specific space interval.
In other embodiments, the fifth processing unit is further configured to:
and weighting the plurality of analysis results through a connecting layer connected with the cavity convolutional neural network model to obtain first data.
In other embodiments, the second processing module comprises: a sixth processing unit, a first determining unit, and a seventh processing unit, wherein:
the sixth processing unit is used for processing the first data through the deep neural network model to obtain feature information of each image in the plurality of images, wherein the feature information is associated with the attribute information of the object to be identified; the attribute information is a plurality of attribute information, and the attribute information corresponds to multi-scale characteristic information;
a first determination unit configured to determine a plurality of pieces of feature information having a position mapping relationship among the plurality of pieces of feature information;
and the seventh processing unit is used for fusing the plurality of characteristic information with the position mapping relation to obtain second data.
In other embodiments, the third processing module comprises: a second determining unit, a third determining unit, and an eighth processing unit, wherein:
a second determination unit, configured to determine, based on the first data, a first marker point of a target object in the objects to be identified in the first direction and length information in the first direction;
a third determining unit, configured to determine, based on the second data, a second marker point of the target object in the second direction, first boundary information in the second direction, and second boundary information in the second direction;
the eighth processing unit is used for obtaining the positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information and the second boundary information; the characteristics of the object to be identified comprise positioning information, and an included angle between the first direction and the second direction is 90 degrees.
In other embodiments, the eighth processing unit is further configured to:
determining the size of the target object in the first direction based on the first mark point and the length information;
determining the plane position of the target object in the second direction based on the second mark point, the first boundary information and the second boundary information;
based on the size and the plane position, positioning information is obtained.
In other embodiments, the eighth processing unit is further configured to:
determining a positioning area from the object to be recognized based on the size and the plane position through a deep neural network model;
obtaining the confidence degree that the positioning area comprises the target object through a deep neural network model;
the location area and confidence are determined as location information.
In other embodiments, the eighth processing unit is further configured to:
if the confidence coefficient accords with the threshold range, determining that the positioning area comprises the target object;
and if the confidence coefficient does not meet the threshold range, determining that the positioning area does not comprise the target object.
It should be noted that, for the descriptions of the same steps and the same contents in this embodiment as those in other embodiments, reference may be made to the descriptions in other embodiments, which are not described herein again.
Based on the foregoing embodiments, an embodiment of the present disclosure provides a terminal, which may be applied to the information processing method provided in the embodiments corresponding to fig. 1 to 2, and as shown in fig. 6, the terminal 5 includes: a processor 51, a memory 52 and a communication bus 53;
the communication bus 53 is used for realizing communication connection between the processor 51 and the memory 52;
the processor 51 is configured to execute the information processing program stored in the memory 52 to implement the following steps:
obtaining an image sequence consisting of a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified;
performing correlation analysis between pixels of different images and between different pixels of the same image through the same model to obtain first data associated with the form of the object to be recognized;
determining second data associated with attribute information of the object to be identified based on the first data, the attribute information being characterized by attribute characteristic parameters of the object to be identified in the single image;
based on at least the first data and the second data, a feature of an object to be identified in the sequence of images is obtained.
In other embodiments, obtaining an image sequence of a plurality of images comprises:
obtaining video information containing an object to be identified;
extracting a plurality of images from the video information according to the time sequence to obtain an image sequence; wherein each image of the plurality of images is a two-dimensional image or a three-dimensional image.
In other embodiments, performing a correlation analysis between pixels of different images and between different pixels of the same image by the same model to obtain first data associated with a morphology of an object to be recognized includes:
performing correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific time interval through the same model to obtain an analysis result corresponding to the nth specific time interval; the intervals of two adjacent time intervals are different, N is a positive integer which is more than or equal to 1 and less than N, and N is the total number of the specific time intervals;
based on the plurality of analysis results, first data is generated.
In other embodiments, performing, by the same model, correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific time interval to obtain an analysis result corresponding to the nth specific time interval includes:
determining a void rate corresponding to the nth specific time interval in the void convolutional neural network model;
and performing correlation analysis between pixels of different images and between different pixels of the same image on the basis of the void rate corresponding to the nth specific time interval by using a void convolution neural network model to obtain an analysis result corresponding to the nth specific time interval.
In other embodiments, performing a correlation analysis between pixels of different images and between different pixels of the same image by the same model to obtain first data associated with a morphology of an object to be recognized includes:
performing correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific space interval through the same model to obtain an analysis result corresponding to the nth specific space interval; the intervals of two adjacent specific space intervals are different, N ' is a positive integer which is greater than or equal to 1 and smaller than N ', and N ' is the total number of the specific space intervals; wherein each image of the plurality of images is a two-dimensional image;
based on the plurality of analysis results, first data is generated.
In other embodiments, performing, by the same model, correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific spatial interval to obtain an analysis result corresponding to the nth specific spatial interval includes:
determining the void rate corresponding to the nth specific space interval in the void convolutional neural network model;
and performing correlation analysis between pixels of different images and between different pixels of the same image on the basis of the void rate corresponding to the nth specific space interval by using a void convolution neural network model to obtain an analysis result corresponding to the nth specific space interval.
In other embodiments, generating the first data based on the plurality of analysis results includes:
and weighting the plurality of analysis results through a connecting layer connected with the cavity convolutional neural network model to obtain first data.
In other embodiments, determining second data associated with attribute information of the object to be identified based on the first data comprises:
processing the first data through a deep neural network model to obtain feature information associated with attribute information of each image and the object to be identified in the plurality of images; the attribute information is multiple, and the multiple attribute information corresponds to multi-scale characteristic information;
determining a plurality of characteristic information with position mapping relation in the plurality of characteristic information;
and fusing the plurality of characteristic information with the position mapping relation to obtain second data.
In other embodiments, deriving features of the object to be identified in the sequence of images based on at least the first data and the second data comprises:
determining a first mark point of a target object in the object to be recognized in the first direction and length information in the first direction based on the first data;
determining a second marker point of the target object in a second direction, first boundary information in the second direction and second boundary information in the second direction based on the second data;
obtaining positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information and the second boundary information; the characteristics of the object to be identified comprise positioning information, and an included angle between the first direction and the second direction is 90 degrees.
In other embodiments, obtaining the positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information, and the second boundary information includes:
determining the size of the target object in the first direction based on the first mark point and the length information;
determining the plane position of the target object in the second direction based on the second mark point, the first boundary information and the second boundary information;
based on the size and the plane position, positioning information is obtained.
In other embodiments, based on the size and the planar position, positioning information is derived, including:
determining a positioning area from the object to be recognized based on the size and the plane position through a deep neural network model;
obtaining the confidence degree that the positioning area comprises the target object through a deep neural network model;
the location area and confidence are determined as location information.
It should be noted that, for the descriptions of the same steps and the same contents in this embodiment as those in other embodiments, reference may be made to the descriptions in other embodiments, which are not described herein again.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only for the preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure.

Claims (27)

1. An information processing method, the method comprising:
obtaining an image sequence consisting of a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified;
analyzing the correlation among the plurality of images to obtain first data associated with the form of the object to be recognized;
determining second data associated with attribute information of the object to be identified based on the first data; the attribute information is characterized by the attribute characteristic parameters of the object to be identified in the single image;
obtaining features of the object to be identified in the image sequence based on at least the first data and the second data.
2. The method of claim 1, the obtaining an image sequence of a plurality of images, comprising:
obtaining video information containing the object to be identified;
extracting a plurality of images from the video information according to the time sequence to obtain the image sequence; wherein each image of the plurality of images is a two-dimensional image or a three-dimensional image.
3. The method of claim 2, said analyzing correlations between said plurality of images resulting in first data associated with a morphology of said object to be identified, comprising:
and carrying out correlation analysis on the pixels of different images and the pixels of the same image through the same model to obtain the first data.
4. The method of claim 3, wherein the analyzing the correlation between pixels of different images and between different pixels of the same image by the same model to obtain the first data comprises:
performing correlation analysis between pixels of different images and between different pixels of the same image based on an nth specific time interval through the same model to obtain an analysis result corresponding to the nth specific time interval; the intervals of two adjacent time intervals are different, N is a positive integer which is more than or equal to 1 and less than N, and N is the total number of the specific time intervals;
generating the first data based on a plurality of the analysis results.
5. The method of claim 4, wherein the performing, by the same model, correlation analysis between pixels of the different images and between different pixels of the same image based on an nth specific time interval to obtain an analysis result corresponding to the nth specific time interval comprises:
determining a void rate corresponding to the nth specific time interval in the void convolutional neural network model;
and performing correlation analysis between the pixels of different images and between different pixels of the same image by using the hole convolution neural network model based on the hole rate corresponding to the nth specific time interval to obtain an analysis result corresponding to the nth specific time interval.
6. The method of claim 3, wherein the analyzing the correlation between pixels of different images and between different pixels of the same image by the same model to obtain the first data comprises:
performing correlation analysis between pixels of different images and between different pixels of the same image based on the nth specific space interval through the same model to obtain an analysis result corresponding to the nth specific space interval; the intervals of two adjacent specific space intervals are different, N ' is a positive integer which is greater than or equal to 1 and smaller than N ', and N ' is the total number of the specific space intervals; wherein each image of the plurality of images is a two-dimensional image;
generating the first data based on a plurality of the analysis results.
7. The method of claim 6, wherein the performing, by the same model, correlation analysis between pixels of the different images and between different pixels of the same image based on an nth specific spatial interval to obtain an analysis result corresponding to the nth specific spatial interval comprises:
determining the void rate corresponding to the nth specific space interval in the void convolutional neural network model;
and performing correlation analysis between pixels of different images and between different pixels of the same image on the basis of the void rate corresponding to the nth specific space interval by using the void convolution neural network model to obtain an analysis result corresponding to the nth specific space interval.
8. The method of claim 5 or 7, the generating the first data based on a plurality of the analysis results, comprising:
and weighting the analysis results through a connecting layer connected with a cavity convolution neural network model to obtain the first data.
9. The method of claim 8, the determining second data associated with attribute information of the object to be identified based on the first data, comprising:
processing the first data through a deep neural network model to obtain feature information associated with the attribute information of each image in the plurality of images and the object to be identified; the attribute information is multiple, and the multiple attribute information corresponds to multi-scale characteristic information; each attribute information is characterized by the attribute characteristic parameters of the object to be identified in the single image;
determining a plurality of characteristic information with position mapping relation in the plurality of characteristic information;
and fusing the plurality of characteristic information with the position mapping relation to obtain the second data.
10. The method of claim 9, said deriving features of the object to be identified in the sequence of images based on at least the first data and the second data, comprising:
determining a first mark point of a target object in the object to be identified in a first direction and length information in the first direction based on the first data;
determining a second marker point of the target object in a second direction, first boundary information in the second direction, and second boundary information in the second direction based on the second data;
obtaining positioning information of the target object based on the first mark point, the length information, the second mark point, the first boundary information and the second boundary information; wherein the characteristics of the object to be identified include the positioning information; the included angle between the first direction and the second direction is 90 degrees.
11. The method of claim 10, wherein the deriving positioning information of the target object based on the first marker point, the length information, the second marker point, the first boundary information, and the second boundary information comprises:
determining a size of the target object in the first direction based on the first marker point and the length information;
determining a plane position of the target object in the second direction based on the second marker point, the first boundary information and the second boundary information;
and obtaining the positioning information based on the size and the plane position.
12. The method of claim 11, said deriving the positioning information based on the size and the planar position, comprising:
determining a localization area from the object to be identified based on the size and the planar position by a deep neural network model;
obtaining a confidence that the positioning region comprises the target object through the deep neural network model;
determining the location area and the confidence as the location information.
13. An information processing apparatus, the information processing apparatus comprising: the device comprises a first acquisition module, a first processing module, a second processing module and a third processing module, wherein:
the first acquisition module is used for acquiring an image sequence formed by a plurality of images; wherein at least one image of the plurality of images comprises an object to be identified;
the first processing module is used for analyzing the correlation among the plurality of images to obtain first data associated with the form of the object to be recognized;
the second processing module is used for determining second data associated with attribute information of the object to be identified based on the first data, wherein the attribute information is characterized by attribute characteristic parameters of the object to be identified in a single image;
the third processing module is configured to obtain a feature of the object to be recognized in the image sequence based on at least the first data and the second data.
14. The apparatus of claim 13, the first acquisition module comprising: a first acquisition unit and a first processing unit, wherein:
the first obtaining unit is used for obtaining video information containing the object to be identified;
the first processing unit is used for extracting a plurality of images from the video information in a time sequence to obtain the image sequence; wherein each image of the plurality of images is a two-dimensional image or a three-dimensional image.
15. The apparatus of claim 14, wherein the first processing module is further configured to perform correlation analysis between pixels of different images and between different pixels of the same image through the same model to obtain the first data.
16. The apparatus of claim 15, the first processing module comprising: a second processing unit and a third processing unit, wherein:
the second processing unit is used for performing correlation analysis on the pixels of different images and the pixels of the same image based on an nth specific time interval through the same model to obtain an analysis result corresponding to the nth specific time interval; the intervals of two adjacent time intervals are different, N is a positive integer which is more than or equal to 1 and less than N, and N is the total number of the specific time intervals;
the third processing unit is configured to generate the first data based on a plurality of the analysis results.
17. The apparatus as recited in claim 16, said second processing unit to further:
determining a void rate corresponding to the nth specific time interval in the void convolutional neural network model;
and performing correlation analysis between the pixels of different images and between different pixels of the same image by using the hole convolution neural network model based on the hole rate corresponding to the nth specific time interval to obtain an analysis result corresponding to the nth specific time interval.
18. The apparatus of claim 15, the first processing module comprising: a fourth processing unit and a fifth processing unit, wherein:
the fourth processing unit is configured to perform correlation analysis between pixels of the different images and between different pixels of the same image based on an nth specific spatial interval through the same model, so as to obtain an analysis result corresponding to the nth specific spatial interval; the intervals of two adjacent specific space intervals are different, N ' is a positive integer which is greater than or equal to 1 and smaller than N ', and N ' is the total number of the specific space intervals; wherein each image of the plurality of images is a two-dimensional image;
the fifth processing unit is configured to generate the first data based on a plurality of the analysis results.
19. The apparatus as recited in claim 18, said fourth processing unit to further:
determining the void rate corresponding to the nth specific space interval in the void convolutional neural network model;
and performing correlation analysis between pixels of different images and between different pixels of the same image on the basis of the void rate corresponding to the nth specific space interval by using the void convolution neural network model to obtain an analysis result corresponding to the nth specific space interval.
20. The apparatus of claim 17 or 19, said generating said first data based on a plurality of said analysis results, further configured to:
and weighting the analysis results through a connecting layer connected with a cavity convolution neural network model to obtain the first data.
21. The apparatus of claim 20, the second processing module comprising: a sixth processing unit, a first determining unit, and a seventh processing unit, wherein:
the sixth processing unit is configured to process the first data through a deep neural network model to obtain feature information associated with attribute information of each of the plurality of images and the object to be identified; the attribute information is multiple, and the multiple attribute information corresponds to multi-scale characteristic information; each attribute information is characterized by the attribute characteristic parameters of the object to be identified in the single image;
the first determining unit is configured to determine a plurality of pieces of feature information having a position mapping relationship among the plurality of pieces of feature information;
the seventh processing unit is configured to fuse the plurality of feature information having the position mapping relationship to obtain the second data.
22. The apparatus of claim 21, the third processing module comprising: a second determining unit, a third determining unit, and an eighth processing unit, wherein:
the second determining unit is used for determining a first mark point of a target object in the object to be identified in a first direction and length information in the first direction based on the first data;
the third determining unit is configured to determine, based on the second data, a second marker point of the target object in a second direction, first boundary information in the second direction, and second boundary information in the second direction;
the eighth processing unit is configured to obtain positioning information of the target object based on the first marker point, the length information, the second marker point, the first boundary information, and the second boundary information; the characteristic of the object to be identified comprises the positioning information, and an included angle between the first direction and the second direction is 90 degrees.
23. The apparatus as recited in claim 22, said eighth processing unit to further:
determining a size of the target object in the first direction based on the first marker point and the length information;
determining a plane position of the target object in the second direction based on the second marker point, the first boundary information and the second boundary information;
and obtaining the positioning information based on the size and the plane position.
24. The apparatus as recited in claim 23, said eighth processing unit to further:
determining a localization area from the object to be identified based on the size and the planar position by a deep neural network model;
obtaining a confidence that the positioning region comprises the target object through the deep neural network model;
determining the location area and the confidence as the location information.
25. The apparatus as recited in claim 24, said eighth processing unit to further:
if the confidence coefficient meets a threshold range, determining that the positioning area comprises the target object;
and if the confidence coefficient does not meet the threshold range, determining that the positioning area does not comprise the target object.
26. A terminal, the terminal comprising: a memory having computer-executable instructions stored thereon, and a processor operable to perform the method steps of any of claims 1 to 12 when executing the computer-executable instructions on the memory.
27. A storage medium having stored thereon computer-executable instructions capable, when executed, of carrying out the method steps of any one of claims 1 to 12.
CN201910189975.6A 2019-03-13 2019-03-13 Information processing method, device, terminal and storage medium Active CN109977816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910189975.6A CN109977816B (en) 2019-03-13 2019-03-13 Information processing method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910189975.6A CN109977816B (en) 2019-03-13 2019-03-13 Information processing method, device, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN109977816A CN109977816A (en) 2019-07-05
CN109977816B true CN109977816B (en) 2021-05-18

Family

ID=67078687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910189975.6A Active CN109977816B (en) 2019-03-13 2019-03-13 Information processing method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN109977816B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108648195B (en) * 2018-05-09 2022-06-28 联想(北京)有限公司 Image processing method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108600782A (en) * 2018-04-08 2018-09-28 深圳市零度智控科技有限公司 Video super-resolution method, device and computer readable storage medium
CN109447082A (en) * 2018-08-31 2019-03-08 武汉尺子科技有限公司 A kind of scene motion Target Segmentation method, system, storage medium and equipment

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9111215B2 (en) * 2012-07-03 2015-08-18 Brain Corporation Conditional plasticity spiking neuron network apparatus and methods
CN106354816B (en) * 2016-08-30 2019-12-13 东软集团股份有限公司 video image processing method and device
CN106447625A (en) * 2016-09-05 2017-02-22 北京中科奥森数据科技有限公司 Facial image series-based attribute identification method and device
US11308350B2 (en) * 2016-11-07 2022-04-19 Qualcomm Incorporated Deep cross-correlation learning for object tracking
CN107507176B (en) * 2017-08-28 2021-01-26 京东方科技集团股份有限公司 Image detection method and system
CN108062531B (en) * 2017-12-25 2021-10-19 南京信息工程大学 Video target detection method based on cascade regression convolutional neural network
CN108388876B (en) * 2018-03-13 2022-04-22 腾讯科技(深圳)有限公司 Image identification method and device and related equipment
CN109165561A (en) * 2018-07-27 2019-01-08 北京以萨技术股份有限公司 A kind of traffic congestion recognition methods based on video features

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108600782A (en) * 2018-04-08 2018-09-28 深圳市零度智控科技有限公司 Video super-resolution method, device and computer readable storage medium
CN109447082A (en) * 2018-08-31 2019-03-08 武汉尺子科技有限公司 A kind of scene motion Target Segmentation method, system, storage medium and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Multiregion segmentation of bladder cancer structures in MRI with progressive dilated convolutional networks;Dolz, Jose 等;《MEDICAL PHYSICS》;20181231;第45卷(第12期);第5482-5493页 *
面向密集人群计数的两列串行空洞卷积神经网络;赵传强 等;《电脑知识与技术》;20181231;第14卷(第34期);第164-167页 *

Also Published As

Publication number Publication date
CN109977816A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
EP3961484B1 (en) Medical image segmentation method and device, electronic device and storage medium
CN109685060B (en) Image processing method and device
CN111127466B (en) Medical image detection method, device, equipment and storage medium
CN108010021B (en) Medical image processing system and method
CN107742093B (en) Real-time detection method, server and system for infrared image power equipment components
CN110008962B (en) Weak supervision semantic segmentation method based on attention mechanism
WO2018089163A1 (en) Methods and systems of performing object pose estimation
JP2012243313A (en) Image processing method and image processing device
JP7156515B2 (en) Point cloud annotation device, method and program
CN112085714A (en) Pulmonary nodule detection method, model training method, device, equipment and medium
CN109583364A (en) Image-recognizing method and equipment
EP3973507B1 (en) Segmentation for holographic images
CN109671055B (en) Pulmonary nodule detection method and device
CN114870384A (en) Taijiquan training method and system based on dynamic recognition
CN109977816B (en) Information processing method, device, terminal and storage medium
CN110992310A (en) Method and device for determining partition where mediastinal lymph node is located
CN112703531A (en) Generating annotation data for tissue images
KR20140047331A (en) Object segmentation using block clustering based on automatic initial region of interest estimation
AU2019431568B2 (en) Method and product for processing of vrds 4d medical images
CN116862920A (en) Portrait segmentation method, device, equipment and medium
CN110751034B (en) Pedestrian behavior recognition method and terminal equipment
Zeng et al. A survey of deep learning-based methods for cryo-electron tomography data analysis
CN113705432A (en) Model training and three-dimensional target detection method, device, equipment and medium
KR20200005853A (en) Method and System for People Count based on Deep Learning
CN117854155B (en) Human skeleton action recognition method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant