CN108235116A - Feature propagation method and device, electronic equipment, program and medium - Google Patents
Feature propagation method and device, electronic equipment, program and medium Download PDFInfo
- Publication number
- CN108235116A CN108235116A CN201711455916.6A CN201711455916A CN108235116A CN 108235116 A CN108235116 A CN 108235116A CN 201711455916 A CN201711455916 A CN 201711455916A CN 108235116 A CN108235116 A CN 108235116A
- Authority
- CN
- China
- Prior art keywords
- frame
- feature
- present frame
- level
- present
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 230000011218 segmentation Effects 0.000 claims abstract description 82
- 238000000605 extraction Methods 0.000 claims abstract description 61
- 238000013528 artificial neural network Methods 0.000 claims abstract description 27
- 230000004044 response Effects 0.000 claims abstract description 22
- 238000006243 chemical reaction Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 8
- 230000005540 biological transmission Effects 0.000 description 18
- 238000003860 storage Methods 0.000 description 11
- 101001136670 Homo sapiens Persephin Proteins 0.000 description 10
- 102100036660 Persephin Human genes 0.000 description 10
- 238000004891 communication Methods 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 4
- 230000000644 propagated effect Effects 0.000 description 4
- 238000012544 monitoring process Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the invention discloses a kind of feature propagation method and device, electronic equipment, program and medium, wherein, method includes:Judge whether present frame is key frame;It is the non-key frame in video in response to the present frame, according to the low-level feature of adjacent previous key frame and the low-level feature of the present frame, the high-level characteristic of the present frame is obtained by the high-level characteristic of the previous key frame;Wherein, in neural network, extraction obtains the network depth of the corresponding first network layer of low-level feature of the previous key frame, is shallower than the network depth that extraction obtains corresponding second network layer of high-level characteristic of the previous key frame.The consensus information between video frame is utilized in the embodiment of the present invention, using the semantic label between contiguous frames it is close the characteristics of, video semanteme feature is traveled into present frame from adjacent previous key frame, reduces and computes repeatedly the time, and improve the accuracy of semantic segmentation.
Description
Technical field
The present invention relates to computer vision technique, especially a kind of feature propagation method and device, electronic equipment, program and
Medium.
Background technology
Video semanteme segmentation is the major issue in computer vision and video semanteme understanding task.Video semanteme divides mould
Type has important application in many fields, such as automatic Pilot, the fields such as video monitoring and video object analysis.
At present, it is although more to the comparison of the semantic segmentation technical research of image, video semanteme cutting techniques are but ground
That studies carefully is fewer.Video semanteme divides more demanding real-time, while can ensure enough precision.
Invention content
The embodiment of the present invention provides the feature propagation technical solution in a kind of video.
One side according to embodiments of the present invention, a kind of feature propagation method provided, including:
Judge whether present frame is key frame;
It is the non-key frame in video in response to the present frame, according to the low of the adjacent previous key frame of the present frame
The low-level feature of layer feature and the present frame, the high level that the present frame is obtained by the high-level characteristic of the previous key frame are special
Sign;Wherein, in neural network, the network that extraction obtains the corresponding first network layer of low-level feature of the previous key frame is deep
Degree is shallower than the network depth that extraction obtains corresponding second network layer of high-level characteristic of the previous key frame.
Optionally, in any of the above-described embodiment of the method for the present invention, the previous key adjacent according to the present frame
The low-level feature of the low-level feature of frame and the present frame obtains the present frame by the high-level characteristic of the previous key frame
High-level characteristic, including:
According to the low-level feature of adjacent previous key frame and the low-level feature of the present frame, obtain from the previous pass
The low-level feature of key frame transforms to the conversion weights of the low-level feature of the present frame;
According to the high-level characteristic of the previous key frame and the conversion weights, by the high-level characteristic of the previous key frame
Be converted to the high-level characteristic of the present frame.
Optionally, it is non-key in video in response to the present frame in any of the above-described embodiment of the method for the present invention
Frame further includes:
High-level characteristic at least based on the present frame carries out semantic segmentation to the present frame, obtains the present frame
Semantic label.
Optionally, in any of the above-described embodiment of the method for the present invention, the high-level characteristic at least based on the present frame,
Semantic segmentation is carried out to the present frame, including:
Low-level feature and high-level characteristic based on the present frame carry out semantic segmentation, described in acquisition to the present frame
The semantic label of present frame.
Optionally, in any of the above-described embodiment of the method for the present invention, low-level feature and high level based on the present frame are special
Sign carries out semantic segmentation to the present frame, including:
The low-level feature of the present frame is converted, is obtained consistent with the port number of the high-level characteristic of the present frame
Feature;
The feature that the present frame is converted to is spliced or merged with the high-level characteristic of the present frame, is worked as
Previous frame feature;
Based on the present frame feature, semantic segmentation is carried out to the present frame.
Optionally, it is described to judge whether present frame is key frame in any of the above-described embodiment of the method for the present invention, including:
Judge whether the present frame is key frame using key frame scheduling strategy.
Optionally, it is described to judge described work as using key frame scheduling strategy in any of the above-described embodiment of the method for the present invention
Whether previous frame is key frame, including:Judge whether the present frame is key frame using regular length scheduling method;
It is the non-key frame in video in response to the present frame, the method further includes:The present frame is carried out special
Sign extraction obtains the low-level feature of the present frame.
Optionally, in any of the above-described embodiment of the method for the present invention, judge the present frame using key frame scheduling strategy
Whether it is key frame, including:
Feature extraction is carried out to the present frame, obtains the low-level feature of the present frame;
According to the low-level feature of the previous key frame and the low-level feature of the present frame, obtain the present frame and adjusted
Spend the scheduling probability value for key frame;
Determine whether the present frame is scheduled as key frame according to the scheduling probability value of the present frame.
Optionally, in any of the above-described embodiment of the method for the present invention, according to the low-level feature of the previous key frame and institute
The low-level feature of present frame is stated, obtains the scheduling probability value that the present frame is scheduled as key frame, including:
The low-level feature of the low-level feature of the previous key frame and the present frame is spliced, it is special to obtain splicing
Sign;
By key frame dispatch network, obtain whether the present frame should be scheduled as key based on the splicing feature
The scheduling probability value of frame.
Optionally, it in any of the above-described embodiment of the method for the present invention, further includes:
It is the key frame in video in response to present frame, feature extraction is carried out to the present frame, obtains the present frame
Low-level feature and caching;
Feature extraction is carried out to the low-level feature of the present frame, obtains the high-level characteristic and caching of the present frame.
Optionally, it in any of the above-described embodiment of the method for the present invention, further includes:
It is the key frame in the video in response to present frame, based on the high-level characteristic of the present frame, to described current
Frame carries out semantic segmentation, obtains the semantic label of the present frame.
Other side according to embodiments of the present invention, a kind of feature propagation device provided, including:
Judgment module, for judging whether present frame is key frame;
Feature propagation module is non-in video in response to present frame for the judging result according to the judgment module
Key frame, according to the low-level feature of the adjacent previous key frame of the present frame and the low-level feature of the present frame, by described
The high-level characteristic of previous key frame obtains the high-level characteristic of the present frame;Wherein, in neural network, extraction obtains described previous
The network depth of the first network layer of the low-level feature of key frame is shallower than extraction and obtains the high-level characteristic pair of the previous key frame
The network depth of the second network layer answered.
Optionally, in any of the above-described device embodiment of the present invention, the feature propagation module is specifically used for:
According to the low-level feature of the previous key frame and the low-level feature of the present frame, obtain from the previous key
The low-level feature of frame transforms to the conversion weights of the low-level feature of the present frame;And
According to the high-level characteristic of the previous key frame and the conversion weights, by the high-level characteristic of the previous key frame
Be converted to the high-level characteristic of the present frame.
Optionally, it in any of the above-described device embodiment of the present invention, further includes:
Semantic segmentation module is non-in video in response to present frame for the judging result according to the judgment module
Key frame, at least high-level characteristic based on the present frame carry out semantic segmentation to the present frame, obtain the present frame
Semantic label.
Optionally, in any of the above-described device embodiment of the present invention, the semantic segmentation module is at least based on described current
The high-level characteristic of frame when carrying out semantic segmentation to the present frame, is specifically used for:Low-level feature and height based on the present frame
Layer feature carries out semantic segmentation to the present frame.
Optionally, in any of the above-described device embodiment of the present invention, the semantic segmentation module is based on the present frame
Low-level feature and high-level characteristic when carrying out semantic segmentation to the present frame, are specifically used for:
The low-level feature of the present frame is converted, is obtained consistent with the port number of the high-level characteristic of the present frame
Feature;
The feature that the present frame is converted to is spliced or merged with the high-level characteristic of the present frame, is worked as
Previous frame feature;And
Based on the present frame feature, semantic segmentation is carried out to the present frame.
Optionally, in any of the above-described device embodiment of the present invention, the judgment module, specifically for utilizing key frame tune
Degree strategy judges whether the present frame is key frame.
Optionally, in any of the above-described device embodiment of the present invention, the judgment module, specifically for utilizing regular length
Scheduling method judges whether the present frame is key frame;
Described device further includes:
Fisrt feature extraction module, for the judging result according to the judgment module, in response to present frame in video
Non-key frame, feature extraction is carried out to the present frame, obtains the low-level feature of the present frame.
Optionally, it in any of the above-described device embodiment of the present invention, further includes:
Fisrt feature extraction module, for carrying out feature extraction to the present frame, the low layer for obtaining the present frame is special
Sign;
Acquisition module for the low-level feature and the low-level feature of the present frame according to adjacent previous key frame, obtains
The present frame is taken to be scheduled as the scheduling probability value of key frame;
The judgment module, specifically for determining whether the present frame is adjusted according to the scheduling probability value of the present frame
It spends for key frame.
Optionally, in any of the above-described device embodiment of the present invention, the acquisition module includes:
Concatenation unit is spelled for the low-level feature to the previous key frame and the low-level feature of the present frame
It connects, obtains splicing feature;
Key frame dispatch network obtains whether the present frame should be scheduled as key for being based on the splicing feature
The scheduling probability value of frame.
Optionally, in any of the above-described device embodiment of the present invention, the fisrt feature extraction module is additionally operable to according to institute
The judging result of judgment module is stated, is the key frame in video in response to present frame, feature extraction is carried out to the present frame, is obtained
Obtain the low-level feature and caching of the present frame;
Described device further includes:
Second feature extraction module for carrying out feature extraction to the low-level feature of the key frame, obtains the key
The high-level characteristic and caching of frame.
Optionally, in any of the above-described device embodiment of the present invention, the semantic segmentation module is additionally operable to sentence according to
The judging result of disconnected module, is the key frame in video in response to present frame, based on the high-level characteristic of the present frame, to described
Present frame carries out semantic segmentation, obtains the semantic label of the present frame.
Another aspect according to embodiments of the present invention, a kind of electronic equipment provided, including:Any of the above-described reality of the present invention
Apply the feature propagation device described in example.
Another aspect according to embodiments of the present invention, another electronic equipment provided, including:
Feature propagation device described in processor and any of the above-described embodiment of the present invention;
When processor runs the feature propagation device, the feature propagation device described in any of the above-described embodiment of the present invention
In unit be run.
Another aspect according to embodiments of the present invention, another electronic equipment provided, including:Processor and storage
Device;
For the memory for storing an at least executable instruction, the executable instruction makes the processor perform this hair
The operation of each step in feature propagation method described in bright any of the above-described embodiment.
Another aspect according to embodiments of the present invention, a kind of computer program provided, including computer-readable code,
It is characterized in that, when the computer-readable code in equipment when running, the processor execution in the equipment is used to implement
The instruction of each step in feature propagation method described in any of the above-described embodiment of the present invention.
Another aspect according to embodiments of the present invention, a kind of computer-readable medium provided, for storing computer
The instruction that can be read, described instruction are performed in the feature propagation method realized described in any of the above-described embodiment of the present invention and respectively walk
Rapid operation.
The feature propagation provided based on the above embodiment of the present invention puts method and apparatus, electronic equipment, program and medium,
When present frame is the non-key frame in video, according to the adjacent low-level feature of previous key frame of present frame and the low layer of present frame
Feature obtains the high-level characteristic of present frame, to be based on the high-level characteristic to non-key frame by the high-level characteristic of previous key frame
Carry out semantic segmentation.The consensus information between video frame is utilized in the embodiment of the present invention, is marked using the semanteme between contiguous frames
The characteristics of close is signed, present frame will be traveled to from adjacent previous key frame for carrying out the high-level characteristic of video semanteme segmentation,
To carry out semantic segmentation to present frame based on the high-level characteristic of the present frame, without extracting video successive frame for language frame by frame
The high-level characteristic of justice segmentation relative to extraction frame by frame for the mode of the high-level characteristic of semantic segmentation, reduces when computing repeatedly
Between;In addition, that the high-level characteristic of previous key frame is traveled to present frame is for semantic segmentation and indirect for the embodiment of the present invention
Semantic label is propagated, the mode of key frame semantic label is propagated relative to light stream, improves the accuracy of semantic segmentation.
Below by drawings and examples, technical scheme of the present invention is described in further detail.
Description of the drawings
The attached drawing of a part for constitution instruction describes the embodiment of the present invention, and is used to explain together with description
The principle of the present invention.
With reference to attached drawing, according to following detailed description, the present invention can be more clearly understood, wherein:
Fig. 1 is the flow chart of feature of present invention transmission method one embodiment.
Fig. 2 is the flow chart of another embodiment of feature of present invention transmission method.
Fig. 3 is the flow chart of another embodiment of feature of present invention transmission method.
Fig. 4 is the structure diagram of feature of present invention transmission device one embodiment.
Fig. 5 is the structure diagram of another embodiment of feature of present invention transmission device.
Fig. 6 is the structure diagram of another embodiment of feature of present invention transmission device.
Fig. 7 is the structure diagram of one Application Example of electronic equipment of the present invention.
Specific embodiment
Carry out the various exemplary embodiments of detailed description of the present invention now with reference to attached drawing.It should be noted that:Unless in addition have
Body illustrates that the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally
The range of invention.
Simultaneously, it should be appreciated that for ease of description, the size of the various pieces shown in attached drawing is not according to reality
Proportionate relationship draw.
It is illustrative to the description only actually of at least one exemplary embodiment below, is never used as to the present invention
And its application or any restrictions that use.
Technology, method and apparatus known to person of ordinary skill in the relevant may be not discussed in detail, but suitable
In the case of, the technology, method and apparatus should be considered as part of specification.
It should be noted that:Similar label and letter represents similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined in a attached drawing, then in subsequent attached drawing does not need to that it is further discussed.
The embodiment of the present invention can be applied to computer system/server, can be with numerous other general or specialized calculating
System environments or configuration operate together.Suitable for be used together with computer system/server well-known computing system, ring
The example of border and/or configuration includes but not limited to:Personal computer system, server computer system, thin client, thick client
Machine, hand-held or laptop devices, the system based on microprocessor, set-top box, programmable consumer electronics, NetPC Network PC,
Little types Ji calculates machine Xi Tong ﹑ large computer systems and the distributed cloud computing technology environment including any of the above described system, etc..
Computer system/server can be in computer system executable instruction (such as journey performed by computer system
Sequence module) general linguistic context under describe.In general, program module can include routine, program, target program, component, logic, number
According to structure etc., they perform specific task or realize specific abstract data type.Computer system/server can be with
Implement in distributed cloud computing environment, in distributed cloud computing environment, task is long-range by what is be linked through a communication network
Manage what equipment performed.In distributed cloud computing environment, program module can be located at the Local or Remote meter for including storage device
It calculates in system storage medium.
In the implementation of the present invention, inventors discovered through research that, a kind of current existing video semanteme segmentation side
In method, it will be directly applied in video for the model that image, semantic is divided, since video successive frame has many redundancies,
Processing does not utilize this information frame by frame, big so as to cause computational complexity;In another video semanteme dividing method, make
With light stream from key frame propagation characteristic to non-key frame, the semantic label of key frame is calculated using a deep neural network,
Then with the light stream of small a network calculations key frame and present frame, i.e., the displacement put pixel-by-pixel in key frame and present frame to
Amount, then by light stream from key frame propagation semantic label to present frame, i.e.,:Based on the motion vector put pixel-by-pixel to key
Frame propagation semantic label is handled the semantic label to obtain present frame, may be caused due to target movement etc. in video
It is the picture weave of image in video, fuzzy, the light stream of acquisition is made to be inaccurate, so as to reduce semantic segmentation precision.
Fig. 1 is the flow chart of feature of present invention transmission method one embodiment.As shown in Figure 1, the feature of the embodiment passes
Broadcasting method includes:
102, judge whether present frame is key frame.
For example, it can judge whether present frame is key frame using key frame scheduling strategy.
104, it is the non-key frame in video in response to present frame, it is special according to the low layer of the adjacent previous key frame of present frame
It seeks peace the low-level feature of present frame, the high-level characteristic of present frame is obtained by the high-level characteristic of previous key frame.
Wherein, in neural network, extraction obtains the low-level feature of previous key frame and present frame and the low layer spy of present frame
The network depth of the first network layer of sign is shallower than in neural network and carries out feature extraction to low-level feature and obtain the of high-level characteristic
The network depth of two network layers.
In various embodiments of the present invention, neural network includes the different network layer of more than two network depths, neural network packet
In the network layer included, the network layer for carrying out feature extraction is properly termed as characteristic layer, after neural network receives a frame, leads to
It crosses first characteristic layer and feature extraction is carried out, and be inputted second characteristic layer to the frame of input, from second characteristic layer,
Each characteristic layer carries out feature extraction to the feature of input successively, and the feature extracted is input to next network layer carries out spy
Sign extraction, until obtaining the feature for carrying out semantic segmentation.The network depth of each characteristic layer is carried according to feature in neural network
The characteristic layer for being used to carry out feature extraction in neural network from shallow to deep, according to network depth, can be divided by the sequence taken
Low-level feature layer and high-level characteristic layer two parts, i.e., above-mentioned first network layer and the second network layer.Wherein, in low-level feature layer
The feature that each characteristic layer carries out feature extraction final output successively is known as low-level feature, and each characteristic layer in high-level characteristic layer is successively
The feature for carrying out feature extraction final output is known as high-level characteristic.Relative to the shallower feature of network depth in same neural network
Layer, the deeper characteristic layer view field of network depth is larger, more concern spatial structural form, and the feature extracted is for semanteme
During segmentation so that semantic segmentation is more accurate, however, network depth is deeper, difficulty in computation and complexity are higher.In practical application,
Can the characteristic layer in neural network be divided by low-level feature layer and high-level characteristic layer according to preset standard, such as calculation amount,
The preset standard can be adjusted according to actual demand.For example, the nerve net for including 101 sequentially connected characteristic layers for one
Network, can according to presetting, by the 1st in 100 characteristic layers to the 30th this it is 30 first (can also be other numbers
Amount) characteristic layer as low-level feature layer, using the 31st to the 100th this rear 70 characteristic layer as high-level characteristic layer.For example,
For pyramid scene parsing network (Pyramid Scene Parsing Network, PSPN), which can include
Four part convolutional networks (conv1 to conv4) and a classification layer, each section convolutional network include multiple convolutional layers again, can
With according to calculation amount size, using in the PSPN from conv1 to conv4_3 in convolutional layer as low-level feature layer, account for
The calculation amount of the PSPN about 1/8, using in the PSPN from conv4_4 classify layer to the end before each convolutional layer as high-level characteristic
Layer accounts for the calculation amount of PSPN about 7/8;Classification layer is used to carry out semantic segmentation to the high-level characteristic that high-level characteristic layer exports.
The feature propagation provided based on the above embodiment of the present invention puts method, when present frame is non-key frame in video, according to working as
The adjacent low-level feature of previous key frame of previous frame and the low-level feature of present frame are obtained by the high-level characteristic of previous key frame and worked as
The high-level characteristic of previous frame carries out semantic segmentation to be based on the high-level characteristic to non-key frame.The embodiment of the present invention, which is utilized, to be regarded
Consensus information between frequency frame, using the semantic label between contiguous frames it is close the characteristics of, will be used to carrying out video semanteme point
The high-level characteristic cut travels to present frame from adjacent previous key frame, so as to based on the high-level characteristic of the present frame to present frame
Semantic segmentation is carried out, without extracting the high-level characteristic for semantic segmentation frame by frame to video successive frame, is used relative to extraction frame by frame
In the mode of the high-level characteristic of semantic segmentation, reduce and compute repeatedly the time;In addition, the embodiment of the present invention is by previous key frame
High-level characteristic travel to present frame for semantic segmentation indirect propagation semantic label, relative to light stream propagate key frame language
The mode of adopted label improves the accuracy of semantic segmentation.
In one of embodiment of various embodiments of the present invention, in operation 102, according to the adjacent previous pass of present frame
The low-level feature of key frame and the low-level feature of present frame obtain the high-level characteristic of present frame by the high-level characteristic of previous key frame,
It can include:
According to the adjacent low-level feature of previous key frame and the low-level feature of present frame, obtain from the low of previous key frame
Layer eigentransformation to the low-level feature of present frame conversion weights;
According to the high-level characteristic of previous key frame and conversion weights, the high-level characteristic of previous key frame is converted into present frame
High-level characteristic, this feature is the feature propagated from previous key frame, also referred to as propagation characteristic.
In a wherein optional example, the low-level feature change from previous key frame can be obtained by multiple convolutional layers
Change to the conversion weights of the low-level feature of present frame.
In another embodiment of feature of present invention transmission method, it can also include:In response to present frame in video
Non-key frame, at least high-level characteristic based on present frame carries out semantic segmentation to present frame, obtains the semantic mark of present frame
Label.
In a wherein embodiment, at least high-level characteristic based on present frame carries out semantic segmentation to present frame, can
To include:Low-level feature and high-level characteristic based on present frame carry out semantic segmentation to present frame, obtain the semantic mark of present frame
Label.
In practical applications, the port number of first network layer that extraction obtains high-level characteristic is typically more than extraction and obtains low layer
The port number of the first network layer of feature, in order to which the low-level feature of present frame and high-level characteristic are merged, one wherein
In exemplary, low-level feature and high-level characteristic based on present frame carry out semantic segmentation to present frame, can include:
The low-level feature of present frame is converted, obtains the feature consistent with the port number of the high-level characteristic of present frame;
The feature that present frame is converted to is spliced or merged with the high-level characteristic of present frame, it is special to obtain present frame
Sign;
Based on present frame feature, semantic segmentation is carried out to present frame.
In the above embodiment of the present invention, the high-level characteristic and present frame levied by the high-level characteristic of previous key frame are merged
Feature for semantic segmentation, without using the feature for calculating the big single frames model of cost and obtaining non-key frame, reducing
While calculation amount, the accuracy of semantic segmentation ensure that.
In addition, in the further embodiment of feature of present invention transmission method, can also cache each after previous key frame
The high-level characteristic of non-key frame, when present frame is non-key frame, feature and the high level of present frame that present frame is converted to
Feature, the high-level characteristic of previous key frame and previous key frame and the high-level characteristic of each non-key frame before present frame carry out
Splicing or fusion obtain present frame feature and based on present frame feature, semantic segmentation are carried out to present frame.
Based on the embodiment, the high-level characteristics of all cachings between previous key frame and present frame can be propagated to current
Frame, and spliced or merged to carry out semantic segmentation, it is more robust that acquisition can be obtained under minimum fusion cost in this way
Semantic segmentation effect.
In an embodiment of various embodiments of the present invention, key frame scheduling strategy therein can be regular length tune
Degree method, such as be judged as a key frame every the frame of l~5, i.e.,:Can be judged using regular length scheduling method present frame whether be
Key frame.
Fig. 2 is the flow chart of another embodiment of feature of present invention transmission method.As shown in Fig. 2, the feature of the embodiment
Transmission method includes:
202, judge whether present frame is key frame using regular length scheduling method.
If whether present frame is key frame, operation 212 is performed.Otherwise, it if present frame is the non-key frame in video, performs
Operation 204.
204, to present frame (also referred to as:Current non-key frame) feature extraction is carried out, obtain the low-level feature of present frame.
It, can be by the low-level feature layer of neural network (i.e. in an example of various embodiments of the present invention:First network
Layer) feature extraction is carried out to present frame, obtain the low-level feature of present frame.
206, according to the adjacent low-level feature of previous key frame of present frame and the low-level feature of present frame, obtain before this
The low-level feature of one key frame transforms to the conversion weights of the low-level feature of present frame.
Wherein, conversion weights can be previous key frame low-level feature and the low-level feature of present frame the two features it
Between transition matrix, low-level feature including previous key frame and in the low-level feature of present frame, between the feature put pixel-by-pixel
Switch element.
208, according to the high-level characteristic of the previous key frame and conversion weights, the high-level characteristic of the previous key frame is converted
High-level characteristic for present frame.
210, low-level feature and high-level characteristic based on present frame carry out semantic segmentation to present frame, obtain present frame
Semantic label.
Flow to step 210 semantic segmentation terminates, and later, does not perform the follow-up process of the present embodiment.
212, to present frame (also referred to as:Current key frame) feature extraction is carried out, it obtains the low-level feature of present frame and delays
It deposits.
It, can be by the low-level feature layer of neural network (i.e. in a wherein example:First network layer) to present frame
Carry out feature extraction.
214, feature extraction is carried out to the low-level feature of present frame, obtains the high-level characteristic and caching of present frame.
It, can be by the high-level characteristic layer of neural network (i.e. in a wherein example:Second network layer) to present frame
Low-level feature carry out feature extraction.
216, the high-level characteristic based on present frame carries out semantic segmentation to present frame, obtains the semantic label of present frame.
In various embodiments of the present invention, the low-level feature layer that key frame and non-key frame can share neural network carries out low layer
PSPN may be used in feature extraction, neural network herein, which can include four part convolutional networks, and (conv1 is arrived
Conv4 it) is divided into a classification layer, each section convolutional network as multiple convolutional layers, wherein, the low-level feature layer of neural network
Convolutional layer in can including in PSPN from conv1 to conv4_3, accounts for the calculation amount of PSPN about 7/8;The high level of neural network
Characteristic layer can account for the calculation amount of PSPN about 1/8, for carrying including each convolutional layer before classifying layer to the end from conv4_4
Take the high-level characteristic of key frame;Layer of classifying, which is used to correspond to based on key frame or the high-level characteristic of non-key frame, identifies key frame or non-
The classification of at least one pixel in key frame, so as to fulfill the semantic segmentation to key frame or non-key frame.
In various embodiments of the present invention, for key frame, it can call and calculate the big single frames model of cost, such as PSPN is carried out
Semantic segmentation, so as to obtain high-precision semantic segmentation result.It, can be adaptive by the high-level characteristic of key frame for non-key frame
That answers travels to present frame, obtains the high-level characteristic of present frame, takes full advantage of the consensus information between video successive frame, keeps away
Exempt to compute repeatedly the time, low-level feature and high-level characteristic based on present frame, semantic segmentation is carried out to present frame, obtained current
The semantic label of frame.The present embodiment is the semantic segmentation precision for ensuring key frame, it is not required that the list big using cost is calculated
Frame model carries out semantic segmentation frame by frame to non-key frame, reduces computation complexity and calculates the time, saves computing resource.
Fig. 3 is the flow chart of another embodiment of feature of present invention transmission method.As shown in figure 3, the feature of the embodiment
Transmission method includes:
302, feature extraction is carried out to present frame, obtains the low-level feature of present frame.
It, can be by the low-level feature layer of neural network (i.e. in an example of various embodiments of the present invention:First network
Layer) feature extraction is carried out to present frame, obtain the low-level feature of present frame.
304, according to the adjacent low-level feature of previous key frame of present frame and the low-level feature of present frame, obtain present frame
It is scheduled as the scheduling probability value of key frame.
In a wherein example, the low-level feature of the low-level feature of previous key frame and present frame can be spelled
It connects, and obtained splicing feature is inputted into a key frame dispatch network, obtained by the key frame dispatch network based on the splicing feature
Take whether present frame should be scheduled as the scheduling probability value of key frame.
306, determine whether present frame is scheduled as key frame according to the scheduling probability value of present frame.
If whether present frame is key frame, operation 314 is performed.Otherwise, it if present frame is the non-key frame in video, performs
Operation 308.
308, according to present frame (also referred to as:Current non-key frame) adjacent previous key frame low-level feature and present frame
Low-level feature, obtain the conversion weights of the low-level feature from the low-level feature of the previous key frame to present frame.
310, according to the high-level characteristic of the previous key frame and conversion weights, the high-level characteristic of previous key frame is converted to
The high-level characteristic of present frame.
312, low-level feature and high-level characteristic based on present frame carry out semantic segmentation to present frame, obtain present frame
Semantic label.
Later, the follow-up process of the present embodiment is not performed.
314, to present frame (also referred to as:Current key frame) feature extraction is carried out, it obtains the low-level feature of present frame and delays
It deposits.
It, can be by the low-level feature layer of neural network (i.e. in a wherein example:First network layer) to present frame
Carry out feature extraction.
316, feature extraction is carried out to the low-level feature of present frame, obtains the high-level characteristic and caching of present frame.
It, can be by the high-level characteristic layer of neural network (i.e. in a wherein example:Second network layer) to present frame
Low-level feature carry out feature extraction.
318, the high-level characteristic based on present frame carries out semantic segmentation to present frame, obtains the semantic label of present frame.
The embodiment of the present invention can be used for the internets amusement production such as automatic Pilot scene, video monitoring scene, portrait segmentation
Product etc., such as:
1, under the scene of automatic Pilot, the target Fast Segmentation in video can be come out using the embodiment of the present invention,
Such as people and vehicle;
2, in video monitoring scene, people can quickly be split;
3, in the internets amusing products such as portrait segmentation, quickly people can be split from video frame.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can pass through
The relevant hardware of program instruction is completed, and aforementioned program can be stored in a computer read/write memory medium, the program
When being executed, step including the steps of the foregoing method embodiments is performed;And aforementioned storage medium includes:ROM, RAM, magnetic disc or light
The various media that can store program code such as disk.
Fig. 4 is the structure diagram of feature of present invention transmission device one embodiment.The feature of various embodiments of the present invention passes
Broadcasting device can be used for realizing the feature propagation method of the various embodiments described above.As shown in figure 4, the feature propagation of one of embodiment
Device includes:Judgment module and feature propagation module.Wherein:
Judgment module, for judging whether present frame is key frame.
Feature propagation module is non-key in video in response to present frame for the judging result according to judgment module
Frame, according to the adjacent low-level feature of previous key frame of present frame and the low-level feature of present frame, by the high level of previous key frame
Feature obtains the high-level characteristic of present frame.
Wherein, in neural network, extraction obtains the network depth of the first network layer of the low-level feature of previous key frame, shallow
The network depth of corresponding second network layer of high-level characteristic of previous key frame is obtained in extraction.
The feature propagation provided based on the above embodiment of the present invention puts device, the non-key frame in present frame is video
When, according to the adjacent low-level feature of previous key frame of present frame and the low-level feature of present frame, by the high level of previous key frame
Feature obtains the high-level characteristic of present frame, and semantic segmentation is carried out to non-key frame to be based on the high-level characteristic.The present invention is implemented
Consensus information between video frame is utilized in example, using the semantic label between contiguous frames it is close the characteristics of, will be used to carry out
The high-level characteristic of video semanteme segmentation travels to present frame from adjacent previous key frame, so that the high level based on the present frame is special
Sign carries out semantic segmentation to present frame, without extracting the high-level characteristic for semantic segmentation frame by frame to video successive frame, relative to
Extraction reduces for the mode of the high-level characteristic of semantic segmentation and computes repeatedly the time frame by frame;In addition, the embodiment of the present invention is by before
The high-level characteristic of one key frame travel to present frame for semantic segmentation indirect propagation semantic label, passed relative to light stream
The mode of key frame semantic label is broadcast, improves the accuracy of semantic segmentation.
In wherein one embodiment mode, feature propagation module is specifically used for:According to the low-level feature of previous key frame
With the low-level feature of present frame, the low-level feature obtained from previous key frame transforms to the conversion right of the low-level feature of present frame
Value;And according to the high-level characteristic of previous key frame and conversion weights, the high-level characteristic of previous key frame is converted into present frame
High-level characteristic.
Fig. 5 is the structure diagram of another embodiment of feature of present invention transmission device.As shown in figure 5, with real shown in Fig. 4
It applies example to compare, the feature propagation device of the embodiment further includes:Semantic segmentation module, for the judgement knot according to judgment module
Fruit is the non-key frame in video, at least high-level characteristic based on present frame in response to present frame, semantic point is carried out to present frame
It cuts, obtains the semantic label of present frame.
In wherein one embodiment mode, the semantic segmentation module at least high-level characteristic based on present frame, to present frame
When carrying out semantic segmentation, specifically for low-level feature and high-level characteristic based on present frame, semantic segmentation is carried out to present frame.
In a wherein optional example, low-level feature and high-level characteristic of the semantic segmentation module based on present frame, to working as
When previous frame carries out semantic segmentation, it is specifically used for:The low-level feature of present frame is converted, obtains the high-level characteristic with present frame
The consistent feature of port number;The feature that present frame is converted to is spliced or merged with the high-level characteristic of present frame, is obtained
Obtain present frame feature;And based on present frame feature, semantic segmentation is carried out to present frame.
In an embodiment of the above-mentioned each feature propagation device embodiment of the present invention, judgment module is specifically used for utilizing
Key frame scheduling strategy judges whether present frame is key frame.
In a wherein optional example, judgment module is specifically used for whether judging present frame using regular length scheduling method
For key frame.Correspondingly, referring back to Fig. 5, the feature propagation device of another embodiment can also include:Fisrt feature extracts mould
Block is the non-key frame in video in response to present frame for the judging result according to judgment module, feature is carried out to present frame
Extraction obtains the low-level feature of present frame.
Alternatively, referring to Fig. 6, in the feature propagation device of another embodiment, fisrt feature extraction module can also be included
And acquisition module.Wherein:Fisrt feature extraction module, for carrying out feature extraction to present frame, the low layer for obtaining present frame is special
Sign.Acquisition module, for according to the adjacent low-level feature of previous key frame and the low-level feature of present frame, obtaining present frame quilt
It is scheduling to the scheduling probability value of key frame.Correspondingly, in the embodiment, judgment module is general specifically for the scheduling according to present frame
Rate value determines whether present frame is scheduled as key frame.
In a wherein embodiment, acquisition module can include:Concatenation unit, for the low layer to previous key frame
The low-level feature of feature and present frame is spliced, and obtains splicing feature;Key frame dispatch network obtains for being based on splicing feature
Take whether present frame should be scheduled as the scheduling probability value of key frame.
Illustratively, in the feature propagation device of the various embodiments described above, fisrt feature extraction module can be additionally used in basis
The judging result of judgment module, is the key frame in video in response to present frame, and feature extraction is carried out to present frame, is obtained current
The low-level feature and caching of frame.Referring back to Fig. 5 or Fig. 6, in the feature propagation device of further embodiment, can also include:
Second feature extraction module for the judging result according to judgment module, carries out feature extraction to the low-level feature of key frame, obtains
Obtain the high-level characteristic and caching of key frame.
Optionally, in the feature propagation device of the various embodiments described above, semantic segmentation module can be additionally used according to judging mould
The judging result of block, in response to present frame for the key frame in video, the high-level characteristic based on present frame carries out language to present frame
Justice segmentation obtains the semantic label of present frame.
In addition, the embodiment of the present invention additionally provides a kind of electronic equipment, include the feature of any of the above-described embodiment of the present invention
Transmission device.
In addition, the embodiment of the present invention additionally provides another electronic equipment, including:
Memory, for storing executable instruction;And
One or more processors, for communicating to perform executable instruction above-mentioned thereby completing the present invention with memory
The operation of the feature propagation method of one embodiment.
In addition, the embodiment of the present invention additionally provides another electronic equipment, including:
The feature propagation device of processor and any of the above-described embodiment of the present invention;
In processor operation characteristic transmission device, the unit in the feature propagation device of any of the above-described embodiment of the present invention
It is run.
Fig. 7 is the structure diagram of one Application Example of electronic equipment of the present invention.Below with reference to Fig. 7, it illustrates suitable
In for realizing the structure diagram of the electronic equipment of the terminal device of the embodiment of the present application or server.As shown in fig. 7, the electricity
Sub- equipment includes one or more processors, communication unit etc., and one or more of processors are for example:One or more centres
Manage unit (CPU) and/or one or more image processor (GPU) etc., processor can be according to being stored in read-only memory
(ROM) executable instruction in is held from the executable instruction that storage section is loaded into random access storage device (RAM)
Row various appropriate actions and processing.Communication unit may include but be not limited to network interface card, and the network interface card may include but be not limited to IB
(Infiniband) network interface card, processor can communicate to perform executable finger with read-only memory and/or random access storage device
It enables, is connected by bus with communication unit and communicated through communication unit with other target devices, provided so as to complete the embodiment of the present application
The corresponding operation of either method, for example, judging whether present frame is key frame;It is non-in video in response to the present frame
Key frame, according to the low-level feature of the adjacent previous key frame of the present frame and the low-level feature of the present frame, by described
The high-level characteristic of previous key frame obtains the high-level characteristic of the present frame;Wherein, in neural network, extraction obtains described previous
The network depth of the corresponding first network layer of low-level feature of key frame is shallower than extraction and obtains the high level spy of the previous key frame
Levy the network depth of corresponding second network layer.
In addition, in RAM, it can also be stored with various programs and data needed for device operation.CPU, ROM and RAM lead to
Bus is crossed to be connected with each other.In the case where there is RAM, ROM is optional module.RAM store executable instruction or at runtime to
Executable instruction is written in ROM, executable instruction makes processor perform the corresponding operation of any of the above-described method of the present invention.Input/
Output (I/O) interface is also connected to bus.Communication unit can be integrally disposed, may be set to be with multiple submodule (such as
Multiple IB network interface cards), and in bus link.
I/O interfaces are connected to lower component:Include the importation of keyboard, mouse etc.;Including such as cathode-ray tube
(CRT), the output par, c of liquid crystal display (LCD) etc. and loud speaker etc.;Storage section including hard disk etc.;And including all
Such as communications portion of the network interface card of LAN card, modem.Communications portion performs logical via the network of such as internet
Letter processing.Driver is also according to needing to be connected to I/O interfaces.Detachable media, such as disk, CD, magneto-optic disk, semiconductor are deposited
Reservoir etc. is installed as needed on a drive, in order to be mounted into as needed from the computer program read thereon
Storage section.
Need what is illustrated, framework as shown in Figure 7 is only a kind of optional realization method, can root during concrete practice
The component count amount and type of above-mentioned Fig. 7 are selected, are deleted, increased or replaced according to actual needs;It is set in different function component
Put, can also be used it is separately positioned or integrally disposed and other implementations, such as GPU and CPU separate setting or can be by GPU collection
Into on CPU, communication unit separates setting, can also be integrally disposed on CPU or GPU, etc..These interchangeable embodiments
Each fall within protection domain disclosed by the invention.
In addition, the embodiment of the present invention additionally provides a kind of computer storage media, for storing computer-readable finger
It enables, which is performed the operation for realizing any of the above-described embodiment feature propagation method of the present invention.
In addition, the embodiment of the present invention additionally provides a kind of computer program, including computer-readable instruction, work as calculating
When the instruction that machine can be read is run in a device, it is special that the processor execution in equipment is used to implement any of the above-described embodiment of the present invention
Levy the executable instruction of the step in transmission method.
In an optional embodiment, the computer program is specially software product, such as software development kit
(Software Development Kit, SDK), etc..
In one or more optional embodiments, the embodiment of the present invention additionally provides a kind of computer program program production
Product, for storing computer-readable instruction, described instruction is performed so that computer performs any of the above-described possible realization side
Feature propagation method described in formula.
The computer program product can be realized especially by hardware, software or its mode combined.In an alternative embodiment
In son, the computer program product is embodied as computer storage media, in another optional example, the computer
Program product is embodied as software product, such as SDK etc..
Each embodiment is described by the way of progressive in this specification, the highlights of each of the examples are with its
The difference of its embodiment, the same or similar part cross-reference between each embodiment.For device embodiment
For, since it is substantially corresponding with embodiment of the method, so description is fairly simple, referring to the portion of embodiment of the method in place of correlation
It defends oneself bright.
Methods and apparatus of the present invention may be achieved in many ways.For example, can by software, hardware, firmware or
Software, hardware, firmware any combinations realize methods and apparatus of the present invention.The said sequence of the step of for the method
Merely to illustrate, the step of method of the invention, is not limited to sequence described in detail above, special unless otherwise
It does not mentionlet alone bright.In addition, in some embodiments, the present invention can be also embodied as recording program in the recording medium, these programs
Including being used to implement machine readable instructions according to the method for the present invention.Thus, the present invention also covering stores to perform basis
The recording medium of the program of the method for the present invention.
Description of the invention provides for the sake of example and description, and is not exhaustively or will be of the invention
It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.It selects and retouches
It states embodiment and is to more preferably illustrate the principle of the present invention and practical application, and those of ordinary skill in the art is enable to manage
The solution present invention is so as to design the various embodiments with various modifications suitable for special-purpose.
Claims (10)
1. a kind of feature propagation method, which is characterized in that including:
Judge whether present frame is key frame;
It is the non-key frame in video in response to the present frame, it is special according to the low layer of the adjacent previous key frame of the present frame
It seeks peace the low-level feature of the present frame, the high-level characteristic of the present frame is obtained by the high-level characteristic of the previous key frame;
Wherein, in neural network, extraction obtains the network depth of the corresponding first network layer of low-level feature of the previous key frame, shallow
The network depth of corresponding second network layer of high-level characteristic of the previous key frame is obtained in extraction.
2. the according to the method described in claim 1, it is characterized in that, previous key frame adjacent according to the present frame
The low-level feature of low-level feature and the present frame obtains the high level of the present frame by the high-level characteristic of the previous key frame
Feature, including:
According to the low-level feature of adjacent previous key frame and the low-level feature of the present frame, obtain from the previous key frame
Low-level feature transform to the present frame low-level feature conversion weights;
According to the high-level characteristic of the previous key frame and the conversion weights, the high-level characteristic of the previous key frame is converted
High-level characteristic for the present frame.
3. method according to claim 1 or 2, which is characterized in that in response to the present frame be non-key in video
Frame further includes:
High-level characteristic at least based on the present frame carries out semantic segmentation to the present frame, obtains the language of the present frame
Adopted label.
4. according to the method described in claim 3, it is characterized in that, the high-level characteristic at least based on the present frame, right
The present frame carries out semantic segmentation, including:
Low-level feature and high-level characteristic based on the present frame carry out semantic segmentation to the present frame, obtain described current
The semantic label of frame.
The feature that the present frame is converted to is spliced or merged with the high-level characteristic of the present frame, obtains present frame
Feature;
Based on the present frame feature, semantic segmentation is carried out to the present frame.
5. a kind of feature propagation device, which is characterized in that including:
Judgment module, for judging whether present frame is key frame;
Feature propagation module is non-key in video in response to present frame for the judging result according to the judgment module
Frame, according to the low-level feature of the adjacent previous key frame of the present frame and the low-level feature of the present frame, by described previous
The high-level characteristic of key frame obtains the high-level characteristic of the present frame;Wherein, in neural network, extraction obtains the previous key
The network depth of the first network layer of the low-level feature of frame, be shallower than extraction obtain the previous key frame high-level characteristic it is corresponding
The network depth of second network layer.
6. a kind of electronic equipment, which is characterized in that including:Feature propagation device described in claim 5.
7. a kind of electronic equipment, which is characterized in that including:
Feature propagation device described in processor and claim 5;
When processor runs the feature propagation device, the unit in feature propagation device described in claim 5 is run.
8. a kind of electronic equipment, which is characterized in that including:Processor and memory;
For the memory for storing an at least executable instruction, the executable instruction makes the processor perform claim requirement
The operation of each step in any feature propagation methods of 1-4.
9. a kind of computer program, including computer-readable code, which is characterized in that when the computer-readable code is in equipment
During upper operation, the processor execution in the equipment is used to implement each in any feature propagation methods of claim 1-4
The instruction of step.
10. a kind of computer-readable medium, for storing computer-readable instruction, which is characterized in that described instruction is held
The operation of each step in any feature propagation methods of claim 1-4 is realized during row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711455916.6A CN108235116B (en) | 2017-12-27 | 2017-12-27 | Feature propagation method and apparatus, electronic device, and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711455916.6A CN108235116B (en) | 2017-12-27 | 2017-12-27 | Feature propagation method and apparatus, electronic device, and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108235116A true CN108235116A (en) | 2018-06-29 |
CN108235116B CN108235116B (en) | 2020-06-16 |
Family
ID=62649228
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711455916.6A Active CN108235116B (en) | 2017-12-27 | 2017-12-27 | Feature propagation method and apparatus, electronic device, and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108235116B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109151615A (en) * | 2018-11-02 | 2019-01-04 | 湖南双菱电子科技有限公司 | Method for processing video frequency, computer equipment and computer storage medium |
CN109919044A (en) * | 2019-02-18 | 2019-06-21 | 清华大学 | The video semanteme dividing method and device of feature propagation are carried out based on prediction |
CN110060264A (en) * | 2019-04-30 | 2019-07-26 | 北京市商汤科技开发有限公司 | Neural network training method, video frame processing method, apparatus and system |
CN110738108A (en) * | 2019-09-09 | 2020-01-31 | 北京地平线信息技术有限公司 | Target object detection method, target object detection device, storage medium and electronic equipment |
CN110929605A (en) * | 2019-11-11 | 2020-03-27 | 中国建设银行股份有限公司 | Video key frame storage method, device, equipment and storage medium |
CN111062395A (en) * | 2019-11-27 | 2020-04-24 | 北京理工大学 | Real-time video semantic segmentation method |
CN111383245A (en) * | 2018-12-29 | 2020-07-07 | 北京地平线机器人技术研发有限公司 | Video detection method, video detection device and electronic equipment |
CN111654724A (en) * | 2020-06-08 | 2020-09-11 | 上海纽菲斯信息科技有限公司 | Low-bit-rate coding transmission method of video conference system |
CN112016513A (en) * | 2020-09-08 | 2020-12-01 | 北京达佳互联信息技术有限公司 | Video semantic segmentation method, model training method, related device and electronic equipment |
CN112465826A (en) * | 2019-09-06 | 2021-03-09 | 上海高德威智能交通***有限公司 | Video semantic segmentation method and device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101650728A (en) * | 2009-08-26 | 2010-02-17 | 北京邮电大学 | Video high-level characteristic retrieval system and realization thereof |
CN103065300A (en) * | 2012-12-24 | 2013-04-24 | 安科智慧城市技术(中国)有限公司 | Method for video labeling and device for video labeling |
CN105677735A (en) * | 2015-12-30 | 2016-06-15 | 腾讯科技(深圳)有限公司 | Video search method and apparatus |
CN106156747A (en) * | 2016-07-21 | 2016-11-23 | 四川师范大学 | The method of the monitor video extracting semantic objects of Behavior-based control feature |
US20160358628A1 (en) * | 2015-06-05 | 2016-12-08 | Apple Inc. | Hierarchical segmentation and quality measurement for video editing |
US20170124400A1 (en) * | 2015-10-28 | 2017-05-04 | Raanan Y. Yehezkel Rohekar | Automatic video summarization |
CN106934352A (en) * | 2017-02-28 | 2017-07-07 | 华南理工大学 | A kind of video presentation method based on two-way fractal net work and LSTM |
US20170337271A1 (en) * | 2016-05-17 | 2017-11-23 | Intel Corporation | Visual search and retrieval using semantic information |
-
2017
- 2017-12-27 CN CN201711455916.6A patent/CN108235116B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101650728A (en) * | 2009-08-26 | 2010-02-17 | 北京邮电大学 | Video high-level characteristic retrieval system and realization thereof |
CN103065300A (en) * | 2012-12-24 | 2013-04-24 | 安科智慧城市技术(中国)有限公司 | Method for video labeling and device for video labeling |
US20160358628A1 (en) * | 2015-06-05 | 2016-12-08 | Apple Inc. | Hierarchical segmentation and quality measurement for video editing |
US20170124400A1 (en) * | 2015-10-28 | 2017-05-04 | Raanan Y. Yehezkel Rohekar | Automatic video summarization |
CN105677735A (en) * | 2015-12-30 | 2016-06-15 | 腾讯科技(深圳)有限公司 | Video search method and apparatus |
US20170337271A1 (en) * | 2016-05-17 | 2017-11-23 | Intel Corporation | Visual search and retrieval using semantic information |
CN106156747A (en) * | 2016-07-21 | 2016-11-23 | 四川师范大学 | The method of the monitor video extracting semantic objects of Behavior-based control feature |
CN106934352A (en) * | 2017-02-28 | 2017-07-07 | 华南理工大学 | A kind of video presentation method based on two-way fractal net work and LSTM |
Non-Patent Citations (1)
Title |
---|
XIZHOU ZHU; YUWEN XIONG; JIFENG DAI; LU YUAN; YICHEN WEI: "Deep Feature Flow for Video Recognition", 《IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109151615A (en) * | 2018-11-02 | 2019-01-04 | 湖南双菱电子科技有限公司 | Method for processing video frequency, computer equipment and computer storage medium |
CN109151615B (en) * | 2018-11-02 | 2022-01-25 | 湖南双菱电子科技有限公司 | Video processing method, computer device, and computer storage medium |
CN111383245B (en) * | 2018-12-29 | 2023-09-22 | 北京地平线机器人技术研发有限公司 | Video detection method, video detection device and electronic equipment |
CN111383245A (en) * | 2018-12-29 | 2020-07-07 | 北京地平线机器人技术研发有限公司 | Video detection method, video detection device and electronic equipment |
CN109919044A (en) * | 2019-02-18 | 2019-06-21 | 清华大学 | The video semanteme dividing method and device of feature propagation are carried out based on prediction |
CN110060264A (en) * | 2019-04-30 | 2019-07-26 | 北京市商汤科技开发有限公司 | Neural network training method, video frame processing method, apparatus and system |
CN110060264B (en) * | 2019-04-30 | 2021-03-23 | 北京市商汤科技开发有限公司 | Neural network training method, video frame processing method, device and system |
CN112465826A (en) * | 2019-09-06 | 2021-03-09 | 上海高德威智能交通***有限公司 | Video semantic segmentation method and device |
CN112465826B (en) * | 2019-09-06 | 2023-05-16 | 上海高德威智能交通***有限公司 | Video semantic segmentation method and device |
CN110738108A (en) * | 2019-09-09 | 2020-01-31 | 北京地平线信息技术有限公司 | Target object detection method, target object detection device, storage medium and electronic equipment |
CN110929605A (en) * | 2019-11-11 | 2020-03-27 | 中国建设银行股份有限公司 | Video key frame storage method, device, equipment and storage medium |
CN111062395A (en) * | 2019-11-27 | 2020-04-24 | 北京理工大学 | Real-time video semantic segmentation method |
CN111654724A (en) * | 2020-06-08 | 2020-09-11 | 上海纽菲斯信息科技有限公司 | Low-bit-rate coding transmission method of video conference system |
CN112016513A (en) * | 2020-09-08 | 2020-12-01 | 北京达佳互联信息技术有限公司 | Video semantic segmentation method, model training method, related device and electronic equipment |
CN112016513B (en) * | 2020-09-08 | 2024-01-30 | 北京达佳互联信息技术有限公司 | Video semantic segmentation method, model training method, related device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN108235116B (en) | 2020-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108235116A (en) | Feature propagation method and device, electronic equipment, program and medium | |
CN106599789B (en) | The recognition methods of video classification and device, data processing equipment and electronic equipment | |
EP4009231A1 (en) | Video frame information labeling method, device and apparatus, and storage medium | |
CN109508681A (en) | The method and apparatus for generating human body critical point detection model | |
CN108229336A (en) | Video identification and training method and device, electronic equipment, program and medium | |
CN108229280A (en) | Time domain motion detection method and system, electronic equipment, computer storage media | |
CN109800821A (en) | Method, image processing method, device, equipment and the medium of training neural network | |
CN108229363A (en) | Key frame dispatching method and device, electronic equipment, program and medium | |
CN112966742A (en) | Model training method, target detection method and device and electronic equipment | |
CN112215171B (en) | Target detection method, device, equipment and computer readable storage medium | |
CN109300151A (en) | Image processing method and device, electronic equipment | |
CN115861462B (en) | Training method and device for image generation model, electronic equipment and storage medium | |
CN114972958B (en) | Key point detection method, neural network training method, device and equipment | |
CN112668492A (en) | Behavior identification method for self-supervised learning and skeletal information | |
CN114792359A (en) | Rendering network training and virtual object rendering method, device, equipment and medium | |
US20230082715A1 (en) | Method for training image processing model, image processing method, apparatus, electronic device, and computer program product | |
CN110807379A (en) | Semantic recognition method and device and computer storage medium | |
CN109886172A (en) | Video behavior recognition methods and device, electronic equipment, storage medium, product | |
CN115170819A (en) | Target identification method and device, electronic equipment and medium | |
CN115511779A (en) | Image detection method, device, electronic equipment and storage medium | |
US20230290132A1 (en) | Object recognition neural network training using multiple data sources | |
CN108509876A (en) | For the object detecting method of video, device, equipment, storage medium and program | |
CN114359892A (en) | Three-dimensional target detection method and device and computer readable storage medium | |
CN111768007B (en) | Method and device for mining data | |
CN116824686A (en) | Action recognition method and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |