CN110245564A

CN110245564A - A kind of pedestrian detection method, system and terminal device

Info

Publication number: CN110245564A
Application number: CN201910397924.2A
Authority: CN
Inventors: 王健宗; 彭俊清
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-05-14
Filing date: 2019-05-14
Publication date: 2019-09-17
Anticipated expiration: 2039-05-14
Also published as: CN110245564B

Abstract

The present invention is suitable for image identification technical field, provides a kind of pedestrian detection method, system and terminal device, and method includes: by the multipair pedestrian as in convolution depth network model recognition target image, and in the target image to pedestrian's addition the first identification frame；Wherein, the multipair training mission as convolution depth network model includes semantic task and pedestrian detection；By the certain objects in convolutional neural networks VGG19 recognition target image, and in the target image to certain objects addition the second identification frame；It is non-overlapping to judge that the first identification frame and the second identification frame have, if there is overlapping, determines that pedestrian carries certain objects, and cause preset monitor event.Accurate pedestrian detection, and the monitoring of various pedestrian's attribute, scene properties may be implemented through the invention.

Description

A kind of pedestrian detection method, system and terminal device

Technical field

The present invention relates to image identification technical field more particularly to a kind of pedestrian detection methods, system and terminal device.

Background technique

Pedestrian detection (Pedestrian Detection) is always the hot and difficult issue in computer vision research, most In recent years by very big concern.Pedestrian detection will solve the problems, such as: find out pedestrian all in image or video frame, including position It sets and size, is generally indicated with rectangle frame, this is typical target detection problems and general way.

However, in public safety, monitoring for public place pedestrian not only needs to detect pedestrian, label The position of pedestrian out, more needs to be concerned with and avoids accidentally identifying, and not be by other objects with humanoid profile are mis-marked People.In addition, the important belongings for pedestrian are also paid close attention to very much, the especially monitoring of specific behavior has realistic meaning very much, For example it needs to pay special attention to for carrying the pedestrian of package in important place.

Summary of the invention

It is a primary object of the present invention to propose a kind of pedestrian detection method, system and terminal device, to solve existing skill When being monitored in art to public place pedestrian, pedestrian detection error is big, and does not have the problem of special article detection function.

To achieve the above object, first aspect of the embodiment of the present invention provides a kind of pedestrian detection method, comprising:

By the multipair pedestrian as in convolution depth network model recognition target image, and to institute in the target image State pedestrian's addition the first identification frame；

Wherein, the multipair training mission as convolution depth network model includes semantic task and pedestrian detection；

The certain objects in the target image are identified by convolutional neural networks VGG19, and in the target image To certain objects addition the second identification frame；

It is non-overlapping to judge that the first identification frame and the second identification frame have, if there is overlapping, determines that the pedestrian takes With the certain objects, and cause preset monitor event.

In conjunction with the embodiment of the present invention in a first aspect, by multipair as convolution described in first embodiment of the embodiment of the present invention Pedestrian in depth network model recognition target image, comprising:

Obtain pedestrian's data set in the target image；

Character attribute label is set, collects the pedestrian sample in pedestrian's data set, and institute is added to pedestrian's positive sample State character attribute label；

Obtain the contextual data collection in the target image；

Scene set attribute tags collect the scene sample that the contextual data is concentrated, and add institute to scene positive sample State scene properties label；

According to attribute tags pedestrian's positive sample and the scene positive sample generative semantics training set；

By the semantic training set and pedestrian's data set in the form of different task, input described multipair as convolution depth Degree network model is trained, to obtain the pedestrian in the target image.

In conjunction with the first embodiment of first aspect of the embodiment of the present invention, in second embodiment of the embodiment of the present invention, if Character attribute label, the pedestrian sample collected in pedestrian's data set are set, and the personage is added to pedestrian's positive sample Attribute tags, comprising:

The positive negative sample in pedestrian's data set is collected, by the positive and negative sample tissue into two tree structures；

According in the child node of tree structure output, two child nodes with pre-determined distance obtain the pedestrian Positive sample and pedestrian's negative sample add the character attribute label to pedestrian's positive sample.

In conjunction with the first embodiment and second embodiment of first aspect of the embodiment of the present invention, third of the embodiment of the present invention It is described multipair as convolution depth network model includes TA-CNN frame in embodiment；

The network structure of the TA-CNN frame, including four convolutional layers, four maximum pond layers and two full articulamentums.

It is described to pass through convolutional Neural in conjunction with the embodiment of the present invention in a first aspect, in the 4th embodiment of the embodiment of the present invention Certain objects in network VGG19 recognition target image, and certain objects addition second is known in the target image Other frame, comprising:

The samples pictures of the certain objects are collected, and are marked；

The samples pictures are cleaned and marked；

By the class label and VGG19 of editor, the certain objects in the target image are identified.

Second aspect of the embodiment of the present invention provides a kind of pedestrian detecting system, comprising:

Pedestrian's identification module, for by the multipair pedestrian as in convolution depth network model recognition target image, and To pedestrian addition the first identification frame in the target image；

Special article identification module, for identifying the specific object in the target image by convolutional neural networks VGG19 Body, and to certain objects addition the second identification frame in the target image；

Monitoring module, if there is overlapping, is sentenced for judging that it is non-overlapping that the first identification frame and the second identification frame have The fixed pedestrian carries the certain objects, and causes preset monitor event.

In conjunction with second aspect of the embodiment of the present invention, in first embodiment of the embodiment of the present invention, pedestrian's identification module Include:

Pedestrian's data set acquiring unit, for obtaining pedestrian's data set in the target image；

Pedestrian's positive sample marking unit collects pedestrian's sample in pedestrian's data set for character attribute label to be arranged This, and the character attribute label is added to pedestrian's positive sample；

Contextual data collection acquiring unit, for obtaining the contextual data collection in the target image；

Scene positive sample marking unit is used for scene set attribute tags, collects the scene sample that the contextual data is concentrated This, and the scene properties label is added to scene positive sample；

Semantic training set generation unit, for according to pedestrian's positive sample and the positive sample of the scene with attribute tags This generative semantics training set；

Model training unit, for by the semantic training set and pedestrian's data set in the form of different task, it is defeated Enter it is described multipair as convolution depth network model is trained, to obtain the pedestrian in the target image.

In conjunction with the first embodiment of second aspect of the embodiment of the present invention, in second embodiment of the embodiment of the present invention, institute Pedestrian's positive sample marking unit is stated, is also used to:

The third aspect of the embodiment of the present invention provides a kind of terminal device, including memory, processor and is stored in In above-mentioned memory and the computer program that can be run on above-mentioned processor, when above-mentioned processor executes above-mentioned computer program The step of realizing method provided by first aspect as above.

The fourth aspect of the embodiment of the present invention provides a kind of computer readable storage medium, above-mentioned computer-readable storage Media storage has computer program, and above-mentioned computer program realizes method provided by first aspect as above when being executed by processor The step of.

The embodiment of the present invention proposes a kind of pedestrian detection method, by multipair as convolution depth network model, carries out semantic The deep learning of task and pedestrian detection, the accurately pedestrian in recognition target image, and in the target image with the first identification Collimation mark goes out pedestrian；Also by the certain objects in convolutional neural networks VGG19 recognition target image, and in the target image with the Two identification collimation marks go out certain objects, expand the test object in pedestrian detection, then pass through the first identification frame and the second identification frame Position, detect and whether there is carrying relationship between pedestrian and special article, cause if pedestrian carries special article default Monitor event, so that monitoring side is handled in time monitor event.Pedestrian detection method provided in an embodiment of the present invention So that pedestrian detection is no longer confined in simple character image identification, but by multipair as convolution depth network model and spy Determine Articles detecting, realizes accurate pedestrian detection, and the monitoring of various pedestrian's attribute, scene properties.

Detailed description of the invention

Fig. 1 is the implementation process schematic diagram for the pedestrian detection method that the embodiment of the present invention one provides；

Fig. 2 is the detailed implementation process schematic diagram of step S102 in Fig. 1；

Fig. 3 is the implementation process schematic diagram of pedestrian detection method provided by Embodiment 2 of the present invention；

Fig. 4 is the schematic diagram of two tree structures provided by Embodiment 2 of the present invention；

Fig. 5 is the character attribute label and scene properties label that the embodiment of the present invention three provides；

Fig. 6 is that the embodiment of the present invention five provides the structural schematic diagram of pedestrian detecting system.

The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.

Specific embodiment

It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.

It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.

Herein, using the suffix for indicating such as " module ", " component " or " unit " of element only for advantageous In explanation of the invention, there is no specific meanings for itself.Therefore, " module " can be used mixedly with " component ".

In subsequent description, inventive embodiments serial number is for illustration only, does not represent the advantages or disadvantages of the embodiments.

Embodiment one

As shown in Figure 1, the embodiment of the invention provides a kind of pedestrian detection methods, first by multipair as convolution depth net Network model improves the accuracy of pedestrian detection, then in conjunction with convolutional neural networks VGG19, realizes the detection function of special article Energy.Method includes but is not limited to following steps:

S101, by the multipair pedestrian as in convolution depth network model recognition target image, and in the target image In to the pedestrian addition first identification frame.

It is multipair as convolution depth network model in above-mentioned steps S101, multiple tasks and multiple numbers are come from for learning Attribute label is carried out according to the advanced features in source, and to validity feature therein, thus by attribute information from existing scene cut For data set, it is transferred to pedestrian's data set.

In embodiments of the present invention, multipair as carrying out in convolution depth network model for task includes semantic task and pedestrian Detection obtains semantic information by semantic task training, and semantic information is usually the attribute for the target being trained to, while passing through language Adopted information assists carry out pedestrian detection, to reduce miss rate of the deep learning method in pedestrian detection.

S102, the certain objects in the target image are identified by convolutional neural networks VGG19, and in the target figure To certain objects addition the second identification frame as in.

In above-mentioned steps S102, convolutional neural networks VGG19, for carrying out picture recognition and classification.

In embodiments of the present invention, whether there are certain objects in recognition target image, needing first to advance to certain objects identifies, And store recognition result.

In one embodiment, as shown in Fig. 2, above-mentioned steps S102 may include:

S1021, the samples pictures for collecting the certain objects, and be marked.

In above-mentioned steps S1021, the picture that image data may come from kaggle match, *** or Baidu is searched Rope, ImageNet image library etc..

S1022, the samples pictures are cleaned and is marked.

In above-mentioned steps S1022, cleaning specifically can be, but not limited to include: to remove undesirable picture, interception The part needed, the size and clarity of interception need to unitize, and carry out class label to the image after cleaning, such as carry on the back Doubtful dangerous objects such as packet, cap, club etc..

S1023, class label and VGG19 by editor, identify the certain objects in the target image.

In above-mentioned steps S1023, special article can be the article of dangerous goods or doubtful dangerous goods；It is also possible to Such as article of laptop, luggage valuables or doubtful valuables.

It,, then can be by belongings to certain objects addition the second identification frame by identifying special article in important events Product go to judge the behavior of pedestrian, or judge scenario by special article, so that the content of abundant monitoring, improves pedestrian detection Practical significance.

For example, when recognizing perambulator, it can be determined that this pedestrian has child；When recognizing fire, it can be determined that work as front court There may be fire for scape.

S103, judge that the first identification frame and described second identifies that frame has non-overlapping, if there is overlapping, determine the row People carries the certain objects, and causes preset monitor event.

In above-mentioned steps S103, preset monitor event can be, but not limited to include: transmission pre-alert notification, control camera shooting Head carries out continuing to track and record a video to the people.

Pedestrian detection method provided in an embodiment of the present invention carries out semantic appoint by multipair as convolution depth network model The deep learning of business and pedestrian detection, the accurately pedestrian in recognition target image, and frame is identified with first in the target image Mark pedestrian；Also by the certain objects in convolutional neural networks VGG19 recognition target image, and in the target image with second Identification collimation mark goes out certain objects, expands the test object in pedestrian detection, then identifies frame by the first identification frame and second Position is detected and whether there is carrying relationship between pedestrian and special article, causes if pedestrian carries special article preset Monitor event enables monitoring side to handle in time monitor event.Pedestrian detection method provided in an embodiment of the present invention makes Pedestrian detection is obtained no longer to be confined in simple character image identification, but by multipair as convolution depth network model and specific Articles detecting realizes accurate pedestrian detection, and the monitoring of various pedestrian's attribute, scene properties.

Embodiment two

As shown in figure 3, the embodiment of the present invention is shown in embodiment one about the multipair as convolution depth net of step S101 The identification process of network model.In embodiments of the present invention, step S101 includes but is not limited to:

Pedestrian's data set in S1011, the acquisition target image.

In above-mentioned steps S1011, applying in multipair pedestrian's data set as convolution depth network model is image data Collection, image data set is that the subgraph photo based on image is obtained.

In embodiments of the present invention, before obtaining pedestrian's data set, block processing first is carried out to target image, then after processing Subgraph photo in obtain data about personage.It only include the subgraph with character data then in acquired pedestrian's data set Photo.

In a particular application, how much related the size of each subgraph photo and each character data be.

S1012, setting character attribute label, collect the pedestrian sample in pedestrian's data set, and to pedestrian's positive sample Add the character attribute label.

In above-mentioned steps S1012, pedestrian sample be expert at personal data concentrate extract feature samples, with character attribute The relevant feature samples of label are then pedestrian's positive sample.

Character attribute label can include but is not limited to, knapsack, cap, white clothes etc..

Contextual data collection in S1013, the acquisition target image.

S1014, scene set attribute tags collect the scene sample that the contextual data is concentrated, and to scene positive sample Add the scene properties label.

Similarly, it only includes having contextual data that acquired contextual data, which is concentrated, by above-mentioned steps S1013 and step S1014 Subgraph photo, feature samples relevant to scene properties label are then scene positive sample.

Scene properties label can include but is not limited to, sky, tree, building etc..

S1015, basis pedestrian's positive sample and the scene positive sample generative semantics training set with attribute tags.

In above-mentioned steps S1015, semantic training set shows pedestrian's attribute of target image by pedestrian's positive sample, passes through The scene properties of scene positive sample performance data set.

S1016, by the semantic training set and pedestrian's data set in the form of different task, input it is described it is multipair as Convolution depth network model is trained, to obtain the pedestrian in the target image.

In a particular application, when general depth network model carries out pedestrian detection, pedestrian detection task is considered as single Binary classification task, positive sample may be made to obscure with a large amount of negative samples.

In one embodiment, above-mentioned steps S1012 may include:

Character attribute label is set, collects the positive negative sample in pedestrian's data set, the positive and negative sample tissue is arrived In two tree structures；

According to two child nodes in the child node of tree structure output with pre-determined distance, the pedestrian is being obtained just Sample and pedestrian's negative sample add the character attribute label to pedestrian's positive sample.

As shown in figure 4, the embodiment of the present invention also proposed the schematic diagram of two tree structures, in two tree structures: Each father node: the HOG (Histogram of Oriented Gradient, histograms of oriented gradients) for extracting positive negative sample is special Sign, and assemble data using K mean cluster algorithm；Each child node: clustering father node, the vector space between node Structure is obtained by the mean value of series connection distance and each leaf node.

By two tree structures, pedestrian detection is divided into two classification tasks according to positive negative sample, wherein positive sample Feature and the degree of correlation of character attribute label are high, and the feature of negative sample is low with the degree of correlation of character attribute label, thus preliminarily Positive negative sample is separated, then further according to the distance between two child nodes, pedestrian's positive sample is selected, avoids pedestrian in positive sample When the feature of positive sample is low with the degree of correlation of character attribute label, and the feature of pedestrian's negative sample and character attribute mark in negative sample The high situation of the degree of correlation of label, so that the problem of pedestrian's positive sample and pedestrian's negative sample are obscured.

Embodiment three

The embodiment of the present invention is with Caltech (P) data set and CamVid (Ba), Stanford Background (Bb), LM For+SUN (Bc) data set, illustrate that semanteme training set shown in step S1011 to step S1015 obtained in embodiment two Journey.

Pedestrian sample is collected from Caltech (P) first, pedestrian's positive sample is marked using 9 class character attribute labels, people Object attribute tags are mainly provided by the UK police for being monitored analysis.

Then, from CamVid (Ba), Stanford Background (Bb) collects environment on LM+SUN (Bc) data set Scene sample, scene positive sample are marked using 8 class scene properties labels.As shown in figure 5, the embodiment of the present invention also shows reality In the application of border, it is usually required mainly for the character attribute label and scene properties label considered.

Then, the form by label of generative semantics training set exports:

Wherein,Classification from as shown in Figure 5 Label classification, p indicate that character attribute label, s and u indicate that environment attribute label, n indicate the position letter of data set neutron image piece Breath, N indicate the clique photo quantity in data set.

Above-mentioned semantic training set includes all subgraph photos of target image, and pedestrian detection and semantic may be implemented The frame that the multitask of business learns jointly.

Example IV

The embodiment of the present invention in above-described embodiment one and embodiment two it is multipair as convolution depth network architecture into Row explanation.

In embodiments of the present invention, multipair as convolution depth network model includes TA-CNN (task-assistant Convolutional Neural Networks, task assistant convolutional neural networks) frame；The network of the TA-CNN frame Structure, including four convolutional layers, four maximum pond layers and two full articulamentums.

In embodiments of the present invention, TA-CNN frame is the Alex Net of simplified version, is removed on the basis of former Alex Net One layer of convolutional layer and full articulamentum, and joined SPV (Structure Projection Vector, vector space structure). The then TA-CNN frame in the embodiment of the present invention, four included convolutional layers are conv1 to conv4, two full articulamentums For fc5 and fc6.

As it can be seen that the multipair building basis as convolution depth network model in above-described embodiment one and embodiment two is TA- CNN frame also needs to optimize TA-CNN frame in a particular application, and the embodiment of the present invention is also in embodiment three as a result, Semantic data collection for, show the optimization process of TA-CNN frame:

Firstly, formulation TA-CNN, TA-CNN are the following log posterior probability of optimization:

In order to solve Caltech (P), CamVid (Ba), Stanford Background (Bb), LM+SUN (Bc) each number According to the gap between collection, each sample x is calculated_nStructuring projection vector z_n, loss function becomes:

Then pass through TA-CNN e-learning, wherein for learning network parameter W, (2) formula is formulated as again Softmax loss function, it may be assumed that

As it can be seen that formula (3) puts 8 loss functions optimization together, but it will lead to two problems:

1) different task rates of convergence is different, and training will lead to over-fitting simultaneously for multitask；

If 2) characteristic dimension is relatively high, the parameter of network high level can be very much.

In the embodiment of the present invention, in order to solve the problems, such as above-mentioned two, multivariate is converted by (3) formula Cross-entropy loss, formula are as follows:

Embodiment five

As shown in fig. 6, the embodiment of the invention provides a kind of pedestrian detecting systems 60, comprising:

Pedestrian's identification module 61, for by the multipair pedestrian as in convolution depth network model recognition target image, and In the target image to pedestrian's addition the first identification frame；

Special article identification module 62, for passing through the certain objects in convolutional neural networks VGG19 recognition target image, And in the target image to certain objects addition the second identification frame；

Monitoring module 63, if there is overlapping, determines pedestrian for judging that it is non-overlapping that the first identification frame and the second identification frame have Certain objects are carried, and cause preset monitor event.

In one embodiment, pedestrian's identification module 61 includes:

Pedestrian's data set acquiring unit, for obtaining pedestrian's data set in target image；

Pedestrian's positive sample marking unit collects the pedestrian sample in pedestrian's data set for being arranged character attribute label, and Character attribute label is added to pedestrian's positive sample；

Contextual data collection acquiring unit, for obtaining the contextual data collection in target image；

Scene positive sample marking unit is used for scene set attribute tags, collects the scene sample in scene data set, and Scene properties label is added to scene positive sample；

Semantic training set generation unit, for generating language according to pedestrian's positive sample and scene positive sample with attribute tags Adopted training set；

Model training unit, for by semantic training set and pedestrian's data set in the form of different task, input it is multipair as Convolution depth network model is trained, to obtain the pedestrian in target image.

In embodiments of the present invention, pedestrian's positive sample marking unit, is also used to:

The positive negative sample in pedestrian's data set is collected, by positive and negative sample tissue into two tree structures；

According to tree structure output child node in, with pre-determined distance two child nodes, obtain pedestrian's positive sample and Pedestrian's negative sample adds character attribute label to pedestrian's positive sample.

The embodiment of the present invention also provide a kind of terminal device include memory, processor and storage on a memory and can be The computer program run on processor when the processor executes the computer program, is realized as described in embodiment one Pedestrian detection method in each step.

The embodiment of the present invention also provides a kind of storage medium, and the storage medium is computer readable storage medium, thereon It is stored with computer program, when the computer program is executed by processor, realizes the pedestrian detection as described in embodiment one Each step in method.

Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations；Although previous embodiment Invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each implementation Technical solution documented by example is modified or equivalent replacement of some of the technical features；And these modification or Replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all include Within protection scope of the present invention.

Claims

1. a kind of pedestrian detection method characterized by comprising

By the multipair pedestrian as in convolution depth network model recognition target image, and to the row in the target image People's addition the first identification frame；

The certain objects in the target image are identified by convolutional neural networks VGG19, and to institute in the target image State certain objects addition the second identification frame；

It is non-overlapping to judge that the first identification frame and the second identification frame have, if there is overlapping, determines that the pedestrian carries The certain objects, and cause preset monitor event.

2. pedestrian detection method as described in claim 1, which is characterized in that it is described by multipair as convolution depth network model Pedestrian in recognition target image, comprising:

Obtain pedestrian's data set in the target image；

Character attribute label is set, collects the pedestrian sample in pedestrian's data set, and the people is added to pedestrian's positive sample Object attribute tags；

Obtain the contextual data collection in the target image；

Scene set attribute tags collect the scene sample that the contextual data is concentrated, and add the field to scene positive sample Scape attribute tags；

By the semantic training set and pedestrian's data set in the form of different task, input described multipair as convolution depth net Network model is trained, to obtain the pedestrian in the target image.

3. pedestrian detection method as claimed in claim 2, which is characterized in that setting character attribute label, described in the collection Pedestrian sample in pedestrian's data set, and the character attribute label is added to pedestrian's positive sample, comprising:

According in the child node of tree structure output, two child nodes with pre-determined distance obtain the positive sample of the pedestrian Originally with pedestrian's negative sample, the character attribute label is added to pedestrian's positive sample.

4. pedestrian detection method as described in any one of claims 1 to 3, which is characterized in that described multipair as convolution depth net Network model includes task assistant's convolutional neural networks TA-CNN frame；

5. pedestrian detection method as described in claim 1, which is characterized in that described to be identified by convolutional neural networks VGG19 Certain objects in target image, and to certain objects addition the second identification frame in the target image, comprising:

The samples pictures of the certain objects are collected, and are marked；

The samples pictures are cleaned and marked；

6. a kind of pedestrian detecting system characterized by comprising

Pedestrian's identification module, for by the multipair pedestrian as in convolution depth network model recognition target image, and described To pedestrian addition the first identification frame in target image；

Special article identification module, for identifying the certain objects in the target image by convolutional neural networks VGG19, and To certain objects addition the second identification frame in the target image；

Monitoring module, if there is overlapping, determines institute for judging that it is non-overlapping that the first identification frame and the second identification frame have It states pedestrian and carries the certain objects, and cause preset monitor event.

7. pedestrian detecting system as claimed in claim 6, which is characterized in that pedestrian's identification module includes:

Pedestrian's positive sample marking unit collects the pedestrian sample in pedestrian's data set for character attribute label to be arranged, and The character attribute label is added to pedestrian's positive sample；

Scene positive sample marking unit is used for scene set attribute tags, collects the scene sample that the contextual data is concentrated, and The scene properties label is added to scene positive sample；

Semantic training set generation unit, for raw according to pedestrian's positive sample and the scene positive sample with attribute tags At semantic training set；

Model training unit, for the semantic training set and pedestrian's data set in the form of different task, to be inputted institute State it is multipair as convolution depth network model is trained, to obtain the pedestrian in the target image.

8. pedestrian detecting system as claimed in claim 7, which is characterized in that pedestrian's positive sample marking unit is also used to:

9. a kind of terminal device, which is characterized in that on a memory and can be on a processor including memory, processor and storage The computer program of operation, which is characterized in that when the processor executes the computer program, realize such as claim 1 to 5 Each step in described in any item pedestrian detection methods.

10. a kind of storage medium, the storage medium is computer readable storage medium, is stored thereon with computer program, It is characterized in that, when the computer program is executed by processor, realizes such as pedestrian detection described in any one of claim 1 to 5 Each step in method.