CN112766361A - Target fruit detection method and detection system under homochromatic background - Google Patents

Target fruit detection method and detection system under homochromatic background Download PDF

Info

Publication number
CN112766361A
CN112766361A CN202110061551.9A CN202110061551A CN112766361A CN 112766361 A CN112766361 A CN 112766361A CN 202110061551 A CN202110061551 A CN 202110061551A CN 112766361 A CN112766361 A CN 112766361A
Authority
CN
China
Prior art keywords
network
feature
target
fruit
target fruit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110061551.9A
Other languages
Chinese (zh)
Inventor
贾伟宽
张中华
邵文静
刘杰
侯素娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN202110061551.9A priority Critical patent/CN112766361A/en
Publication of CN112766361A publication Critical patent/CN112766361A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/68Food, e.g. fruit or vegetables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides a method and a system for detecting target fruits under the background of the same color system, comprising the following steps: acquiring image data of a target fruit under a homochromatic background, and preprocessing the image; extracting image features of the acquired image data by adopting a depth convolution network, and fusing the image features by using a feature pyramid network to obtain a fused prediction feature map; and respectively predicting the feature map of each level of the feature pyramid network, and generating a predicted value of the target fruit by a full convolution method through classification and regression of two branches. The method integrates a deep convolution network and a pyramid network to extract the characteristic diagram, predicts in a single-stage and full convolution mode, can efficiently identify the fruits in the aspects of precision and speed, has strong robustness in the same-color background environment, and meets the requirement of actual operation.

Description

Target fruit detection method and detection system under homochromatic background
Technical Field
The disclosure relates to the related technical field of intelligent agriculture, in particular to a target fruit detection method and a target fruit detection system under the background of the same color system.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In the whole production cycle of the traditional fruit and vegetable industry, manual operation is mainly used in each stage at present, and the problems of time consumption, labor consumption, high cost, low efficiency and the like exist, so that the production automation in a complex orchard environment is a necessary trend for the development of the industry, the time for variable spraying of pesticide and fertilizer, yield estimation and intelligent picking time are generally determined by detecting the actual condition of fruits, and the accurate and rapid detection of fruit targets has important significance.
The inventor finds that the identification of target fruits in a real orchard environment is generally accompanied by interference such as branch shielding, fruit overlapping, illumination change and the like, and for green fruits, the mutual mixed detection of leaves and green fruits is easily caused due to the reason that the colors of the green fruits are very similar to the background colors of the leaves, so that the fruit identification difficulty is rapidly increased, and the intelligent process of orchard management is influenced.
At present, a certain research foundation is accumulated in the field, and machine learning and deep learning are mostly used. Among them, the identification method based on machine learning usually accompanies operations such as preprocessing, feature selection, etc., and cannot realize an end-to-end detection process, and the identification effect is easily affected by various interferences in the natural environment. Although the recognition method based on deep learning has the advantages that the precision is obviously improved, and the end-to-end detection process can be realized, the operation such as convolution and the dependence of a model on an anchor frame cause that a large amount of calculation and storage resources are consumed, and the recognition speed can not meet the real-time requirement.
Disclosure of Invention
The present disclosure provides a target fruit detection method and a target fruit detection system under the same color system background, which can meet the detection requirements of intelligent agricultural applications such as intelligent picking, variable pesticide and fertilizer spraying, yield estimation, and the like, and can simultaneously improve the detection speed and precision.
In order to achieve the purpose, the following technical scheme is adopted in the disclosure:
one or more embodiments provide a method for detecting a target fruit in a homochromy background, comprising the following steps:
acquiring image data of a target fruit under a homochromatic background, and preprocessing the image;
extracting image features of the acquired image data by adopting a depth convolution network, and fusing the image features by using a feature pyramid network to obtain a fused prediction feature map;
and respectively predicting the feature map of each level of the feature pyramid network, and generating a predicted value of the target fruit by a full convolution method through classification and regression of two branches.
One or more embodiments provide a target fruit detection system in a homochromatic background, comprising:
an image acquisition module: the fruit image preprocessing system is configured to be used for acquiring image data of a target fruit under the background of the same color system and preprocessing the image;
the characteristic diagram extraction and fusion module: the system comprises a depth convolution network, a feature pyramid network and a prediction feature graph, wherein the depth convolution network is used for extracting image features of acquired image data, and the feature pyramid network is used for fusing to obtain a fused prediction feature graph;
a prediction module: and the prediction method is configured to predict the feature map of each level of the feature pyramid network respectively, and generate a predicted value of the target fruit in a full convolution method through classifying and regressing two branches.
An electronic device comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, the computer instructions, when executed by the processor, performing the steps of the above method.
A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the above method.
Compared with the prior art, the beneficial effect of this disclosure is:
(1) according to the method, on the premise of ensuring the precision, the speed is increased, the relation between the accuracy and the efficiency is balanced, the actual operation requirements of various automatic applications in the orchard environment are combined, and the target fruit detection model with high precision, high speed, strong robustness and good adaptability is provided.
(2) The method can quickly and accurately detect the position of the target fruit under the homochromatic background, predict in a single-stage and full-convolution mode, efficiently identify the fruit in both precision and speed, has strong robustness under the homochromatic background environment, and meets the requirement of actual operation.
(3) The method and the device have the advantages that the dependence of a mainstream detection algorithm on an anchor frame is eliminated, the complexity, the detection speed, the occupied storage capacity, the adaptability and the like of the algorithm are obviously improved, and the stability and the applicability of the model when the model is deployed to various intelligent agricultural applications are effectively improved.
Advantages of additional aspects of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the disclosure.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and not to limit the disclosure.
FIG. 1 is a flowchart of the overall detection method of embodiment 1 of the present disclosure;
fig. 2 shows fruit images collected under different interference scenes in the same color system background according to embodiment 1 of the present disclosure;
FIG. 3 is a diagram of a structure of a result prediction stage of a detection model corresponding to the detection method in embodiment 1 of the present disclosure;
FIG. 4 is a diagram of the effect of different-scale fruits of embodiment 1 of the present disclosure predicted by the detection method of embodiment 1
Fig. 5 is a schematic diagram of the partition of mapping a forward sampling region onto an input target fruit picture according to embodiment 1 of the present disclosure;
FIG. 6 is an overall flowchart of a single iteration in the training process of the detection model according to embodiment 1 of the present disclosure;
fig. 7 shows the detection effect of example 1 on fruits in two same color system backgrounds according to example 1 of the present disclosure.
The specific implementation mode is as follows:
the present disclosure is further described with reference to the following drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments in the present disclosure may be combined with each other. The embodiments will be described in detail below with reference to the accompanying drawings.
Example 1
In one or more embodiments, as shown in fig. 1, a method for detecting a target fruit in a same color family background includes the following steps:
step 1, acquiring image data of a target fruit under the background of the same color system, and preprocessing the image;
step 2, extracting image features of the acquired image data by adopting a depth convolution network, and fusing the image features by using a feature pyramid network to obtain a fused prediction feature map;
and 3, respectively predicting the feature map of each level of the feature pyramid network, and generating a predicted value of the target fruit by a full convolution method through classification and regression of two branches.
In the embodiment, the deep convolution network and the pyramid network are fused to extract the feature map, the prediction is carried out in a single-stage and full convolution mode, the fruits can be efficiently identified in the aspects of precision and speed, the robustness is strong in the background environment of the same color system, and the requirement of actual operation is met.
In the step 1, different kinds of fruit images can be collected under the background of the same color system through an image collecting device such as a camera and the like;
the preprocessing method comprises filling and cropping the image.
In step 2, a method for extracting image features of the acquired image data by adopting a deep convolution network and fusing the image features by using a feature pyramid network specifically comprises the following steps: extracting image features by taking a convolutional neural network (ResNet) as a backbone framework; fusing the feature maps output by each residual block in ResNet according to a top-down and transverse connection mode to enable the deep feature map and the shallow feature map to have the same level of semantic capacity, so that the semantic representation capacity of the bottom feature map is gradually enriched, and the obtained feature pyramid is the fused prediction feature map;
in this embodiment, the method for extracting image features specifically includes: outputting the image to a residual error network ResNet by taking batch as a unit, and performing convolution and pooling operation; feature expression capabilities contained in deep feature maps are gradually enriched by convolution and pooling operations.
In this embodiment, the method for fusing image features specifically includes: and fusing the feature maps with different sizes output by each residual block in ResNet according to a top-down and transverse connection mode, so that the deep feature map and the shallow feature map have the same level of semantic capacity, and a feature pyramid is obtained.
Optionally, the cross-connect changes the number of channels to a fixed value, such as 256, by 1 × 1 convolution;
optionally, the top-down method may be upsampling to the same size by a nearest neighbor interpolation method, and finally, the deep layer feature and the shallow layer feature are fused in a pixel-level addition manner.
The prediction model constructed by combining the deep convolutional network and the characteristic pyramid network can effectively improve the segmentation effect of the prediction model on different scales, particularly small-scale target fruits.
In step 3, feature maps of each level of the feature pyramid network are respectively predicted, a category-sensitive semantic map is predicted as the probability of existence of a fruit by a full convolution method through classification and regression of two branches, and a mapping relation between an original image center point and a frame coordinate corresponding to a positive sample is the predicted value of the generated target fruit, wherein the method specifically comprises the following steps of:
step 31, distributing the target fruit marking frame to feature maps of different levels for prediction according to the scale of the target fruit marking frame, obtaining a positive sampling area of the marking frame of the layer for prediction on the feature maps by using a shrinkage factor, and determining each spatial position on the feature maps as a positive sample or a negative sample;
the allocation according to the scale specifically comprises: according to the area of the target fruit marking frame, the target fruit marking frame is distributed to a characteristic diagram which is most suitable for prediction to take charge of prediction;
the positive and negative sample judgment specifically comprises the following steps: and recording the current characteristic diagram as Pl, obtaining a corresponding area on the Pl by using a labeling frame which is responsible for prediction according to a down-sampling multiple s, and shrinking by using a shrinkage coefficient sigma to obtain a positive sampling area Rpos, wherein a spatial position in the Rpos is a positive sampling point, and otherwise, the spatial position is a negative sampling point.
And step 32, predicting the confidence coefficient of the positive sample belonging to the fruit and the regularization offset between the positive sample and the real labeling box through classification and frame regression branches for each positive sample.
Optionally, as shown in fig. 6, the implementation process in the step 1-3 may be implemented in a full convolution neural network model fused with a pyramid network, the target fruit detection model in this embodiment is the full convolution neural network model fused with the pyramid network, the model structure is simpler, and the model structure may include a backbone network responsible for extracting features; the system comprises a feature pyramid responsible for fusing features and a prediction branch network responsible for generating a prediction result; the backbone network, the characteristic pyramid and the prediction branch network are sequentially connected, wherein the backbone network and the prediction branch network respectively adopt a convolutional neural network.
Optionally, the method for training the full convolution neural network model fused with the pyramid network includes the following steps:
(1) and acquiring a fruit image containing different types of interference under the background of the same color system, preprocessing the fruit image, and labeling the fruit image by using the minimum external matrix of the target fruit in the image to obtain the labeling information of the fruit image.
Optionally, green fruits are selected and pictures are taken in the same color system background, so that the acquired images contain different types of interference as much as possible to represent the real orchard environment as much as possible, as shown in fig. 2.
The size of the collected images is unified to 600 x 400, then the minimum external matrix of the target fruit in the images is labeled, label files can be generated according to the format of an MS COCO data set by means of labelme image labeling software, and the training targets of the models can be generated conveniently in the follow-up process.
(2) And extracting image features of the acquired image data by adopting a depth convolution network, and fusing the image features by using a feature pyramid network.
Extracting image features by adopting ResNet, and performing feature fusion by using the outputs of the last three residual blocks conv3, conv4 and conv5 to construct a feature pyramid network, which is marked as { C3,C4,C5}。
In this embodiment, the feature fusion step: c5Changing the number of channels to 256 by 1 × 1 convolution to obtain a new feature map and marking as P5(ii) a Then convolution downsampling is carried out with the step length of 2 to obtain P in sequence6、P7(ii) a Then for { C3,C4,C5Get { P } in a horizontal connection and top-down configuration3,P4,P5And (9) layer.
The cross-connect changes the number of channels to a fixed value, which may be 256, with a 1 × 1 convolution;
from top to bottom, the method adopts a nearest neighbor interpolation method to perform up-sampling to the same size, and finally, the deep layer feature and the shallow layer feature are fused in a pixel-level addition mode. Obtaining the final fused feature pyramidTower { P3,P4,P5,P6,P7}。
(3) Respectively predicting the feature map of each level of the feature pyramid network, and generating a predicted value of the target fruit by a full convolution method through classification and regression of two branches;
this step corresponds to the method of step 3 described above. Specifically, as shown in FIG. 3, for { P3,P4,P5,P6,P7Predicting each layer feature map separately, and assuming the current feature map is Pl∈RH×W×CRespectively inputting the semantic graph into a classification and regression subnet, and predicting a category-sensitive semantic graph P in a full convolution model cls∈RH×W×1As the probability of fruit existence, and the mapping relation P of the center point of the original image and the frame coordinate corresponding to a positive samplel reg∈RH×W×4. Taking the regression subnet as an example, it first generates the prediction value by 4 convolutions of 3 × 3, each convolution containing C convolution kernels and being activated by ReLU, and finally, by one convolution of 3 × 3 containing 4 convolution kernels.
(4) Determining a training target of each space position on the feature map according to the labeling information (labeling frame) in the step (1), wherein the steps are as follows:
(4-1) dimension assignment: according to the size of a target fruit marking frame of the target fruit image marking information, distributing the target fruit marking frame to feature graphs of different levels; as in step 31 previously described.
Specifically, in this step, for more stable training of the model, it is { P }3,P4,P5,P6,P7Assigning a basic scale r to each layer of feature mapl32,64,128,256,512, the dimension range of the target fruit labeling box (which is the real box) for predicting the characteristic diagram of the l-th layer is as follows:
[rl/η,rl·η] (1)
and (3) adjusting the value of the hyper-parameter eta of the control scale range according to the data set formed by the data acquired in the step (1) and the evaluation performance on the verification set divided by the data set. By adopting the method, the number of the spatial positions of the positive samples in each layer is increased, the problem of unbalance of the positive and negative samples is relieved to a certain extent, and the effect of optimizing the semantic expression of the adjacent level feature map can be achieved.
Predicting the image according to the feature level allocation strategy, wherein the prediction effects of all spatial positions in a forward sampling region on the feature map responsible for prediction are mapped to the original image, as shown in fig. 4, the prediction effects comprise two fruits, namely persimmon and apple, the fruits have different scales and are allocated to feature maps of different levels to be responsible for prediction, and the feature map layer with deep color in the image is an allocated region.
(4-2) positive and negative sample determination: the positive sample region is divided by a downsampling multiple and a shrinkage factor. The method is the same as in step 31.
In this embodiment, specifically, a real box G (x)1,y1,x2,y2And (2) the real frame is a marked frame marked in the target fruit image in the step (1), the real frame is predicted by a feature map Pl of the current level, the downsampling ratio of the l-layer feature map and the original image is marked as sl, G is mapped to Pl, and a feature region G '(x'1,y’1,x’2,y’2);
Figure BDA0002902560600000091
Calculating center point coordinate (c ') of G'x,c’y) And width (w '), height (h') are:
Figure BDA0002902560600000101
then shrinking G' by the shrinkage coefficient sigma to obtain a positive sampling region RposWhose coordinates are expressed as
Figure BDA0002902560600000102
Figure BDA0002902560600000103
Training phase, RposAll spatial positions in the sample are regarded as positive samples, and the class target corresponding to the positive samples is the labeling class corresponding to the labeling class G, RposAll spatial positions outside are treated as negative samples and are mapped onto the input picture as shown in fig. 5.
And (4-3) determining a training target of each spatial position on the feature map, wherein the training target comprises a classification target and a regression target, and specifically, the classification target and the regularization frame offset target can be obtained according to the labeling frame information corresponding to the positive sampling point.
Because the scale change of the target fruit in the target fruit image is large, and the direct regression frame numerical value is not stable, in this embodiment, a method of regression predicting the regularized offset between the frame and the real frame is adopted instead. The positive sample point (x, y) is first passed through a down-sampling ratio slMapped onto the input picture and then its regularized offsets from the four edges of the real box G are computed.
RposDirectly calculating the regularization distance from the positive sample (x, y) to the four edges of the real box G and taking the regularization distance as a regression target of the point, wherein the regularization distance is expressed as (t)x1,ty1,tx2,ty2) Specifically, the following are defined:
Figure BDA0002902560600000104
wherein s islTo sample ratio, rlThe base scale assigned to each layer of the feature map.
(5) And (4) calculating the loss between the predicted value of the target fruit and the training target in the step (3), updating network parameters through gradient back propagation (SGD), and performing iterative training and evaluation to obtain an optimal model.
In this embodiment, the loss between the predicted value of the target fruit and the training target includes classification loss and regression loss.
Optionally, for the classification Loss, because the number of positive samples is relatively small, there is a certain imbalance problem of positive and negative samples, and optionally, a Focal local Loss function is adopted in this embodiment to calculate the Loss value.
Alternatively, for the regression Loss, a Smooth L1 Loss function may be used for the calculation.
The total loss can be represented by the formula (6):
Figure BDA0002902560600000111
wherein,
Figure BDA0002902560600000112
as shown in the above formula (6), Lcls,LregThe losses due to the classification and regression branches of the network respectively,
Figure BDA0002902560600000113
respectively, the classification and regression target corresponding to the ith spatial position on the feature map, pi,tiThe classification and regression prediction values corresponding to the ith spatial position on the feature map are respectively, and lambda is responsible for regulating and controlling the balance between the two losses. When the Focal local is adopted to calculate the classification Loss, the alpha parameter is used for regulating and controlling the imbalance between the number and the importance of positive and negative samples; the gamma parameter is used for regulating and controlling the imbalance among the difficult and easy samples, and model degradation caused by loss leading training generated by simple negative samples is avoided; when the Smooth L1 Loss is adopted to calculate the regression Loss, the beta parameter is used for regulating and controlling different Loss functions called due to different Loss ranges, and the defects that the convergence speed of the L1 Loss is low and the L2 Loss is sensitive to outliers are overcome. In addition, two loss functions are respectively passed through NclsAnd NregAnd carrying out regularization. Last for LFoveaBoxAnd updating the model parameters by gradient back propagation, repeatedly iterating and evaluating to obtain an optimal model, wherein the training process of the single iteration network is shown in fig. 6. The final predicted effect of the model is shown in FIG. 7, which contains fruits of persimmon and apple in the same color system background, and thus can be seenAnd (4) accurately detecting the target fruit.
Example 2
Based on embodiment 1, this embodiment provides a target fruit detection system under the background of the same color system, including:
an image acquisition module: the fruit image preprocessing system is configured to be used for acquiring image data of a target fruit under the background of the same color system and preprocessing the image;
the characteristic diagram extraction and fusion module: the system comprises a depth convolution network, a feature pyramid network and a prediction feature graph, wherein the depth convolution network is used for extracting image features of acquired image data, and the feature pyramid network is used for fusing to obtain a fused prediction feature graph;
a prediction module: and the prediction method is configured to predict the feature map of each level of the feature pyramid network respectively, and generate a predicted value of the target fruit in a full convolution method through classifying and regressing two branches.
Example 3
The present embodiment provides an electronic device comprising a memory and a processor, and computer instructions stored on the memory and executed on the processor, wherein the computer instructions, when executed by the processor, perform the steps of the method of embodiment 1.
Example 4
The present embodiment provides a computer readable storage medium for storing computer instructions which, when executed by a processor, perform the steps of the method of embodiment 1.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims (10)

1. A method for detecting target fruits under the background of the same color system is characterized by comprising the following steps:
acquiring image data of a target fruit under a homochromatic background, and preprocessing the image;
extracting image features of the acquired image data by adopting a depth convolution network, and fusing the image features by using a feature pyramid network to obtain a fused prediction feature map;
and respectively predicting the feature map of each level of the feature pyramid network, and generating a predicted value of the target fruit by a full convolution method through classification and regression of two branches.
2. The method for detecting the target fruit in the background of the same color system as the claim 1, which is characterized in that: the method for extracting the image characteristics of the obtained image data by adopting a depth convolution network and fusing the image characteristics through a characteristic pyramid network to obtain a fused prediction characteristic diagram comprises the following steps:
transmitting the target fruit image to a residual error network ResNet, and performing convolution and pooling operations;
and fusing the feature maps with different sizes output by each residual block in ResNet according to a top-down and transverse connection mode, so that the deep feature map and the shallow feature map have the same level of semantic capacity, and a feature pyramid is obtained.
3. The method for detecting the target fruit in the background of the same color system as the claim 2, which is characterized in that: merging from top to bottom with a transverse connection, wherein the transverse connection changes the number of channels to a fixed value by 1 × 1 convolution; or, upsampling from top to bottom to the same size by adopting a nearest neighbor interpolation method, and finally fusing the deep layer feature and the shallow layer feature in a pixel-level addition mode.
4. The method for detecting the target fruit in the background of the same color system as the claim 1, which is characterized in that: the method for respectively predicting the feature map of each level of the feature pyramid network and generating the predicted value of the target fruit by a full convolution method through classification and regression of two branches specifically comprises the following steps:
according to the scale of the target fruit marking frame, distributing the target fruit marking frame to different levels of feature maps for prediction, then obtaining a positive sampling area of the target fruit marking frame, which is responsible for prediction, of each level on the feature maps by using a shrinkage factor, and determining each spatial position on the feature maps as a positive sample or a negative sample;
and predicting the confidence coefficient of the positive sample belonging to the fruit and the regularization offset between the positive sample and a target fruit labeling box through classification and frame regression branches for each positive sample.
5. The method for detecting the target fruit in the background of the same color system as the claim 1, which is characterized in that: the implementation process of the target fruit detection method is implemented in a full convolution neural network model fused with a pyramid network, and the model structure comprises a backbone network responsible for extracting features, a feature pyramid responsible for fusing the features and a prediction branch network responsible for generating a prediction result; the backbone network, the characteristic pyramid and the prediction branch network are sequentially connected, wherein the backbone network and the prediction branch network respectively adopt a convolutional neural network.
6. The method for detecting the target fruit under the background of the same color system as the claim 5, wherein the method for training the full convolution neural network model fused with the pyramid network comprises the following steps:
acquiring fruit images containing different types of interference under the background of the same color system, preprocessing and marking the fruit images to obtain marking information of the fruit images;
extracting image features of the acquired image data by adopting a depth convolution network, and fusing the image features by using a feature pyramid network;
respectively predicting the feature map of each level of the feature pyramid network, and generating a predicted value of the target fruit by a full convolution method through classification and regression of two branches;
determining a training target of each space position on the feature map according to the labeling information;
and calculating the loss between the predicted value of the target fruit and the training target, updating network parameters through gradient back propagation, and performing iterative training and evaluation to obtain an optimal model.
7. The method for detecting the target fruit in the background of the same color system as the claim 1, which is characterized in that: the method for determining the training target of each spatial position on the feature map according to the labeling information specifically comprises the following steps:
according to the size of a target fruit marking frame of the target fruit image marking information, distributing the target fruit marking frame to feature graphs of different levels;
according to the downsampling multiple and the shrinkage coefficient, obtaining a positive sampling area of a target fruit labeling frame which is responsible for prediction of each layer on the feature map, and determining each spatial position on the feature map as a positive sample or a negative sample;
and obtaining a training target with a classification target and a regularization frame offset target as each spatial position on the feature map according to the marking frame information corresponding to the positive sampling point.
8. A target fruit detection system under the background of the same color system is characterized by comprising:
an image acquisition module: the fruit image preprocessing system is configured to be used for acquiring image data of a target fruit under the background of the same color system and preprocessing the image;
the characteristic diagram extraction and fusion module: the system comprises a depth convolution network, a feature pyramid network and a prediction feature graph, wherein the depth convolution network is used for extracting image features of acquired image data, and the feature pyramid network is used for fusing to obtain a fused prediction feature graph;
a prediction module: and the prediction method is configured to predict the feature map of each level of the feature pyramid network respectively, and generate a predicted value of the target fruit in a full convolution method through classifying and regressing two branches.
9. An electronic device comprising a memory and a processor and computer instructions stored on the memory and executable on the processor, the computer instructions when executed by the processor performing the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the method of any one of claims 1 to 7.
CN202110061551.9A 2021-01-18 2021-01-18 Target fruit detection method and detection system under homochromatic background Pending CN112766361A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110061551.9A CN112766361A (en) 2021-01-18 2021-01-18 Target fruit detection method and detection system under homochromatic background

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110061551.9A CN112766361A (en) 2021-01-18 2021-01-18 Target fruit detection method and detection system under homochromatic background

Publications (1)

Publication Number Publication Date
CN112766361A true CN112766361A (en) 2021-05-07

Family

ID=75702477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110061551.9A Pending CN112766361A (en) 2021-01-18 2021-01-18 Target fruit detection method and detection system under homochromatic background

Country Status (1)

Country Link
CN (1) CN112766361A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177529A (en) * 2021-05-27 2021-07-27 腾讯音乐娱乐科技(深圳)有限公司 Method, device and equipment for identifying screen splash and storage medium
CN113536986A (en) * 2021-06-29 2021-10-22 南京逸智网络空间技术创新研究院有限公司 Representative feature-based dense target detection method in remote sensing image
CN114494151A (en) * 2021-12-30 2022-05-13 山东师范大学 Fruit detection method and system under complex orchard environment
CN114549970A (en) * 2022-01-13 2022-05-27 山东师范大学 Night small target fruit detection method and system fusing global fine-grained information

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344821A (en) * 2018-08-30 2019-02-15 西安电子科技大学 Small target detecting method based on Fusion Features and deep learning
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN110619632A (en) * 2019-09-18 2019-12-27 华南农业大学 Mango example confrontation segmentation method based on Mask R-CNN
CN111797846A (en) * 2019-04-08 2020-10-20 四川大学 Feedback type target detection method based on characteristic pyramid network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN109344821A (en) * 2018-08-30 2019-02-15 西安电子科技大学 Small target detecting method based on Fusion Features and deep learning
CN111797846A (en) * 2019-04-08 2020-10-20 四川大学 Feedback type target detection method based on characteristic pyramid network
CN110619632A (en) * 2019-09-18 2019-12-27 华南农业大学 Mango example confrontation segmentation method based on Mask R-CNN

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
CHENCHEN ZHU ET AL.: "Feature Selective Anchor-Free Module for Single-Shot Object Detection", 《2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
TAO KONG ET AL.: "FoveaBox: Beyound Anchor-Based Object Detection", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
ZHI TIAN ET AL.: "FCOS: Fully Convolutional One-Stage Object Detection", 《2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 *
刘树春 等,: "《深度实践OCR基于深度学习的文字识别》", 31 May 2020, 机械工业出版社 *
岳有军 等: "基于改进Mask RCNN的复杂环境下苹果检测研究", 《中国农机化学报》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177529A (en) * 2021-05-27 2021-07-27 腾讯音乐娱乐科技(深圳)有限公司 Method, device and equipment for identifying screen splash and storage medium
CN113177529B (en) * 2021-05-27 2024-04-23 腾讯音乐娱乐科技(深圳)有限公司 Method, device, equipment and storage medium for identifying screen
CN113536986A (en) * 2021-06-29 2021-10-22 南京逸智网络空间技术创新研究院有限公司 Representative feature-based dense target detection method in remote sensing image
CN114494151A (en) * 2021-12-30 2022-05-13 山东师范大学 Fruit detection method and system under complex orchard environment
CN114549970A (en) * 2022-01-13 2022-05-27 山东师范大学 Night small target fruit detection method and system fusing global fine-grained information

Similar Documents

Publication Publication Date Title
CN112766361A (en) Target fruit detection method and detection system under homochromatic background
CN109829399B (en) Vehicle-mounted road scene point cloud automatic classification method based on deep learning
CN109711288B (en) Remote sensing ship detection method based on characteristic pyramid and distance constraint FCN
CN111695482A (en) Pipeline defect identification method
Zhang et al. Hybrid region merging method for segmentation of high-resolution remote sensing images
CN109523520A (en) A kind of chromosome automatic counting method based on deep learning
CN111126472A (en) Improved target detection method based on SSD
CN109858569A (en) Multi-tag object detecting method, system, device based on target detection network
CN110991435A (en) Express waybill key information positioning method and device based on deep learning
CN109583483A (en) A kind of object detection method and system based on convolutional neural networks
CN111612002A (en) Multi-target object motion tracking method based on neural network
CN112884742A (en) Multi-algorithm fusion-based multi-target real-time detection, identification and tracking method
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
CN112651404A (en) Green fruit efficient segmentation method and system based on anchor-frame-free detector
CN111091101B (en) High-precision pedestrian detection method, system and device based on one-step method
CN109242826B (en) Mobile equipment end stick-shaped object root counting method and system based on target detection
CN110222215A (en) A kind of crop pest detection method based on F-SSD-IV3
CN109840559A (en) Method for screening images, device and electronic equipment
CN109492596A (en) A kind of pedestrian detection method and system based on K-means cluster and region recommendation network
CN113033516A (en) Object identification statistical method and device, electronic equipment and storage medium
US11978210B1 (en) Light regulation method, system, and apparatus for growth environment of leafy vegetables
CN111259808A (en) Detection and identification method of traffic identification based on improved SSD algorithm
CN111353440A (en) Target detection method
CN114359245A (en) Method for detecting surface defects of products in industrial scene
CN113435254A (en) Sentinel second image-based farmland deep learning extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210507