CN117789199A

CN117789199A - Identification method and device for small fruits, terminal equipment and storage medium

Info

Publication number: CN117789199A
Application number: CN202311863345.5A
Authority: CN
Inventors: 毛亮; 李悦; 陈婉姗; 覃嘉俊; 王林琳
Original assignee: Guangzhou National Modern Agricultural Industry Science And Technology Innovation Center; Shenzhen Vocational And Technical University
Current assignee: Guangzhou National Modern Agricultural Industry Science And Technology Innovation Center; Shenzhen Vocational And Technical University
Priority date: 2023-12-29
Filing date: 2023-12-29
Publication date: 2024-03-29

Abstract

The invention discloses a method, a device, terminal equipment and a storage medium for identifying small fruits, wherein the method comprises the following steps: acquiring a fruit image; inputting the fruit image into a fruit recognition model so that the fruit recognition model recognizes the position of each fruit in the fruit image and determines whether each fruit is mature or not; wherein the fruit recognition model is constructed based on a YOLOv8 model; the fruit recognition model includes: a small target detection head; the small target detection head is connected with the stagelyer 1 module and the second-stage up-sampling module of the YOLOv8 model, receives the output of the stagelyer 1 module and the second-stage up-sampling module when the fruit image is identified, and generates a feature map according to the output of the stagelyer 1 module and the second-stage up-sampling module. The invention improves the detection effect on small fruits.

Description

Identification method and device for small fruits, terminal equipment and storage medium

Technical Field

The present invention relates to the field of image recognition technologies, and in particular, to a method and apparatus for recognizing small fruits, a terminal device, and a storage medium.

Background

The traditional litchi harvesting needs manual operation, has high labor intensity, and can reduce harvesting efficiency with the aging of population and high labor cost. In order to improve the litchi harvesting efficiency, reduce damage, deal with the problems of labor shortage and the like, a deep learning-based picking robot is put into use for litchi picking.

The deep learning based object detection method shows good performance on a common data set, but the existing conventional model rarely considers the case of processing a small object. When small object objects such as litchi are faced, the problem of insufficient feature extraction often occurs, resulting in poor recognition effect.

Disclosure of Invention

The invention provides a method, a device, terminal equipment and a storage medium for identifying small fruits, which are used for solving the technical problem that the small fruits are poor in identification effect in the prior art.

In order to solve the above technical problems, an embodiment of the present invention provides a method for identifying small fruits, including:

acquiring a fruit image;

inputting the fruit image into a fruit recognition model so that the fruit recognition model recognizes the position of each fruit in the fruit image and determines whether each fruit is mature or not;

wherein the fruit recognition model is constructed based on a YOLOv8 model; the fruit recognition model includes: a small target detection head; the small target detection head is connected with the stagelyer 1 module and the second-stage up-sampling module of the YOLOv8 model, receives the output of the stagelyer 1 module and the second-stage up-sampling module when the fruit image is identified, and generates a feature map according to the output of the stagelyer 1 module and the second-stage up-sampling module.

Preferably, the fruit recognition model further comprises: a selective kernel attention module;

the selective kernel attention module is connected with the SPPF module, the first-stage up-sampling module and the first-stage detection head of the YOLOv8 model;

the selective kernel attention module is used for receiving the output of the SPPF module when the fruit image is identified;

according to the output of the SPPF module, obtaining feature graphs with different scales through a plurality of convolution branches which are parallel and have different kernel sizes;

determining a selection weight according to the information of all convolution branches; fusing the feature images with different scales according to the selection weights to obtain fused feature images;

and transmitting the fused characteristic diagram to the first-stage up-sampling module and the first-stage detection head.

Preferably, the fruit recognition model further comprises: a plurality of RepVGG modules;

the RepVGG module is used for replacing a c2f module of the YOLOv8 model; and is also provided with

In the training process of the fruit recognition model, the RepVGG module comprises a 3×3 convolution branch, a 1×1 convolution branch, an identity mapping branch, a BN layer and a ReLU activation; wherein the 3x3 convolution branches, the 1x1 convolution branches and the identity mapping branches are parallel to each other; and is also provided with

In the reasoning process of the fruit recognition model, the RepVGG module comprises: a 3x3 convolutional layer; the 3×3 convolution layer is obtained by performing a structure re-parameterization operation on the 3×3 convolution branches, the 1×1 convolution branches and the identity mapping branches.

Wherein the structural reparameterization operation includes:

fusing the 3×3 convolution branches with the BN layer to generate a first 3×3 convolution BN branch; fusing the 1 multiplied by 1 convolution branch with the BN layer to generate a 1 multiplied by 1 convolution BN branch; fusing the identity mapping branch with the BN layer to generate an identity mapping BN branch;

converting the convolution size of the 1×1 convolution BN branch into 3×3, to obtain a second 3×3 convolution BN branch;

and fusing the first 3 multiplied by 3 convolution BN branch, the second 3 multiplied by 3 convolution BN branch and the identity mapping BN branch to obtain the 3 multiplied by 3 convolution layer.

Preferably, the training process of the fruit recognition model includes:

acquiring a plurality of sample fruit images; all target objects in the sample fruit image are marked as corresponding rectangular frames, and corresponding labels for indicating whether fruits are ripe or not are arranged;

and training the fruit identification model according to the sample fruit image.

On the basis of the above embodiment, another embodiment of the present invention provides an identification device for small fruits, which is characterized by comprising: an image acquisition module and an identification module;

the image acquisition module is used for acquiring fruit images;

the identification module is used for inputting the fruit image into a fruit identification model so that the fruit identification model can identify the fruit image;

wherein the fruit recognition model is constructed based on a YOLOv8 model; the fruit recognition model includes: a small target detection head; the small target detection head is connected with the stagelyer 1 module and the second-stage up-sampling module of the YOLOv8 model, receives the output of the stagelyer 1 module and the second-stage up-sampling module when the fruit image is identified, and generates a fused characteristic diagram according to the output of the stagelyer 1 module and the second-stage up-sampling module.

Wherein the structural reparameterization operation includes:

Preferably, the training process of the fruit recognition model includes:

On the basis of the above embodiments, a further embodiment of the present invention provides a terminal device, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor executes the computer program to implement the identifying method for small fruits according to the embodiment of the present invention.

On the basis of the foregoing embodiments, a further embodiment of the present invention provides a storage medium, where the storage medium includes a stored computer program, where the computer program controls a device where the computer readable storage medium is located to execute the method for identifying small fruits according to the foregoing embodiment of the present invention when the computer program runs.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

the method comprises the steps of obtaining fruit images; inputting the fruit image into a fruit recognition model so that the fruit recognition model recognizes the position of each fruit in the fruit image and determines whether each fruit is mature or not; wherein the fruit recognition model is constructed based on a YOLOv8 model; the fruit recognition model includes: a small target detection head; the small target detection head is connected with the stagelyer 1 module and the second-stage up-sampling module of the YOLOv8 model, receives the output of the stagelyer 1 module and the second-stage up-sampling module when the fruit image is identified, and generates a feature map according to the output of the stagelyer 1 module and the second-stage up-sampling module. According to the invention, the small target detection layer is added, so that the network is more concerned with the detection of the small target, and the detection effect on the small fruits is improved.

Drawings

Fig. 1 is a schematic flow chart of a method for identifying small fruits according to an embodiment of the present invention;

FIG. 2 is a diagram of the original YOLOv8 network architecture;

FIG. 3 is a network frame diagram of a fruit recognition model of the present invention;

FIG. 4 is a block diagram of a selective kernel attention module of the present invention;

FIG. 5 is a block diagram of a RepVGG module of the invention;

fig. 6 is a schematic diagram of the structural repavgg module of the present invention.

Fig. 7 is a schematic structural diagram of a device for identifying small fruits according to an embodiment of the present invention.

Wherein, the reference numerals of the specification drawings are as follows: a small target detection head 1, a selective kernel attention module 2, and a RepVGG module 3.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

Referring to fig. 1, a flow chart of a method for identifying small fruits according to an embodiment of the invention includes:

s1, acquiring a fruit image.

The fruit image is an RGB image and/or a depth image, and is acquired by a RealSense D435 depth camera.

S2, inputting the fruit image into a fruit recognition model so that the fruit recognition model recognizes the position of each fruit in the fruit image, and determining whether each fruit is mature or not;

wherein the fruit recognition model is constructed based on a YOLOv8 model; the fruit recognition model includes: a small target detection head 1; the small target detection head 1 is connected with a stagelyer 1 module and a second-stage up-sampling module of the YOLOv8 model, receives the output of the stagelyer 1 module and the second-stage up-sampling module when the fruit image is identified, and generates a feature map according to the output of the stagelyer 1 module and the second-stage up-sampling module.

Referring to fig. 2 and 3, fig. 2 is an original YOLOv8 network architecture diagram, fig. 3 is a network architecture diagram of a fruit recognition model of the present invention, and is an improved YOLOv8 network structure, in which a small target detection head 1 is included, the small target detection head 1 receives outputs of the stagelyer 1 module and the second stage up-sampling module, and generates a feature map with a size of 160×160×45 according to the outputs of the stagelyer 1 module and the second stage up-sampling module.

It should be noted that, the downsampling multiple of the original YOLOv8 network is relatively large, and in the downsampling process of the trunk step length of 2, the network can obtain more semantic information, but a great amount of detail characteristic information is lost, and the problem of shallow network information is solved. However, the detail information contains quality features of small-sized objects, which may be ignored during the downsampling process. The deeper feature map is difficult to learn the feature information of the small target, the small target detection head 1 is added, and the shallow feature map and the deep feature map are spliced and then detected. The network can pay more attention to the detection of the small target, and the detection effect is improved.

In a preferred embodiment, the fruit identification model further comprises: a selective kernel attention module 2;

the selective kernel attention module 2 is connected with the SPPF module, the first-stage up-sampling module and the first-stage detection head of the YOLOv8 model;

the selective kernel attention module 2 is used for receiving the output of the SPPF module when the fruit image is identified;

Referring to fig. 2, 3 and 4, fig. 4 is a block diagram of the selective kernel attention module 2 according to the present invention. The network frame diagram of the fruit identification model comprises a selective kernel attention module 2, wherein the selective kernel attention module 2 is connected with an SPPF module, a first-stage up-sampling module and a first-stage detection head of the YOLOv8 model; the selective kernel attention module 2 is used for receiving the output of the SPPF module when the fruit image is identified; according to the output of the SPPF module, obtaining feature graphs with different scales through a plurality of convolution branches which are parallel and have different kernel sizes; determining a selection weight according to the information of all convolution branches; fusing the feature images with different scales according to the selection weights to obtain fused feature images; and transmitting the fused characteristic diagram to the first-stage up-sampling module and the first-stage detection head.

It should be noted that the selective kernel attention module 2 introduces a selective row kernel attention mechanism, which is an attention mechanism that introduces different kernel sizes in the convolutional neural network to capture multi-scale context information. In conventional convolutional neural networks, receptive field sizes are fixed, which limits their ability to effectively capture local and global context information. The select row kernel attention mechanism addresses this limitation by introducing multiple parallel convolution branches, each using a different kernel size. These branches can capture information on different spatial scales, allowing the model to better understand the input features. The key idea of selecting row kernel attention is to utilize channel attention across different kernel sizes. The attention mechanism learns the importance of each channel for each kernel size, allowing the network to selectively focus on the most informative kernel sizes. This adaptation allows the model to dynamically adjust receptive fields and collect relevant information from different scales. By introducing the kernel attention of the selection row into the convolutional neural network architecture, the model can simultaneously capture fine-grained local details and a larger range of global contexts, thereby improving the performance in various computer vision tasks and accelerating the model reasoning time.

In a preferred embodiment, the fruit identification model further comprises: a plurality of RepVGG modules 3;

the RepVGG module 3 is used for replacing a c2f module of the YOLOv8 model; and is also provided with

In the training process of the fruit recognition model, the RepVGG module 3 comprises a 3×3 convolution branch, a 1×1 convolution branch, an identity mapping branch, a BN layer and a ReLU activation; wherein the 3x3 convolution branches, the 1x1 convolution branches and the identity mapping branches are parallel to each other; and is also provided with

In the reasoning process of the fruit recognition model, the RepVGG module 3 includes: a 3x3 convolutional layer; the 3×3 convolution layer is obtained by performing a structure re-parameterization operation on the 3×3 convolution branches, the 1×1 convolution branches and the identity mapping branches.

Wherein the structural reparameterization operation includes:

Referring to fig. 3 and 5, fig. 5 is a block diagram of the RepVGG module 3 according to the invention. The network frame diagram of the fruit recognition model comprises a plurality of RepVGG modules 3, wherein the RepVGG modules 3 are used for replacing a c2f module of the YOLOv8 model.

It should be noted that, the RepVGG module 3 has three branches, namely an identity mapping branch, a 1×1 convolution branch and a 3×3 convolution layer, in each layer during training, and outputs y=x+g (x) +f (x), y represents outputs, x, g (x) and f (x) represent corresponding identity mappings, 1x1 convolution and 3x3 convolution, respectively, each layer needs 3 parameter blocks, and for an n-layer network, 3n parameter blocks are needed. So we need to re-parameterize, which makes the model parameters small at the time of reasoning. Structural re-parameterization is the use of different structures for training and reasoning, but with the same set of parameters. RepVGG converts the 3-branch network equivalence to a single-branch network.

Referring to fig. 6, a schematic diagram of a structural repavgg module 3 according to the present invention is shown, and the structural reparameterization is mainly divided into three steps;

(1) Fusing the 3×3 convolution branches with the BN layer to generate a first 3×3 convolution BN branch; fusing the 1 multiplied by 1 convolution branch with the BN layer to generate a 1 multiplied by 1 convolution BN branch; fusing the identity mapping branch with the BN layer to generate an identity mapping BN branch;

(2) Converting the convolution size of the 1×1 convolution BN branch into 3×3, to obtain a second 3×3 convolution BN branch;

(3) And fusing the first 3 multiplied by 3 convolution BN branch, the second 3 multiplied by 3 convolution BN branch and the identity mapping BN branch to obtain the 3 multiplied by 3 convolution layer.

In a preferred embodiment, the training process of the fruit recognition model comprises:

It should be noted that, training the litchi recognition target detection model, and training the model by using a back propagation iteration mode to obtain model parameters suitable for litchi recognition target detection.

Example two

Referring to fig. 7, a schematic structural diagram of a device for identifying small fruits according to an embodiment of the present invention is provided, where the device includes: an image acquisition module and an identification module;

the image acquisition module is used for acquiring fruit images;

wherein the fruit recognition model is constructed based on a YOLOv8 model; the fruit recognition model includes: a small target detection head 1; the small target detection head 1 is connected with a stagelyer 1 module and a second-stage up-sampling module of the YOLOv8 model, receives the output of the stagelyer 1 module and the second-stage up-sampling module when the fruit image is identified, and generates a fused characteristic diagram according to the output of the stagelyer 1 module and the second-stage up-sampling module.

Wherein the structural reparameterization operation includes:

Example III

Accordingly, an embodiment of the present invention provides a terminal device, where the terminal device includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor executes the computer program to implement the method for identifying small fruits according to the embodiment of the present invention.

Example IV

Accordingly, an embodiment of the present invention provides a storage medium, where the storage medium includes a stored computer program, where when the computer program runs, a device where the computer readable storage medium is controlled to execute the method for identifying small fruits according to the embodiment of the present invention.

It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

It will be clearly understood by those skilled in the art that, for convenience and brevity, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

The device may be a computing device such as a desktop computer, a notebook, a palm computer, a cloud server, etc. The device may include, but is not limited to, a processor, a memory.

The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the device, connecting the various parts of the overall device using various interfaces and lines.

The memory may be used to store the computer program, and the processor may implement various functions of the device by running or executing the computer program stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the cellular phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.

The storage medium is a computer readable storage medium, and the computer program is stored in the computer readable storage medium, and when executed by a processor, the computer program can implement the steps of the above-mentioned method embodiments. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention, and are not to be construed as limiting the scope of the invention. It should be noted that any modifications, equivalent substitutions, improvements, etc. made by those skilled in the art without departing from the spirit and principles of the present invention are intended to be included in the scope of the present invention.

Claims

1. A method of identifying small fruits, comprising:

acquiring a fruit image;

2. The method of identifying small fruits according to claim 1, wherein said fruit identification model further comprises: a selective kernel attention module;

3. The method of identifying small fruits according to claim 1, wherein said fruit identification model further comprises: a plurality of RepVGG modules;

Wherein the structural reparameterization operation includes:

4. The method of claim 1, wherein the training process of the fruit recognition model comprises:

5. An identification device for small fruits, comprising: an image acquisition module and an identification module;

the image acquisition module is used for acquiring fruit images;

6. The apparatus for identifying small fruits according to claim 5, wherein said fruit identification model further comprises: a selective kernel attention module;

7. The apparatus for identifying small fruits according to claim 5, wherein said fruit identification model further comprises: a plurality of RepVGG modules;

Wherein the structural reparameterization operation includes:

8. The apparatus for identifying small fruits according to claim 5, wherein the training process of the fruit identification model comprises:

9. A terminal device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the identification method for small fruits according to any one of claims 1 to 4 when the computer program is executed.

10. A storage medium comprising a stored computer program, wherein the computer program, when run, controls a device in which the computer readable storage medium is located to perform the method of identifying small fruits according to any one of claims 1 to 4.