CN115995017A

CN115995017A - Fruit identification and positioning method, device and medium

Info

Publication number: CN115995017A
Application number: CN202211553660.3A
Authority: CN
Inventors: 毛亮; 梁志尚; 吴惠粦; 田鑫裕; 张兴龙; 朱文铭; 刘昌乐
Original assignee: Guangzhou National Modern Agricultural Industry Science And Technology Innovation Center; Shenzhen Polytechnic
Current assignee: Guangzhou National Modern Agricultural Industry Science And Technology Innovation Center; Shenzhen Polytechnic
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-04-21

Abstract

The invention discloses a fruit identification and positioning method, which comprises the following steps: shooting fruits under different illumination conditions, and classifying shooting results to obtain a training image dataset; labeling the images in the training image data set, and setting labels for labeling results; training the fruit target detection model by utilizing the training image data set and the labeling result; and acquiring images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training to obtain the maturity and position information of the fruits to be detected. The invention can effectively solve the problems of low accuracy, non-universality and high data acquisition cost in the prior art.

Description

Fruit identification and positioning method, device and medium

Technical Field

The invention relates to the technical field of fruit identification and positioning, in particular to a fruit identification and positioning method, device and medium.

Background

The identification and the positioning of fruits are the precondition and the basis for realizing automatic picking. The existing fruit identification and positioning method, such as a fruit positioning method and device proposed by patent document CN111126296A, and a mature pomegranate positioning method based on Mask R-CNN and 3-dimensional sphere fitting proposed by patent document CN112529948A, adopt a threshold segmentation or example segmentation method to identify fruit targets in an image, and the method is complex in algorithm, easy to be interfered by environment, large in data volume to be processed and cannot guarantee real-time performance.

The existing target technology utilizes the information of the color, shape, texture and the like of the fruits in the color image to divide the target in the image from the background, so as to realize the identification of the fruits in the image. The method has strict requirements on the environment, is easy to be interfered, has the phenomena of omission, identification error and the like, and cannot meet the fruit identification requirements in the orchard. For example, there are great differences in light conditions in orchards between different weather conditions and different times of the day; on the other hand, fruits in the orchard grow on fruit trees, and the situation that fruits are close to each other and are shielded with leaves and branches exists, so that fruit image background collected in the orchard is very complex, interference caused by the above factors cannot be well avoided in the prior art, and the recognition accuracy in the orchard environment is low and the universality is not achieved. When the method of example segmentation is adopted to mark the data set, the outline description points of the target need to be marked, so that the workload is high and the efficiency is low. The data volume required to be processed by the two identification methods is very large, and the real-time performance cannot be ensured due to slow processing. The method adopts a mode of acquiring the point cloud for positioning, and the point cloud data required by the method is difficult to acquire and has high cost.

Disclosure of Invention

The embodiment of the invention provides a fruit identification and positioning method, device and medium, which can effectively solve the problems of low accuracy, non-universality and high data acquisition cost in the prior art.

An embodiment of the invention provides a fruit identification and positioning method, which comprises the following steps:

shooting fruits under different illumination conditions, and classifying shooting results to obtain a training image dataset;

labeling the images in the training image data set, and setting labels for labeling results;

training the fruit target detection model by utilizing the training image data set and the labeling result;

and acquiring images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training to obtain the maturity and position information of the fruits to be detected.

Compared with the prior art, the fruit identification and positioning method disclosed by the embodiment of the invention ensures the diversity of environmental conditions obtained by the image in the data set by shooting under different weather conditions, so that the characteristics of litchi fruit targets under various conditions can be learned when the litchi fruit target detection model in an orchard is trained, the difficulty brought by light change is overcome, and the target detection model can accurately identify the litchi fruit targets under different environmental conditions. By combining the target detection result and the depth image to locate the target, compared with a method for locating by using point cloud data, the method only needs to use the depth sensor to shoot, and has low cost and simple data acquisition method.

Further, the shooting of fruits under different illumination conditions classifies shooting results to obtain a training image dataset, which specifically comprises:

and respectively shooting a fixed number of fruit images under various illumination conditions, classifying all shot fruit images according to the illumination conditions, and combining the fruit images into a training image data set.

When the litchi fruit image dataset is manufactured, shooting is carried out under different weather conditions so as to ensure the diversity of environmental conditions for image acquisition in the dataset, so that the characteristics of the litchi fruit target under various conditions can be learned when the litchi fruit target detection model of an orchard is trained, the difficulty brought by light change is overcome, and the target detection model can accurately identify the litchi fruit target under different environmental conditions.

Further, the labeling the images in the training image dataset and setting the label for the labeling result specifically includes:

labeling fruits in the image data set through a labeling tool, framing out fruit areas in the image by using geometric figure frames, and setting labels according to the maturity of the fruits in the image, wherein the label types comprise ripeness and immature.

When the data is marked, only one geometric figure frame surrounding the target is needed to be arranged, the outline of the target is not needed to be traced, and the workload is smaller in the marking process.

Further, the training of the fruit target detection model by using the training image dataset and the labeling result specifically includes:

loading an image dataset, inputting the training image dataset and a labeling result into a fruit target detection model, obtaining initial model parameters and calculating initial loss after model operation, continuously updating the model parameters and calculating the loss by using a back propagation iteration mode, and ending training when the model performance meets the requirement to obtain a fruit target detection model after final training is completed;

wherein, the fruit target detection model comprises: a feature extraction network, a neck, and a detection portion; the feature extraction network is composed of a convolutional neural network and attention functions, wherein the attention functions are multi-head attention functions formed by performing parallel calculation on the dot product scaling attention functions for a plurality of times and then splicing the dot product scaling attention functions; the neck adopts two structures of a feature pyramid structure and a path aggregation network, the feature pyramid structure is used for overlapping high-level feature mapping and low-level feature mapping through up-sampling, and the path aggregation network is used for transmitting positioning information from a shallow layer to a deep layer; the detection part outputs a target detection output frame according to the feature image generated by the feature extraction network and the neck, the output frame comprises a plurality of prior frames and a prediction frame, the prior frames are distributed in each pixel of the feature image and have different sizes, and the prediction frame is obtained through calculation of the prior frames and the feature image.

As a preferred embodiment, the training is finished when the model performance meets the requirement, which specifically includes:

the model performance meets the requirements specifically as follows: the loss is less than a preset error value;

the loss is obtained by adding the positioning loss, the confidence coefficient loss and the classification loss, is used for judging the error between the model prediction result of the current parameter and the real situation, and ends training when the loss is smaller than a preset error value.

Further, the collecting of images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training, so as to obtain position information of the fruits to be detected, specifically including:

loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;

collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;

detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;

obtaining depth values of central points of the plurality of output frames in the depth image;

and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.

Compared with a method for positioning by utilizing point cloud data, the method only needs to utilize a depth sensor to shoot, and has low cost and simple data acquisition method.

Another embodiment of the present invention correspondingly provides a fruit identifying and positioning device, including: the device comprises an image acquisition and labeling module, a model training module and a fruit identification and positioning module;

the image acquisition and labeling module is used for shooting fruits under different illumination conditions, classifying shooting results to obtain a training image data set, labeling images in the training image data set, and setting labels for labeling results;

the model training module is used for training a fruit target detection model by utilizing the training image data set and the labeling result;

the fruit recognition and positioning module is used for collecting images of a plurality of fruits to be detected, recognizing and positioning the fruits in the images through the trained fruit target detection model, and obtaining maturity and position information of the fruits to be detected.

Compared with the prior art, the fruit identification and positioning device disclosed by the embodiment of the invention ensures the diversity of environmental conditions obtained by the image in the data set by shooting under different weather conditions, so that the characteristics of litchi fruit targets under various conditions can be learned when the litchi fruit target detection model in an orchard is trained, the difficulty brought by light change is overcome, and the target detection model can accurately identify the litchi fruit targets under different environmental conditions. By combining the target detection result and the depth image to locate the target, compared with a method for locating by utilizing point cloud data, the device only needs to utilize the depth sensor to shoot, and has low cost and simple data acquisition method.

Further, the fruit identifying and positioning module is used for collecting images of a plurality of fruits to be detected, identifying and positioning the fruits in the images through a trained fruit target detection model to obtain maturity and position information of the fruits to be detected, and specifically comprises the following steps:

Another embodiment of the present invention provides a fruit identifying and positioning device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor executes the computer program to implement the fruit identifying and positioning method according to the embodiment of the present invention.

Another embodiment of the present invention provides a storage medium, where the computer readable storage medium includes a stored computer program, where when the computer program runs, a device where the computer readable storage medium is located is controlled to execute the fruit identifying and positioning method according to the foregoing embodiment of the present invention.

Drawings

Fig. 1 is a flow chart of a fruit identifying and positioning method according to an embodiment of the invention.

Fig. 2 is a schematic diagram of a network structure of a fruit target detection model according to an embodiment of the present invention.

Fig. 3 is a schematic structural view of a fruit identifying and positioning device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, a flow chart of a fruit identifying and positioning method according to an embodiment of the invention includes:

s101: shooting fruits under different illumination conditions, and classifying shooting results to obtain a training image dataset;

s102: labeling the images in the training image data set, and setting labels for labeling results;

s103: training the fruit target detection model by utilizing the training image data set and the labeling result;

s104: and acquiring images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training to obtain the maturity and position information of the fruits to be detected.

According to the fruit identification and positioning method provided by the embodiment of the invention, shooting is performed under different weather conditions to ensure the diversity of environmental conditions of image acquisition in a data set, so that the characteristics of litchi fruit targets under various conditions can be learned when the litchi fruit target detection model in an orchard is trained, difficulties caused by light change are overcome, and the target detection model can accurately identify the litchi fruit targets under different environmental conditions. By combining the target detection result and the depth image to locate the target, compared with a method for locating by using point cloud data, the method only needs to use the depth sensor to shoot, and has low cost and simple data acquisition method.

For step S101, specifically, a fixed number of fruit images are photographed under various illumination conditions, and all the photographed fruit images are classified according to the illumination conditions and then combined into a training image dataset.

In a preferred embodiment, the direct sunlight image is obtained by forward light shooting on sunny days, and the side light image is obtained by side light shooting; shooting in the evening to obtain an image with low brightness; images under scattered light conditions were taken on cloudy days. When the data set is manufactured, the quantity of images shot in the four environments of direct sunlight, photometry, low brightness and scattered light in the training data set is ensured to be equal.

Specifically, in step S102, the fruits in the image dataset are marked by a marking tool, the fruit areas in the image are framed by geometric figure frames, and the obtained geometric figure frames are respectively provided with labels according to the maturity of the fruits in the image, wherein the label types comprise maturity and immature maturity.

In a preferred embodiment, the fruit of the image in the training dataset is manually marked, and only a rectangular frame surrounding the target is required to be arranged during marking, so that the outline of the target is not required to be dotted. Marking fruit areas in the image through rectangular frames by using marking tools to obtain real frames, and setting corresponding labels, wherein the label types comprise ripe fruits and unripe fruits so as to distinguish ripe fruits from unripe fruits.

For step S103, specifically, loading an image dataset, inputting the training image dataset and the labeling result into a fruit target detection model, obtaining initial model parameters and calculating initial loss after model operation, and then continuously updating the model parameters and calculating the loss by using a back propagation iteration mode, and ending training after the model performance reaches the requirement to obtain a fruit target detection model after final training is completed;

In a preferred embodiment, the model is trained using counter-propagating iterations to obtain model parameters suitable for target detection of the litchi in the orchard. The training step comprises the steps of loading data, establishing a model, updating model parameters, calculating loss, evaluating the model, judging the condition for ending training, and storing the model parameters. The condition of the judgment technology training is that the model performance reaches the requirement or the training frequency is larger than a set value, and the requirement is that the change value of the loss function is smaller than the set value.

In particular, the calculation loss uses an improved target detection loss function, including a positioning loss, a confidence loss, and a classification loss, which reflect the model prediction result using the current parameters and the error of the real situation, and the calculation method is as follows:

Loss＝Loss _cls +Loss _obj +Loss _box

the classification loss and the confidence loss adopt a binary cross entropy loss function, and the calculation method is expressed as follows:

where p represents a predicted value, x represents a sample, y represents a target value, n represents a total sample amount, and L represents a result of the binary cross entropy loss final calculation.

The positioning Loss adopts alpha-CIoU Loss _α-CIoU The calculation method comprises the following steps:

in the formula, A, B represents an output frame and a real frame, respectively, |a n b| represents an area of intersection of a and B, |a u b| represents an area of union of a and B, and C represents an area of a smallest rectangle surrounding a and B. Alpha is an adjustable parameter, and the value of alpha is determined by comparing detection results when different values are taken, so that the flexibility of a target detection model can be improved. b and b ^gt The center points of the output frame and the real frame are respectively, ρ (·) is the euclidean distance, and c is the diagonal length of the smallest bounding box of the two frames. Beta is a positive trade-off parameter and v measures the uniformity of aspect ratio. The calculation methods of β and v are expressed as:

/>

w in ^gt And h ^gt The width and height of the real frame are respectively, and w and h are respectively the width and height of the output frame.

In a preferred embodiment, the fruit target detection model is an orchard target detection model based on modified YOLOv5, and comprises a feature extraction network, a neck part and a detection part, wherein the specific network structure is shown in fig. 2.

In particular, the feature extraction network is a 'convolution-attention structure', and is composed of a convolution neural network and an attention function. The convolutional neural network forms three structures of Conv (Convolume), SPP (Spatial Pyramid Pooling ) and CSP bottleneck layer of the feature extraction network. Conv includes a convolution layer, a batch normalization layer, and an activation function. The activation function is a leakage linear rectification function. The CSP bottleneck layer includes a convolution layer, batch normalization, activation functions, and a residual network structure. The attention function uses a scaled dot product attention function, the calculation method is expressed as:

wherein Q, K, V are input feature patterns, K ^T The transposed matrix of K is represented,

is a scale factor.

The scaled dot product attention function is calculated in parallel for a plurality of times and spliced to form a multi-head attention function, and the calculation method is expressed as follows:

MultiHead(Q,K,V)＝Concat(head ₁ ,…,head _□ )W ^O

where head is the output of the scaled dot product attention function.

The multi-head attention function and the convolutional neural network are combined to form a feature extraction network.

The neck adopts two structures of a characteristic pyramid and a path aggregation network. The feature pyramid adopts a top-down mode, and the high-level feature mapping and the low-level feature mapping are overlapped through up-sampling. The path aggregation network adopts a bottom-up mode to transmit positioning information from a shallow layer to a deep layer.

The detection section outputs a target detection output frame based on the feature map generated by the feature extraction network and the neck. A plurality of frames of different sizes, called a priori frames, are generated in each pixel of the feature map, the a priori frames having sizes of 10×13, 16×30, 33×23, 30×61, 62×45, 59×119, 116×90, 156×198, 373×326, respectively. The prediction frame is obtained through prior frame and feature map calculation, and the calculation method comprises the following steps:

b _x ＝σ(t _x )+c _x

b _y ＝σ(t _y )+c _y

wherein σ (t) _x )、σ(t _y ) For the offset based on the coordinates of the upper left corner of the grid center point, σ is a sigmoid function. P is p _w 、p _□ Is the width and height of the prior frame. b _x 、b _y 、b _w 、b _□ Respectively the abscissa of the central point of the prediction frame, the ordinate of the central point, the width and the height.

For step S104, specifically, the collecting a plurality of images of the fruit to be detected, identifying and positioning the fruit in the images through the trained fruit target detection model, and obtaining the position information of the fruit to be detected specifically includes:

In a preferred embodiment, using Intel RealSense D435 depth sensor as the camera, the parameters of the camera are initialized, setting the resolution of the acquired color and depth images to 640 x 480. The depth sensor is used for collecting a color image and a depth image in front of fruits, a target detection model is used for detecting the fruits in the color image, a target detection output frame is obtained, and the coordinates (x, z) of the center point of the output frame in the color image are recorded. And then acquiring a depth value of a point (x, z) in the depth image as a distance y between the fruit and the shooting point, wherein (x, y, z) represents position information of the fruit in a space coordinate system.

Referring to fig. 3, a schematic structural diagram of a fruit identifying and positioning device according to an embodiment of the present invention includes: an image acquisition and labeling module 201, a model training module 202 and a fruit identification and positioning module 203;

the image acquisition and labeling module 201 is configured to shoot fruits under different illumination conditions, classify shooting results to obtain a training image dataset, label images in the training image dataset, and set labels for labeling results;

the model training module 202 is configured to train a fruit target detection model by using the training image dataset and the labeling result;

the fruit recognition and positioning module 203 is configured to collect images of a plurality of fruits to be detected, and recognize and position the fruits in the images through a trained fruit target detection model to obtain maturity and position information of the fruits to be detected.

According to the fruit identification and positioning device provided by the embodiment of the invention, shooting is performed under different weather conditions to ensure the diversity of environmental conditions of image acquisition in a data set, so that the characteristics of litchi fruit targets under various conditions can be learned when the litchi fruit target detection model in an orchard is trained, difficulties caused by light change are overcome, and the target detection model can accurately identify the litchi fruit targets under different environmental conditions. By combining the target detection result and the depth image to locate the target, compared with a method for locating by utilizing point cloud data, the device only needs to utilize the depth sensor to shoot, and has low cost and simple data acquisition method.

Further, the fruit identifying and positioning module 203 is configured to collect images of a plurality of fruits to be detected, identify and position the fruits in the images through a trained fruit target detection model, and obtain maturity and position information of the fruits to be detected, and specifically includes:

The embodiment of the invention also provides a fruit identification and positioning device. The fruit identification and positioning device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The steps of the above embodiments of the fruit identification and positioning method are implemented by the processor when executing the computer program, for example, step S101 shown in fig. 1. Alternatively, the processor, when executing the computer program, performs the functions of the modules in the device embodiments described above, such as the fruit identification and positioning module 203.

The computer program may be divided into one or more modules, which are stored in the memory and executed by the processor to accomplish the present invention, for example. The one or more modules may be a series of computer program instruction segments capable of performing a specific function for describing the execution of the computer program in the fruit identification and localization device. For example, the computer program may be divided into an image acquisition and labeling module 201, a model training module 202 and a fruit recognition and positioning module 203, each of which functions in particular as follows:

The fruit identifying and positioning device can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The fruit identification and location device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a fruit identification and locating device and is not limiting of the fruit identification and locating device, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the fruit identification and locating device may also include input and output devices, network access devices, buses, etc.

The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is the control center of the fruit identification and positioning device, with various interfaces and lines connecting the various parts of the overall fruit identification and positioning device.

The memory may be used to store the computer program or module, and the processor may implement the various functions of the fruit identification and location device by running or executing the computer program or module stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.

Wherein the module integrated with the positioning device may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a stand alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims

1. The fruit identifying and positioning method is characterized by comprising the following steps:

2. The method for identifying and positioning fruits according to claim 1, wherein the steps of photographing fruits under different illumination conditions, classifying photographing results, and obtaining a training image dataset comprise:

3. The method for identifying and positioning fruits according to claim 1, wherein the steps of labeling the images in the training image dataset and labeling the labeling result comprise:

4. The method for identifying and locating fruits according to claim 1, wherein training the fruit target detection model by using the training image dataset and the labeling result comprises:

5. The method for identifying and locating fruits according to claim 4, wherein said training is finished when the model performance meets the requirement, specifically comprising:

6. The method for identifying and positioning fruits according to claim 1, wherein the collecting a plurality of images of fruits to be detected, identifying and positioning fruits in the images by a trained fruit target detection model, and obtaining position information of the fruits to be detected specifically comprises:

7. A fruit identification and positioning device, comprising: the device comprises an image acquisition and labeling module, a model training module and a fruit identification and positioning module;

8. The fruit identification and positioning device according to claim 7, wherein the fruit identification and positioning module is configured to collect images of a plurality of fruits to be detected, and identify and position the fruits in the images through a trained fruit target detection model, so as to obtain maturity and position information of the fruits to be detected, and the fruit identification and positioning device specifically comprises:

9. A fruit identification and locating device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the fruit identification and locating method according to any one of claims 1 to 6 when executing the computer program.

10. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program, when run, controls a device in which the computer readable storage medium is located to perform the fruit identification and localization method according to any one of claims 1 to 6.