CN111339887B - Commodity identification method and intelligent container system - Google Patents

Commodity identification method and intelligent container system Download PDF

Info

Publication number
CN111339887B
CN111339887B CN202010103439.2A CN202010103439A CN111339887B CN 111339887 B CN111339887 B CN 111339887B CN 202010103439 A CN202010103439 A CN 202010103439A CN 111339887 B CN111339887 B CN 111339887B
Authority
CN
China
Prior art keywords
image
acquisition device
commodity
data set
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010103439.2A
Other languages
Chinese (zh)
Other versions
CN111339887A (en
Inventor
朱晓雅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cloudminds Robotics Co Ltd
Original Assignee
Cloudminds Shanghai Robotics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cloudminds Shanghai Robotics Co Ltd filed Critical Cloudminds Shanghai Robotics Co Ltd
Priority to CN202010103439.2A priority Critical patent/CN111339887B/en
Publication of CN111339887A publication Critical patent/CN111339887A/en
Application granted granted Critical
Publication of CN111339887B publication Critical patent/CN111339887B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention relates to the technical field of target detection, and discloses a commodity identification method and an intelligent container system. The method is applied to an intelligent container, wherein the intelligent container comprises a plurality of shelf layers, and each shelf layer is provided with a first image acquisition device and a second image acquisition device; the method comprises the following steps: acquiring a first image by the first image acquisition device and acquiring a second image by the second image acquisition device; splicing the first image and the second image to obtain a spliced image; training an original recognition model according to the spliced image, and taking the trained original recognition model as a preset recognition model; and carrying out commodity identification through the preset identification model. Through the mode, the embodiment of the invention can reduce the probability of missing detection in commodity identification.

Description

Commodity identification method and intelligent container system
Technical Field
The embodiment of the invention relates to the technical field of target detection, in particular to a commodity identification method and an intelligent container system.
Background
In the present intelligent counter, the top of every layer is provided with only one camera, shoots through the camera and carries out commodity identification, and when the counter volume increases, the camera can't all shoot the commodity in every layer to cause commodity identification's omission easily.
Disclosure of Invention
An object of the embodiment of the invention is to provide a commodity identification method and an intelligent container system, which can reduce the probability of missed detection in commodity identification.
According to an aspect of an embodiment of the present invention, there is provided a commodity identification method applied to an intelligent container, the intelligent container including a plurality of shelf layers, each of the shelf layers being provided with a first image acquisition device and a second image acquisition device;
the method comprises the following steps: acquiring a first image by the first image acquisition device and acquiring a second image by the second image acquisition device; splicing the first image and the second image to obtain a spliced image; training an original recognition model according to the spliced image, and taking the trained original recognition model as a preset recognition model; and carrying out commodity identification through the preset identification model.
In an optional manner, the stitching the first image and the second image to obtain a stitched image specifically includes: extracting first characteristic points of the first image and second characteristic points of the second image respectively; acquiring a matched feature point set according to the first feature point and the second feature point; and according to the matched feature point set, converting the first image and the second image into the same coordinate system to obtain the spliced image.
In an optional manner, the stitching the first image and the second image to obtain a stitched image specifically further includes: after the first image and the second image are converted into the same coordinate system, carrying out weighted fusion processing on an overlapped area of the first image and the second image; and taking the processed image formed by the first image and the second image as the spliced image.
In an alternative manner, before the stitching the first image and the second image, the method further includes: and carrying out distortion correction processing on the first image and/or the second image.
In an optional manner, when the first image and/or the second image are/is fisheye images, the performing distortion correction processing on the first image and/or the second image specifically includes: according to the mapping relation between the fisheye image and the target image, determining the coordinates of each pixel point in the fisheye image on the target image; and carrying out bilinear interpolation on the target image according to the determined coordinates and the fisheye image to obtain a pixel value of each pixel point in the target image, and taking the target image obtained after interpolation as a fisheye image after distortion correction processing.
In an optional manner, training an original recognition model according to the stitched image, and taking the trained original recognition model as a preset recognition model, specifically including: acquiring an original training image; the original training image is subjected to the distortion correction processing, and the original training image after the distortion correction processing is recorded as a first data set; training the original recognition model according to the original training image and the first data set, and taking the trained original recognition model as a first intermediate model; inputting the spliced image into the first intermediate model to acquire a second data set output by the first intermediate model; training the first intermediate model according to the original training image, the first data set and the second data set, and taking the trained first intermediate model as a second intermediate model; inputting the second data set into the second intermediate model to obtain a third data set output by the second intermediate model; after screening or manually labeling the third data set according to the spliced image, training the second intermediate model according to the original training image, the first data set and the third data set, and taking the trained second intermediate model as the preset recognition model.
In an alternative way, the first image capturing device and/or the second image capturing device is a fisheye camera.
In an optional manner, the commodity identification through the preset identification model specifically includes: before purchasing goods, identifying the original goods in the intelligent container through the preset identification model; after the completion of commodity selection is confirmed, identifying the rest commodities in the intelligent container through the preset identification model; and comparing the original commodity with the residual commodity to obtain a settled commodity.
According to yet another aspect of the embodiment of the present invention, there is provided an intelligent container system including an intelligent container and a cloud server; the intelligent container comprises a plurality of shelf layers, wherein each shelf layer is provided with a first image acquisition device and a second image acquisition device, the first image acquisition device is used for acquiring a first image, and the second image acquisition device is used for acquiring a second image; the cloud server comprises a processor and a memory, wherein the memory is used for storing at least one executable instruction, and when the cloud server runs, the processor executes the executable instruction to enable the processor to execute the operation of the commodity identification method.
According to another aspect of embodiments of the present invention, there is provided a computer-readable storage medium having stored therein at least one executable instruction for causing a processor to perform steps according to the article identification method as described above.
According to the embodiment of the invention, the first image is acquired through the first image acquisition device, the second image is acquired through the second image acquisition device, the first image and the second image are spliced to obtain the spliced image, the original recognition model is trained according to the spliced image, the trained original recognition model is used as the preset recognition model, commodity recognition is carried out through the preset recognition model, and when the volume of the intelligent container is increased, all commodities on the shelf layer can still be shot through the two image acquisition devices, so that the probability of missing detection of commodity recognition can be reduced.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present invention can be more clearly understood, and the following specific embodiments of the present invention are given for clarity and understanding.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 shows a schematic diagram of the structure of an intelligent container provided by an embodiment of the invention;
FIG. 2 shows a schematic diagram of the structure of an intelligent container according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a commodity identification method according to an embodiment of the present invention;
fig. 4 is a schematic flow chart of a commodity identification method according to another embodiment of the present invention;
FIG. 5 is a schematic flow chart of a commodity identification method according to another embodiment of the present invention;
FIG. 6 shows a schematic diagram of the structure of an intelligent container system provided by an embodiment of the invention;
fig. 7 shows a schematic structural diagram of the cloud processor of fig. 6.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
In the current intelligent container, only one camera is arranged at the top of each layer, and commodity identification is carried out by shooting through the camera. When the amount of goods or the kinds of goods required by customers are increased, the container needs to be modified, thereby increasing the container volume to accommodate more goods. When the container volume is increased, the single camera cannot shoot all the commodities in each layer (for example, the commodities positioned at the left and right ends in each layer cannot be shot), so that missing shooting is caused, and missing detection of commodity identification is easily caused.
Based on the above, the embodiment of the invention provides a commodity identification method and an intelligent container system, which can reduce the probability of missed detection in commodity identification.
It should be noted that the commodity identification method in the embodiment of the present invention is applied to an intelligent container. As shown in FIG. 1, the intelligent container 100 can include a number of shelf levels 101, each shelf level 101 being provided with a first image acquisition device 102 and a second image acquisition device 103. The first image capturing device 102, the second image capturing device 103 may be disposed at the top left and right portions of the shelf layer, respectively, and such that at least the entire shelf layer is photographed by the first image capturing device 102, the second image capturing device 103. For example, as shown in fig. 2, the first image acquisition device 102 and the second image acquisition device 103 are respectively disposed at the first quarter point and the third quarter point on the top of the shelf layer.
The first image capturing device 102 may be a normal camera or a fisheye camera, and the second image capturing device 103 may be a normal camera or a fisheye camera. For example, the first image acquisition device 102 and the second image acquisition device 103 are both normal cameras; alternatively, the first image acquisition device 102 and the second image acquisition device 103 are both fisheye cameras; alternatively, the first image acquisition device 102 is a normal camera, and the second image acquisition device 103 is a fisheye camera.
The fisheye camera is a camera with a fisheye lens, which is a lens having a focal length of 16mm or less and a view angle of 180 degrees or more. The normal camera may include a standard camera, and may further include other wide-angle cameras other than a fisheye camera. The standard camera is a camera with a standard lens, the standard lens is a lens with a focal length of 40mm to 55mm, and the perspective of the scenery represented by the standard camera is relatively close to that of the eyes. Compared with a common camera, the fish-eye camera can shoot a wider range of commodities.
In particular, embodiments of the present invention are further described below with reference to the accompanying drawings.
It should be understood, however, that the following examples provided herein may be combined with one another to form new embodiments, so long as they do not conflict.
Fig. 3 is a schematic flow chart of a commodity identification method according to an embodiment of the present invention. As shown in fig. 3, the commodity identification method includes:
step 210, acquiring a first image by a first image acquisition device and acquiring a second image by a second image acquisition device.
When the first image acquisition device and the second image acquisition device are arranged on the goods shelf layer of the intelligent container and goods are placed on the goods shelf layer, the first image acquisition device shoots a first image, the second image acquisition device shoots a second image, the first image can be acquired from the first image acquisition device, and the second image can be acquired from the second image acquisition device.
Step 220, stitching the first image and the second image to obtain a stitched image.
Because the first image acquisition device and the second image acquisition device are respectively positioned at the two ends of the top of the goods shelf layer, the first image and the second image are respectively commodity pictures of the left part and the right part of the goods shelf layer. However, if there is a possibility that the images captured by the first image capturing device and the second image capturing device overlap, the same commodity image may exist in the first image and the second image.
In some embodiments, when there is no overlapping portion of the frames captured by the first image capturing device and the second image capturing device, the first image and the second image may be directly stitched.
In some other embodiments, when there is an overlapping portion of the frames captured by the first image capturing device and the second image capturing device, step 220 may include:
step 221, extracting a first feature point of the first image and a second feature point of the second image respectively;
step 222, obtaining a matched feature point set according to the first feature point and the second feature point;
and 223, converting the first image and the second image into the same coordinate system according to the matched feature point set to obtain a spliced image.
The feature point refers to a point in the image where the gray value changes drastically or a point with a larger curvature on the edge of the image (i.e., an intersection point of two edges). As long as there are enough detectable feature points in the image and the feature points are different, feature stable and can be precisely positioned, the image can be locally analyzed based on the feature points instead of observing the whole image.
In step 221, the first feature point of the first image and the second feature point of the second image may be extracted by an accelerated robust feature algorithm, respectively. Among them, the accelerated robust feature (speed Up RobustFeature, SURF) algorithm is an improved method of the SIFT algorithm, which is a feature with a constant scale, and the SURF algorithm has higher computational efficiency compared to the SIFT algorithm.
Optionally, after step 221, step 220 may further include: and respectively acquiring feature point descriptions of each first feature point and each second-day feature point.
In step 222, the matching feature set refers to a set of optimal pairings of the first feature point and the second feature point. The matching of feature points may be performed based on the feature point descriptions. In order to exclude feature points without matching relationship due to image occlusion and background confusion, a SIFT matching method (i.e. Low's algorithm) comparing the nearest neighbor distance and the next neighbor distance may be used to extract the optimal pairing of the first feature point and the second feature point. The method specifically comprises the following steps: and (3) taking one of the first characteristic points in the first image, finding out the first two second characteristic points closest to the Euclidean distance of the first characteristic point in the second image, and taking the first characteristic point and the first second characteristic point closest to the Euclidean distance as optimal pairing if the ratio obtained by dividing the closest distance by the next closest distance is smaller than a preset threshold T in the two second characteristic points. Due to the high latitude of the feature space, there may be a large number of other mismatches for similar distances, and thus their ratio is relatively high. When the preset threshold T is lowered, the number of SIFT matching points is reduced but more stable. The value principle of the preset threshold value T can be as follows: t=0.4: matching with higher accuracy requirement; t=0.6, requiring a relatively large number of matches for the number of matching points; t=0.5:typically.
In step 223, after the matching feature point set is obtained, the first image and the second image are converted into the same coordinate system, so that a stitched image can be obtained. The method specifically comprises the following steps: acquiring a projection mapping matrix from a first image to a second image; calculating four vertex coordinates of the registration map (e.g., the first image); registering the second image to the first image, i.e. projective transformation; calculating the size of the spliced image according to the first image and the registered second image; copying the first image to a specific position of the second image to obtain a spliced image.
In some embodiments, since the stitched image tends to be unnatural, the two images may not be so natural at the junction transition due to illumination color, etc. Step 223 may specifically further include: after converting the first image and the second image into the same coordinate system, performing weighted fusion processing on the overlapped area of the first image and the second image; and taking the image formed by the processed first image and the processed second image as a spliced image. The weighted fusion is also called a pixel weighted average method (Weighted Averging, WA), has the advantages of simplicity, easiness in implementation and high operation speed, and can improve the signal-to-noise ratio of the fused image. And improving the junction by a weighted fusion processing method, namely adding pixel values of the overlapped area according to a preset weight to synthesize a new image.
Step 230, training an original recognition model according to the spliced image, and taking the trained original recognition model as a preset recognition model.
The original recognition model is obtained through original training images and professional manual labeling training, and the original training images are commodity images acquired by only arranging a single image acquisition device on the top of each shelf layer. When two image acquisition devices are arranged on the goods shelf layer, the method is directly applied to an original identification model to carry out commodity identification, and the obtained effect is not good. Therefore, the original recognition model needs to be retrained by fusing the spliced images, and the trained original recognition model is used as a preset recognition model to improve the generalization capability of the recognition model, so that the recognition effect is improved.
And 240, carrying out commodity identification through a preset identification model.
The preset identification model can be applied to the intelligent container of the double-image acquisition device, and can achieve a good identification effect when the volume of the intelligent container is increased.
According to the embodiment of the invention, the first image is acquired through the first image acquisition device, the second image is acquired through the second image acquisition device, the first image and the second image are spliced to obtain the spliced image, the original recognition model is trained according to the spliced image, the trained original recognition model is used as the preset recognition model, commodity recognition is carried out through the preset recognition model, and when the volume of the intelligent container is increased, all commodities on the shelf layer can still be shot through the two image acquisition devices, so that the probability of missing detection of commodity recognition can be reduced.
Fig. 4 is a schematic flow chart of a commodity identification method according to another embodiment of the present invention.
As shown in fig. 4, the method includes:
step 310, acquiring a first image by a first image acquisition device and acquiring a second image by a second image acquisition device.
Step 320, performing distortion correction processing on the first image and/or the second image.
Step 330, stitching the first image and the second image to obtain a stitched image.
And 340, training an original recognition model according to the spliced image, and taking the trained original recognition model as a preset recognition model.
And 350, carrying out commodity identification through a preset identification model.
The implementation manners of step 310, step 330 and step 350 are the same as the implementation manners of step 210, step 220 and step 240 in the above embodiments, and are not repeated here.
In step 320, since the first image capturing device or the second image capturing device may be a normal camera or a fisheye camera, the first image or the second image captured by the first image capturing device or the second image capturing device may be a fisheye image, and thus a distortion correction process is required. For example, when the first image is a fisheye image and the second image is a normal image, the distortion correction processing is performed on the first image; or when the first image is a normal image and the second image is a fish-eye image, performing distortion correction processing on the second image; or when the first image and the second image are both fisheye images, distortion correction processing is performed on both the first image and the second image.
Specifically, the distortion correction processing is performed on the fisheye image, including:
step 321, determining coordinates of each pixel point in the fisheye image on the target image according to the mapping relation between the fisheye image and the target image;
and step 322, performing bilinear interpolation on the target image according to the determined coordinates and the fisheye image to obtain a pixel value of each pixel point in the target image, and taking the target image obtained after interpolation as the fisheye image after distortion correction processing.
The target image is an image obtained by subjecting a fisheye image to distortion correction processing. The mapping relationship between the fisheye image and the target image may be preset according to the desired distortion correction effect. For example, the mapping relationship between the fisheye image and the target image may be a longitude and latitude mapping relationship, and the specific embodiment may be: and establishing a longitude and latitude mapping plane, wherein the plane takes the longitude of the spherical surface as an abscissa and the latitude of the spherical surface as an ordinate, establishing a corresponding relation between a point p on the spherical surface and a point p' on the longitude and latitude mapping plane, and mapping all pixel points on the spherical surface to the longitude and latitude mapping plane so as to generate a rectangular longitude and latitude mapping image. In this embodiment, the radius r of the fisheye image may be calculated first, the pixel coordinates (x ', y') of the fisheye image are mapped to spherical coordinates (x, y, z) with the radius r, the longitude and latitude coordinates (w, q) of the spherical coordinates (x, y, z) on the spherical surface are calculated, and then the longitude and latitude coordinates are expanded on [0, pi ] to obtain the pixel coordinates (u ', v') of the target image, so as to determine the coordinates of each pixel point in the fisheye image on the target image.
In step 322, after determining the partial coordinates of the target image, bilinear interpolation is performed on the target image according to the determined coordinates and the fisheye image, so as to obtain a pixel value of each pixel point in the target image, and the target image obtained after interpolation is used as the fisheye image after distortion correction. Because some pixels in the target image obtained by mapping have no value, the target image needs to be filled, so that the target image is more complete and clear.
In step 340, it may include:
step 341, acquiring an original training image;
step 342, performing distortion correction processing on the original training image, and marking the original training image after the distortion correction processing as a first data set;
step 343, training an original recognition model according to the original training image and the first data set, and taking the trained original recognition model as a first intermediate model;
step 344, inputting the spliced image into the first intermediate model to obtain a second data set output by the first intermediate model;
step 345, training a first intermediate model according to the original training image, the first data set and the second data set, and taking the trained first intermediate model as a second intermediate model;
step 346, inputting the second data set into the second intermediate model, and obtaining a third data set output by the second intermediate model;
step 347, after screening or manually labeling the third data set according to the stitched image, training a second intermediate model according to the original training image, the first data set and the third data set, and taking the trained second intermediate model as a preset recognition model.
Wherein the original training image is a collection of several images. When the original training image is a fisheye image, the original training image is subjected to distortion correction (i.e., the original training image is subjected to processing as in step 320), so that the fisheye image in the original training image is corrected to a planar image, corresponding labeling information is also converted into planar coordinates, and the original training image (including the converted labeling information) after the distortion correction is recorded as the first data set.
In step 343, the specific steps may be: inputting an original training image (comprising original labeling information) and a first data set into an original recognition model, fine-tuning parameters in the original recognition model to train the original recognition model, and taking the trained original recognition model as a first intermediate model. Thus, the resulting first intermediate model can be applied to uncorrected commodity images acquired by a single fisheye camera as well as corrected commodity images acquired by a single fisheye camera.
In step 344, the stitched image is input into the first intermediate model, the stitched image is automatically annotated using the first intermediate model, and the data output by the first intermediate model is recorded as the second dataset. Since there may be a difference between the stitched image and the single image, the result obtained by automatically labeling the stitched image using the first intermediate model may not be necessarily accurate, and thus inaccurate data may exist in the second data set.
In step 345, the specific steps may be: the original training image (including the original labeling information), the first data set and the second data set are input into a first intermediate model, parameters in the first intermediate model are finely adjusted to train the first intermediate model, and the trained first intermediate model is used as a second intermediate model. Thus, the resulting second intermediate model can be applied to uncorrected merchandise images captured by a single fisheye camera, corrected merchandise images captured by a single fisheye camera, and corrected and stitched merchandise images captured by two cameras (including a normal camera or a fisheye camera).
In step 346, the second dataset is input into the second intermediate model, the images in the second dataset are automatically annotated using the second intermediate model, and the data output by the second intermediate model is noted as the third dataset. Because inaccurate marking data may exist in the second data set, the automatic marking of the images in the second data set by using the second intermediate model can reduce the inaccurate marking in the second data set, so that the workload of manual marking can be greatly reduced.
In step 347, the third dataset is screened or manually labeled according to the stitched image, which means that inaccurate data in the third dataset is screened and removed manually or manually remarked, and after the stitched image is manually labeled, the manually labeled result and the labeled result in the third dataset are compared, so as to determine the inaccurate data in the third dataset. After the third data set is processed, the original training image (including the original labeling information), the first data set and the third data set are input into a second intermediate model, parameters in the second intermediate model are adjusted to train the second intermediate model, and the trained second intermediate model is used as a preset recognition model. Therefore, the obtained preset recognition model can be suitable for uncorrected commodity images acquired by a single fisheye camera, corrected commodity images acquired by the single fisheye camera, corrected and spliced commodity images acquired by two cameras (including a common camera or a fisheye camera), and the recognition result is accurate.
In some other embodiments, step 340 may further comprise: acquiring an original training image; performing distortion correction processing on the original training image, and marking the original training image after the distortion correction processing as a first data set; training an original recognition model according to the original training image and the first data set, and taking the trained original recognition model as a first intermediate model; inputting the spliced image which is not corrected and directly spliced into a first intermediate model, and obtaining a second data set output by the first intermediate model; training a first intermediate model according to the original training image, the first data set and the second data set, and taking the trained first intermediate model as a second intermediate model; inputting the second data set into a second intermediate model to obtain a third data set output by the second intermediate model; after screening or manually marking the third data set according to the spliced image, inputting the corrected and spliced image into a second intermediate model to obtain a fourth data set output by the second intermediate model; training a second intermediate model according to the original training image, the first data set, the third data set and the fourth data set, and taking the trained second intermediate model as a third intermediate model; inputting the fourth data set into the third intermediate model to obtain a fifth data set output by the third intermediate model; after screening or manually labeling the fifth data set according to the corrected and spliced image, training a third intermediate model according to the original training image, the first data set, the third data set and the fifth data set, and taking the trained third intermediate model as a preset recognition model. By distinguishing the spliced images which are not corrected and spliced directly from the spliced images which are corrected and spliced, training is performed respectively, so that the recognition effect of the preset recognition model after training is better.
According to the embodiment of the invention, the first image is acquired through the first image acquisition device, the second image is acquired through the second image acquisition device, the distortion correction processing is carried out on the first image and/or the second image, the first image and the second image are spliced to obtain the spliced image, the original recognition model is trained according to the spliced image, the trained original recognition model is used as the preset recognition model, commodity recognition is carried out through the preset recognition model, when the volume of the intelligent container is increased, all commodities on the shelf layer can still be shot through the two image acquisition devices, so that the probability of commodity recognition omission can be reduced, and the background data is continuously acquired, the data set is increased, the diversity of pictures can be increased, the generalization capability of the preset recognition model obtained through training is high, commodity recognition can be carried out for different types of cameras and containers with different sizes, the recognition effect is good, the fact that the model is required to be retrained when the camera or the container size is replaced each time is avoided, and the manpower is saved.
Fig. 5 is a schematic flow chart of a commodity identification method according to still another embodiment of the present invention. The difference from the above embodiment is that, as shown in fig. 5, the commodity identification performed by the preset identification model may specifically include:
step 241, identifying the original commodity in the intelligent container through a preset identification model before purchasing the commodity;
step 242, identifying the rest goods in the intelligent container through a preset identification model after confirming that the goods are selected;
and 243, comparing the original commodity with the rest commodity to obtain the settlement commodity.
Before the commodity is purchased, the door of the intelligent container can be opened for the current customer, and the settlement can be completed for the previous customer. The original commodity in the intelligent container is identified through a preset identification model, and the method specifically comprises the following steps: and acquiring commodity images through the first image acquisition device and the second image acquisition device, correcting and splicing the commodity images, and identifying commodity types and commodity numbers in the commodity images by using a preset commodity identification model so as to determine the original commodity in the intelligent container.
The confirmation of the completion of commodity selection may be the receipt of door closing information or preset timing information of the intelligent container. The method for identifying the residual commodity in the intelligent container through the preset identification model can be specifically as follows: after confirming that commodity selection is completed, acquiring commodity images through a first image acquisition device and a second image acquisition device, correcting and splicing the commodity images, and identifying commodity types and commodity numbers in the commodity images by using a preset commodity identification model so as to determine the residual commodities in the intelligent counter.
In step 243, the original commodity is compared with the remaining commodity, and the commodity lacking in the original commodity is determined to be the settlement commodity. After the settlement commodity is obtained, the payable amount of the customer can be calculated according to the category of the settlement commodity and the corresponding unit price, so that the customer can be settled.
According to the embodiment of the invention, the first image is acquired through the first image acquisition device, the second image is acquired through the second image acquisition device, the first image and the second image are spliced to obtain the spliced image, the original recognition model is trained according to the spliced image, the trained original recognition model is used as the preset recognition model, commodity recognition is carried out through the preset recognition model, and when the volume of the intelligent container is increased, all commodities on the shelf layer can still be shot through the two image acquisition devices, so that the probability of missing detection of commodity recognition can be reduced.
FIG. 6 shows a schematic diagram of the structure of an intelligent container system according to an embodiment of the invention. As shown in fig. 6, the system 400 includes: an intelligent container 100 and a cloud server 500. The intelligent container 100 and the cloud server 500 may be wireless, and the intelligent container 100 and the cloud server 500 may be capable of communicating.
Wherein, the intelligent container 100 can be shown in FIG. 1, the specific embodiment of the present invention is not limited to the specific implementation of the intelligent container. The intelligent container 100 comprises a number of shelf layers 101, each shelf layer 101 being provided with a first image acquisition device 102 and a second image acquisition device 103, the first image acquisition device 102 being adapted to acquire a first image and the second image acquisition device 103 being adapted to acquire a second image.
The cloud server 500 includes a processor and a memory, where the memory is configured to store at least one executable instruction, and when the cloud server runs, the processor executes the executable instruction to cause the processor to execute the steps of the commodity identification method according to any of the above method embodiments.
Optionally, as shown in fig. 7, the cloud server 500 may include: a processor 502, a communication interface (Communications Interface) 504, a memory 506, and a communication bus 508.
Wherein: processor 502, communication interface 504, and memory 506 communicate with each other via communication bus 508. A communication interface 504 for communicating with network elements of other devices, such as clients or other servers. The processor 502 is configured to execute the program 510, and may specifically perform the method for identifying a commodity in any of the above-described method embodiments.
In particular, program 510 may include program code including computer-operating instructions.
The processor 502 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the cloud server may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
A memory 506 for storing a program 510. Memory 506 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
It should be noted that, the intelligent container system provided by the embodiment of the invention is a system capable of executing the commodity identification method, so that all the embodiments based on the commodity identification method are applicable to the system and can achieve the same or similar beneficial effects.
According to the embodiment of the invention, the first image is acquired through the first image acquisition device, the second image is acquired through the second image acquisition device, the first image and the second image are spliced to obtain the spliced image, the original recognition model is trained according to the spliced image, the trained original recognition model is used as the preset recognition model, commodity recognition is carried out through the preset recognition model, and when the volume of the intelligent container is increased, all commodities on the shelf layer can still be shot through the two image acquisition devices, so that the probability of missing detection of commodity recognition can be reduced.
An embodiment of the present invention provides a computer readable storage medium having stored therein at least one executable instruction for causing a processor to perform the article identification method of any of the above-described method embodiments.
An embodiment of the present invention provides a computer program product comprising a computer program stored on a computer storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the article identification method of any of the method embodiments described above.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.

Claims (8)

1. The commodity identification method is characterized by being applied to an intelligent container, wherein the intelligent container comprises a plurality of shelf layers, and each shelf layer is provided with a first image acquisition device and a second image acquisition device;
the method comprises the following steps:
acquiring a first image by the first image acquisition device and acquiring a second image by the second image acquisition device;
performing distortion correction processing on the first image and/or the second image;
splicing the first image and the second image to obtain a spliced image;
acquiring an original training image according to the spliced image;
the original training image is subjected to the distortion correction processing, and the original training image after the distortion correction processing is recorded as a first data set;
training an original recognition model according to the original training image and the first data set, and taking the trained original recognition model as a first intermediate model;
inputting the spliced image into the first intermediate model to acquire a second data set output by the first intermediate model;
training the first intermediate model according to the original training image, the first data set and the second data set, and taking the trained first intermediate model as a second intermediate model;
inputting the second data set into the second intermediate model to obtain a third data set output by the second intermediate model;
after screening or manually labeling the third data set according to the spliced image, training the second intermediate model according to the original training image, the first data set and the third data set, and taking the trained second intermediate model as a preset recognition model;
and carrying out commodity identification through the preset identification model.
2. The method according to claim 1, wherein the stitching the first image and the second image to obtain a stitched image, specifically comprises:
extracting first characteristic points of the first image and second characteristic points of the second image respectively;
acquiring a matched feature point set according to the first feature point and the second feature point;
and according to the matched feature point set, converting the first image and the second image into the same coordinate system to obtain the spliced image.
3. The method according to claim 2, wherein the stitching the first image and the second image to obtain a stitched image, in particular further comprises:
after the first image and the second image are converted into the same coordinate system, carrying out weighted fusion processing on an overlapped area of the first image and the second image;
and taking the processed image formed by the first image and the second image as the spliced image.
4. The method according to claim 1, wherein when the first image and/or the second image is a fisheye image, the performing distortion correction processing on the first image and/or the second image specifically includes:
according to the mapping relation between the fisheye image and the target image, determining the coordinates of each pixel point in the fisheye image on the target image;
and carrying out bilinear interpolation on the target image according to the determined coordinates and the fisheye image to obtain a pixel value of each pixel point in the target image, and taking the target image obtained after interpolation as a fisheye image after distortion correction processing.
5. The method according to claim 1, wherein the first image acquisition device and/or the second image acquisition device is a fisheye camera.
6. The method according to any one of claims 1 to 5, wherein the identifying the commodity by the preset identification model specifically comprises:
before purchasing goods, identifying the original goods in the intelligent container through the preset identification model;
after the completion of commodity selection is confirmed, identifying the rest commodities in the intelligent container through the preset identification model;
and comparing the original commodity with the residual commodity to obtain a settled commodity.
7. An intelligent container system is characterized by comprising an intelligent container and a cloud server;
the intelligent container comprises a plurality of shelf layers, wherein each shelf layer is provided with a first image acquisition device and a second image acquisition device, the first image acquisition device is used for acquiring a first image, and the second image acquisition device is used for acquiring a second image;
the cloud server comprises a processor and a memory, wherein the memory is used for storing at least one executable instruction, and when the cloud server runs, the processor executes the executable instruction to enable the processor to execute the steps of the commodity identification method according to any one of claims 1-6.
8. A computer readable storage medium having stored therein at least one executable instruction for causing a processor to perform the steps of the article identification method according to any one of claims 1-6.
CN202010103439.2A 2020-02-20 2020-02-20 Commodity identification method and intelligent container system Active CN111339887B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010103439.2A CN111339887B (en) 2020-02-20 2020-02-20 Commodity identification method and intelligent container system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010103439.2A CN111339887B (en) 2020-02-20 2020-02-20 Commodity identification method and intelligent container system

Publications (2)

Publication Number Publication Date
CN111339887A CN111339887A (en) 2020-06-26
CN111339887B true CN111339887B (en) 2023-07-21

Family

ID=71181668

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010103439.2A Active CN111339887B (en) 2020-02-20 2020-02-20 Commodity identification method and intelligent container system

Country Status (1)

Country Link
CN (1) CN111339887B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381184B (en) * 2021-01-15 2021-05-25 北京每日优鲜电子商务有限公司 Image detection method, image detection device, electronic equipment and computer readable medium
CN113781371B (en) * 2021-08-23 2024-01-30 南京掌控网络科技有限公司 Method and equipment for de-duplication and splicing of shelf picture identification result
CN114372993B (en) * 2021-12-20 2022-10-28 广州市玄武无线科技股份有限公司 Layered detection method and system for oblique-shooting shelf based on image correction

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960318A (en) * 2018-06-28 2018-12-07 武汉市哈哈便利科技有限公司 A kind of commodity recognizer using binocular vision technology for self-service cabinet

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109035134B (en) * 2017-06-08 2021-09-28 株式会社理光 Panoramic image splicing method and device, electronic equipment and storage medium
CN108549851B (en) * 2018-03-27 2020-08-25 合肥美的智能科技有限公司 Method and device for identifying goods in intelligent container and intelligent container
CN109902636A (en) * 2019-03-05 2019-06-18 上海扩博智能技术有限公司 Commodity identification model training method, system, equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960318A (en) * 2018-06-28 2018-12-07 武汉市哈哈便利科技有限公司 A kind of commodity recognizer using binocular vision technology for self-service cabinet

Also Published As

Publication number Publication date
CN111339887A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN111339887B (en) Commodity identification method and intelligent container system
JP5830546B2 (en) Determination of model parameters based on model transformation of objects
CN111723611A (en) Pedestrian re-identification method and device and storage medium
US8948533B2 (en) Increased quality of image objects based on depth in scene
CN110766002B (en) Ship name character region detection method based on deep learning
CN110189343B (en) Image labeling method, device and system
CN111161295B (en) Dish image background stripping method
CN112686220B (en) Commodity identification method and device, computing equipment and computer storage medium
CN109859104B (en) Method for generating picture by video, computer readable medium and conversion system
CN109919971A (en) Image processing method, device, electronic equipment and computer readable storage medium
CN111738036A (en) Image processing method, device, equipment and storage medium
CN111160395A (en) Image recognition method and device, electronic equipment and storage medium
CN111814754A (en) Single-frame image pedestrian detection method and device for night scene
WO2020259416A1 (en) Image collection control method and apparatus, electronic device, and storage medium
CN112163995A (en) Splicing generation method and device for oversized aerial photographing strip images
Xue et al. Fisheye distortion rectification from deep straight lines
Kannala et al. Object recognition and segmentation by non-rigid quasi-dense matching
CN118015190A (en) Autonomous construction method and device of digital twin model
CN111428743A (en) Commodity identification method, commodity processing device and electronic equipment
CN111242094B (en) Commodity identification method, intelligent container and intelligent container system
Xian et al. Neural lens modeling
CN117253022A (en) Object identification method, device and inspection equipment
CN112465702A (en) Synchronous self-adaptive splicing display processing method for multi-channel ultrahigh-definition video
CN112070077A (en) Deep learning-based food identification method and device
CN109598675B (en) Splicing method of multiple repeated texture images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210127

Address after: 200000 second floor, building 2, no.1508, Kunyang Road, Minhang District, Shanghai

Applicant after: Dalu Robot Co.,Ltd.

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Applicant before: CLOUDMINDS (SHENZHEN) ROBOTICS SYSTEMS Co.,Ltd.

TA01 Transfer of patent application right
CB02 Change of applicant information

Address after: 201111 Building 8, No. 207, Zhongqing Road, Minhang District, Shanghai

Applicant after: Dayu robot Co.,Ltd.

Address before: 200000 second floor, building 2, no.1508, Kunyang Road, Minhang District, Shanghai

Applicant before: Dalu Robot Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant