CN113536829A - Goods static identification method of unmanned retail container - Google Patents

Goods static identification method of unmanned retail container Download PDF

Info

Publication number
CN113536829A
CN113536829A CN202010286329.4A CN202010286329A CN113536829A CN 113536829 A CN113536829 A CN 113536829A CN 202010286329 A CN202010286329 A CN 202010286329A CN 113536829 A CN113536829 A CN 113536829A
Authority
CN
China
Prior art keywords
target detection
network
static
goods
detection model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010286329.4A
Other languages
Chinese (zh)
Other versions
CN113536829B (en
Inventor
张海军
李东海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology
Original Assignee
Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology filed Critical Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology
Priority to CN202010286329.4A priority Critical patent/CN113536829B/en
Publication of CN113536829A publication Critical patent/CN113536829A/en
Application granted granted Critical
Publication of CN113536829B publication Critical patent/CN113536829B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07FCOIN-FREED OR LIKE APPARATUS
    • G07F9/00Details other than those peculiar to special kinds or types of apparatus
    • G07F9/006Details of the software used for the vending machines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a static goods identification method for an unmanned retail container, which comprises the steps of constructing a static identification data set through manual acquisition and manual marking; introducing a deformable convolution neural network and a group normalization layer into a backbone network, selecting a focusing loss function for classification in a sub-network and selecting a balance L1 loss function for coordinate regression to construct a first-stage target detection model; training a stage target detection model to obtain grid parameters; and inputting the grid parameters into the unmanned sales counter, and identifying the types and the quantity of the goods. The static goods identification method provided by the invention solves the problem of instability of edge goods detection in the traditional target detection model, and improves the user experience of unmanned goods selling by improving the goods identification rate.

Description

Goods static identification method of unmanned retail container
Technical Field
The invention belongs to the field of object identification of unmanned retail, and particularly relates to a static goods identification method of an unmanned retail container.
Background
As a large class of unattended services, unattended retail mainly refers to retail consumption behavior that is performed in an unattended situation. Unmanned retail refers to a new retail service realized based on intelligent technology without the attendance of a shopping guide and a cashier. Vending machines, which were originally developed in the early 80's of the 19 th century, were an example of an unmanned retail model. Nowadays, a novel vending machine is produced by using new technologies such as mobile payment and two-dimensional codes. The use of these newly developed technologies greatly improves the efficiency of selling goods and the user experience as compared to conventional vending machines. Typically, a consumer needs to open an application providing mobile payment services, such as a payer, WeChat, etc., and then enter a transaction settlement process by scanning the two-dimensional code. However, the business process still needs to follow the traditional operational shopping flow. For example, only one item can be selected at a time, and when a user wants to purchase a plurality of items, the operation needs to be repeated many times, which is inconvenient. In contrast, newly developed unmanned intelligent vending machines can greatly improve the shopping experience by employing advanced computer vision technology. Tencent optimal graph laboratory introduced an unmanned intelligent retail container case equipped with artificial intelligence technology on the computer vision peak in 2018. The intelligent vending machine integrates the technologies such as a deep learning technology, a visual product recognition algorithm and WeChat online payment into an unmanned intelligent retail container based on visual recognition, explores a new shopping mode, and greatly improves the purchasing experience compared with the traditional vending machine. With the rapid development of technologies such as computer vision, RFID, deep learning, Internet of things and the like, the unmanned intelligent vending machine is more and more popular in the e-commerce market as an important unmanned retail form.
The core of the static goods identification method in the environment of the unmanned retail container is a target detection algorithm. According to the development of the target detection method, the field can be roughly divided into two main detection branches, namely a two-stage target detection method and a one-stage target detection method. In recent years, target detection performance in multiple reference datasets has been continuously updated based on two-stage and one-stage detection algorithms of convolutional neural networks. In 2014, Girshick et al proposed an R-CNN target detection method, which is an important algorithm introduced into the field of target detection in the deep learning in recent years. In a later study, Girshick et al proposed an improved Fast RCNN method. Based on the idea of a multitask loss function, Fast RCNN combines classification loss and bounding box regression loss into a unified end-to-end training framework. However, generating positive and negative candidate boxes still requires a selective search algorithm to generate the physical candidate regions, which separates the training process of the detector. In addition, it is very time consuming in the testing phase. To address this problem, Ren et al propose a faster R-CNN with candidate area generation network module to help generate the candidate box. In addition, inspired by a one-stage method of the regression of the OverFeat algorithm, Redmon et al propose a one-stage detection method named YOLO, which omits a candidate bounding box extraction branch (candidate box suggestion stage), and integrates feature extraction, candidate bounding box position regression and classification into the same convolutional network.
When the traditional target detection algorithm is used for detecting and analyzing goods in the unmanned retail container, the goods at the edge of the picture are not detected stably, the frame is lost frequently, the recall rate is reduced, the user experience is poor, and the market popularization of the unmanned retail container is seriously influenced.
Disclosure of Invention
The invention aims to provide a static goods identification method for an unmanned retail container, which is characterized in that a deformable convolution neural network is introduced to construct a one-stage target detection model, so that the goods identification rate is improved, and the problem of unstable edge goods detection in the unmanned retail container is solved.
In order to achieve the purpose, the invention adopts the following technical scheme: a method of static identification of goods for an unmanned retail container, the method comprising: constructing a static data set, and manually marking the label, category and bounding box coordinate information of an image by manually collecting the image; constructing a one-stage target detection model, wherein the one-stage target detection model comprises a backbone network and a sub-network; introducing a deformable convolution neural network into the backbone network, wherein a group normalization layer is selected as a normalization layer of the backbone network; in the sub-network, a focusing loss function is selected to classify the coordinate information of the boundary frame, and a balance L1 loss function is selected to perform coordinate regression on the coordinate information of the boundary frame; training the first-stage target detection model, taking the picture of the static data set as input, extracting features through the backbone network, and taking the coordinate information of the label, the category and the boundary box as output to obtain grid parameters; and inputting the grid parameters into the unmanned retail container to perform static goods identification.
Specifically, the backbone network employs a residual error network.
Preferably, the method for introducing the deformable convolutional neural network into the backbone network is introduced into the last three layers of convolution.
Preferably, the method for training the one-stage target detection model adopts a randomness reduction algorithm and a momentum algorithm.
Specifically, the method for constructing the one-stage target detection model is to construct a DrtNet model on the basis of a RetinaNet model.
The invention has the beneficial effects that: a deformable convolution neural network and a group normalization layer are added on the basis of a RetinaNet model, and a focusing loss function and a balance L1 loss function are selected for classification and regression, so that the recall rate of an original data set is improved, the condition that a boundary frame is lost is prevented to a certain extent, and the goods identification rate of the unmanned retail container is improved.
Drawings
FIG. 1 is a flow chart of a method of static identification of goods for an unmanned retail container of the present invention;
FIG. 2 is a block diagram of a deformable convolutional neural network of the present invention;
FIG. 3 is a schematic view of an embodiment of an unmanned retail container of the present invention;
FIG. 4 is a flow chart of the operation of an unmanned retail container according to an embodiment of the present invention;
FIG. 5 is a graph showing the results of the beverage item test according to the embodiment of the present invention.
The reference numerals in the figures denote:
1. an unmanned retail container; 2. a cabinet door; 3. a camera is provided.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following detailed description of the embodiments of the present invention is provided with examples. It should be understood that the examples described herein are only for the purpose of illustrating the present invention, and are not intended to limit the scope of the present invention.
Referring to fig. 1, fig. 1 is a flow chart of the static goods identification method of the unmanned retail container of the present invention. Wherein,
step S1: and constructing a static data set, and manually marking the label, category and bounding box coordinate information of the image by manually collecting the image.
In this embodiment, the goods category is constructed by randomly selecting 10 beverages from the market, which are: plus Duodao (JDB), pulsation (MZ), Fenda (FT), Master Kong Iced Black Tea (IBT), Nutrition express line (NE), unified assam tea milk Green (JGMT), beauty juice source (MM), Baisui mountain (GTEN), unified assam original milk tea (UAMT) and force emperor (VVW). The entire data set of static identification is in the VOC2007 data format, with pictures having two sizes, mainly 1280 x 720 and 1920 x 1080. The total data set was 34052 pictures, all of which were labeled, for a total of 155153 beverages.
Step S2: constructing a one-stage target detection model, wherein the one-stage target detection model comprises a backbone network and a sub-network; introducing a deformable convolution neural network into a backbone network, wherein a group normalization layer is selected as a normalization layer of the backbone network; in the sub-network, a focusing loss function is selected to classify the coordinate information of the bounding box, and a balance L1 loss function is selected to perform coordinate regression on the coordinate information of the bounding box.
Step S3: training the stage target detection model in the step S2, taking the picture of the static data set in the step S1 as input, performing feature extraction through a backbone network, and taking the coordinate information of the label, the category and the bounding box as output to acquire grid parameters. The process specifically uses a small batch of stochastic gradient descent, momentum algorithm for training.
Step S4: and inputting the grid parameters into the unmanned retail container to perform static goods identification.
In the step S2, the method for constructing the one-stage target detection model is to construct a DrtNet model on the basis of the RetinaNet model, the backbone network adopts a residual network, and the introduction of the deformable convolution neural network is introduced in the last three-layer convolution. In this embodiment, please refer to fig. 2, which is a schematic diagram of a deformable convolutional neural network. In the embodiment of the invention, the adaptive learning variable is introduced in the deformable convolution operation, and the rule of the traditional convolution kernel operation is not changed. Also for each output y (p)0) All up-sample 9 positions from the input feature map, where the 9 positions are the center position x (p)0) Diffused all around, but with an increase of Δ pnThe sample points are allowed to diffuse into a non-gidd shape. See formula (1)
Figure BDA0002448647580000041
The group normalization divides the feature map into a plurality of groups according to the channel dimension, normalizes each group, and changes the dimension of the feature map from [ N, C, H, W ] to [ N, G, C// G, H, W ], wherein the normalized dimension is [ C// G, H, W ]. The normalization mode of GN avoids the influence of Batch size on the model, and can solve the problems mentioned above, so that the designed one-stage target detection network completely replaces the Batch normalization layer with the group normalization layer.
The classification loss function uses a focus loss function which enables a one-stage target detection network to achieve the same accuracy as that of fast RCNN by adjusting the calculation formula of the loss function, and is an improved version of the cross entropy loss function. Adding a regulating factor (1-p) before the cross entropyt)γAnd gamma is more than or equal to 0, as shown in formula (2)
FL(pt)=-(1-pt)γlog(pt) (2)
As γ gets larger, the penalty function is almost zero in the easily classified part, and ptThe smaller part (indistinguishable samples), the loss function value is still larger. Thus, when the class imbalance is large, the loss function values of the samples can be accumulated, and the samples which are difficult to distinguish can contribute more loss function values.
In practical use, a slight accuracy improvement can be generated by adding an alpha balance factor on the basis of the formula (2), see formula (3):
FL(pt)=-αt(1-pt)γlog(pt) (3)
in the coordinate regression, a balance L1 loss function is used, wherein the sample loss is greater than or equal to 1.0 and is an outlier sample, and the sample loss is less than or equal to 1.0 and is an accurate sample. The key idea in balancing the L1 loss function is to raise the regression gradient of the key part, i.e. the gradient from the exact samples, to rebalance the samples and tasks involved, thus enabling a more balanced training in terms of classification, global positioning and exact positioning. Equation (4) for designing a lift gradient is as follows:
Figure BDA0002448647580000051
wherein alpha controls the lifting of the gradient of the accurate sample, and the gradient of the accurate sample can be lifted by setting smaller alpha without influencing the value of the outlier sample. The upper bound of the regression error is controlled and adjusted, so that different tasks can be more balanced. Alpha and gamma control balance from a sample and task level, and the two factors for controlling different aspects are mutually enhanced to achieve more balanced training. From the gradient equation, the balance L1 loss function can be derived as shown in equation (5):
Figure BDA0002448647580000052
referring to fig. 3 and fig. 4, fig. 3 is a schematic structural diagram of an unmanned retail container according to an embodiment of the present invention, and fig. 4 is a flowchart of an operation of the unmanned retail container according to the embodiment of the present invention.
When the user opens the cabinet door 2 of the unmanned retail container 1, the sensor is triggered to enable the camera 3 to take a first picture. After the user selects the goods, the hands extend out of the unmanned retail container 1, the infrared sensor detects the leaving of the hands, and the camera 3 is triggered to take a second picture. The type and the number of the goods taken by the user are determined by the placing pictures of the front and the back unmanned retail containers 1 for the goods taken by the user.
Referring to fig. 5, the result of identifying the beverage goods according to the embodiment of the present invention is obtained by using the static goods identification method of the unmanned retail container according to the embodiment of the present invention. As shown in FIG. 5, the method of the present invention can improve the recall rate of the original data set, and in this embodiment, the bounding box of the beverage item is not lost.
Finally, it should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims (5)

1. A static goods identification method for an unmanned retail container is characterized by comprising the following steps:
constructing a static data set, and manually marking the label, category and bounding box coordinate information of an image by manually collecting the image;
constructing a one-stage target detection model, wherein the one-stage target detection model comprises a backbone network and a sub-network; introducing a deformable convolution neural network into the backbone network, wherein a group normalization layer is selected as a normalization layer of the backbone network; in the sub-network, a focusing loss function is selected to classify the coordinate information of the boundary frame, and a balance L1 loss function is selected to perform coordinate regression on the coordinate information of the boundary frame;
training the first-stage target detection model, taking the picture of the static data set as input, extracting features through the backbone network, and taking the coordinate information of the label, the category and the boundary box as output to obtain grid parameters;
and inputting the grid parameters into the unmanned retail container to perform static goods identification.
2. The static goods identification method of claim 1, wherein the backbone network employs a residual network.
3. The static goods identification method of claim 1, wherein the method for introducing the deformable convolutional neural network into the backbone network is introduced into the last three layers of convolution.
4. The method for statically identifying goods according to claim 1, wherein the method for training the one-stage target detection model adopts a stochastic gradient descent algorithm with momentum.
5. The static goods identification method of claim 1, wherein the method for constructing the one-stage target detection model is to construct a DrtNet model based on a RetinaNet model.
CN202010286329.4A 2020-04-13 2020-04-13 Goods static identification method for unmanned retail container Active CN113536829B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010286329.4A CN113536829B (en) 2020-04-13 2020-04-13 Goods static identification method for unmanned retail container

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010286329.4A CN113536829B (en) 2020-04-13 2020-04-13 Goods static identification method for unmanned retail container

Publications (2)

Publication Number Publication Date
CN113536829A true CN113536829A (en) 2021-10-22
CN113536829B CN113536829B (en) 2024-06-11

Family

ID=78119881

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010286329.4A Active CN113536829B (en) 2020-04-13 2020-04-13 Goods static identification method for unmanned retail container

Country Status (1)

Country Link
CN (1) CN113536829B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116596012A (en) * 2023-05-09 2023-08-15 上海银满仓数字科技有限公司 Commodity information transmission method and system based on RFID

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170124415A1 (en) * 2015-11-04 2017-05-04 Nec Laboratories America, Inc. Subcategory-aware convolutional neural networks for object detection
CN108764164A (en) * 2018-05-30 2018-11-06 华中科技大学 A kind of method for detecting human face and system based on deformable convolutional network
CN109409443A (en) * 2018-11-28 2019-03-01 北方工业大学 Multi-scale deformable convolution network target detection method based on deep learning
CN109711427A (en) * 2018-11-19 2019-05-03 深圳市华尊科技股份有限公司 Object detection method and Related product
AU2019101133A4 (en) * 2019-09-30 2019-10-31 Bo, Yaxin MISS Fast vehicle detection using augmented dataset based on RetinaNet
CN110414559A (en) * 2019-06-26 2019-11-05 武汉大学 The construction method and commodity recognition method of intelligence retail cabinet commodity target detection Unified frame

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170124415A1 (en) * 2015-11-04 2017-05-04 Nec Laboratories America, Inc. Subcategory-aware convolutional neural networks for object detection
CN108764164A (en) * 2018-05-30 2018-11-06 华中科技大学 A kind of method for detecting human face and system based on deformable convolutional network
CN109711427A (en) * 2018-11-19 2019-05-03 深圳市华尊科技股份有限公司 Object detection method and Related product
CN109409443A (en) * 2018-11-28 2019-03-01 北方工业大学 Multi-scale deformable convolution network target detection method based on deep learning
CN110414559A (en) * 2019-06-26 2019-11-05 武汉大学 The construction method and commodity recognition method of intelligence retail cabinet commodity target detection Unified frame
AU2019101133A4 (en) * 2019-09-30 2019-10-31 Bo, Yaxin MISS Fast vehicle detection using augmented dataset based on RetinaNet

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张海军,李海东: "Deep Learning-based Beverage Recognition for Unmanned Vending Machines: An Empirical Study", 《2019 IEEE 17TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN)》, pages 1464 - 1467 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116596012A (en) * 2023-05-09 2023-08-15 上海银满仓数字科技有限公司 Commodity information transmission method and system based on RFID
CN116596012B (en) * 2023-05-09 2024-05-07 上海银满仓数字科技有限公司 Commodity information transmission method and system based on RFID

Also Published As

Publication number Publication date
CN113536829B (en) 2024-06-11

Similar Documents

Publication Publication Date Title
US20220375193A1 (en) Saliency-based object counting and localization
US10817749B2 (en) Dynamically identifying object attributes via image analysis
US11657602B2 (en) Font identification from imagery
CN108520285A (en) Article discrimination method, system, equipment and storage medium
US11887217B2 (en) Text editing of digital images
US20140324836A1 (en) Finding similar items using windows of computation
US20110314031A1 (en) Product category optimization for image similarity searching of image-based listings in a network-based publication system
Xu et al. Design of smart unstaffed retail shop based on IoT and artificial intelligence
CN107592839A (en) Fine grit classification
Wu et al. An intelligent self-checkout system for smart retail
CN107683469A (en) A kind of product classification method and device based on deep learning
KR20190095333A (en) Anchor search
CN107133854A (en) Information recommendation method and device
CN110363206B (en) Clustering of data objects, data processing and data identification method
CN111868709A (en) Automatic batch sorting
CN112651340A (en) Character recognition method, system, terminal device and storage medium for shopping receipt
CN106997350A (en) A kind of method and device of data processing
CN113536829B (en) Goods static identification method for unmanned retail container
CN114255377A (en) Differential commodity detection and classification method for intelligent container
KR102155905B1 (en) Smart mediation apparatus and methods for sharing travel goods
CN112883719A (en) Class word recognition method, model training method, device and system
CN112541055A (en) Method and device for determining text label
KR101498944B1 (en) Method and apparatus for deciding product seller related document
Wang et al. An IoT Based Fruit and Vegetable Sales System: A whole system including IoT based integrated intelligent scale and online shop
Fanca et al. Romanian coins recognition and sum counting system from image using TensorFlow and Keras

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant