CN117953415A - Article placement detection method, device, computer equipment and storage medium - Google Patents

Article placement detection method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN117953415A
CN117953415A CN202410030759.8A CN202410030759A CN117953415A CN 117953415 A CN117953415 A CN 117953415A CN 202410030759 A CN202410030759 A CN 202410030759A CN 117953415 A CN117953415 A CN 117953415A
Authority
CN
China
Prior art keywords
placement
feature vector
feature
article
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410030759.8A
Other languages
Chinese (zh)
Inventor
王卓琛
罗羊
刘枢
吕江波
沈小勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Smartmore Technology Co Ltd
Original Assignee
Shenzhen Smartmore Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Smartmore Technology Co Ltd filed Critical Shenzhen Smartmore Technology Co Ltd
Priority to CN202410030759.8A priority Critical patent/CN117953415A/en
Publication of CN117953415A publication Critical patent/CN117953415A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The application relates to a method and a device for detecting object placement, computer equipment and a storage medium. The method comprises the following steps: determining at least one region to be detected in an article placement image acquired from an article placement platform; extracting the characteristics of each region to be detected to obtain region characteristic vectors corresponding to each region to be detected; for each region feature vector, determining a target reference feature vector matching the region feature vector from a plurality of reference feature vectors; the reference feature vector is obtained based on a reference placement image, and the reference placement image is obtained by aiming at a placement platform meeting preset placement requirements; and determining an article placement detection result based on the target reference feature vector corresponding to each region feature vector, wherein the article placement detection result is used for representing whether the object placement platform meets the preset placement requirement. By adopting the application, the detection efficiency of object placement can be improved.

Description

Article placement detection method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of article placement detection technologies, and in particular, to a method and apparatus for article placement detection, a computer device, and a storage medium.
Background
In order to determine whether the placement of the articles on the placement platform meets relevant placement regulations, for example, the placement platform may be a desk or a cabinet, and a worker is usually scheduled to perform a regular check. However, manual inspection is time-consuming and prone to errors one by one, so detection of article placement by using a deep learning model, for example, detection of article placement by using a target detection model, has emerged.
In the conventional technology, a large amount of labeling data is usually required to be collected to train a target detection model, and whether the placement of the articles is compliant is detected through the trained target detection model. However, once the user updates the item placement rules or changes the type of item, the user needs to re-collect data and retrain the target detection model, which takes a long time, resulting in lower detection efficiency of item placement.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, an apparatus, a computer device, a computer readable storage medium, and a computer program product for detecting placement of an article, which can improve the detection efficiency of placement of the article.
In a first aspect, the present application provides a method for detecting placement of an article, including:
determining at least one region to be detected in an article placement image acquired from an article placement platform;
Extracting the characteristics of each region to be detected to obtain region characteristic vectors corresponding to each region to be detected;
For each region feature vector, determining a target reference feature vector matching the region feature vector from a plurality of reference feature vectors; the reference feature vector is obtained based on a reference placement image, and the reference placement image is obtained by aiming at a placement platform meeting preset placement requirements;
determining an article placement detection result based on the target reference feature vector corresponding to each region feature vector; the object placement detection result is used for representing whether the object placement platform meets preset placement requirements.
In a second aspect, the present application provides an article placement detection device, comprising:
the area determining module is used for determining at least one area to be detected from the object placing image acquired by the object placing platform;
The feature extraction module is used for respectively extracting features of each region to be detected to obtain region feature vectors corresponding to each region to be detected;
The feature matching module is used for determining a target reference feature vector matched with the region feature vector from a plurality of reference feature vectors according to each region feature vector; the reference feature vector is obtained based on a reference placement image, and the reference placement image is obtained by aiming at a placement platform meeting preset placement requirements;
The detection module is used for determining an article placement detection result based on the target reference feature vector corresponding to each region feature vector; the object placement detection result is used for representing whether the object placement platform meets preset placement requirements.
In a third aspect, the application provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method described above when the processor executes the computer program.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method described above.
In a fifth aspect, the application provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method described above.
According to the article placement detection method, the device, the computer equipment, the computer readable storage medium and the computer program product, the area feature vectors corresponding to the areas to be detected are obtained by respectively carrying out feature extraction on the areas to be detected in the article placement images, the target reference feature vectors matched with the area feature vectors are determined from the reference feature vectors, the article placement detection result is determined based on the target reference feature vectors corresponding to the area feature vectors, whether the article placement platform corresponding to the article placement images meets the preset placement requirements or not is detected, the article placement detection of the opposite article platform is realized through a feature comparison search mode, and the article placement detection can be realized only by updating the reference feature vectors based on the reference placement images meeting the new preset placement requirements under the condition that the new preset placement requirements exist without retraining a detection model, so that the detection efficiency of the article placement is improved.
Drawings
Fig. 1 is an application environment diagram of an article placement detection method according to an embodiment of the present application.
Fig. 2 is a flow chart of an article placement detection method according to an embodiment of the present application.
Fig. 3 is a schematic diagram of a registration and detection flow according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of a feature extraction model according to an embodiment of the present application.
Fig. 5 is a block diagram of an article placement detection device according to an embodiment of the present application.
Fig. 6 is an internal structure diagram of a computer device according to an embodiment of the present application.
Fig. 7 is an internal structure diagram of another computer device according to an embodiment of the present application.
Fig. 8 is an internal structural diagram of a computer-readable storage medium according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The object placement detection method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a communication network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, etc. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.
Those skilled in the art will appreciate that the application environment shown in fig. 1 is only a partial scenario related to the present application, and does not constitute a limitation on the application environment of the present application.
As shown in fig. 2, an embodiment of the present application provides a method for detecting placement of an article, which may be performed by a terminal or a server, or may be performed by the terminal and the server together, and an example in which the method is applied to the server 104 in fig. 1 is described. The method comprises the following steps:
Step 202, determining at least one area to be detected from an object placement image acquired by the object placement platform.
The object placing platform can be an office table, a cabinet, a commodity shelf and the like and is used for placing objects in corresponding scenes. For example, in an enterprise, in order to keep the office environment clean, a 6S management system is usually formulated to restrict objects placed on the office desk, that is, the object placing platform is the office desk, and the object placing image may be a desktop image acquired from the office desk. The area to be detected is an area which is determined from the object placement image and is required to place the object. It will be appreciated that the area to be inspected may or may not include the entire article or a portion of the article.
Specifically, the terminal may acquire an article placement image obtained by image acquisition of the article placement platform by using the image acquisition device, and send the article placement image to the server. The server receives an article placement image sent by the terminal, and determines at least one area to be detected from the article placement image.
In some embodiments, the reference placement image and the position of each object placement area in the reference placement image are stored in the server, and the server may determine the corresponding area to be detected from the object placement image according to the position of each object placement area in the reference placement image. The reference placement images are acquired aiming at a placement platform meeting preset placement requirements, a plurality of reference objects are placed on the placement platform, and the placement positions of the reference objects meet the preset placement requirements. The object placement area is an image area including a reference object in the reference placement image. The preset placement requirements are placement rules formulated for the object placement platform, and the types and placement positions of objects to be placed on the object placement platform are specified.
In some embodiments, before object placement detection is performed, the terminal may perform image acquisition on an object placement platform according to preset placement requirements through an image acquisition device to obtain a reference placement image, perform object detection on the reference placement image to obtain object placement areas in the reference placement image, determine a position of each object placement area in the reference placement image, and for example, label the detected positions of the object placement areas with a labeling frame. The server may then store the reference placement image and the location of each item placement area in the reference placement image in a database.
And 204, respectively extracting the characteristics of each region to be detected to obtain region characteristic vectors corresponding to each region to be detected.
Wherein the region feature vector is a vector characterizing relevant features of the region to be detected, including at least one of color, texture, article size, etc.
Specifically, for each region to be detected, the server may cut out a region image corresponding to the region to be detected from the object placement image, and then input the region image into a trained feature extraction model to perform feature extraction, so as to obtain a region feature vector corresponding to the region to be detected.
Step 206, for each region feature vector, determining a target reference feature vector matched with the region feature vector from a plurality of reference feature vectors; the reference feature vector is obtained based on a reference placement image, and the reference placement image is obtained by collecting the reference placement image aiming at a placement platform meeting the preset placement requirement.
The reference feature vector is obtained by extracting features of an object placing area in the reference placing image, and represents relevant features of the object placing area, including features of reference objects in the object placing area. Each object placement area corresponds to a reference feature vector. It can be understood that the reference placement images may be multiple, for example, multiple images obtained by image acquisition at different angles for a placement platform meeting preset placement requirements.
Specifically, a plurality of reference feature vectors are stored in the server, and for each region feature vector, the server may calculate the similarity between the region feature vector and each reference feature vector, determine the target reference feature vector from each reference feature vector in order of the similarity from the largest, for example, may determine the reference feature vector having the largest similarity as the target reference feature vector. Wherein the similarity may be determined based on a vector distance between the region feature vector and the reference feature vector, and the vector distance may be a euclidean distance.
Step 208, determining an article placement detection result based on the target reference feature vector corresponding to each region feature vector; the object placement detection result is used for representing whether the object placement platform meets preset placement requirements.
Specifically, the server also stores the item type corresponding to each reference feature vector. For each regional feature vector, the server may determine the article type corresponding to the target reference feature vector corresponding to the regional feature vector as a predicted article type corresponding to the region to be detected to which the regional feature vector belongs, determine the standard article type corresponding to the region to be detected according to the position of the region to be detected in the article placement image, and determine the article placement detection result according to the predicted article type and the standard article type corresponding to each region to be detected, for example, when the predicted article type corresponding to the region to be detected is inconsistent with the standard article type, determine that the article placement detection result is that the article placement platform does not meet the preset placement requirement. The predicted article type is an article type corresponding to an article in the to-be-detected area determined through feature matching, and the standard article type is an article type corresponding to an article to be placed in the to-be-detected area according to a preset placement requirement.
In some embodiments, the process of determining the reference feature vector from the reference placement image may be referred to as image registration, and the reference placement image may also be referred to as registration image, as shown in fig. 3, which illustrates a schematic flow chart of registration and testing. Under the condition that a user updates a preset placement requirement, only three steps of image registration, feature extraction and matching detection in the flow chart are needed to be repeated, and shutdown iteration and engineer participation in updating are not needed. Firstly, in an image registration link, a user can input a reference placement image meeting the updated preset placement requirement as a registration image into a feature extraction model, and store an output reference feature vector as a positive sample into a sample library. Then, the user can take the real-time collected object placement image as a test image, input the test image into the feature extraction model for processing, and output the feature vector. And finally, carrying out feature matching retrieval, and comparing the feature vector corresponding to the test image with the feature vector in the sample library by a user, so as to determine whether the object placing platform corresponding to the object placing image meets the preset placing requirement according to the comparison result.
Therefore, in the embodiment of the application, the feature extraction is respectively carried out on each region to be detected in the object placement image to obtain the region feature vector corresponding to each region to be detected, the target reference feature vector matched with the region feature vector is determined from the reference feature vector, the object placement detection result is determined based on the target reference feature vector corresponding to each region feature vector, so as to detect whether the object placement platform corresponding to the object placement image meets the preset placement requirement, the object placement detection of the object placement platform is realized in a feature comparison search mode, and the object placement detection can be realized only by updating the reference feature vector based on the reference placement image meeting the new preset placement requirement under the condition of having the new preset placement requirement without retraining a detection model, thereby improving the detection efficiency of the object placement.
In some embodiments, the regional feature vectors are derived using a trained feature extraction model; before extracting the features of each region to be detected to obtain the region feature vector corresponding to each region to be detected, the method further comprises:
inputting each sample image into a feature extraction model to be trained for processing, and outputting sample feature vectors corresponding to each sample image; each sample image includes items of any item type;
determining agent feature vectors corresponding to the article types respectively;
and adjusting the feature extraction model to be trained based on the similarity between the sample feature vector corresponding to each sample image and the plurality of agent feature vectors to obtain a trained feature extraction model.
The sample image is used for training the feature extraction model, and one sample image comprises one article, for example, the sample image can be obtained by image acquisition of a single article placed on the article placing platform, or can be an area image comprising one article, which is obtained by cutting out from the article placing platform image. The types of the objects corresponding to the objects in the different sample images can be the same or different. The sample feature vector is extracted based on the sample image. Each item type corresponds to a proxy feature vector, which characterizes the relevant features of the item of the corresponding item type. It will be appreciated that the proxy feature vector is continually updated during the training process.
Specifically, the server may acquire a plurality of sample images from a local or other computer device, input each sample image to a feature extraction model to be trained to process, obtain sample feature vectors corresponding to each sample image, calculate similarity between the sample feature vectors corresponding to each sample image and each proxy feature vector, and determine a loss value based on the similarity, so as to adjust model parameters of the feature extraction model to be trained according to the loss value until a preset training condition is met, and obtain a trained feature extraction model. The preset training condition is preset, for example, may be that the model converges or reaches a preset iteration number.
In some embodiments, in the model training process, the server may cluster sample feature vectors corresponding to the plurality of input sample images respectively, and obtain, according to a clustering result, agent sets corresponding to the article types respectively, where each agent set includes a plurality of sample feature vectors of the same article type. For each proxy set, the server may determine a proxy feature vector based on the respective sample feature vectors in the proxy set, e.g., a cluster center vector corresponding to each item type may be determined as a corresponding proxy feature vector, and a mean vector of sample feature vectors in the proxy set may be determined as a proxy feature vector.
Therefore, in this embodiment, each article type has a respective agent feature vector, so that the feature extraction model to be trained is adjusted based on the similarity between the sample feature vector and each agent feature vector, so that the difference between feature vectors of different article types can be increased, the accuracy of the feature extraction model is improved, and the detection accuracy of article placement is improved.
In some embodiments, adjusting the feature extraction model to be trained based on the similarity between the sample feature vector and the plurality of proxy feature vectors corresponding to each sample image to obtain a trained feature extraction model, including:
For each sample image, determining a proxy feature vector corresponding to the article type to which the article belongs in the sample image as a positive proxy feature vector, and determining proxy feature vectors except the positive proxy feature vector as negative proxy feature vectors;
Determining a loss value based on a first similarity between a sample feature vector corresponding to the sample image and a positive proxy feature vector and a second similarity between a sample feature vector corresponding to the sample image and a negative proxy feature vector; the first similarity and the loss value form a negative correlation, and the second similarity and the loss value form a positive correlation;
And adjusting the feature extraction model to be trained based on the loss value to obtain a trained feature extraction model.
The type of the object in each sample image is known, for example, may be labeled in advance. The type of the article corresponding to the positive proxy feature vector is the same as the type of the article belonging to the article in the sample image, and only one positive proxy feature vector corresponds to each sample feature vector. The article type corresponding to the negative proxy feature vector is different from the article type of the article in the sample image, and the negative proxy feature vector corresponding to each sample feature vector can be a plurality of negative proxy feature vectors. The positive correlation refers to: under the condition that other conditions are unchanged, the directions of the two variables are the same, and when one variable is changed from large to small, the other variable is also changed from large to small. The negative correlation refers to: under the condition that other conditions are unchanged, the directions of the two variables are opposite, and when one variable is changed from large to small, the other variable is changed from small to large.
Specifically, for each sample image, the server may acquire an item type to which the item in the sample image belongs, and determine a proxy feature vector corresponding to the item type to which the item in the sample image belongs as a positive proxy feature vector, and determine respective proxy feature vectors of other item types as negative proxy feature vectors. And then the server calculates a first similarity between the sample feature vector corresponding to the sample image and the positive proxy feature vector, calculates a second similarity between the sample feature vector corresponding to the sample image and each negative proxy feature vector, and determines a loss value based on the first similarity and each second similarity, so that model parameters of the feature extraction model to be trained are adjusted by using the loss value.
In some embodiments, the server may input a plurality of sample images of one batch at a time into the feature extraction model to be trained, and calculate the loss value based on sample feature vectors respectively corresponding to the plurality of sample images. For example, the loss function of the feature extraction model may be expressed by the following formula:
Wherein X represents a sample feature vector set corresponding to a plurality of sample images, and X is E X. l (X) represents a loss value. P + represents a positive proxy set and P - represents a negative proxy set. Alpha, delta are the super-parameters, Is the similarity between the sample feature vector x and the proxy feature vector p.
Therefore, in this embodiment, since the first similarity and the loss value form a negative correlation, and the second similarity and the loss value form a positive correlation, in the training process, training can be controlled to be performed in a direction in which the first similarity is increased and the second similarity is decreased, so as to reduce the similarity between feature vectors of the articles of different article types, and the difference between the different article types is pulled away, so that the feature extraction model can extract the feature vectors of the different article types more accurately, and has a better resolution effect on the region with insignificant features.
In some embodiments, adjusting the feature extraction model to be trained based on the loss value results in a trained feature extraction model comprising:
And adjusting model parameters of each agent feature vector and the feature extraction model to be trained until a preset training condition is met, so as to obtain a trained feature extraction model.
Specifically, the value of each agent feature vector can be learned, and because the first similarity and the loss value form a negative correlation, and the second similarity and the loss value form a positive correlation, when the loss value is calculated each time, model parameters of the agent feature vector and the feature extraction model to be trained can be adjusted towards the direction of reducing the loss value until a preset training condition is met, and a trained feature extraction model is obtained.
Therefore, in this embodiment, the value of the proxy feature vector is adjusted in a direction of reducing the loss value, so that the value of the proxy feature vector can be optimized, and the loss value calculated based on the sample feature vector and the proxy feature vector is more accurate, thereby further improving the accuracy of the feature extraction model.
In some embodiments, feature extraction is performed on each region to be detected, to obtain a region feature vector corresponding to each region to be detected, including:
Extracting features of the to-be-detected areas aiming at each to-be-detected area to obtain an area feature map and at least one area attention map;
respectively carrying out weighted calculation on the regional feature map by utilizing each regional attention map to obtain a weighted feature map;
And determining the regional characteristic vector corresponding to the region to be detected based on the weighted characteristic map.
The region feature map is obtained by extracting features of a region image corresponding to a region to be detected, and region attention is used for representing significant features in the region to be detected, and can be used for enhancing the features of the region feature map. It will be appreciated that at least one salient feature may be included in one region to be detected, and thus regional attention is intended to be at least one.
Specifically, for each region to be detected, the server may cut out a region image corresponding to the region to be detected from the object placement image according to the position of the region to be detected in the object placement image, then input the region image into the trained feature extraction model, and perform feature extraction by the backbone network of the trained feature extraction model. The backbone network includes a first output branch that outputs a regional signature and a second output branch that outputs a regional signature. The server may perform a weighted calculation on the regional feature map by using each regional attention map, which may also be referred to as bilinear attention pooling, as shown in fig. 4, which illustrates a process of bilinear attention pooling on the regional feature map by using the regional attention map, so as to implement enhancement processing on the regional feature map, obtain a weighted feature map corresponding to each regional attention map, and then combine the weighted feature maps to obtain a regional feature vector.
In some embodiments, as shown in fig. 4, a schematic structure diagram of a feature extraction model is shown, during a model training process, attention attempts can be utilized to perform attention clipping or attention erasing on an input image, so as to remove a partial area irrelevant to the attention attempts, obtain an enhanced image, and then the enhanced image is input into a backbone network of the feature extraction model again for training, so that the training effect on the attention attempts is enhanced.
Therefore, in the embodiment, the regional attention map is extracted, so that the regional feature map is enhanced by using the regional attention map, and the extraction effect of regional features of smaller objects is better, so that the feature vectors output by the feature extraction model are more accurate.
In some embodiments, determining the item placement detection result based on the target reference feature vector corresponding to each region feature vector includes:
For each regional feature vector, determining a predicted article type corresponding to a region to be detected, to which the regional feature vector belongs, according to a target reference feature vector corresponding to the regional feature vector;
Determining the standard object type corresponding to the area to be detected according to the position of the area to be detected in the object placement image;
Under the condition that the predicted article type is inconsistent with the standard article type, determining that the article placement detection result is that the article placement platform does not meet the preset placement requirement.
The predicted article type is an article type corresponding to the article in the to-be-detected area determined through feature matching; the standard object type is an object type corresponding to an object to be placed in the area to be detected according to a preset placement requirement.
Specifically, the server stores an item type corresponding to each reference feature vector, and for each region feature vector, the server may determine the item type corresponding to the target reference feature vector corresponding to the region feature vector as the predicted item type corresponding to the region to be detected to which the region feature vector belongs. Because the position of the area to be detected is determined based on the position of the object placement area in the reference placement image, the server stores the object types corresponding to the positions of the object placement areas in the reference placement image, so that the corresponding standard object types can be determined according to the positions of the area to be detected.
In some embodiments, in the case that the predicted article type is inconsistent with the standard article type, that is, the articles in the to-be-detected area are not articles required to be placed in the preset placement requirements, or the articles in the to-be-detected area are not placed, the server determines that the article placement detection result is that the placement platform does not meet the preset placement requirements, and the server can send a placement error prompt message to the terminal; under the condition that the predicted article type corresponding to each area to be detected is consistent with the corresponding marked article type, the server determines that the article placement detection result is that the article placement platform meets the preset placement requirement, and the server can send placement correct prompt information to the terminal.
It can be seen that, in this embodiment, the predicted article type corresponding to the article in the to-be-detected area is determined by the target reference feature vector, and the corresponding standard article type is determined by the position of the to-be-detected area in the article placement image, so that whether the article placement platform meets the preset placement requirement can be directly determined according to whether the comparison of the predicted article type and the standard article type is consistent, and the detection efficiency of article placement is improved.
In some embodiments, determining at least one region to be detected from the item placement image acquired from the storage platform includes:
determining the position of an article placement area corresponding to each reference article from the reference placement image;
for each article placement area, determining a corresponding area to be detected from the article placement image acquired by the article placement platform according to the position of the article placement area in the reference placement image.
Wherein, the reference article is the article of putting on the thing platform of putting that accords with the requirement of predetermineeing to put. The object placement areas represent the corresponding positions of the reference objects on the object placement platform meeting the preset placement requirements in the reference placement image, and it can be understood that each object placement area corresponds to one area to be detected.
Specifically, the server stores the positions of the object placement areas corresponding to the reference objects in the reference placement image, after receiving the object placement image sent by the terminal, the server finds the corresponding positions from the object placement image according to the positions of the object placement areas in the reference placement image for each object placement area, so as to determine the to-be-detected areas corresponding to the object placement areas.
Therefore, in this embodiment, according to the position of each object placement area in the reference placement image, the corresponding area to be detected is determined from the object placement image acquired by the object placement platform, so that a basis is provided for determining the standard object type corresponding to the area to be detected.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides an article placement detection device. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation of the embodiment of the device for detecting object placement provided below may be referred to the limitation of the method for detecting object placement above, and will not be repeated here.
As shown in fig. 5, an embodiment of the present application provides an article placement detection device 500, including:
the area determining module 502 is configured to determine at least one area to be detected from the object placement image acquired by the object placement platform.
The feature extraction module 504 is configured to perform feature extraction on each region to be detected, so as to obtain a region feature vector corresponding to each region to be detected.
A feature matching module 506, configured to determine, for each region feature vector, a target reference feature vector matching the region feature vector from a plurality of reference feature vectors; the reference feature vector is obtained based on a reference placement image, and the reference placement image is obtained by collecting the reference placement image aiming at a placement platform meeting the preset placement requirement.
The detection module 508 is configured to determine an article placement detection result based on the target reference feature vectors corresponding to the feature vectors of each region; the object placement detection result is used for representing whether the object placement platform meets preset placement requirements.
In some embodiments, the regional feature vectors are derived using a trained feature extraction model; the device also comprises a model training module which is specifically used for:
inputting each sample image into a feature extraction model to be trained for processing, and outputting sample feature vectors corresponding to each sample image; each sample image includes items of any item type;
determining agent feature vectors corresponding to the article types respectively;
and adjusting the feature extraction model to be trained based on the similarity between the sample feature vector corresponding to each sample image and the plurality of agent feature vectors to obtain a trained feature extraction model.
In some embodiments, in adjusting the feature extraction model to be trained based on the similarity between the sample feature vector and the plurality of proxy feature vectors corresponding to each sample image, to obtain a trained feature extraction model, the model training module is specifically configured to:
For each sample image, determining a proxy feature vector corresponding to the article type to which the article belongs in the sample image as a positive proxy feature vector, and determining proxy feature vectors except the positive proxy feature vector as negative proxy feature vectors;
Determining a loss value based on a first similarity between a sample feature vector corresponding to the sample image and a positive proxy feature vector and a second similarity between a sample feature vector corresponding to the sample image and a negative proxy feature vector; the first similarity and the loss value form a negative correlation, and the second similarity and the loss value form a positive correlation;
And adjusting the feature extraction model to be trained based on the loss value to obtain a trained feature extraction model.
In some embodiments, in adjusting the feature extraction model to be trained based on the loss value to obtain a trained feature extraction model, the model training module is specifically configured to:
And adjusting model parameters of each agent feature vector and the feature extraction model to be trained until a preset training condition is met, so as to obtain a trained feature extraction model.
In some embodiments, in terms of extracting features of each to-be-detected region to obtain a region feature vector corresponding to each to-be-detected region, the feature extraction module 504 is specifically configured to:
Extracting features of the to-be-detected areas aiming at each to-be-detected area to obtain an area feature map and at least one area attention map;
respectively carrying out weighted calculation on the regional feature map by utilizing each regional attention map to obtain a weighted feature map;
And determining the regional characteristic vector corresponding to the region to be detected based on the weighted characteristic map.
In some embodiments, in determining the object placement detection result based on the target reference feature vector corresponding to each region feature vector, the detection module 508 is specifically configured to:
For each regional feature vector, determining a predicted article type corresponding to a region to be detected, to which the regional feature vector belongs, according to a target reference feature vector corresponding to the regional feature vector;
Determining the standard object type corresponding to the area to be detected according to the position of the area to be detected in the object placement image;
Under the condition that the predicted article type is inconsistent with the standard article type, determining that the article placement detection result is that the article placement platform does not meet the preset placement requirement.
In some embodiments, in determining at least one area to be detected in the item placement image acquired from the storage platform, the area determining module 502 is specifically configured to:
determining the position of an article placement area corresponding to each reference article from the reference placement image;
for each article placement area, determining a corresponding area to be detected from the article placement image acquired by the article placement platform according to the position of the article placement area in the reference placement image.
The modules in the article placement detection device may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In some embodiments, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing related data related to the object placement detection method. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by the processor to implement the steps in the article placement detection method described above.
In some embodiments, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input device. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by the processor to implement the steps in the article placement detection method described above. The display unit of the computer device is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device. The display screen can be a liquid crystal display screen or an electronic ink display screen; the input device of the computer equipment can be a touch layer covered on a display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structures shown in fig. 6 or 7 are merely block diagrams of portions of structures associated with aspects of the application and are not intended to limit the computer device to which aspects of the application may be applied, and that a particular computer device may include more or fewer components than those shown, or may combine certain components, or may have a different arrangement of components.
In some embodiments, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the method embodiments described above when the computer program is executed.
In some embodiments, an internal structural diagram of a computer-readable storage medium is provided as shown in fig. 8, the computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the method embodiments described above.
In some embodiments, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.
Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of a computer program, which may be stored on a non-transitory computer readable storage medium and which, when executed, may comprise the steps of the above-described embodiments of the methods. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (10)

1. An article placement detection method, comprising:
determining at least one region to be detected in an article placement image acquired from an article placement platform;
Respectively extracting features of the areas to be detected to obtain area feature vectors corresponding to the areas to be detected;
For each of the regional feature vectors, determining a target reference feature vector matching the regional feature vector from a plurality of reference feature vectors; the reference feature vector is obtained based on a reference placement image, and the reference placement image is obtained by collecting the object placement platform meeting preset placement requirements;
Determining an object placement detection result based on the target reference feature vector corresponding to each regional feature vector; the article placement detection result is used for representing whether the article placement platform meets the preset placement requirement or not.
2. The method of claim 1, wherein the regional feature vector is derived using a trained feature extraction model; before the feature extraction is performed on each region to be detected to obtain the region feature vector corresponding to each region to be detected, the method further includes:
Inputting each sample image into a feature extraction model to be trained for processing, and outputting a sample feature vector corresponding to each sample image; each sample image comprises an item of any item type;
determining agent feature vectors corresponding to the article types respectively;
And adjusting the feature extraction model to be trained based on the similarity between the sample feature vector corresponding to each sample image and a plurality of agent feature vectors to obtain the trained feature extraction model.
3. The method according to claim 2, wherein the adjusting the feature extraction model to be trained based on the similarity between the sample feature vector corresponding to each of the sample images and the plurality of proxy feature vectors to obtain the trained feature extraction model includes:
for each sample image, determining a proxy feature vector corresponding to an article type to which an article in the sample image belongs as a positive proxy feature vector, and determining proxy feature vectors except the positive proxy feature vector as negative proxy feature vectors;
determining a loss value based on a first similarity between a sample feature vector corresponding to the sample image and the positive proxy feature vector, and a second similarity between a sample feature vector corresponding to the sample image and the negative proxy feature vector; the first similarity and the loss value form a negative correlation, and the second similarity and the loss value form a positive correlation;
and adjusting the feature extraction model to be trained based on the loss value to obtain the trained feature extraction model.
4. A method according to claim 3, wherein said adjusting the feature extraction model to be trained based on the loss value results in the trained feature extraction model, comprising:
And adjusting model parameters of each agent feature vector and the feature extraction model to be trained until a preset training condition is met, so as to obtain the trained feature extraction model.
5. The method of claim 1, wherein the performing feature extraction on each to-be-detected region to obtain a region feature vector corresponding to each to-be-detected region includes:
extracting features of the to-be-detected areas aiming at each to-be-detected area to obtain an area feature map and at least one area attention map;
respectively carrying out weighted calculation on the regional feature map by utilizing each regional attention map to obtain a weighted feature map;
And determining the regional characteristic vector corresponding to the region to be detected based on the weighted characteristic diagram.
6. The method of claim 1, wherein determining the object placement detection result based on the target reference feature vector corresponding to each of the regional feature vectors comprises:
for each regional feature vector, determining a predicted article type corresponding to a region to be detected, to which the regional feature vector belongs, according to a target reference feature vector corresponding to the regional feature vector;
Determining the standard object type corresponding to the to-be-detected area according to the position of the to-be-detected area in the object placement image;
And under the condition that the predicted article type is inconsistent with the standard article type, determining that the article placement detection result is that the article placement platform does not meet the preset placement requirement.
7. The method of claim 1, wherein determining at least one region to be detected in the item placement image acquired from the placement platform comprises:
determining the position of an article placement area corresponding to each reference article from the reference placement image;
And determining a corresponding region to be detected from the object placing images acquired by the object placing platform according to the position of the object placing region in the reference placing image aiming at each object placing region.
8. An article placement detection device, comprising:
the area determining module is used for determining at least one area to be detected from the object placing image acquired by the object placing platform;
the feature extraction module is used for respectively extracting features of the areas to be detected to obtain area feature vectors corresponding to the areas to be detected;
A feature matching module, configured to determine, for each of the regional feature vectors, a target reference feature vector that matches the regional feature vector from a plurality of reference feature vectors; the reference feature vector is obtained based on a reference placement image, and the reference placement image is obtained by collecting the object placement platform meeting preset placement requirements;
The detection module is used for determining object placement detection results based on the target reference feature vectors corresponding to the region feature vectors; the article placement detection result is used for representing whether the article placement platform meets the preset placement requirement or not.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202410030759.8A 2024-01-05 2024-01-05 Article placement detection method, device, computer equipment and storage medium Pending CN117953415A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410030759.8A CN117953415A (en) 2024-01-05 2024-01-05 Article placement detection method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410030759.8A CN117953415A (en) 2024-01-05 2024-01-05 Article placement detection method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117953415A true CN117953415A (en) 2024-04-30

Family

ID=90801154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410030759.8A Pending CN117953415A (en) 2024-01-05 2024-01-05 Article placement detection method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117953415A (en)

Similar Documents

Publication Publication Date Title
CN111680678B (en) Target area identification method, device, equipment and readable storage medium
CN111667001B (en) Target re-identification method, device, computer equipment and storage medium
CN110096929A (en) Target detection neural network based
CN112580668B (en) Background fraud detection method and device and electronic equipment
AU2018202767A1 (en) Data structure and algorithm for tag less search and svg retrieval
WO2023130717A1 (en) Image positioning method and apparatus, computer device and storage medium
CN115115825B (en) Method, device, computer equipment and storage medium for detecting object in image
CN113065593A (en) Model training method and device, computer equipment and storage medium
CN110083731B (en) Image retrieval method, device, computer equipment and storage medium
CN115690672A (en) Abnormal image recognition method and device, computer equipment and storage medium
CN110163095B (en) Loop detection method, loop detection device and terminal equipment
CN116030312B (en) Model evaluation method, device, computer equipment and storage medium
CN116051959A (en) Target detection method and device
CN117953415A (en) Article placement detection method, device, computer equipment and storage medium
CN115661472A (en) Image duplicate checking method and device, computer equipment and storage medium
CN115758271A (en) Data processing method, data processing device, computer equipment and storage medium
CN115063473A (en) Object height detection method and device, computer equipment and storage medium
CN114821140A (en) Image clustering method based on Manhattan distance, terminal device and storage medium
CN116612474B (en) Object detection method, device, computer equipment and computer readable storage medium
CN112560834A (en) Coordinate prediction model generation method and device and graph recognition method and device
CN116402842B (en) Edge defect detection method, device, computer equipment and storage medium
CN116563357B (en) Image matching method, device, computer equipment and computer readable storage medium
CN117392477A (en) Training of object detection model, apparatus, computer device and storage medium
CN117058422A (en) Image labeling method, device, computer equipment and computer readable storage medium
CN117909739A (en) Model monitoring method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination