WO2022215195A1 - 物体検出装置、物体検出システム、物体検出方法、及び、記録媒体 - Google Patents
物体検出装置、物体検出システム、物体検出方法、及び、記録媒体 Download PDFInfo
- Publication number
- WO2022215195A1 WO2022215195A1 PCT/JP2021/014768 JP2021014768W WO2022215195A1 WO 2022215195 A1 WO2022215195 A1 WO 2022215195A1 JP 2021014768 W JP2021014768 W JP 2021014768W WO 2022215195 A1 WO2022215195 A1 WO 2022215195A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- object detection
- original
- feature amount
- encoded
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 345
- 230000010365 information processing Effects 0.000 claims description 64
- 238000004891 communication Methods 0.000 claims description 45
- 238000013528 artificial neural network Methods 0.000 claims description 32
- 230000006870 function Effects 0.000 claims description 23
- 238000010801 machine learning Methods 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 17
- 238000005094 computer simulation Methods 0.000 claims description 16
- 230000005540 biological transmission Effects 0.000 claims description 12
- 230000006835 compression Effects 0.000 description 15
- 238000007906 compression Methods 0.000 description 15
- 238000010191 image analysis Methods 0.000 description 9
- 230000000052 comparative effect Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 230000007423 decrease Effects 0.000 description 4
- 238000007639 printing Methods 0.000 description 4
- 230000003190 augmentative effect Effects 0.000 description 3
- 239000000470 constituent Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 238000000034 method Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Definitions
- This disclosure relates to the technical field of, for example, an object detection device, an object detection system, an object detection method, and a recording medium capable of detecting a detection target object in an image.
- Patent Document 1 describes an example of an object detection device that detects a detection target object in an image using a neural network.
- Patent Documents 2 to 4 are cited as prior art documents related to this disclosure.
- the object detection device may transmit an image to an information processing device external to the object detection device via a communication line in parallel with object detection processing for detecting a detection target object in the image.
- an information processing device external to the object detection device via a communication line in parallel with object detection processing for detecting a detection target object in the image.
- the object detection device when the object detection device is installed in a mobile terminal having relatively low processing power, the object detection device performs information processing that requires relatively high processing power on an image.
- An image may be transmitted to an information processing device capable of performing
- the object detection device may compress the image and transmit the compressed image to the information processing device in order to satisfy the bandwidth restrictions of the communication line.
- the object detection device needs to perform a compression operation for compressing the image independently of the object detection operation.
- the object detection device does not necessarily have high processing power to the extent that the object detection operation and the compression operation can be performed independently. Therefore, it is desired to reduce the processing load for performing the object detection operation and the compression operation.
- the object of this disclosure is to provide an object detection device, an object detection system, an object detection method, and a recording medium that can solve the technical problems described above.
- this disclosure provides an object detection apparatus, an object detection system, an object detection method, and a recording medium capable of compressing an image and reducing the processing load for detecting a detection target object in the image. Make it an issue.
- the object detection device disclosed in this disclosure extracts a feature amount that enables object detection from each of the first image and the second image representing the detection target object acquired from the image generation device, and is capable of decoding later. is the compression-encoded first image and can be used as the first feature amount that is the feature amount of the first image, and the compression-encoded Using a generation means for generating second encoded information that can be used as a second feature amount that is the second image and is the feature amount of the second image, and the first and second feature amounts, detection means for detecting the detection target object in the first image;
- An object detection system disclosed herein includes an object detection device and an information processing device, wherein the object detection device includes a first image obtained from an image generation device and a second image representing a detection target object. is compression-encoded so as to extract a feature amount that enables object detection and to be decoded later, the compression-encoded first image and the first image First encoded information that can be used as a first feature amount that is the feature amount and the compression-encoded second image that can be used as a second feature amount that is the feature amount of the second image generation means for generating second encoded information; detection means for detecting the detection target object in the first image using the first and second feature quantities; transmitting means for transmitting coded information to the information processing device, and the information processing device performs a predetermined operation using the first coded information.
- the object detection method disclosed in this disclosure extracts a feature amount that enables object detection from each of the first image and the second image representing the detection target object obtained from the image generation device, and is capable of being decoded later. is the compression-encoded first image and can be used as the first feature amount that is the feature amount of the first image, and the compression-encoded generating second encoded information that is the second image and that can be used as a second feature amount that is the feature amount of the second image; Detecting the object to be detected in the image.
- the recording medium of this disclosure enables a computer to extract a feature amount that enables object detection from each of the first image and the second image representing the object to be detected obtained from the image generation device, and to decode them later.
- First encoded information and compression encoding that can be used as a first feature amount that is the feature amount of the first image that is the compression-encoded first image by performing compression encoding such that generating second encoded information that can be used as a second feature amount that is the second image obtained and is the feature amount of the second image, and using the first and second feature amounts
- the A recording medium recording a computer program for executing an object detection method for detecting the detection target object in the first image.
- the processing load for compressing the first image and detecting the detection target object in the first image can be reduced.
- FIG. 1 is a block diagram showing the overall configuration of the object detection system of this embodiment.
- FIG. 2 is a block diagram showing the configuration of the object detection device of this embodiment.
- FIG. 3 schematically shows the structure of a neural network used by the object detection device of this embodiment.
- FIG. 4 is a block diagram showing the configuration of the information processing apparatus of this embodiment.
- FIG. 5 is a flow chart showing the operation flow of the object detection system of this embodiment.
- FIG. 6 conceptually shows machine learning for generating a computational model used by the object detection device.
- FIG. 7 schematically shows the structure of a neural network used by the object detection device of the comparative example.
- FIG. 8 is a block diagram showing the configuration of an object detection device in a modification.
- Embodiments of an object detection device, an object detection system, an object detection method, and a recording medium will be described below with reference to the drawings.
- an object detection device, an object detection system, an object detection method, and a recording medium will be described using an object detection system SYS to which embodiments of the object detection device, the object detection system, the object detection method, and the recording medium are applied.
- An embodiment of is described. However, the present invention is not limited to the embodiments described below.
- FIG. 1 is a block diagram showing the overall configuration of the object detection system SYS of this embodiment.
- the object detection system SYS includes an object detection device 1 and an information processing device 2.
- Object detection device 1 and information processing device 2 can communicate with each other via communication line 3 .
- the communication line 3 may include a wired communication line.
- the communication line 3 may include a wired communication line.
- the object detection device 1 can detect a detection target object within the original image IMG_original. That is, the object detection device 1 can perform object detection.
- the original image IMG_origina is an image from which a detection target object is to be detected.
- the object detection device 1 may acquire such an original image IMG_original from an image generation device such as a camera.
- the object detection device 1 uses a detection target image IMG_target indicating the detection target object in order to detect the detection target object in the original image IMG_original. That is, the object detection device 1 uses the original image IMG_original and the detection target image IMG_target to detect the detection target object indicated by the detection target image IMG_target in the original image IMG_original.
- the object detection device 1 Based on the original image IMG_original, the object detection device 1 generates a feature amount CM_original of the original image IMG_original as a feature amount that enables object detection. Further, the object detection device 1 generates a feature quantity CM_target of the detection target image IMG_target as a feature quantity that enables object detection based on the detection target image IMG_target. After that, the object detection device 1 detects a detection target object in the original image IMG_original based on the feature amount CM_original and the feature amount CM_target.
- the object detection device 1 further compresses and encodes the original image IMG_original so that it can be decoded later.
- the object detection device 1 applies desired compression-encoding processing to the original image IMG_original, thereby creating a data structure (information format, information form).
- a data structure (information format, information form) is expressed as ⁇ compressing and encoding the input image so that it can be decoded later'' or ⁇ compressing and encoding the input image so that it can be decoded later''. do.
- the term "input image” used here is replaced with an image with an appropriate name depending on the location of the description.
- the object detection device 1 generates encoded information EI_original, which is the compression-encoded original image IMG_original.
- the object detection device 1 transmits the generated encoded information EI_original to the information processing device 2 via the communication line 3 .
- the possibility of satisfying the band constraint on the communication line 3 is increased.
- the object detection device 1 uses the encoded information EI_original as the feature amount CM_original of the original image IMG_original (that is, the feature amount CM_original for detecting the detection target object). That is, the object detection device 1 compresses and encodes the original image IMG_original to generate encoded information EI_original that can be used as the feature quantity CM_original. More specifically, the object detection apparatus 1 compresses and encodes the original image IMG_original so as to extract a feature amount that enables object detection and to be able to decode it later, so that it can be used as a feature amount CM_original. coded information EI_original (in other words, a feature quantity CM_original that can be used as coded information EI_original is generated).
- the object detection device 1 uses the detection target image IMG_target in addition to the original image IMG_original to detect the detection target object. For this reason, the object detection apparatus 1, in addition to the feature amount CM_original, encodes the encoded information EI_target, which is the compression-encoded detection target image IMG_target, into the feature amount CM_target of the detection target image IMG_target (that is, the is generated as a feature amount CM_target) for That is, the object detection device 1 compresses and encodes the detection target image IMG_target in the same manner as the original image IMG_original is compression encoded, thereby generating encoded information EI_target that can be used as the feature amount CM_target.
- EI_target which is the compression-encoded detection target image IMG_target
- the object detection apparatus 1 compresses and encodes the detection target image IMG_target so as to extract a feature amount that enables object detection and to be able to decode later, and uses it as a feature amount CM_target.
- Generate possible encoding information EI_target in other words, generate feature quantity CM_target that can be used as encoding information EI_target).
- the object detection device 1 may or may not transmit the generated encoded information EI_target to the information processing device 2 via the communication line 3 .
- the information processing device 2 receives (that is, acquires) the encoded information EI_original from the object detection device 1 via the communication line 3 .
- the information processing device 2 performs a predetermined operation using the received encoded information EI_original.
- an example in which the information processing apparatus 2 performs a decoding operation for generating a restored image IMG_dec by decoding encoded information EI_original will be described as an example of the predetermined operation.
- a specific example of such an object detection system SYS is an augmented reality (AR) system.
- Augmented reality is a technology that detects a real object existing in the real space and arranges a virtual object where the real object exists within an image showing the real space.
- the object detection device 1 may be applied to mobile terminals such as smartphones.
- the object detection device 1 detects the detection target object (that is, the real object) in the original image IMG_original generated by capturing the real space with the camera of the mobile terminal, and detects the detected object in the original image IMG_original.
- a virtual object may be placed where the object to be detected exists.
- the information processing device 2 may generate the restored image IMG_dec by performing the above-described decoding operation, and may further perform an image analysis operation for analyzing the restored image IMG_dec.
- the results of the image analysis operation may be sent to the mobile terminal.
- the mobile terminal may arrange the virtual object based on the result of the image analysis operation by the information processing device 2 in addition to the detection result of the detection target object by the object detection device 1 .
- An example of the image analysis operation by the information processing device 2 is an operation of estimating the orientation of the mobile terminal based on the restored image IMG_dec.
- the mobile terminal may arrange the virtual object based on the orientation of the mobile terminal estimated by the image analysis operation by the information processing device 2 .
- FIG. 2 is a block diagram showing the configuration of the object detection device 1. As shown in FIG.
- the object detection device 1 includes an arithmetic device 11, a storage device 12, and a communication device 13. Furthermore, the object detection device 1 may include an input device 14 and an output device 15 . However, the object detection device 1 does not have to include at least one of the input device 14 and the output device 15 . Arithmetic device 11 , storage device 12 , communication device 13 , input device 14 , and output device 15 may be connected via data bus 16 .
- the computing device 11 includes, for example, at least one of a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and an FPGA (Field Programmable Gate Array).
- Arithmetic device 11 reads a computer program.
- arithmetic device 11 may read a computer program stored in storage device 12 .
- the computing device 11 may read a computer program stored in a computer-readable non-temporary recording medium using a recording medium reading device (not shown) included in the object detection device 1 .
- the computing device 11 may acquire (that is, download) a computer program from a device (not shown) arranged outside the object detection device 1 via the communication device 13 (or other communication device). may be read).
- Arithmetic device 11 executes the read computer program.
- the arithmetic unit 11 implements logical functional blocks for executing the operations (in other words, processing) that the object detection apparatus 1 should perform.
- the arithmetic device 11 can function as a controller for realizing logical functional blocks for executing the operations that the object detection device 1 should perform.
- FIG. 2 shows an example of logical functional blocks implemented within the arithmetic unit 11.
- the arithmetic unit 11 includes an encoding unit 111, which is a specific example of "generating means,” an object detection unit 112, which is a specific example of "detecting means,” and a “transmitting means.”
- a transmission control unit 113 which is a specific example of (1), is realized.
- the encoding unit 111 compresses and encodes the original image IMG_original so that it can be decoded later, thereby generating encoded information EI_original that can be used as the feature amount CM_original of the original image IMG_original. Furthermore, the encoding unit 111 compresses and encodes the detection target image IMG_target so that it can be decoded later, thereby generating encoded information EI_target that can be used as the feature amount CM_target of the detection target image IMG_target.
- the object detection unit 112 detects a detection target object in the original image IMG_original based on the feature amount CM_origina and the feature amount CM_target generated by the encoding unit 111 .
- the encoding unit 111 uses an arithmetic model generated by machine learning to generate encoded information EI_original and EI_target (that is, feature quantities CM_original and CM_target). Furthermore, the object detection unit 112 detects a detection target object in the original image IMG_original using an arithmetic model generated by machine learning.
- the computing model may include a compression encoding model and an object detection model.
- the compression encoding model may be a model mainly for generating encoding information EI_original and EI_target (that is, features CM_origina and CM_target).
- the object detection model may be a model for detecting a detection target object in the original image IMG_original based mainly on the feature values CM_origina and CM_target (that is, the encoded information EI_original and EI_target).
- a neural network NN is an example of a computational model generated by machine learning.
- An example of the neural network NN used by the encoding unit 111 and the object detection unit 112 is schematically shown in FIG.
- the neural network NN includes a network portion NN1 that is an example of a "first model portion” and a network portion NN2 that is an example of a "second model portion”.
- the network part NN1 is mainly used by the encoding unit 111 to generate encoded information EI_original and EI_target (that is, features CM_original and CM_target).
- the network portion NN1 is a neural network for realizing the compression coding model described above.
- the network part NN1 is an input image that has been compression-encoded so that it can be decoded later, and can output encoded information that can be used as a feature amount of the input image. Therefore, when the original image IMG_original is input to the network portion NN1, the network portion NN1 outputs the encoded information EI_original (that is, the feature quantity CM_original).
- the detection target image IMG_target is input to the network portion NN1, the network portion NN1 outputs the encoded information EI_target (that is, the feature amount CM_target).
- the network part NN1 may contain a neural network conforming to the desired compression encoding method.
- an encoder portion of an autoencoder may be used as network portion NN1.
- the information processing device 2 may generate the restored image IMG_dec from the encoded information EI_original using the decoding part of the autoencoder.
- the network part NN2 is mainly used by the object detection unit 112 to detect detection target objects in the original image IMG_original. That is, the network portion NN2 is a neural network for realizing the object detection model described above.
- the network part NN2 outputs the detection result of the object shown by the other image in the one image.
- the feature values CM_original and CM_target, which are the outputs of the network portion NN1, are input to the network portion NN2. In this case, the network part NN2 outputs the detection result of the detection target object indicated by the detection target image IMG_target in the original image IMG_original.
- the network part NN2 may output information regarding the presence/absence of the detection target object in the original image IMG_original as the detection result of the detection target object.
- the network part NN2 may output information about the position of the detection target image IMG_target in the original image IMG_original (for example, the position of the bounding box) as the detection result of the detection target object.
- the network part NN2 may include a neural network conforming to the desired object detection method for detecting objects using the two images.
- SiamRPN Silicon Region Proposal Network
- SiamRPN Siamese Region Proposal Network
- the transmission control unit 113 uses the communication device 13 to transmit the encoded information EI_original generated by the encoding unit 111 to the information processing device 2 . More specifically, as shown in FIG. 3 , the transmission control unit 113 uses the communication device 13 to transmit the encoded information EI_original output from the network part NN1 to the information processing device 2 . Further, the transmission control unit 113 may use the communication device 13 to transmit the encoded information EI_target generated by the encoding unit 111 to the information processing device 2 . More specifically, as shown in FIG. 3 , the transmission control unit 113 may use the communication device 13 to transmit the encoded information EI_target output by the network part NN1 to the information processing device 2 .
- the storage device 12 can store desired data.
- the storage device 12 may temporarily store computer programs executed by the arithmetic device 11 .
- the storage device 12 may temporarily store data temporarily used by the arithmetic device 11 while the arithmetic device 11 is executing a computer program.
- the storage device 12 may store data that the object detection device 1 saves for a long period of time.
- the storage device 12 may include at least one of RAM (Random Access Memory), ROM (Read Only Memory), hard disk device, magneto-optical disk device, SSD (Solid State Drive), and disk array device. good. That is, storage device 12 may include non-transitory recording media.
- the communication device 13 can communicate with the information processing device 2 via the communication line 3.
- the communication device 13 transmits the encoded information EI_original to the information processing device 2 via the communication line 3 under the control of the transmission control section 113 .
- the communication device 13 may transmit the encoded information EI_target to the information processing device 2 via the communication line 3 under the control of the transmission control section 113 .
- the input device 14 is a device that accepts input of information to the object detection device 1 from the outside of the object detection device 1 .
- the input device 14 may include an operation device (for example, at least one of a keyboard, a mouse, and a touch panel) that can be operated by an operator of the object detection device 1 .
- the input device 14 may include a reading device capable of reading information recorded as data on a recording medium that can be externally attached to the object detection device 1 .
- the output device 15 is a device that outputs information to the outside of the object detection device 1 .
- the output device 15 may output information as an image. That is, the output device 15 may include a display device (so-called display) capable of displaying an image showing information to be output.
- the output device 15 may output information as voice.
- the output device 15 may include an audio device capable of outputting audio (so-called speaker).
- the output device 15 may output information on paper.
- the output device 15 may include a printing device (so-called printer) capable of printing desired information on paper.
- FIG. 4 is a block diagram showing the configuration of the information processing device 2. As shown in FIG.
- the information processing device 2 includes an arithmetic device 21, a storage device 22, and a communication device 23. Furthermore, the information processing device 2 may include an input device 24 and an output device 25 . However, the information processing device 2 does not have to include at least one of the input device 24 and the output device 25 . Arithmetic device 21 , storage device 22 , communication device 23 , input device 24 and output device 25 may be connected via data bus 26 .
- the computing device 21 includes, for example, at least one of a CPU, GPU and FPGA.
- Arithmetic device 21 reads a computer program.
- arithmetic device 21 may read a computer program stored in storage device 22 .
- the computing device 21 may read a computer program stored in a computer-readable non-temporary recording medium using a recording medium reading device (not shown) included in the information processing device 2 .
- the computing device 21 may acquire (that is, download) a computer program from a device (not shown) arranged outside the information processing device 2 via the communication device 23 (or other communication device). may be read).
- Arithmetic device 21 executes the read computer program.
- arithmetic device 21 can function as a controller for implementing logical functional blocks for executing operations that the information processing device 2 should perform.
- FIG. 4 shows an example of logical functional blocks implemented within the arithmetic unit 21.
- an information acquisition unit 211 and a processing unit 212 are implemented in the computing device 21 .
- the information acquisition unit 211 uses the communication device 23 to receive (that is, acquire) the encoded information EI_original transmitted from the object detection device 1 .
- the processing unit 212 performs a predetermined operation using the encoded information EI_original.
- the processing unit 212 performs a decoding operation to generate the restored image IMG_dec by decoding the encoded information EI_original acquired by the information acquisition unit 211 .
- the processing unit 212 may perform an image analysis operation of analyzing the restored image IMG_dec.
- the storage device 22 can store desired data.
- the storage device 22 may temporarily store computer programs executed by the arithmetic device 21 .
- the storage device 22 may temporarily store data temporarily used by the arithmetic device 21 while the arithmetic device 21 is executing a computer program.
- the storage device 22 may store data that the information processing device 2 saves for a long period of time.
- the storage device 22 may include at least one of RAM, ROM, hard disk device, magneto-optical disk device, SSD and disk array device. That is, the storage device 22 may include non-transitory recording media.
- the communication device 23 can communicate with the object detection device 1 via the communication line 3.
- the communication device 23 may receive (that is, acquire) the encoded information EI_original from the object detection device 1 via the communication line 3 under the control of the information acquisition section 211 .
- the input device 24 is a device that receives input of information to the information processing device 2 from outside the information processing device 2 .
- the input device 24 may include an operation device (for example, at least one of a keyboard, a mouse, and a touch panel) that can be operated by an operator of the information processing device 2 .
- the input device 24 may include a reading device capable of reading information recorded as data on a recording medium that can be externally attached to the information processing device 2 .
- the output device 25 is a device that outputs information to the outside of the information processing device 2 .
- the output device 25 may output information as an image.
- the output device 25 may include a display device (so-called display) capable of displaying an image showing information to be output.
- the output device 25 may output information as voice. That is, the output device 25 may include an audio device capable of outputting audio (so-called speaker).
- the output device 25 may output information on paper. That is, the output device 25 may include a printing device (so-called printer) capable of printing desired information on paper.
- FIG. 5 is a flow chart showing the flow of operations performed by the object detection system SYS.
- the object detection device 1 acquires the original image IMG_original (step S11).
- the object detection device 1 may acquire the original image IMG_original from a camera, which is a specific example of the image generation device.
- the object detection device 1 may acquire the original image IMG_original from the camera each time the camera generates the original image IMG_original.
- the object detection device 1 may acquire a plurality of original images IMG_original as time series data from the camera. In this case, the operation shown in FIG. 5 is performed using each original image IMG_.
- the object detection device 1 acquires the detection target image IMG_target (step S11). For example, when the detection target image IMG_target is stored in the storage device 12 , the object detection device 1 may acquire the detection target image IMG_target from the storage device 12 . For example, when the detection target image IMG_target is recorded on a recording medium that can be externally attached to the object detection device 1, the object detection device 1 uses a recording medium reading device (for example, an input device 14) may be used to acquire the detection target image IMG_target from the recording medium. For example, when the detection target image IMG_target is recorded in a device (for example, a server) external to the object detection device 1, the object detection device 1 uses the communication device 13 to transmit the detection target image IMG_target from the external device. may be obtained.
- a recording medium reading device for example, an input device 14
- the object detection device 1 does not need to acquire the detection target image IMG_target again after acquiring the detection target image IMG_target once. In other words, the object detection device 1 may acquire the detection target image IMG_target when the detection target object changes.
- the object detection device 1 compresses and encodes the original image IMG_original so that it can be decoded later, thereby converting encoded information EI_original that can be used as the feature value CM_original of the original image IMG_original. Generate (step S12). Furthermore, the object detection device 1 (particularly, the encoding unit 111) compresses and encodes the detection target image IMG_target so that it can be decoded later, thereby providing encoded information that can be used as the feature amount CM_target of the detection target image IMG_target. EI_target is generated (step S12).
- the object detection device 1 detects the detection target object in the original image IMG_original based on the feature values CM_original and CM_target generated in step S12 (step S13).
- the operation of detecting the detection target object may include an operation of detecting an area of a desired shape (for example, a rectangular area, a so-called bounding box) containing the detection target object in the original image IMG_original.
- the operation of detecting the detection target object may include an operation of detecting the position (coordinate values, for example) of a region of a desired shape containing the detection target object in the original image IMG_original.
- the operation of detecting the detection target object may include an operation of detecting characteristics (for example, at least one of color, shape, size and orientation) of the detection target object in the original image IMG_original.
- the detection result of the detection target object in step S13 may be used for desired purposes.
- the detection result of the detection target object in step S13 may be used for AR applications. That is, the detection result of the detection target object in step S13 may be used for placing the virtual object at the position of the detection target object.
- the object detection device 1 transmits the encoded information EI_original generated in step S12 to the information processing device 2 using the communication device 13. (step S14).
- the encoded information EI_original is the compression-encoded original image IMG_original
- the data size of the encoded information EI_original is smaller than the data size of the original image IMG_original. Therefore, compared with the case where the original image IMG_original is transmitted to the information processing device 2 via the communication line 3, the possibility of satisfying the band limitation of the communication line 3 is increased. That is, even when the bandwidth of the communication line 3 is relatively narrow (that is, the amount of data that can be transmitted per unit time is relatively small), the object detection device 1 transmits the encoded information EI_original to the information processing device. 2 can be sent.
- the information processing device 2 uses the communication device 23 to receive the encoded information EI_original transmitted from the object detection device 1 (step S21). After that, the information processing device 2 (particularly, the processing unit 212) performs a predetermined operation using the encoded information EI_original (step S22). For example, the processing unit 212 may perform a decoding operation to generate the restored image IMG_dec by decoding the encoded information EI_original acquired by the information acquisition unit 211 . The processing unit 212 may perform an image analysis operation of analyzing the restored image IMG_dec.
- FIG. 6 conceptually shows machine learning for generating a computational model used by the object detection device 1 .
- the machine learning performed when the computation model is the neural network NN of FIG. 3 will be explained.
- the computational model may be generated by machine learning described below.
- the neural network NN is a learning data set including a plurality of learning data in which a learning image (hereinafter referred to as “learning image IMG_learn_original”) and a correct label y_learn of the detection result of the detection target object in the learning image IMG_learn_original are associated. generated by machine learning using Furthermore, even after the neural network NN is once generated, the neural network NN may be appropriately updated by machine learning using a learning data set including new learning data.
- the learning image IMG_learn_original included in the learning data is input to the network portion NN1 (that is, the compression coding model) included in the initial or generated neural network NN. .
- the network part NN1 compression-encodes the learning image IMG_learn_original so that it can be decoded later. output learning information EI_learn.
- a learning detection target image hereinafter referred to as "detection target image IMG_learn_target” representing a learning detection target object is input to the network portion NN1 included in the initial or generated neural network NN. be.
- the network part NN1 compression-encodes the detection target image IMG_learn_target so that it can be decoded later, and uses it as the compression-encoded detection target image IMG_learn_target and the feature amount CM_learn_target of the detection target image IMG_learn_target.
- the output of the network part NN1 (that is, the feature values CM_learn_original and CM_learn_target) is input to the network part NN2 (that is, the object detection model) included in the initial or generated neural network NN.
- the network part NN2 outputs the actual detection result y of the detection target object in the learning image IMG_learn_original.
- the encoded information EI_learn_original output by the network part NN2 is decoded. As a result, a restored image IMG_learn_dec is generated.
- the above operations are repeated for multiple learning data (or part of them) included in the learning data set. Furthermore, the operation performed on a plurality of learning data (or some of them) may be repeated on a plurality of detection target images IMG_learn_target.
- the neural network NN is generated or updated using loss functions Loss including a loss function Loss1 regarding detection of the detection target object and a loss function Loss2 regarding compression encoding and decoding.
- the loss function Loss1 is a loss function relating to the error between the output y of the network part NN2 (that is, the actual detection result of the detection target object in the learning image IMG_learn_original by the network part NN2) and the correct label y_learn.
- the loss function Loss1 may be a loss function that decreases as the error between the output y of the network part NN2 and the correct label y_learn decreases.
- the loss function Loss2 is a loss function relating to the error between the restored image IMG_learn_dec and the learning image IMG_learn_original.
- the loss function Loss2 may be a loss function that decreases as the error between the restored image IMG_learn_dec and the learning image IMG_learn_original decreases.
- the neural network NN may be generated or updated so that the loss function Loss is minimized.
- the neural network NN may be generated or updated using existing algorithms for machine learning so that the loss function Loss is minimized.
- the neural network NN may be generated or updated using error backpropagation to minimize the loss function Loss. As a result, a neural network NN is generated or updated.
- the object detection device 1 compresses and encodes the original image IMG_original, so that it can be used as the feature amount CM_original of the original image IMG_original.
- Generate encoded information EI_original That is, the object detection device 1 does not have to perform the operation of generating the feature amount CM_original and the operation of generating the encoded information EI_original independently.
- the object detection device 1 does not have to perform the operation of generating the feature quantity CM_original independently from the encoded information EI_original.
- the object detection device 1 does not have to perform the operation of generating the encoded information EI_original separately from the feature amount CM_original. Therefore, the processing load for compressing the original image IMG_original and detecting the detection target object in the original image IMG_original can be reduced.
- the object detection apparatus of the comparative example that does not generate the encoded information EI_original that can be used as the feature amount CM_original performs the operation of generating the feature amount CM_original and the operation of generating the encoded information EI_original as shown in FIG. and must be performed separately.
- the object detection device of the comparative example includes a network part NN3 for generating the feature amount CM_original independently of the encoded information EI_original, and the encoded information EI_original independently of the feature amount CM_original.
- the object detection device 1 of this embodiment does not have to include either one of the network portions NN3 and NN4. Therefore, the structure of the neural network NN used by the object detection device 1 is simpler than the structure of the neural network NN' used by the object detection device of the comparative example. That is, the structure of the computational model used by the object detection device 1 is simpler than the structure of the computational model used by the object detection device of the comparative example.
- the processing load for compressing the original image IMG_original and generating the feature value CM_original of the original image IMG_original can be reduced compared to the comparative example. That is, in this embodiment, the processing load for compressing the original image IMG_original and detecting the detection target object in the original image IMG_original can be reduced as compared with the comparative example.
- the neural network NN (that is, the computational model) is generated by machine learning using a loss function Loss including a loss function Loss1 related to detection of a detection target object and a loss function Loss2 related to compression encoding and decoding. Therefore, a computation model is generated that can appropriately generate the encoded information EI_original that is the compressed original image IMG_original and that can be used as the feature quantity CM_original of the original image IMG_original.
- the object detection device 1 compresses and encodes the original image IMG_original using the computation model generated in this way, so that it can be used as the compressed original image IMG_original and the feature value CM_original of the original image IMG_original. coded information EI_original can be generated appropriately.
- the object detection device 1 transmits the encoded information EI_original to the information processing device 2 .
- the object detection device 1 does not have to transmit the encoded information EI_original to the information processing device 2 .
- the object detection device 1 may store the encoded information EI_original in the storage device 12 . In this case, as shown in FIG. 8, the object detection device 1 does not have to include the transmission control section 113 .
- the object detection device 1 uses the original image IMG_original and the detection target image IMG_target to detect the detection target object indicated by the detection target image IMG_target in the original image IMG_original.
- the object detection device 1 may detect the detection target object within the original image IMG_original without using the detection target image IMG_target.
- the object detection device 1 may detect a target object using an arithmetic model conforming to a desired object detection method for detecting an object using an image from which the object should be detected.
- An example of a computation model conforming to a desired object detection method for detecting an object using an image to detect the object is a computation model conforming to YOLO (You Only Look Once).
- the object detection device 1 compresses and encodes the original image IMG_original so that it can be decoded later, and uses the compression-encoded original image IMG_original as the feature value CM_original of the original image IMG_original. Possible encoding information EI_original may be generated. As a result, the object detection device 1 can enjoy the effects described above.
- machine learning of the YOLO-compliant computational model is performed so that the output of the intermediate layer of the YOLO-compliant computational model can be decoded.
- machine learning may be performed to generate a computational model that conforms to YOLO but extends YOLO so as to include an intermediate layer whose output can be decoded later.
- the intermediate layer of the YOLO-compliant computational model can output encoded information that can be used as feature quantities for object detection and that can be decoded later. Therefore, even the object detection device 1 that performs object detection using a computation model conforming to YOLO can enjoy the above-described effects.
- the information processing device 2 performs the decoding operation of decoding the encoded information EI_original to generate the restored image IMG_dec and the image analysis operation of analyzing the restored image IMG_dec.
- the information processing device 2 may perform operations other than the decoding operation and the image analysis operation.
- the information processing device 2 may store the encoded information EI_original received from the object detection device 1 in the storage device 22 .
- the information processing device 2 may store the restored image IMG_dec generated from the encoded information EI_original in the storage device 22 .
- the object detection apparatus according to appendix 1, further comprising transmission means for transmitting, via a communication line, the first encoded information to an information processing apparatus that performs a predetermined operation using the first encoded information.
- the predetermined operation includes a first operation of generating a third image by decoding the first encoded information, a second operation of analyzing the third image, and storing the first encoded information in a storage device.
- the object detection device including at least one of a third operation of storing and a fourth operation of storing the third image in a storage device.
- the generating means uses a first model part that outputs the first and second coded information when the first and second images are input, of a computation model generated by machine learning, to generate the generating the first and second encoded information that can be used as first and second features, respectively;
- the detection means uses a second model portion for outputting a detection result of the detection target object in the first image when the first and second feature amounts of the arithmetic model are input, detect the object to be detected,
- the arithmetic model includes a detection result of the detection target object output by a second model part of the arithmetic model to which the fourth image for learning is input, and a detection result of the detection target object in the fourth image.
- the object detection device Generated by decoding a first loss function based on the error from the correct label of and the first encoded information output by the first model part of the computation model to which the fourth image is input 4.
- the object detection device according to any one of appendices 1 to 3, wherein the object detection device is generated by machine learning using a second loss function based on an error between the third image obtained and the fourth image.
- the computational model includes a neural network, the first model portion includes an encoder portion of an autoencoder; The object detection device according to appendix 4.
- An object detection system comprising an object detection device and an information processing device,
- the object detection device is By compression-encoding each of the first image obtained from the image generation device and the second image showing the object to be detected so as to extract a feature amount that enables object detection and to be decoded later,
- the compression-encoded first image, and first encoded information that can be used as a first feature amount that is the feature amount of the first image, and the compression-encoded second image, and generating means for generating second encoded information that can be used as a second feature amount, which is the feature amount of the second image
- detection means for detecting the detection target object in the first image using the first and second feature amounts
- transmitting means for transmitting the first encoded information to the information processing device via a communication line,
- SYS Object detection system 1 Object detection device 11 Arithmetic device 111 Coding unit 112 Object detection unit 113 Transmission control unit 2 Information processing device IMG_original Original image IMG_target Detection target image EI_original, EI_target Encoding information CM_original, CM_target Feature quantity NN Neural network NN1, NN2 network part
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
初めに、本実施形態の物体検出システムSYSの構成について説明する。
初めに、図1を参照しながら、本実施形態の物体検出システムSYSの全体構成について説明する。図1は、本実施形態の物体検出システムSYSの全体構成を示すブロック図である。
続いて、図2を参照しながら、物体検出装置1の構成について説明する。図2は、物体検出装置1の構成を示すブロック図である。
続いて、図4を参照しながら、情報処理装置2の構成について説明する。図4は、情報処理装置2の構成を示すブロック図である。
続いて、図5を参照しながら、物体検出システムSYSの行う動作について説明する。図5は、物体検出システムSYSの行う動作の流れを示すフローチャートである。
続いて、図6を参照しながら、物体検出装置1が用いる演算モデルを生成するための機械学習について説明する。図6は、物体検出装置1が用いる演算モデルを生成するための機械学習を概念的に示す。尚、以下の説明では、説明の便宜上、演算モデルが図3のニューラルネットワークNNである場合に行われる機械学習について説明する。しかしながら、演算モデルが図3のニューラルネットワークNNとは異なる場合であっても、以下に説明する機械学習によって演算モデルが生成されてもよい。
以上説明したように、本実施形態では、物体検出装置1は、元画像IMG_originalを圧縮符号化することで、元画像IMG_originalの特徴量CM_originalとして利用可能な符号化情報EI_originalを生成する。つまり、物体検出装置1は、特徴量CM_originalを生成する動作と、符号化情報EI_originalを生成する動作とを別個独立に行わなくてもよい。物体検出装置1は、符号化情報EI_originalとは別個独立に特徴量CM_originalを生成する動作を行わなくてもよい。物体検出装置1は、特徴量CM_originalとは別個独立に符号化情報EI_originalを生成する動作を行わなくてもよい。このため、元画像IMG_originalを圧縮し且つ元画像IMG_original内において検出対象物体を検出するための処理負荷が低減可能となる。
上述した説明では、物体検出装置1は、情報処理装置2に符号化情報EI_originalを送信している。しかしながら、物体検出装置1は、情報処理装置2に符号化情報EI_originalを送信しなくてもよい。例えば、物体検出装置1は、符号化情報EI_originalを、記憶装置12に格納してもよい。この場合、図8に示すように、物体検出装置1は、送信制御部113を備えていなくてもよい。
以上説明した実施形態に関して、更に以下の付記を開示する。
[付記1]
画像生成装置から取得した第1画像及び検出対象物体を示す第2画像の夫々を、物体検出を可能とする特徴量を抽出するように且つ後に復号可能になるように圧縮符号化することで、圧縮符号化された前記第1画像であって且つ前記第1画像の前記特徴量である第1特徴量として利用可能な第1符号化情報及び圧縮符号化された前記第2画像であって且つ前記第2画像の前記特徴量である第2特徴量として利用可能な第2符号化情報を生成する生成手段と、
前記第1及び第2特徴量を用いて、前記第1画像内において前記検出対象物体を検出する検出手段と
を備える物体検出装置。
[付記2]
通信回線を介して、前記第1符号化情報を、前記第1符号化情報を用いた所定動作を行う情報処理装置に送信する送信手段を更に備える
付記1に記載の物体検出装置。
[付記3]
前記所定動作は、前記第1符号化情報を復号化することで第3画像を生成する第1動作、前記第3画像を解析する第2動作、及び、前記第1符号化情報を記憶装置に格納する第3動作及び前記第3画像を記憶装置に格納する第4動作の少なくとも一つを含む
付記2に記載の物体検出装置。
[付記4]
前記生成手段は、機械学習によって生成される演算モデルのうちの前記第1及び第2画像が入力された場合に前記第1及び第2符号化情報を出力する第1モデル部分を用いて、前記第1及び第2特徴量として夫々利用可能な前記第1及び第2符号化情報を生成し、
前記検出手段は、前記演算モデルのうちの前記第1及び第2特徴量が入力された場合に前記第1画像内における前記検出対象物体の検出結果を出力する第2モデル部分を用いて、前記検出対象物体を検出し、
前記演算モデルは、学習用の第4画像が入力された前記演算モデルのうちの第2モデル部分が出力する前記検出対象物体の検出結果と、前記第4画像内における前記検出対象物体の検出結果の正解ラベルとの誤差に基づく第1損失関数、及び、前記第4画像が入力された前記演算モデルのうちの前記第1モデル部分が出力する前記第1符号化情報を復号化することで生成される第3画像と前記第4画像との誤差に基づく第2損失関数とを用いた機械学習によって生成される
付記1から3のいずれか一項に記載の物体検出装置。
[付記5]
前記演算モデルは、ニューラルネットワークを含み、
前記第1モデル部分は、オートエンコーダのうちのエンコーダ部分を含み、
付記4に記載の物体検出装置。
[付記6]
物体検出装置と、情報処理装置とを備える物体検出システムであって、
前記物体検出装置は、
画像生成装置から取得した第1画像及び検出対象物体を示す第2画像の夫々を、物体検出を可能とする特徴量を抽出するように且つ後に復号可能になるように圧縮符号化することで、圧縮符号化された前記第1画像であって且つ前記第1画像の前記特徴量である第1特徴量として利用可能な第1符号化情報及び圧縮符号化された前記第2画像であって且つ前記第2画像の前記特徴量である第2特徴量として利用可能な第2符号化情報を生成する生成手段と、
前記第1及び第2特徴量を用いて、前記第1画像内において前記検出対象物体を検出する検出手段と、
通信回線を介して、前記第1符号化情報を前記情報処理装置に送信する送信手段と
を備え、
前記情報処理装置は、前記第1符号化情報を用いた所定動作を行う
物体検出システム。
[付記7]
画像生成装置から取得した第1画像及び検出対象物体を示す第2画像の夫々を、物体検出を可能とする特徴量を抽出するように且つ後に復号可能になるように圧縮符号化することで、圧縮符号化された前記第1画像であって且つ前記第1画像の前記特徴量である第1特徴量として利用可能な第1符号化情報及び圧縮符号化された前記第2画像であって且つ前記第2画像の前記特徴量である第2特徴量として利用可能な第2符号化情報を生成し、
前記第1及び第2特徴量を用いて、前記第1画像内において前記検出対象物体を検出する
物体検出方法。
[付記8]
コンピュータに、
画像生成装置から取得した第1画像及び検出対象物体を示す第2画像の夫々を、物体検出を可能とする特徴量を抽出するように且つ後に復号可能になるように圧縮符号化することで、圧縮符号化された前記第1画像であって且つ前記第1画像の前記特徴量である第1特徴量として利用可能な第1符号化情報及び圧縮符号化された前記第2画像であって且つ前記第2画像の前記特徴量である第2特徴量として利用可能な第2符号化情報を生成し、
前記第1及び第2特徴量を用いて、前記第1画像内において前記検出対象物体を検出する
物体検出方法を実行させるコンピュータプログラムが記録された記録媒体。
1 物体検出装置
11 演算装置
111 符号化部
112 物体検出部
113 送信制御部
2 情報処理装置
IMG_original 元画像
IMG_target 検出対象画像
EI_original、EI_target 符号化情報
CM_original、CM_target 特徴量
NN ニューラルネットワーク
NN1、NN2 ネットワーク部分
Claims (8)
- 画像生成装置から取得した第1画像及び検出対象物体を示す第2画像の夫々を、物体検出を可能とする特徴量を抽出するように且つ後に復号可能になるように圧縮符号化することで、圧縮符号化された前記第1画像であって且つ前記第1画像の前記特徴量である第1特徴量として利用可能な第1符号化情報及び圧縮符号化された前記第2画像であって且つ前記第2画像の前記特徴量である第2特徴量として利用可能な第2符号化情報を生成する生成手段と、
前記第1及び第2特徴量を用いて、前記第1画像内において前記検出対象物体を検出する検出手段と
を備える物体検出装置。 - 通信回線を介して、前記第1符号化情報を、前記第1符号化情報を用いた所定動作を行う情報処理装置に送信する送信手段を更に備える
請求項1に記載の物体検出装置。 - 前記所定動作は、前記第1符号化情報を復号化することで第3画像を生成する第1動作、前記第3画像を解析する第2動作、前記第1符号化情報を記憶装置に格納する第3動作及び前記第3画像を記憶装置に格納する第4動作の少なくとも一つを含む
請求項2に記載の物体検出装置。 - 前記生成手段は、機械学習によって生成される演算モデルのうちの前記第1及び第2画像が入力された場合に前記第1及び第2符号化情報を出力する第1モデル部分を用いて、前記第1及び第2特徴量として夫々利用可能な前記第1及び第2符号化情報を生成し、
前記検出手段は、前記演算モデルのうちの前記第1及び第2特徴量が入力された場合に前記第1画像内における前記検出対象物体の検出結果を出力する第2モデル部分を用いて、前記検出対象物体を検出し、
前記演算モデルは、学習用の第4画像が入力された前記演算モデルのうちの第2モデル部分が出力する前記検出対象物体の検出結果と、前記第4画像内における前記検出対象物体の検出結果の正解ラベルとの誤差に基づく第1損失関数、及び、前記第4画像が入力された前記演算モデルのうちの前記第1モデル部分が出力する前記第1符号化情報を復号化することで生成される第3画像と前記第4画像との誤差に基づく第2損失関数とを用いた機械学習によって生成される
請求項1から3のいずれか一項に記載の物体検出装置。 - 前記演算モデルは、ニューラルネットワークを含み、
前記第1モデル部分は、オートエンコーダのうちのエンコーダ部分を含み、
請求項4に記載の物体検出装置。 - 物体検出装置と、情報処理装置とを備える物体検出システムであって、
前記物体検出装置は、
画像生成装置から取得した第1画像及び検出対象物体を示す第2画像の夫々を、物体検出を可能とする特徴量を抽出するように且つ後に復号可能になるように圧縮符号化することで、圧縮符号化された前記第1画像であって且つ前記第1画像の前記特徴量である第1特徴量として利用可能な第1符号化情報及び圧縮符号化された前記第2画像であって且つ前記第2画像の前記特徴量である第2特徴量として利用可能な第2符号化情報を生成する生成手段と、
前記第1及び第2特徴量を用いて、前記第1画像内において前記検出対象物体を検出する検出手段と、
通信回線を介して、前記第1符号化情報を前記情報処理装置に送信する送信手段と
を備え、
前記情報処理装置は、前記第1符号化情報を用いた所定動作を行う
物体検出システム。 - 画像生成装置から取得した第1画像及び検出対象物体を示す第2画像の夫々を、物体検出を可能とする特徴量を抽出するように且つ後に復号可能になるように圧縮符号化することで、圧縮符号化された前記第1画像であって且つ前記第1画像の前記特徴量である第1特徴量として利用可能な第1符号化情報及び圧縮符号化された前記第2画像であって且つ前記第2画像の前記特徴量である第2特徴量として利用可能な第2符号化情報を生成し、
前記第1及び第2特徴量を用いて、前記第1画像内において前記検出対象物体を検出する
物体検出方法。 - コンピュータに、
画像生成装置から取得した第1画像及び検出対象物体を示す第2画像の夫々を、物体検出を可能とする特徴量を抽出するように且つ後に復号可能になるように圧縮符号化することで、圧縮符号化された前記第1画像であって且つ前記第1画像の前記特徴量である第1特徴量として利用可能な第1符号化情報及び圧縮符号化された前記第2画像であって且つ前記第2画像の前記特徴量である第2特徴量として利用可能な第2符号化情報を生成し、
前記第1及び第2特徴量を用いて、前記第1画像内において前記検出対象物体を検出する
物体検出方法を実行させるコンピュータプログラムが記録された記録媒体。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/014768 WO2022215195A1 (ja) | 2021-04-07 | 2021-04-07 | 物体検出装置、物体検出システム、物体検出方法、及び、記録媒体 |
US18/284,610 US20240161445A1 (en) | 2021-04-07 | 2021-04-07 | Object detection apparatus, object detection system, object detection method, and recording medium |
JP2023512574A JPWO2022215195A1 (ja) | 2021-04-07 | 2021-04-07 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/014768 WO2022215195A1 (ja) | 2021-04-07 | 2021-04-07 | 物体検出装置、物体検出システム、物体検出方法、及び、記録媒体 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022215195A1 true WO2022215195A1 (ja) | 2022-10-13 |
Family
ID=83545303
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/014768 WO2022215195A1 (ja) | 2021-04-07 | 2021-04-07 | 物体検出装置、物体検出システム、物体検出方法、及び、記録媒体 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20240161445A1 (ja) |
JP (1) | JPWO2022215195A1 (ja) |
WO (1) | WO2022215195A1 (ja) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013182323A (ja) * | 2012-02-29 | 2013-09-12 | Toshiba Tec Corp | 情報処理装置、店舗システム及びプログラム |
JP2018205920A (ja) * | 2017-05-31 | 2018-12-27 | 富士通株式会社 | 学習プログラム、学習方法および物体検知装置 |
JP2019200697A (ja) * | 2018-05-18 | 2019-11-21 | 東芝テック株式会社 | 棚管理システムおよびプログラム |
-
2021
- 2021-04-07 US US18/284,610 patent/US20240161445A1/en active Pending
- 2021-04-07 JP JP2023512574A patent/JPWO2022215195A1/ja active Pending
- 2021-04-07 WO PCT/JP2021/014768 patent/WO2022215195A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013182323A (ja) * | 2012-02-29 | 2013-09-12 | Toshiba Tec Corp | 情報処理装置、店舗システム及びプログラム |
JP2018205920A (ja) * | 2017-05-31 | 2018-12-27 | 富士通株式会社 | 学習プログラム、学習方法および物体検知装置 |
JP2019200697A (ja) * | 2018-05-18 | 2019-11-21 | 東芝テック株式会社 | 棚管理システムおよびプログラム |
Also Published As
Publication number | Publication date |
---|---|
US20240161445A1 (en) | 2024-05-16 |
JPWO2022215195A1 (ja) | 2022-10-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022252488A1 (zh) | 一种图像压缩方法、装置、电子设备及可读存储介质 | |
US11212540B2 (en) | Video data processing system | |
CN111510718B (zh) | 通过图像文件的块间差异提高压缩率的方法及*** | |
CN103888777A (zh) | 运动图像压缩解压缩装置 | |
CN106688015B (zh) | 处理用于当解码图像时对于块的操作的参数 | |
US10003812B2 (en) | Information processing apparatus, method of controlling the same, and storage medium | |
WO2024140568A1 (zh) | 图像处理方法、装置、电子设备及可读存储介质 | |
US20220335560A1 (en) | Watermark-Based Image Reconstruction | |
CN106817584A (zh) | 一种基于fpga的mjpeg压缩实现方法和fpga | |
CN112689197A (zh) | 一种文件格式转换方法、装置、以及计算机存储介质 | |
WO2022215195A1 (ja) | 物体検出装置、物体検出システム、物体検出方法、及び、記録媒体 | |
CN111405293B (zh) | 一种视频传输方法及装置 | |
KR20200022798A (ko) | 3d 모델 압축 및 압축해제 시스템 및 방법 | |
CN111510716B (zh) | 通过图像文件的像素变换提高压缩率的方法和*** | |
US9008185B2 (en) | Apparatus and method of lossless compression/restoration of selective image information | |
CN105791819B (zh) | 一种图像的帧压缩方法、图像的解压缩方法及装置 | |
US11915458B1 (en) | System and process for reducing time of transmission for single-band, multiple-band or hyperspectral imagery using machine learning based compression | |
CN110545446A (zh) | 一种桌面图像编码、解码方法、相关装置及存储介质 | |
US7733249B2 (en) | Method and system of compressing and decompressing data | |
KR102573201B1 (ko) | 이미지 재건 기술 기반 영상 통신 비용 절감 시스템 및 방법 | |
CN110830744B (zh) | 一种安全交互*** | |
CN117354529B (zh) | 基于视频编码***的图像处理方法、电子设备及介质 | |
CN115375530B (zh) | 一种多gpu协同渲染方法、***、装置及存储介质 | |
JP2019049916A (ja) | 処理装置及び推論処理方法 | |
US20220215566A1 (en) | Method of piecewise linear scaling of geometry atlas and apparatus using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21935999 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18284610 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023512574 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21935999 Country of ref document: EP Kind code of ref document: A1 |