WO2018016655A1 - Instructing device, method of controlling instructing device, remote operation support system, and information processing program - Google Patents

Instructing device, method of controlling instructing device, remote operation support system, and information processing program Download PDF

Info

Publication number
WO2018016655A1
WO2018016655A1 PCT/JP2017/026726 JP2017026726W WO2018016655A1 WO 2018016655 A1 WO2018016655 A1 WO 2018016655A1 JP 2017026726 W JP2017026726 W JP 2017026726W WO 2018016655 A1 WO2018016655 A1 WO 2018016655A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
work
video
information
instruction
Prior art date
Application number
PCT/JP2017/026726
Other languages
French (fr)
Japanese (ja)
Inventor
太一 三宅
大津 誠
拓人 市川
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Publication of WO2018016655A1 publication Critical patent/WO2018016655A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • G09G5/37Details of the operation on graphic patterns
    • G09G5/377Details of the operation on graphic patterns for mixing or overlaying two or more graphic patterns
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Definitions

  • the present invention relates to a remote communication system that performs communication using a remote communication technology.
  • the present invention relates to an instruction device that configures the remote communication system and issues a work instruction to a work terminal, a control method for the instruction device, a remote work support system, and an information processing program.
  • AR augmented reality
  • CG computer graphics
  • the terminal of the instructor receives a video image taken by the operator's terminal and operates by the instructor.
  • a mark such as a circle or an arrow can be drawn at a target location on the video.
  • the mark drawn in this way is also displayed on the operator's terminal.
  • the instructor uses an ambiguous expression such as “here” or “that”, referring to the landmark makes it difficult to cause a recognition discrepancy between the two.
  • it is possible to obtain merits such as work efficiency, realization of highly specialized work independent of skill proficiency, work period and cost reduction.
  • the present invention has been made in view of the above problems, and an object of the present invention is to provide a remote communication system including an instruction device that is less likely to cause a discrepancy in the recognition of instruction contents between an instructor and an operator.
  • an instruction device includes a receiving unit that receives a captured image of a work target imaged by a work terminal, and a transmission that transmits instruction information to the work terminal. And a video composition unit that superimposes information indicating a photographing range of the captured image on an image including the whole image of the work object or a partial image thereof and displays the information on the display unit.
  • a control method for an instruction device includes a reception step of receiving a captured image of a work object captured by a work terminal, and instruction information for the work terminal.
  • FIG. 1 is a diagram schematically illustrating an example of a usage scene of the remote communication system 1 according to the present embodiment.
  • the left side of FIG. 1 is the work site 100, and the right side of FIG. 1 shows the instruction room 106, which are located away from each other.
  • the telecommunications system 1 is a system that implements work support by transmitting information about work between an operator and an instructor that are located apart from each other.
  • FIG. 1 shows a scene in which a worker 101 at a work site 100 is performing a work while receiving a work instruction regarding a work object 102 from a commander 107 in an instruction room 106 by a work terminal 103 (first terminal). Show. More specifically, in this example, the worker 101 who is repairing the work object 102 receives an instruction regarding repair from the supervisor 107 who supervises.
  • the remote communication system 1 includes a work terminal 103 and an instruction device 108 that can communicate with each other.
  • the remote communication system 1 issues a work instruction by sharing an instruction mark that is set in the work terminal 103 and the instruction device 108 and is an instruction content that is input and displayed in the video. Can do.
  • the remote communication system 1 may have any configuration as long as it can perform work instructions.
  • the remote communication system 1 may be configured to perform work instructions using only voice.
  • the instructor 107 can give a work instruction to the worker 101 while confirming the three-dimensional shape of the work target 102 (overall image of the work target 102). it can.
  • the work terminal 103 is a tablet computer and includes a camera 103a provided on the back side and a display unit 103b provided on the front side.
  • the work terminal 103 can shoot the work object 102 with the camera 103a, display the video obtained by the shooting on the display unit 103b, and transmit the video to the remote pointing device (second terminal) 108. It can be done.
  • the instruction device 108 installed in the instruction room 106 is in the form of a desktop personal computer as shown in FIG. 1, but is not limited to this form.
  • a tablet computer as used by the worker 101 may be used.
  • the instruction device 108 can receive the video sent from the remote work terminal 103 and display the video on the display device 109 (display unit).
  • the instructor 107 gives a work instruction to the worker 101 using the display device 109 and the instruction device 108 while viewing the video 110 displayed on the display device 109.
  • the video 110 is an entire image (abstracted data, hereinafter, including an image for the instructor 107 (user) to grasp the three-dimensional shape of the work target 102 (work target). May be referred to as abstract data).
  • an entire image including an image for the instructor 107 (user) to grasp the three-dimensional shape of the work object 102 is used as the video 110 displayed on the display device 109.
  • One embodiment is not limited to this.
  • the image for the instructor 107 (user) to grasp may be a partial image instead of the entire image of the work target 102.
  • what the instructor 107 (user) grasps is not limited to the three-dimensional shape of the work target 102.
  • the video 110 displayed on the display device 109 may be an image indicating which region of the work target 102 corresponds to the shooting range of the video sent from the remote work terminal 103 as will be described later. That's fine.
  • the instruction device 108 when the instructor 107 performs an input instruction using a touch panel or a mouse, the instruction device 108 generates instruction mark information indicating the instruction mark 111 based on the input instruction.
  • the indication mark 111 is a mark that is displayed superimposed on the designated position designated by the instructor 107 in the display device 109 and the video displayed on the display device 109, and the instructor 107 instructs the operator.
  • the instruction mark may be a simple mark, a pointer, a marker, text, a picture, or a combination of two or more of these. That is, the instruction mark is an image indicating information related to work, and indicates an image that is displayed superimposed on the captured video.
  • the instruction mark information is information necessary when generating the instruction mark.
  • the instruction device 108 displays the instruction mark 111 on the display device 109 based on the mark information, and transmits the instruction mark information (instruction information) to the work terminal 103.
  • the work terminal 103 also displays the instruction mark 105 on the display unit 103b based on the mark information.
  • the work terminal 103 and the instruction device 108 share instruction mark information, and based on the instruction mark information, each superimposes the same instruction mark on the video.
  • the instruction device 108 performs an input operation for superimposing the instruction mark 111 on the work target video 102a (video corresponding to the work target 102) shown in the video 110 of the display device 109 in FIG. Accept.
  • the instruction device 108 Upon receiving the input operation, the instruction device 108 generates instruction mark information indicating that the instruction mark 105 is to be superimposed on the work target video 102a, transmits the instruction mark information to the work terminal 103, and transmits the instruction mark information.
  • the mark information is shared with the work terminal 103.
  • indication apparatus 108 superimposes the instruction
  • the work terminal 103 superimposes the instruction mark 105 on the work target image 102a shown in the image 104 of the display unit 103b in FIG. 1 based on the instruction mark information.
  • the worker 101 can grasp the instruction mark 105 and the location (target position) designated by the instruction mark 105 by the display unit 103b. Thereby, the worker 101 can visually grasp the work instruction from the instruction room 106 at a remote place.
  • the work terminal 103 can also set instruction mark information based on the input instruction of the worker 101, and can superimpose and display the instruction mark 105 on the video 104 of the display unit 103b based on the instruction mark information.
  • the work terminal 103 transmits the instruction mark information to the instruction device 108, and the work terminal 103 and the instruction device 108 can share the mark information.
  • the instruction device 108 can superimpose the instruction mark 111 on the video 110 of the display device 109 based on the mark information transmitted from the work terminal 103.
  • both the instructor 107 and the worker 101 can recognize the instruction mark set by one of the instructor 107 and the worker 101.
  • the instruction information transmitted from the instruction device 108 to the work terminal 103 is not limited to the instruction mark information.
  • instruction information for specifying various images to be superimposed on the video 104 of the display unit 103b of the work terminal 103 is used. Can do.
  • the work terminal 103 and the instruction device 108 included in the remote communication system 1 are connected to each other according to a protocol such as TCP / IP or UDP via a public communication network (for example, the Internet) 201. Can communicate.
  • a protocol such as TCP / IP or UDP
  • a public communication network for example, the Internet
  • the remote communication system 1 is provided with a management server 200 for collectively managing instruction mark information, and the management server 200 is also connected to the public communication network 201.
  • the work terminal 103 and the public communication network 201 may be connected by wireless communication.
  • the wireless communication is, for example, a Wi-Fi (Wireless Fidelity) (registered trademark) connection of an international standard (IEEE 802.11) defined by Wi-Fi (registered trademark) li Alliance (US industry group).
  • Wi-Fi Wireless Fidelity
  • IEEE 802.11 registered trademark
  • Wi-Fi registered trademark
  • li Alliance US industry group
  • FIG. 2 shows a configuration using the management server 200, there is no problem even if the work terminal 103 and the instruction device 108 directly communicate with each other.
  • the work terminal 103 and the instruction device 108 are not problematic.
  • the form which communicates directly with will be described.
  • descriptions of general audio communication processing and video communication processing other than additional screen information used in a normal video conference system are omitted as long as there is no problem.
  • the remote communication system 1 includes the instruction device 108 of the instructor 107 and the work terminal 103 of the worker 101, which will be described in turn.
  • the instruction device 108 receives a video and instruction mark information sent from the outside, and transmits an instruction mark information generated inside to the outside, a communication unit 301 (reception unit, transmission unit). ), The image mark information and the image synthesizing range of the work terminal 103 into the abstracted data held by the instructor 107, the result of the video composition, and the abstract data to be described later are displayed.
  • Display unit 303 an external input / output unit 304 that receives input from the user, a storage unit 305 that stores various data used for video composition, and a shootable range of the camera 103 a included in the work terminal 103 ( (Hereinafter referred to as “shooting range”), a shooting range calculation unit 306 for calculating the position of the shooting range for compositing on the abstract data, and a control unit 3 for controlling the entire pointing device 108. 7, and a data bus 300 for exchanging data between each block.
  • the video composition unit 302 further includes an entire image display unit 311 (video composition unit) and a shooting range display unit 312 (video composition unit).
  • the abstract data is abstract data of the work object 102 and is an entire image including an image for the instructor 107 (user) to grasp the three-dimensional shape of the work object 102 (work object). A specific configuration example of the abstract data will be described later.
  • the communication unit 301 includes a FPGA (Field Programmable Gate Array) and an ASIC (Application Specific Integrated Circuit), and is a processing block that transmits and receives data to and from the outside. Specifically, a video code and instruction mark information sent from the work terminal 103, which will be described later, are received, and instruction mark information that is created internally is transmitted.
  • the video code is data on which an encoding process suitable for encoding a moving image has been executed.
  • H. H.264 encoding is one of the standards for compression encoding of moving image data, and is a method standardized by ISO (International Organization for Standardization).
  • the communication unit 301 When the communication unit 301 performs video transmission processing, the communication unit 301 performs encoding processing in accordance with the above-described encoding method and transmits the encoded video. At the time of video reception, video decoding processing is performed, and processing opposite to the above-described encoding processing is performed.
  • the encoding process and the decoding process are preferably executed by the control unit 307 described later.
  • the video composition unit 302 includes an FPGA, an ASIC, or a GPU (Graphics-Processing Unit).
  • the video composition unit 302 performs composition processing on the input video so that the instruction mark and the area captured by the work terminal 103 can be visually represented.
  • the instruction mark information is information necessary for generating instruction contents that can be expressed visually, such as an instruction mark and a pointer as described above.
  • the video composition unit 302 further includes an entire image display unit 311 and a shooting range display unit 312.
  • the entire image display unit 311 displays an entire image including an image for the user to grasp the three-dimensional shape of the work object. Specifically, the entire image including the abstract data of the work target 102 is displayed to the instructor 107 via the display unit 303 described later.
  • the imaging range display unit 312 displays information indicating which part of the work object is captured in the captured image by being superimposed on the entire image displayed by the entire image display unit 311. Specifically, with respect to the video captured by the work terminal 103, information indicating to which area of the work target 102 the imaging range of the video corresponds is displayed to the instructor 107 via the display unit 303 described later. To do.
  • the shooting range display unit 312 may be of any configuration as long as it is a configuration showing the shooting range of the video. For example, only the boundary line of the shooting range may be displayed, or the video itself may be superimposed on the entire image.
  • the display unit 303 includes an LCD (Liquid Crystal Display), an organic EL display (OELD: Organic Luminescence Display), and the like.
  • the display unit 303 displays a composite video output from the video composition unit 302, a video processing result, an image stored in the storage unit 305, a UI (User Interface) for controlling the apparatus, and the like.
  • the display unit 303 can be provided with a touch panel function that allows the terminal to be operated by pressing the display surface. By using this function, the location of the instruction mark described above can be determined. Can be specified.
  • the display unit 303 may be configured to be externally installed outside the pointing device 108 via the external input / output unit 304 as in the display device 109 of FIG.
  • the external input / output unit 304 has input / output ports such as USB (Universal Serial Bus) and HDMI (High Definition Multimedia Interface), and operates as an interface with the external storage.
  • USB Universal Serial Bus
  • HDMI High Definition Multimedia Interface
  • the storage unit 305 includes, for example, a main storage device such as a RAM (Random Access Memory) or an auxiliary storage device such as a hard disk.
  • the main storage device is used to temporarily hold image data and image processing results.
  • the auxiliary storage device stores data for long-term storage as storage, such as captured image data and image processing results.
  • the shooting range calculation unit 306 visualizes the shooting range of the video shot by the worker 101 on the abstract data possessed by the instructor 107 based on the decoded and restored video signal. Necessary information is calculated and output to the video composition unit 302. That is, the shooting range calculation unit 306 calculates the coordinate position corresponding to the shooting range of the video on the coordinate space including the abstract data of the work object 102. A specific calculation method will be described later.
  • the control unit 307 controls the entire processing of the instruction device 108 and controls the exchange of information between the units. In the present embodiment, the control unit 307 performs encoding and decoding processing on the video signal as necessary for the data transmitted and received by the communication unit 301.
  • the data bus 300 is a bus for exchanging data between each unit.
  • the shooting range calculation unit 306 adds the range in which the worker 101 is shooting to the instructor in addition to the conventional method of sharing the instruction mark information between the worker 101 and the instructor 107.
  • the abstracted data possessed by the instructor 107 refers to abstracted data having information representing the overall image of the work object 102 as a parameter.
  • the abstract data can be expressed as three-dimensional information or two-dimensional information.
  • the three-dimensional or two-dimensional information is data based on an expression method that can be described by a combination of at least one or more parameters, or another expression based on information described by the parameters. It is the data converted into the method.
  • the three-dimensional information may be, for example, three-dimensional model data having information such as the size and shape of the work object 102, surface color information, and coordinate information.
  • the three-dimensional model data is obtained when the instructor scans the work target in advance and has it as data, or as the three-dimensional design drawing at the stage where the work target 102 is designed, or by the worker 101 There may be a case where the instructor 107 possesses the three-dimensional information acquired using the distance measuring device at the work site 100.
  • the two-dimensional information includes data represented on a two-dimensional plane, such as an overhead view (overhead video), a development view that can be obtained by developing or projecting the three-dimensional data, and a three-view drawing. (Hereinafter referred to as two-dimensional data).
  • a two-dimensional plane such as an overhead view (overhead video), a development view that can be obtained by developing or projecting the three-dimensional data, and a three-view drawing.
  • the bird's-eye view is an image showing the entire work site 100 photographed including the work object 102 and its surroundings, and is a two-dimensional image taken with an imaging device installed at any location on the work site 100.
  • Refers to the video shot of The instructor 107 can select whether the abstract data is handled as three-dimensional model data, two-dimensional data, or an overhead view, and can be freely switched according to the work situation. It is assumed that the abstract data is stored in the storage unit 305 of the instruction device 108 and can be referred to at any timing by the instructor 107 as well as during the work.
  • the entire image display unit 311 uses a plurality of parameters that specify the three-dimensional shape of the work object 102 (work object) to display the entire image (abstraction) of the work object 102 (work object).
  • Data eg, 3D model data 404, 2D data 406, overhead view 407, etc.
  • the entire image display unit 311 is data (three-dimensional model data 404) indicating a three-dimensional model of the work object 102 (work object) or data obtained by developing the three-dimensional model on a two-dimensional plane.
  • the whole image can be displayed using (two-dimensional data 406).
  • the entire image of the work object displayed by the entire image display unit 311 using the two-dimensional data 406 can include an image (overhead view 407) showing a situation around the work object.
  • FIG. 4A schematically represents a live-action image 400 when the worker 101 captures the work target 102.
  • FIGS. 4B to 4D respectively show the result of drawing the imaging range 405a on the three-dimensional model space 401, the result of drawing the imaging range 405b on the development view, which is the two-dimensional data 406, and two-dimensional data.
  • the result of drawing the imaging range 405c on the above-described overhead view 407 is shown.
  • Reference numeral 403 represents an AR marker physically attached to the work target.
  • the AR marker refers to a pattern that serves as a mark for designating a position for displaying additional information (instruction mark information here) in a system that implements AR using an image recognition technique. Typical examples include simple and clear black and white figures and QR (Quick Response) codes.
  • the AR marker is represented by a black framed square such as 403.
  • the shooting range calculation unit 306 calculates the shooting position and tilt of the work terminal 103 and the position of the shooting range using the information of the AR marker 403. The calculation method will be described later.
  • the position information is transmitted to the instruction device 108 using a method such as the conventional technique, and the instruction mark is superimposed on an appropriate position on the abstract data. It will be drawn.
  • the means for synchronizing the positional information of the instruction mark 105 on the work terminal 103 and the instruction mark 111 on the instruction device 108 is not related to the essence of one aspect of the present invention. The explanation is omitted.
  • reference numeral 408 indicates that the video sent from the work terminal 103 is synthesized and drawn on the abstract data.
  • the instructor 107 can freely adjust and select the composition position, size, and presence / absence of display of the video 408.
  • the instruction person 107 Is used for reference.
  • the instruction device 108 includes a captured image (the image 408 in the instruction device 108 and the live-action image 400 in the work terminal 103) captured by the work terminal 103 and the work.
  • An image indicating such information, which is displayed superimposed on the video 408 (the instruction mark 111 on the instruction device 108 and the instruction mark 105 on the work terminal 103) is shared with the work terminal 103.
  • the image 408 and the instruction mark 111 on the instruction device 108 correspond to the live-action image 400 and the instruction mark 105 on the work terminal 103, respectively.
  • the instruction mark displayed at a specific location of the work target 102 on the image including the work target 102 is displayed on both the instruction device 108 and the work terminal 103.
  • the instruction device 108 displays and displays an entire image (for example, 3D model data 404, 2D data 406, overhead view 407) including an image for the user to grasp the three-dimensional shape of the work object 102.
  • the information indicating which part of the work object 102 is captured in the real image 400 captured by the work terminal 103 is displayed by being superimposed on the whole image.
  • the “information indicating which part of the work object 102 is captured in the live-action image 400 captured by the work terminal 103” is, for example, the imaging ranges 405a, 405b, and 405c illustrated in FIG.
  • the shooting range calculation unit 306 receives a video signal from the data bus 300 as an input.
  • the imaging range calculation unit 306 includes an AR marker detection unit 501 that searches for the position of the AR marker 403 from the input video signal.
  • the shooting range calculation unit 306 also includes geometric information (vertex coordinates, side lengths, areas, diagonal lines, center coordinates, etc.) of the detected AR marker 403 and the camera 103 a included in the work terminal 103.
  • An external parameter calculation unit 502 that calculates position information and tilt information of the work terminal 103 using the internal parameters is provided.
  • the imaging range calculation unit 306 further inputs a position on the three-dimensional model corresponding to the position of the AR marker 403 pasted on the work target 102 from the external input / output unit 304 to the imaging range coordinate calculation unit 504.
  • the shooting range calculation unit 306 receives the inputs from the external parameter calculation unit 502 and the AR marker reference position input unit 503, and calculates shooting range coordinates on the abstract data for drawing the shooting range. And a calculation unit 504.
  • the coordinate information output from the shooting range coordinate calculation unit 504 is output to the video composition unit 302 via the data bus 300, and is used to draw the shooting range at the corresponding coordinates on the abstract data.
  • the shooting range coordinate calculation unit 504 includes a drawing position calculation unit 601, a two-dimensional data conversion unit 603, a feature point matching processing unit 605, a shooting range projection conversion unit 606, and a calculation result output unit 607. .
  • the shooting range coordinate calculation unit 504 further includes a three-dimensional model data determination unit 602 and an overhead view determination unit 604.
  • the drawing position calculation unit 601 includes the position and inclination information (hereinafter referred to as posture information) of the work terminal 103 input from the external parameter calculation unit 502 and the internal parameter information of the camera 103a of the work terminal 103 that has been calculated in advance. Are used to calculate coordinate information on the three-dimensional model space 401 in which the photographing range 405a is drawn by means described later, and send it to the three-dimensional model data determination unit 602.
  • the work terminal 103 detects the position and tilt of the work terminal 103 using a known positioning system and sensor, and the work terminal 103 is informed of the position and tilt of the work terminal 103 from the work terminal 103 to the instruction device 108. May be sent.
  • the drawing position calculation unit 601 uses the posture information input from the communication unit 301 instead of the external parameter calculation unit 502 to calculate coordinate information on the three-dimensional model space 401 where the shooting range 405a is drawn. May be.
  • the 3D model data determination unit 602 has a role of changing a destination to which the result of the drawing position calculation unit 601 is transferred depending on whether the abstract data viewed by the instructor 107 is 3D model data or 2D data.
  • the 3D model data determination unit 602 directly passes the result of the drawing position calculation unit 601 to the calculation result output unit 607.
  • the three-dimensional model data determination unit 602 performs a conversion process into a two-dimensional data form based on the output result of the drawing position calculation unit 601. The result is passed to 603.
  • the two-dimensional data conversion unit 603 passes the result of the conversion process to the calculation result output unit 607.
  • the overhead view determination unit 604 determines whether or not the abstract data referred to by the instructor 107 is the overhead view 407. When the abstract data referred to by the instructor 107 is the overhead view 407, the overhead view determination unit 604 passes the posture information and the internal parameters of the camera 103a to the feature point matching processing unit 605 described later.
  • the feature point matching processing unit 605 performs feature point detection processing and association processing on the video image sent from the work terminal 103 and the overhead view 407 referred to by the instructor 107. The result obtained at this time is sent to the shooting range projection conversion unit 606.
  • the feature point extraction process is a process of extracting feature points that indicate the appearance features of the work object 102, for example, pixels that are a combination of a plurality of edges. This is processing for associating the feature points with the feature points in the overhead view 407.
  • the shooting range projection conversion unit 606 performs projective conversion processing to the coordinates on the overhead view 407 corresponding to the range visible in the video sent from the work terminal 103, and the result is sent to the calculation result output unit 607. hand over.
  • the calculation result output unit 607 is the result of the drawing position calculation unit 601, the result of the two-dimensional data conversion unit 603, or the shooting range projection conversion unit 606 according to the form of the abstract data that the instructor 107 is browsing. It is determined which of the results is output to the video composition unit 302.
  • FIG. 7 is a diagram showing a flow of processing in the instruction device 108.
  • the input data is stored in the storage unit 305 and is referred to in the process of step 8. Once the reference position has been input, this processing step can be skipped.
  • the drawing position calculation unit 601 calculates a coordinate value on which the imaging range 405a is to be drawn on the three-dimensional model space 401 by a method described later.
  • the drawing position calculation unit 601 passes the calculated coordinate value to the calculation result output unit 607 and proceeds to step 9.
  • Step 10 (S10): When it is determined in step 9 that the abstract data being browsed by the instructor 107 is not the 3D model data 404, the shooting range coordinate calculation unit 504 passes the coordinate information to the 2D data conversion unit 603, A process for converting the imaging range into a two-dimensional data format by a method described later is performed. The converted two-dimensional data is transferred to the calculation result output unit 607 and then output to the video composition unit 302, and the control unit 307 advances the process to step 12.
  • the shooting range coordinate calculation unit 504 uses the feature point matching processing unit 605 to perform feature point extraction processing and feature point association for both the received video and the overhead view 407 stored in the storage unit 305. Process. Next, the shooting range coordinate calculation unit 504 inputs the set of feature points associated by the feature point matching processing unit 605 to the shooting range projection conversion unit 606, and performs projection conversion of the coordinates of the shooting range by a method described later. Do.
  • the shooting range coordinate calculation unit 504 passes the conversion result by the shooting range projection conversion unit 606 to the calculation result output unit 607, and then outputs it to the video composition unit 302.
  • the control unit 307 advances the process to step 12.
  • step 3 When transition is made from step 3 to this step (when the AR marker is not detected and the process moves directly without calculating the shooting range), the process of synthesizing the visualization information is not performed. Thereafter, the control unit 307 advances the process to step 13.
  • Step 131 S131: Whole image display step (video composition step):
  • the video composition unit 302 uses the entire image display unit 311 to display the abstract data currently referred to by the instructor 107 via the display unit 303. Thereafter, the process proceeds to step 132.
  • Step 132 shooting range display step (video composition step):
  • the video composition unit 302 uses the photographing range display unit 312 to display the visualization information indicating the photographing range synthesized in step 12 superimposed on the abstract data displayed in step 131 via the display unit 303. . Thereafter, the process proceeds to step 14.
  • the control unit 307 may be configured to determine that the process is not continued when communication from the work terminal 103 is interrupted, and to continue the process otherwise.
  • the processing contents of the instruction device 108 described so far can be organized as follows. That is, the processing executed by the instruction device 108 is a whole image including an image for the user to grasp the three-dimensional shape of the work object 102 (work object) (for example, 3D model data 404, 2D data 406, overhead view). An entire image display step (step 131) for displaying FIG. 407 and the like, and information indicating which part of the work object 102 is captured in the live-action image 400 (captured image) by being superimposed on the entire image (for example, A shooting range display step (step 132) for displaying shooting ranges 405a, 405b, and 405c).
  • the external parameter calculation unit 502 calculates the posture information of the work terminal 103.
  • the external parameter calculation unit 502 is an external parameter that represents a positional relationship or tilt parameter from the camera 103a to the work target 102 from internal parameters that are unique information inside the camera, such as characteristics of the imaging sensor of the camera 103a and lens distortion information. It is possible to calculate the posture information.
  • the internal parameters of the camera 103a of the work terminal 103 may be calculated using a general camera calibration method provided in a general-purpose tool such as ARTnch Version Kit or a general-purpose library such as OpenCV (Computer Vision).
  • the parameters of the camera 103a of the work terminal 103 may be calibrated in advance, or the worker 101 may perform calibration processing before work.
  • the internal parameters obtained by the above-described means are held in the work terminal 103 and the storage unit 305 of the pointing device 108.
  • the external parameter calculation unit 502 refers to the internal parameters of the camera 103 a of the work terminal 103 held in the storage unit 305 and calculates the posture information of the work terminal 103.
  • the posture information of the work terminal 103 calculated by the external parameter calculation unit 502 is converted from a point on the coordinate axis starting from the center of the AR marker 403 to a point on the coordinate axis starting from the optical axis origin of the camera 103a. Described by a rotation matrix and a translation vector.
  • the rotation matrix and the translation vector can be calculated using ARTToolKit.
  • the ARTToolKit also has a function of calculating the coordinates of the center of the AR marker 403 in the coordinate system centered on the optical axis origin of the work terminal 103. For this reason, it is preferable to obtain it together with the rotation matrix and the translation vector for use in the imaging range coordinate calculation unit 504 described later.
  • the positional relationship between the work terminal 103 and the AR marker 403 is obtained. Can be synchronized. The following calculation will be described on the assumption that the positional relationship between the work terminal 103, the work object 102, the viewpoint of the instructor 107 in the 3D model space and the 3D model data 404 is synchronized.
  • FIG. 8 schematically shows a state in which the camera 103a is placed in front of the work object 102 and is photographed.
  • Reference numeral 801 denotes a video shot including the work target 102, and 104 corresponds to the work target 102 in the video.
  • the drawing position of the shooting range 802 can be derived by using the posture information of the work terminal 103 and the internal parameter information of the camera 103a. Specifically, when a specific pixel on the image of the camera 103a is expressed as P i (u i , v i ) (i is a pixel index number), coordinates in the three-dimensional model space corresponding to P i are By using the internal parameter A of the camera, it can be obtained by perspective projection conversion as in the following (Equation 1).
  • m in (Expression 1) represents a scale variable, and is uniquely determined according to the distance from the optical axis origin of the camera 103a when a straight line perpendicular to the projection plane of the camera 103a is drawn.
  • f is a focal length
  • cx and cy are optical centers
  • s is a distortion coefficient of a lens.
  • s may be set to 0 and cx and cy may be set to the center of the image for simplification of the calculation.
  • the shooting range coordinate calculation unit 504 uses the drawing position calculation unit 601 and based on the video 701 received from the work terminal 103, the shooting range 405a on the three-dimensional model data 404 of the work target 102. Can be calculated.
  • the drawing of the shooting range on the two-dimensional data according to the present invention is performed after the drawing position calculation unit 601 once calculates the coordinates of the shooting range on the three-dimensional model space 401.
  • the viewpoint is placed at the position 901 and the work target 102 is photographed.
  • the shooting range at the time is visualized three-dimensionally as in 902.
  • the pasting range 902 at this time is pasted as texture on the 3D model data 404, and then converted into a 2D data format using a method described later, so that the instructor 107 refers to the shooting range 902 as 2D data. can do.
  • a CAD drafting tool such as AutoCAD (Computer-Aided Design) is used for data obtained by projecting the imaging range 902 onto the surface of the 3D model data 404. It is possible to do.
  • Some CAD drafting tools have a function of converting the three-dimensional model data 404 into various two-dimensional drawings and associating the positional relationship with the coordinates on the original three-dimensional model data 404. . Therefore, the 3D model data 404 that has undergone the projection conversion processing of the imaging range 902 is converted into a format that can be read by the CAD drawing tool, and the coordinate data of the imaging range 902 is delivered, and then the two-dimensional A shooting range on the data can be generated.
  • the CAD drafting tool can generate two-dimensional data in which the shooting range is drawn from the three-dimensional model data 404 having the shooting range 902.
  • three-dimensional model data 404 is input to the CAD drawing tool to perform conversion processing, and processing for receiving coordinate values on the two-dimensional data of the imaging range is performed.
  • the result obtained by inputting the 3D model data 404 with the imaging range 902 pasted as a texture to the two-dimensional data conversion unit 603 is, for example, an imaging range 907 drawn on the development view 905 of the video 903. become that way.
  • the shooting range drawn on the three-view diagram 906 of the video 904 is as shown in 909.
  • the shooting range coordinate calculation unit 504 uses the feature point matching processing unit 605 to perform feature point matching processing between an image showing a partial portion of the work object 102 shot by the camera 103a and the overhead view 407. By performing this, it becomes possible to specify the shooting range.
  • a feature point is, for example, a pixel in which a plurality of edges are combined, and the feature point information can be calculated using, for example, SURF (Speeded Up Robust Features).
  • the feature point information is positional information of the detected feature points in the image coordinates and description information (feature amount) that can identify the feature points.
  • the feature point extraction processing executed by the feature point matching processing unit 605 is not limited to the method using SURF.
  • FIG. 10 there is an overhead view 1001 in which the entire image of the work object 102 is shown, and it is assumed that the state when the worker 101 photographs the work object 102 during work is a video 1002. At this time, it is assumed that N feature points are detected from the overhead view 1001 and M feature points are detected from the video 1002.
  • the feature point matching processing unit 605 executes a feature point association process, and associates the detected feature points with each other, thereby associating the shooting range of the video 1002 with the corresponding range of the overhead view 1001. It can be carried out.
  • the feature point associating process may be, for example, associating feature points having the same combination of relative change amounts with respect to feature point information and information of another adjacent feature point.
  • the conversion can be performed using the following conversion formula.
  • the pixel (m i , n i ) on the video 1002 photographed by the worker 101 can be converted to the pixel (m i ′, n i ′) on the overhead view 1001 (homography conversion 1004). .
  • H * in this homography transformation is a 3 ⁇ 3 matrix and is called a homography matrix.
  • a homography matrix is a matrix that can projectively transform two images.
  • the homography matrix H * can be expressed, for example, as shown in (Equation 3) below.
  • the imaging range coordinate calculation unit 504 uses the imaging range projection conversion unit 606 to minimize the coordinate conversion error according to (Equation 3).
  • the value of each element of 3 ⁇ 3 is obtained.
  • each element is calculated so as to minimize the value of the following (formula 4).
  • argmin () is a function that calculates a parameter below argmin that minimizes the value in parentheses. Specifically, a combination of the elements h 11 to h 33 of the homography matrix H * is calculated so that the value in parentheses in (Equation 4) is minimized.
  • the shooting range coordinate calculation unit 504 can convert each pixel on the video 1002 captured by the worker 101 into a pixel on the overhead view 1001 using the shooting range projection conversion unit 606.
  • the pixels (m, n) converted by the shooting range projection conversion unit 606 may be only the outermost pixels in the video 1002 in order to reduce the processing weight.
  • all pixel positions may be converted and visualized by a method in which the luminance value of the pixel (m ′, n ′) on the overhead view 1001 is replaced with that of the pixel (m, n) on the video 1002.
  • Good For example, suppose that the coordinate 1003 at the upper left of the video 1002 is subjected to the projective transformation to the position of the coordinate 1005 on the overhead view 1001 as a result of performing the homography transformation 1004 by the calculation of (Equation 2).
  • the lower left, lower right, and upper right pixels in the image 1002 are also subjected to homography conversion and associated with the pixels on the overhead view 1001.
  • the shooting range of the work terminal 103 can be visualized on the overhead view 1001.
  • This processing is performed every time the posture of the camera 103a in the work terminal 103 changes when the video 1002 is received, and the position of the shooting range visualized on the overhead view 1001 also changes each time. Further, the change in the position of the photographing range may be performed in the same manner as when the pointing device 108 refers to the three-dimensional model data 404 and the two-dimensional data and refers to the overhead view 1001. Good.
  • the instruction device 108 displays an entire image including an image for allowing the user to grasp the three-dimensional shape of the work object photographed by the work terminal 103, and superimposes the entire image on the entire image so that any part of the work object is displayed. Information indicating whether the captured image is captured can be displayed. Accordingly, the instructor 107 can give a work instruction after grasping the entire work object 102 by using the instruction device 108, so that there is a discrepancy in recognition between the instructor 107 and the worker 101. It becomes difficult.
  • the work terminal 103 includes a communication unit 1101, a video composition unit 1102, a display unit 1103, an external input / output unit 1104, a storage unit 1105, a video acquisition unit 1106, a control unit 1107, and a data bus 1100. is doing.
  • the work terminal 103 is different from the block configuration of the pointing device 108 in FIG. 3 in that the work terminal 103 includes the video acquisition unit 1106 and the shooting range calculation unit 306 does not exist. Other configurations are the same for the work terminal 103 and the pointing device 108. That is, the communication unit 1101 is the communication unit 301, the video composition unit 1102 is the video composition unit 302, the display unit 1103 is the display unit 303, the external input / output unit 1104, the external input / output unit 304, and the storage unit 1105 is the storage unit 305.
  • the control unit 1107 and the data bus 1100 have the same functions as the control unit 307 and the data bus 300, respectively.
  • the work terminal 103 By executing a series of processes, the work terminal 103 according to the present embodiment shoots an image including at least a part of the work object 102 and transmits the image to the instruction device 108. Then, the work terminal 103 can receive the instruction mark information indicating the instruction content input by the instruction device 108, and can visualize and display the instruction mark information as the instruction mark 105. Accordingly, the worker 101 can check the instruction mark 105 displayed in the video obtained by photographing the work target 102 using the work terminal 103, and therefore, the worker 101 can recognize the instruction mark 107 and the worker 101. It is possible to perform work that is difficult to cause wrinkles.
  • the remote communication system 1 visualizes and displays the shooting range of the video shot by the work terminal 103 on the abstract data possessed by the pointing device 108. Then, the instructor 107 can give a work instruction using an instruction mark after grasping the entire work object 102. Therefore, there is an effect that it is possible to provide the remote communication system 1 in which it is difficult for the instructor 107 and the worker 101 to recognize the instruction content.
  • an appropriate display is performed when a part of the work object 102 is shown in the entire range photographed by the worker 101. That is, in the method described in the first embodiment, the imaging range is drawn according to the shape of the work object 102 using the distance value from the optical axis origin of the camera 103a to the work object 102. .
  • the coordinate position (Z value) for rendering the imaging range is not fixed, and there is a problem that rendering becomes difficult. This is because the coordinate position (Z value) corresponding to the depth from the optical axis origin of the camera 103a to the background is not constant.
  • the remote communication system 1 does not draw the imaging range in accordance with the shape using the distance value to the work target, but the distance L from the optical axis origin of the camera 103a to the AR marker 403.
  • the above-described problem is addressed by drawing a surface representing the imaging range at the position.
  • the shooting range display unit 312 displays a plane (shooting range 1303) indicating the shooting range of the captured video (the video 408 in the pointing device 108 and the live-action video 400 in the work terminal 103) as a whole image ( For example, it is different from the above-described embodiment in that it is displayed superimposed on the three-dimensional model data 404, the two-dimensional data 406, the overhead view 407, and the like.
  • the imaging range determined in this way is a plane that is parallel to the projection surface of the camera 103a and changes in size according to the distance to the AR marker.
  • FIG. 13 schematically shows the result of drawing the photographing range based on the coordinate value obtained by the projective transformation formula of (Formula 1 ').
  • the shooting range of the video 1301 shot by the work terminal 103 is the AR marker 403 ′ on the three-dimensional model data 404 corresponding to the AR marker 403 on the work target 102 from the viewpoint 1302 of the instructor 107 in the three-dimensional model space 401. Is drawn at the position of the distance L. According to the illustrated example, the imaging range is drawn as 1303 in parallel with the projection plane of the work terminal 103.
  • the information used to calculate the drawing position of the shooting range 1303 is not limited to the distance value L from the viewpoint 1302 to the AR marker 403 'application position, but may be a distance value that represents a specific surface.
  • a representative surface included in the work object 102 or a face closest to the camera 103a out of each surface of the work object 102 may be extracted and the distance value up to that may be used.
  • a plane that represents the shooting range is uniquely obtained, so that the instruction device 108 can provide an image that visualizes the shooting range to the instructor 107 regardless of the presence or absence of the background. .
  • the pointing device 108 can display a plane indicating the shooting range of the captured image captured by the work terminal 103 so as to be superimposed on the entire image. Therefore, the instruction device 108 provides the remote communication system 1 in which even if the video captured by the work terminal 103 includes a background, the instruction content is less likely to be recognized between the instruction person 107 and the worker 101. There is an effect that can be.
  • the remote communication system 1 according to the present embodiment is different from the above embodiments in that the imaging range drawing range is extended using a plurality of AR markers.
  • the work terminal 103 displays the work object 102 in a state where a plurality of AR markers are pasted on the work object 102 and a plurality of corresponding marker positions are designated on the three-dimensional model data 404. Take a picture.
  • the posture of the work terminal 103 is set for each ID in (Equation 1) and (Equation 1 ′). To calculate.
  • the first and second embodiments it is necessary to shoot so that one AR marker is always displayed on the screen, and the movement range of the work terminal 103 is greatly limited.
  • the worker 101 only needs to shoot so that at least one of a plurality of AR markers is displayed in the video of the work terminal 103, so that the degree of freedom of work is increased. improves.
  • AR markers 1402 and 1403 are pasted on the work target 1401 and 1402 ′ and 1403 ′ are input as corresponding reference positions on the three-dimensional model data 404.
  • the photographing range is drawn at a position 1406 in the three-dimensional model space.
  • the shooting range is drawn at a position 1407.
  • the imaging range may be calculated using the calculation result of the attitude of the work terminal 103 with respect to any one AR marker.
  • the instruction device 108 includes a plurality of AR markers (AR marker 1402, AR marker 1403) in which videos (video 1404, video 1405) taken by the work terminal 103 are pasted on the work target 1401. If the image includes at least one of the images, an entire image including an image for the user to grasp the three-dimensional shape of the work target 1401 (for example, the three-dimensional model data 404, the two-dimensional data 406, the overhead view 407, etc.) ) And superimposed on the displayed whole image, information indicating which part of the work target 1401 is captured in the image captured by the work terminal 103 can be displayed.
  • rendering is performed assuming that there is one virtual AR marker using the center of gravity of the coordinates of each AR marker and the average value of the distance to the marker.
  • P v (X v , Y v , Z v ) where the virtual AR marker exists is
  • the pointing device 108 captures an image from the coordinate position of the virtual AR marker when there are a plurality of AR markers.
  • the surface can be calculated, and the range of work can be expanded.
  • the remote communication system 1 uses at least one AR marker captured by the work terminal 103 for the work target 1401 including a plurality of AR markers (AR marker 1402 and AR marker 1403).
  • the shooting range of the included video (video 1404, video 1405) is visualized and displayed on the abstract data (for example, as positions 1406 and 1407 illustrated in FIG. 14).
  • the instructor 107 can give a work instruction using an instruction mark after grasping the entire work target 1401. Therefore, there is an effect that it is possible to provide the remote communication system 1 in which at least one AR marker can be included in the video, and it is difficult for the instructor 107 and the worker 101 to recognize the instruction content. .
  • FIG. 15 is a drawing example of a shooting range and auxiliary information in the 3D model data 404 in the 3D model space 401, which is an example of abstract data.
  • the drawing example 1501 of the orientation of the work terminal 103 shows a state in which visual information indicating the orientation of the work terminal 103 is synthesized and displayed in accordance with the rotation of the shooting range 405a.
  • orientation information 1505 is drawn in the imaging range on the three-dimensional model space 401 as auxiliary information indicating the upward direction of the work terminal 103. That is, in the present embodiment, the imaging range display unit 312 displays information (orientation information 1505) indicating the orientation of the work terminal 103 when the work terminal 103 images the work target 102 (work target). Can do.
  • the work terminal 103 is held vertically, but the instructor 107 does not know whether the result is rotated left or right without the orientation information 1505. .
  • This direction information 1505 is effective when instructing in which direction the work terminal 103 should be moved.
  • the direction of rotation of the work terminal 103 can be calculated using an inclination sensor provided in the work terminal 103.
  • the orientation information 1505 for the imaging range 405a, for example, a method of changing the color of the upper boundary line to a color different from the other three boundary colors may be used.
  • a state where 1506 is pasted is shown.
  • the abstract data possessed by the instructor 107 and the work object 102 actually photographed by the worker 101 do not necessarily have the same characteristics such as shape, color, and size. For example, there may be a change due to various factors such as the environment in which the work object 102 is placed and the passage of time.
  • the instructor 107 can confirm the difference between the abstract data and the actual object (work object 102).
  • the remote communication system 1 has no means for determining where the worker 101 is currently watching.
  • the instructor 107 does not check the screen of the work terminal 103 until the operator 107 confirms the screen of the work terminal 103 only when the photographed image is superimposed on the photographing range 405 a and the worker 101 is working. Notice that it may continue.
  • the situation where the worker 101 is working is not shown, it is understood that he is grasping the work terminal 103 and is listening to the explanation of the instructor 107, but is watching the work terminal 103. Not exclusively. Therefore, there is a possibility that the instruction mark information set by the instructor 107 may be missed.
  • the line-of-sight information 1507 is within the imaging range 405a, it may be determined that the worker 101 is gazing at the screen, and it is easy to grasp the timing at which the instruction mark placed by the instructor 107 can be visually recognized.
  • the line-of-sight information 1507 is expressed as a circle, but the shape and color are not particularly limited. When gazing at a point for a long time, the size may be gradually increased, or the indication mark may follow the line of sight.
  • Gaze information can be obtained by using general-purpose tools such as OpenGazer, an open source gaze tracking library, and EyeX dev kit, an eye tracker tool provided by Toby Technology Co., Ltd. It is. Devices necessary for this can be incorporated into the configuration of the present invention.
  • the instructor 107 designates a specific position on the shooting range 405a as auxiliary information, so that the electronic object information 1508a having an auxiliary role is superimposed on a part of the shooting range 405a or on the entire surface.
  • the electronic object information 1508a is used as a guideline when the instructor 107 is instructed, or is used for the purpose of presenting materials having a supplementary explanation role of the instruction content to the worker 101.
  • the electronic object information 1508b may be drawn not only on the three-dimensional model space 401 but also on the screen of the work terminal 103, or both.
  • the electronic object information may be text information, an appropriate image according to the content of the instruction, a moving image, or a form that can be expressed in two dimensions, such as a design drawing or a map.
  • three-dimensional information such as a reduced view of three-dimensional model data, model data of a work location or a part to be worked on, etc. may be drawn.
  • the remote communication system 1 has the shooting range of the video captured by the work terminal 103 (the video 408 in the pointing device 108 and the real shot video 400 in the work terminal 103).
  • the instructor 107 grasps the entire work object 102 (for example, the three-dimensional model data 404, the two-dimensional data 406, the overhead view 407) and the auxiliary information, and then indicates the instruction mark (instruction information, the instruction mark on the instruction device 108).
  • the instruction mark instruction information, the instruction mark on the instruction device 108.
  • 111 and the work mark 103 on the work terminal 103 can be given work instructions. Therefore, there is an effect that it is possible to provide the remote communication system 1 in which it is difficult for the instructor 107 and the worker 101 to recognize the instruction content.
  • an automatic movement type for example, an automatic traveling type and an automatic flight type
  • a work terminal a telecommunications system including an automatic traveling robot and the pointing device 108 of the first embodiment described above is illustrated.
  • the robot differs from the work terminal 103 of the first embodiment described above in that it includes a work unit in addition to the components of the work terminal 103 of the first embodiment.
  • the robot does not necessarily have a configuration that can visually grasp the instruction information transmitted from the instruction device side.
  • the robot may be configured such that when the instruction information is received, the working unit performs the operation based on the content of the instruction information.
  • the working unit includes a traveling control unit that controls traveling of the robot, and may include, for example, a working arm and a control unit for the working arm.
  • the instruction information transmitted from the instruction device 108 to the robot includes the movement position, posture, work content, and the like of the working unit included in the robot.
  • information necessary for the worker on the work terminal side to perform work is transmitted from the instruction device 108 to the robot as instruction information to the working unit.
  • the shooting range of the video shot by the robot is visualized and displayed on the abstract data possessed by the pointing device 108. Then, the instructor 107 can perform the instruction information by the instruction mark after grasping the entire work object 102. Therefore, there is an effect that it is possible to provide a telecommunications system in which it is difficult for the instructor 107 and the robot to recognize the instruction content.
  • any method can be used as long as it can be associated with the coordinate system on the abstract data and can calculate the attitude.
  • SFM Structure From Motion
  • the relative attitude of the work terminal 103 is obtained, and at the same time, the surrounding 3D restoration is performed, and the obtained 3D restoration result and the abstract data are aligned. do it.
  • the positioning for example, the technology of ICP (Iterative Closest Point) can be used.
  • the information that can be viewed by the instructor 107 in each embodiment may be configured so that other than abstract data can be displayed at the same time.
  • the video captured by the work terminal 103 and the abstract data of the work target 102 may be displayed side by side.
  • the display form may be a form in which the abstract data is superimposed on the video.
  • the instruction mark information may be displayed in a synthesized form with one or both of the captured video and abstract data.
  • the imaging range display unit 312 displays the captured image of the work terminal 103 (the video 408 on the pointing device 108) in the entire image of the work target 102 (for example, the 3D model data 404, the 2D data 406, and the overhead view 407).
  • the captured video can be displayed in a portion corresponding to the portion captured in the live-action video 400) in the work terminal 103.
  • each component for realizing the function is described as being a different part, but actually has a part that can be clearly separated and recognized in this way. It doesn't have to be.
  • the remote communication system 1 that implements the functions of the above-described embodiments may be configured by using, for example, different parts for each component for realizing the functions.
  • all the components may be mounted on one LSI (Large Scale Integration, large integrated circuit). That is, in any mounting form, each component may be included as a function.
  • each constituent element of one embodiment of the present invention can be arbitrarily selected, and an invention having a selected configuration is also included in one embodiment of the present invention.
  • a program for realizing the functions described in the above embodiments is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Processing may be performed.
  • the “computer system” includes hardware such as an OS (Operation System) and peripheral devices.
  • the “computer system” includes a homepage providing environment (or display environment) if a WWW (World Wide Web) system is used.
  • a WWW World Wide Web
  • the “computer-readable recording medium” means a storage device such as a flexible disk, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. Including things. In addition, a volatile memory inside a computer system serving as a server or a client that is used when transmitting the program, such as one that holds the program for a certain period of time. The program may be a program for realizing a part of the above-described functions, or may be a program that can realize the above-described functions in combination with a program already recorded in a computer system.
  • the pointing device (108) includes a captured image (the live-action image 400 and the image 408) of the work target (102) captured by the work terminal (103) and information related to the work.
  • An overall image display unit (311) for displaying an entire image (video 110) including an image for the user to grasp the three-dimensional shape of the object (102), and the work object (102) superimposed on the entire image. ) Includes an imaging range display unit (312) for displaying information (imaging range 405a, imaging range 405b, and imaging range 405c) indicating which part of the captured image is captured.
  • the pointing device displays an entire image including an image for the user to grasp the three-dimensional shape of the work object, and superimposes the entire image on the entire image so that which part of the work object is the work object.
  • Information indicating whether an image is captured in the captured image captured by the terminal can be displayed. Therefore, by displaying the shooting range of the video shot by the work terminal superimposed on the entire image of the work target, it is possible for the instructor to give a work instruction after grasping the entire work target. Therefore, there is an effect that it is possible to provide a telecommunications system in which the instruction content is less likely to be discerned between the instructor and the operator.
  • the shooting range display unit (312) includes a plane (shooting range 1303) indicating the shooting range of the captured video. It is displayed superimposed on the whole image.
  • the pointing device superimposes and displays a plane indicating the shooting range of the captured video shot by the work terminal on a specific plane of the entire image of the work target. Therefore, there is an effect that it is possible to provide a remote communication system in which even if the video captured by the work terminal includes a background, it is difficult for the instructor and the worker to recognize the instruction content.
  • the shooting range display unit (312) is provided on a portion of the whole image captured in the captured video.
  • the captured image is displayed on the corresponding part.
  • the pointing device can display the captured image in a portion corresponding to a portion captured in the captured image captured by the work terminal in the entire image of the work target. Therefore, the instructor gives a work instruction while comparing the actual image captured by the work terminal with the entire image of the work object, and remote communication is unlikely to cause a flaw in the recognition of the instruction content between the instructor and the worker.
  • the system can be provided.
  • the imaging range display unit (312) is configured so that the work terminal (103) is the work object (103). 102), information (orientation information 1505) indicating the direction of the work terminal (103) at the time of imaging is displayed.
  • the pointing device indicates the orientation of the work terminal when the work terminal images the work object, in addition to the shooting range of the video taken by the work terminal, on the entire image of the work object.
  • Information to be displayed can be displayed. Therefore, it is possible to provide a telecommunications system that uses the information indicating the shooting range and the orientation of the work terminal and that is less likely to cause a flaw in recognizing the instruction content between the instructor and the operator.
  • the whole image display unit (312) has a three-dimensional shape of the work object (102).
  • the whole image is displayed using a plurality of parameters to be specified.
  • the pointing device can display the entire image of the work target using a plurality of parameters that specify the three-dimensional shape of the work target. Therefore, the instructor can give a work instruction after grasping the entire image displayed using a plurality of parameters specifying the three-dimensional shape of the work object. Therefore, there is an effect that it is possible to provide a telecommunications system in which the instruction content is less likely to be discerned between the instructor and the operator.
  • the whole image display unit (311) displays a three-dimensional model of the work object (102).
  • the whole image is displayed using the data shown (three-dimensional model data 404) or the data obtained by developing the three-dimensional model on a two-dimensional plane (development diagram 905, three-view diagram 906).
  • the instruction device uses the data indicating the three-dimensional model of the work object or the data obtained by developing the three-dimensional model on the two-dimensional plane to display the entire image of the work object. Is displayed. Therefore, the instructor can give a work instruction after grasping the entire image of the work object. Therefore, there is an effect that it is possible to provide a remote communication system that is less likely to cause a flaw in the recognition of instruction content between the instructor and the operator.
  • the entire image is an image (overhead view 407) showing a situation around the work object. Including.
  • the pointing device can display the imaged range of the captured image captured by the work terminal superimposed on the image indicating the situation around the work target. Therefore, the instructor can give a work instruction after grasping the entire image of the work object. Therefore, there is an effect that it is possible to provide a telecommunications system in which the instruction content is less likely to be discerned between the instructor and the operator.
  • the control method of the pointing device (108) according to aspect 8 which is an aspect of the present invention is an image showing a captured image of the work target (102) captured by the work terminal (103) and information related to the work.
  • a shooting range display step (S132) for displaying information. According to said structure, there exists an effect similar to the aspect 1.
  • the pointing device (108) may be realized by a computer.
  • the pointing device is operated by causing the computer to operate as each unit (software element) included in the pointing device (108).
  • a control program for the instruction device (108) for realizing (108) by a computer and a computer-readable recording medium on which the control program is recorded also fall within the scope of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Provided is a remote communication system with which discrepancies in the understanding of instruction details between an instructor and an operator do not readily occur. An instructing device (108) is provided with: a communication unit (301) which receives a captured image of an operation target object (102) captured by an operating terminal (103), and transmits instruction information to the operating terminal (103); and an image synthesizing unit (302) which causes a display unit to display information indicating an imaging zone of the captured image, superimposed on an image including an image of the entire operation target object (102) or a partial image thereof.

Description

指示装置、指示装置の制御方法、遠隔作業支援システムおよび情報処理プログラムInstruction device, instruction device control method, remote work support system, and information processing program
 本発明は遠隔通信技術を用いて通信を行う遠隔通信システムに関する。特に、該遠隔通信システムを構成し、作業端末に作業指示を行う指示装置、指示装置の制御方法、遠隔作業支援システムおよび情報処理プログラムに関する。 The present invention relates to a remote communication system that performs communication using a remote communication technology. In particular, the present invention relates to an instruction device that configures the remote communication system and issues a work instruction to a work terminal, a control method for the instruction device, a remote work support system, and an information processing program.
 有識者や熟練者の知見が重要視される作業において、有識者や熟練者等の指示者が、当該作業に習練していない者や若手の作業者に対して指示(指導)するシーンがある。例えば、作業手順、判断基準、問題の対処法、といったノウハウを伝達するシーンである。その際に、指示者と作業者(指示を受ける者)とが同じ場所にいて、対面でコミュニケーションを取ることができれば、指示内容の伝達を効率的に行える。 There are scenes where an instructor such as an expert or an expert gives instructions (instructions) to a person who is not trained in the work or a young worker in an operation where the knowledge of an expert or an expert is important. For example, it is a scene that conveys know-how such as work procedures, judgment criteria, and countermeasures for problems. At that time, if the instructor and the worker (the person receiving the instruction) are in the same place and can communicate face-to-face, the contents of the instruction can be efficiently transmitted.
 作業者と指示者とが同じ場所にいないケースがあり、そのような場合、作業者が指示を直ぐに仰ぐ事や、説明を受ける事が困難となる。例えば、作業現場が狭く、複数人が集まって作業を行うことが物理的に困難である場合や、指示者が作業現場から遠く離れた場所にいて、作業者の元にすぐに向かうことができない場合などである。これらの場合であっても、作業に関する判断基準や手順が示されたマニュアルがあるならば、マニュアルに従って作業を進めることができる。実際には、マニュアルに記載の無いような事態(突発的な問題、または、状況に応じて経験的に判断しなければならない事態)が生じることがあり、経験の浅い作業者では対処しきれないことも多い。 There are cases where the worker and the instructor are not in the same place. In such a case, it is difficult for the operator to immediately ask for an instruction or receive an explanation. For example, when the work site is small and it is physically difficult for multiple people to work together, or the instructor is far away from the work site and cannot immediately go to the worker Such as the case. Even in these cases, if there is a manual showing the criteria and procedures for the work, the work can be carried out according to the manual. In reality, situations that are not described in the manual (sudden problems or situations that require judgment based on the situation) may occur, and inexperienced workers cannot handle them. There are many things.
 これに対し、テレビ電話(ビデオ電話)を用いて、遠隔地から作業支援を行う解決手法がある。この方法では、作業者は、作業箇所や作業の様子を撮影し、この撮影映像を指示者に送信し、指示者は、受信した映像をもとに、主に音声通信を介して指示を伝達するようになっている。 On the other hand, there is a solution method that provides work support from a remote location using a video phone (video phone). In this method, the worker takes a picture of the work location and the state of the work, and transmits the shot video to the instructor. The instructor transmits the instruction mainly through voice communication based on the received video. It is supposed to be.
 しかし、この方法では、指示者は、説明対象箇所を指差しながらの説明を行うことができず、あくまで音声のみで説明しなければならない。 However, in this method, the instructor cannot explain while pointing at the explanation target part, and must explain only by voice.
 音声のみの説明の場合、「ここ」「あれ」といった曖昧な表現を用いながら説明すると、現場の作業者に対して意図を正確に伝えられない。それゆえ「右端から何番目の上から何番目」というように、普遍的な表現で説明対象箇所を特定することが求められる。 In the case of explanation only by voice, if the explanation is made using ambiguous expressions such as “here” and “that”, the intention cannot be accurately conveyed to the worker on site. Therefore, it is required to specify the part to be explained by universal expression, such as “what number from the top right”.
 ところが、作業者が絶えず動いているような作業においては、指示者にとって「三番目」の場所であっても、作業者にとっては最早「四番目」やそれ以外の場所に変わる可能性もある。指示者が普遍的な表現で、説明対象箇所を特定したにもかかわらず、現場の作業者に対して意図が正確に伝わらない事も多い。 However, in a work in which the worker is constantly moving, even if it is the “third” place for the instructor, there is a possibility that it will be changed to the “fourth” or other place for the worker. In many cases, the instructor does not accurately convey the intention to the worker in the field, even though the instructor has specified the part to be explained in a universal expression.
 現場の作業者に対して意図が正確に伝わらないと、両者の間で認識に齟齬が生じることになり、作業効率がさらに低下するという問題が生じる。また、仮に、普遍的な表現(例えば「右端から何番目の上から何番目」というような表現)によって説明対象箇所を正確に伝えることができたとしても、通常作業(作業者と指示者とが同じ場所にいる場合)に比べ、より詳細な説明が必要となり、作業効率の低下は避けられない。 If the intention is not accurately communicated to the worker on site, there will be a discrepancy in recognition between the two, and there will be a problem that the work efficiency is further reduced. Moreover, even if the location to be explained can be accurately conveyed by a universal expression (for example, an expression such as “what number from the top right”), normal work (workers and instructors) Compared to the case where they are in the same place), a more detailed explanation is required, and a reduction in work efficiency is inevitable.
 近年、撮影された映像に対してコンピュータグラフィックス(CG)にて作成した画像を重畳して描画する拡張現実(AR:Augmented Reality)技術の研究が盛んになってきている。AR技術を用いることによって、撮影される映像上にCGで作成した目印(絵柄、符号、文字など)をリアルタイムに重ねて描画できる。これにより、作業現場において実際に存在しない目印を、作業現場を映している映像上にあたかも存在しているかのように示すことができる。つまり、テレビ電話(ビデオ電話)を用いて遠隔地から作業支援を行う解決手法にAR技術を適用すれば、前述した問題を解決できる。 In recent years, research on augmented reality (AR) technology for superimposing and drawing an image created by computer graphics (CG) on a captured image has become active. By using the AR technology, a mark (picture, code, character, etc.) created by CG can be drawn in real time on the image to be shot. Thereby, a mark that does not actually exist at the work site can be shown as if it exists on the image showing the work site. That is, if the AR technique is applied to a solution method for performing work support from a remote place using a video phone (video phone), the above-described problems can be solved.
 例えば、以下に示す特許文献1および非特許文献1に開示されている技術によれば、指示者(オペレータ)の端末は、作業者の端末にて撮影された映像を受信し、指示者の操作によって当該映像上の目標箇所に丸や矢印等の目印を描画できるようになっている。こうして描画された目印は、作業者の端末にも表示されるようになっている(また、作業者の端末の向きや位置が変わることで、撮影範囲が変わったとしても、目標箇所に付された目印は目標箇所に追随して表示されるようになっている)。指示者が「ここ」「あれ」といった曖昧な表現を用いたとしても、目印を参照することにより、両者の間で認識の齟齬が生じにくくなる。上記公開された技術に従えば、作業の効率化、スキル習熟度によらない専門性の高い作業の実現、作業期間やコスト削減等のメリットが得られる。 For example, according to the techniques disclosed in Patent Document 1 and Non-Patent Document 1 shown below, the terminal of the instructor (operator) receives a video image taken by the operator's terminal and operates by the instructor. Thus, a mark such as a circle or an arrow can be drawn at a target location on the video. The mark drawn in this way is also displayed on the operator's terminal. (Also, even if the shooting range changes due to the change in the orientation and position of the operator's terminal, it is attached to the target location. The mark is displayed following the target location). Even if the instructor uses an ambiguous expression such as “here” or “that”, referring to the landmark makes it difficult to cause a recognition discrepancy between the two. According to the technology disclosed above, it is possible to obtain merits such as work efficiency, realization of highly specialized work independent of skill proficiency, work period and cost reduction.
日本国特許公開公報「特開2015-135641号公報(2015年7月27日公開)」Japanese Patent Publication “Japanese Unexamined Patent Application Publication No. 2015-135641 (Published July 27, 2015)”
 しかしながら、特許文献1及び非特許文献1に記載の手法では、指示者が見ることのできる映像はあくまで作業者によって撮影された映像の範囲に限られるため、得られる情報が極端に制限されている。例えば、今見えている映像が作業対象全体のうちどの部分にあたるのか、次に説明すべき箇所は今見えている場所からどの方向にあるのか、などの情報の入手が、指示者側では非常に困難ということである。作業者が把握できる情報との差が大きく、指示内容の認識に関し、指示者と作業者の間で齟齬が生じやすくなる。 However, in the methods described in Patent Document 1 and Non-Patent Document 1, the video that can be viewed by the instructor is limited to the range of the video captured by the operator, and thus the information that can be obtained is extremely limited. . For example, it is very difficult for the instructor to obtain information such as which part of the entire work target is the image that is currently being viewed, and in what direction the part to be explained next is from where it is now visible. It is difficult. There is a large difference from the information that can be grasped by the operator, and it becomes easy for a wrinkle to occur between the instructor and the worker regarding the recognition of the instruction content.
 さらに、作業者は、作業端末を手に持ちながら作業を行うため、撮影範囲が頻繁に変化することが考えられる。指示者はこのような映像から、作業者の見ている場所や方向を特定することが困難であるこということも課題になる。 Furthermore, since the operator works while holding the work terminal in his / her hand, it is conceivable that the shooting range frequently changes. Another problem is that it is difficult for the instructor to specify the location and direction the operator is viewing from such an image.
 本発明は、以上の課題を鑑みてなされたものであり、指示者と作業者の間で指示内容の認識に齟齬が生じにくい指示装置を備えた遠隔通信システムを提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a remote communication system including an instruction device that is less likely to cause a discrepancy in the recognition of instruction contents between an instructor and an operator.
 上記の課題を解決するために、本発明の一態様に係る指示装置は、作業端末が撮像した作業対象物の撮像映像を受信する受信部と、前記作業端末に対して指示情報を送信する送信部と、前記撮像映像の撮影範囲を示す情報を、前記作業対象物の全体像もしくはその部分像を含む画像に重畳させて表示部に表示させる映像合成部と、を備える。 In order to solve the above problem, an instruction device according to an aspect of the present invention includes a receiving unit that receives a captured image of a work target imaged by a work terminal, and a transmission that transmits instruction information to the work terminal. And a video composition unit that superimposes information indicating a photographing range of the captured image on an image including the whole image of the work object or a partial image thereof and displays the information on the display unit.
 上記の課題を解決するために、本発明の一態様に係る指示装置の制御方法は、作業端末が撮像した作業対象物の撮像映像を受信する受信ステップと、前記作業端末に対して指示情報を送信する送信ステップと、前記撮像映像の撮影範囲を示す情報を、前記作業対象物の全体像もしくはその部分像を含む画像に重畳させて表示部に表示させる映像合成ステップを含む。 In order to solve the above-described problem, a control method for an instruction device according to an aspect of the present invention includes a reception step of receiving a captured image of a work object captured by a work terminal, and instruction information for the work terminal. A transmitting step for transmitting, and a video synthesizing step for superimposing information indicating a shooting range of the captured video on an image including the entire image of the work object or a partial image thereof and displaying the superimposed image on the display unit.
 本発明の一態様によれば、作業対象の全体像もしくはその部分像を把握した上で作業指示を行うが可能となるため、指示者と作業者の間で指示内容の認識に齟齬が生じにくい指示装置を備えた遠隔通信システムを提供することができるという効果を奏する。 According to one aspect of the present invention, since it is possible to give a work instruction after grasping the whole image or a partial image of the work object, it is difficult for the instructor and the worker to recognize the instruction content. There is an effect that it is possible to provide a telecommunications system including the pointing device.
本発明の一態様である実施形態1に係る遠隔通信システムの利用イメージを示した図である。It is the figure which showed the utilization image of the telecommunication system which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る通信網における各装置の接続関係を示す図である。It is a figure which shows the connection relation of each apparatus in the communication network which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る指示装置の要部構成の一例を示すブロック図である。It is a block diagram which shows an example of the principal part structure of the instruction | indication apparatus which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る撮影範囲描画の好ましい例を示す図である。It is a figure which shows the preferable example of the imaging | photography range drawing which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る指示装置の撮影範囲算出部の要部構成の一例を示すブロック図である。It is a block diagram which shows an example of a principal part structure of the imaging | photography range calculation part of the instruction | indication apparatus which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る指示装置の撮影範囲座標算出部の要部構成の一例を示すブロック図である。It is a block diagram which shows an example of the principal part structure of the imaging | photography range coordinate calculation part of the pointing device which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る指示装置が実行する処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process which the instruction | indication apparatus which concerns on Embodiment 1 which is 1 aspect of this invention performs. 本発明の一態様である実施形態1に係る3次元モデルデータ上における撮影範囲の見え方を示す図である。It is a figure which shows the appearance of the imaging | photography range on the three-dimensional model data which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る2次元データ上における撮影範囲の見え方を示す図である。It is a figure which shows the appearance of the imaging | photography range on the two-dimensional data which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る俯瞰図上における撮影範囲の見え方を示す図である。It is a figure which shows the appearance of the imaging | photography range on the bird's-eye view which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る作業端末の要部構成の一例を示すブロック図である。It is a block diagram which shows an example of the principal part structure of the work terminal which concerns on Embodiment 1 which is 1 aspect of this invention. 本発明の一態様である実施形態1に係る作業端末が実行する処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process which the working terminal which concerns on Embodiment 1 which is 1 aspect of this invention performs. 本発明の一態様である実施形態2に係る撮影範囲の描画例を示す図である。It is a figure which shows the example of drawing of the imaging | photography range which concerns on Embodiment 2 which is 1 aspect of this invention. 本発明の一態様である実施形態3に係る撮影範囲の描画例を示す図である。It is a figure which shows the example of drawing of the imaging | photography range which concerns on Embodiment 3 which is 1 aspect of this invention. 本発明の一態様である実施形態1において取られ得る描画方法を示す図である。It is a figure which shows the drawing method which can be taken in Embodiment 1 which is 1 aspect of this invention.
 (第1の実施の形態)
 本発明の一態様である第1の実施の形態による遠隔通信システム(遠隔作業支援システム)について、図1から図12までを参照しながら説明を行う。
(First embodiment)
A remote communication system (remote work support system) according to a first embodiment which is an aspect of the present invention will be described with reference to FIGS.
 <遠隔通信システムの利用シーン>
 図1は、本実施形態に係る遠隔通信システム1の利用シーンの一例を模式的に示した図である。図1の左側が作業現場100であり、図1の右側が指示室106を示しており、お互いに離れたところに位置している。遠隔通信システム1は、互いに離れて位置している作業者と指示者との間における作業に関する情報伝達を行うことによって作業支援を実現するシステムである。
<Usage of remote communication systems>
FIG. 1 is a diagram schematically illustrating an example of a usage scene of the remote communication system 1 according to the present embodiment. The left side of FIG. 1 is the work site 100, and the right side of FIG. 1 shows the instruction room 106, which are located away from each other. The telecommunications system 1 is a system that implements work support by transmitting information about work between an operator and an instructor that are located apart from each other.
 図1は、作業現場100にいる作業者101が、指示室106にいる指示者107から、作業対象102に関する作業指示を作業端末103(第1端末)によって受けながら、作業を行なっているシーンを示している。より詳細に説明すると、作業対象102の修理を行っている作業者101が、監督する指示者107から修理に関する指示をもらっている例である。 FIG. 1 shows a scene in which a worker 101 at a work site 100 is performing a work while receiving a work instruction regarding a work object 102 from a commander 107 in an instruction room 106 by a work terminal 103 (first terminal). Show. More specifically, in this example, the worker 101 who is repairing the work object 102 receives an instruction regarding repair from the supervisor 107 who supervises.
 図示の例によれば、遠隔通信システム1は、互いに通信可能な作業端末103と指示装置108とを含む。また、遠隔通信システム1は、作業端末103および指示装置108にて設定された、映像内に入力された、視覚化して表示された指示内容である指示マークを共有することで作業指示を行うことができる。なお、遠隔通信システム1は作業指示が可能な構成であればどのようなものであってもよく、例えば、音声のみを用いた作業指示を行う構成であってもよい。 According to the illustrated example, the remote communication system 1 includes a work terminal 103 and an instruction device 108 that can communicate with each other. In addition, the remote communication system 1 issues a work instruction by sharing an instruction mark that is set in the work terminal 103 and the instruction device 108 and is an instruction content that is input and displayed in the video. Can do. The remote communication system 1 may have any configuration as long as it can perform work instructions. For example, the remote communication system 1 may be configured to perform work instructions using only voice.
 特に、詳細は後述するが、遠隔通信システム1において指示者107は、作業対象102の立体形状(作業対象102の全体像)を確認しながら、作業者101に対して、作業指示を与えることができる。 In particular, although details will be described later, in the remote communication system 1, the instructor 107 can give a work instruction to the worker 101 while confirming the three-dimensional shape of the work target 102 (overall image of the work target 102). it can.
 作業端末103は、タブレット状のコンピュータであり、背面側に設けられているカメラ103aと、表面側に設けられる表示部103bと、を備えている。作業端末103は、カメラ103aによって作業対象102を撮影でき、撮影によって得られた映像を表示部103bに表示させることができ、且つ、当該映像を遠隔地の指示装置(第2端末)108へ送信できるようになっている。 The work terminal 103 is a tablet computer and includes a camera 103a provided on the back side and a display unit 103b provided on the front side. The work terminal 103 can shoot the work object 102 with the camera 103a, display the video obtained by the shooting on the display unit 103b, and transmit the video to the remote pointing device (second terminal) 108. It can be done.
 指示室106に設置されている指示装置108は、図1に示すようにデスクトップ型のパーソナルコンピュータ(Personal Computer)の形態であるが、当該形態に限定されるものではない。例えば、作業者101が用いているようなタブレット状のコンピュータであってもよい。 The instruction device 108 installed in the instruction room 106 is in the form of a desktop personal computer as shown in FIG. 1, but is not limited to this form. For example, a tablet computer as used by the worker 101 may be used.
 指示装置108は、遠隔地の作業端末103から送られてきた映像を受信し、その映像を表示装置109(表示部)に表示させることができる。指示者107は、表示装置109に表示されている映像110を見ながら、表示装置109および指示装置108を用いて作業者101に対して作業指示を行う。なお、本実施の形態において、映像110は、作業対象102(作業対象物)の立体的形状を指示者107(ユーザ)が把握するための画像を含む全体画像(抽象化されたデータ、以下、抽象化データと称する)であってもよい。なお、本実施の形態では、作業対象102の立体的形状を指示者107(ユーザ)が把握するための画像を含む全体画像を、表示装置109に表示されている映像110として用いるが、本発明の一態様はこれに限定されるものではない。例えば、指示者107(ユーザ)が把握するための画像は、作業対象102の全体像ではなく、その部分像であってもよい。また、指示者107(ユーザ)が把握するのは、作業対象102の立体的形状に限らない。要するに、表示装置109に表示されている映像110は、後述するように遠隔地の作業端末103から送られてきた映像の撮影範囲が、作業対象102のどの領域に相当するかを示す画像であればよい。 The instruction device 108 can receive the video sent from the remote work terminal 103 and display the video on the display device 109 (display unit). The instructor 107 gives a work instruction to the worker 101 using the display device 109 and the instruction device 108 while viewing the video 110 displayed on the display device 109. In the present embodiment, the video 110 is an entire image (abstracted data, hereinafter, including an image for the instructor 107 (user) to grasp the three-dimensional shape of the work target 102 (work target). May be referred to as abstract data). In the present embodiment, an entire image including an image for the instructor 107 (user) to grasp the three-dimensional shape of the work object 102 is used as the video 110 displayed on the display device 109. One embodiment is not limited to this. For example, the image for the instructor 107 (user) to grasp may be a partial image instead of the entire image of the work target 102. Also, what the instructor 107 (user) grasps is not limited to the three-dimensional shape of the work target 102. In short, the video 110 displayed on the display device 109 may be an image indicating which region of the work target 102 corresponds to the shooting range of the video sent from the remote work terminal 103 as will be described later. That's fine.
 具体的には、指示者107がタッチパネルやマウスを利用した入力指示を行うと、指示装置108は、当該入力指示に基づいて、指示マーク111を示す指示マーク情報を生成する。指示マーク111とは、表示装置109および表示装置109に表示される映像において、指示者107に指定される指定位置に重畳して表示されるマークであり、指示者107が作業者に指示を行うのに利用される。なお、本実施形態において、指示マークとは、単なるマークの他、ポインタ、マーカー、テキスト、絵柄であってもよいし、これらの2以上の組み合わせからなるものであってもよい。すなわち、指示マークは、作業に係る情報を示す画像であって、前記撮像映像に重畳させて表示される画像を示す。また、指示マーク情報は、指示マークを生成する際に必要な情報である。 Specifically, when the instructor 107 performs an input instruction using a touch panel or a mouse, the instruction device 108 generates instruction mark information indicating the instruction mark 111 based on the input instruction. The indication mark 111 is a mark that is displayed superimposed on the designated position designated by the instructor 107 in the display device 109 and the video displayed on the display device 109, and the instructor 107 instructs the operator. Used for In the present embodiment, the instruction mark may be a simple mark, a pointer, a marker, text, a picture, or a combination of two or more of these. That is, the instruction mark is an image indicating information related to work, and indicates an image that is displayed superimposed on the captured video. The instruction mark information is information necessary when generating the instruction mark.
 指示装置108は、指示マーク情報を生成すると、当該マーク情報に基づいて表示装置109に指示マーク111を表示させ、且つ、当該指示マーク情報(指示情報)を作業端末103へ送信する。作業端末103も、当該マーク情報に基づいて表示部103bに指示マーク105を表示させる。すなわち、作業端末103と指示装置108とは、指示マーク情報を共有し、当該指示マーク情報に基づいて、其々が同じ指示マークを映像に重畳させる。 When the instruction mark information is generated, the instruction device 108 displays the instruction mark 111 on the display device 109 based on the mark information, and transmits the instruction mark information (instruction information) to the work terminal 103. The work terminal 103 also displays the instruction mark 105 on the display unit 103b based on the mark information. In other words, the work terminal 103 and the instruction device 108 share instruction mark information, and based on the instruction mark information, each superimposes the same instruction mark on the video.
 つまり、指示装置108は、図1の表示装置109の映像110に示される作業対象映像102a(作業対象102に対応する映像)に対して指示マーク111を重畳させるための入力操作を指示者107から受け付ける。指示装置108は、該入力操作を受け付けると、作業対象映像102aに対して指示マーク105を重畳させることを示した指示マーク情報を生成し、当該指示マーク情報を作業端末103に送信し、当該指示マーク情報を作業端末103と共有する。 That is, the instruction device 108 performs an input operation for superimposing the instruction mark 111 on the work target video 102a (video corresponding to the work target 102) shown in the video 110 of the display device 109 in FIG. Accept. Upon receiving the input operation, the instruction device 108 generates instruction mark information indicating that the instruction mark 105 is to be superimposed on the work target video 102a, transmits the instruction mark information to the work terminal 103, and transmits the instruction mark information. The mark information is shared with the work terminal 103.
 そして、指示装置108は、当該指示マーク情報に基づいて、図1の表示装置109の映像110に示される作業対象映像102aに対して指示マーク111を重畳させる。一方、作業端末103は、当該指示マーク情報に基づいて、図1の表示部103bの映像104に示される作業対象映像102aに対して指示マーク105を重畳させる。 And the instruction | indication apparatus 108 superimposes the instruction | indication mark 111 on the work target image | video 102a shown on the image | video 110 of the display apparatus 109 of FIG. 1 based on the said instruction | indication mark information. On the other hand, the work terminal 103 superimposes the instruction mark 105 on the work target image 102a shown in the image 104 of the display unit 103b in FIG. 1 based on the instruction mark information.
 作業者101は、表示部103bによって、指示マーク105と、指示マーク105にて指定される箇所(目標位置)とを把握できる。これにより、作業者101は、遠隔地にある指示室106からの作業指示を視覚的に把握できる。 The worker 101 can grasp the instruction mark 105 and the location (target position) designated by the instruction mark 105 by the display unit 103b. Thereby, the worker 101 can visually grasp the work instruction from the instruction room 106 at a remote place.
 なお、作業端末103は、作業者101の入力指示に基づいて指示マーク情報を設定して、当該指示マーク情報に基づき、表示部103bの映像104に指示マーク105を重畳表示することもできる。この際、勿論であるが、作業端末103が指示装置108へ指示マーク情報を送信し、作業端末103と指示装置108とで当該マーク情報を共有することができる。そして、指示装置108は、作業端末103から送られてきたマーク情報に基づいて、表示装置109の映像110に指示マーク111を重畳させることができる。これにより、指示者107と作業者101とのうちの一方が設定した指示マークを、指示者107および作業者101の両者が認識可能となる。また、指示装置108から作業端末103に送信する指示情報は、指示マーク情報に限定されず、例えば、作業端末103の表示部103bの映像104に重畳する様々な画像を特定する指示情報を用いることができる。 Note that the work terminal 103 can also set instruction mark information based on the input instruction of the worker 101, and can superimpose and display the instruction mark 105 on the video 104 of the display unit 103b based on the instruction mark information. At this time, as a matter of course, the work terminal 103 transmits the instruction mark information to the instruction device 108, and the work terminal 103 and the instruction device 108 can share the mark information. The instruction device 108 can superimpose the instruction mark 111 on the video 110 of the display device 109 based on the mark information transmitted from the work terminal 103. Thereby, both the instructor 107 and the worker 101 can recognize the instruction mark set by one of the instructor 107 and the worker 101. In addition, the instruction information transmitted from the instruction device 108 to the work terminal 103 is not limited to the instruction mark information. For example, instruction information for specifying various images to be superimposed on the video 104 of the display unit 103b of the work terminal 103 is used. Can do.
 <遠隔通信システムの構成>
 また、図2に示すように、遠隔通信システム1に含まれる、作業端末103及び指示装置108は、公衆通信網(例えばインターネット)201を介して、TCP/IPやUDP等のプロトコルに従って、お互いに通信することができる。
<Configuration of telecommunications system>
Further, as shown in FIG. 2, the work terminal 103 and the instruction device 108 included in the remote communication system 1 are connected to each other according to a protocol such as TCP / IP or UDP via a public communication network (for example, the Internet) 201. Can communicate.
 遠隔通信システム1には、図2に示すように、指示マーク情報を一括して管理するための管理サーバ200が設けられ、管理サーバ200も公衆通信網201に接続されている。なお、作業端末103と公衆通信網201とは、無線通信によって接続されてもよい。この場合、無線通信は、例えば、Wi-Fi(登録商標) Aliance(米国業界団体)によって規定された国際標準規格(IEEE 802.11)のワイファイ(Wi-Fi:Wireless Fiderity)(登録商標)接続によって実現することが可能である。また、以上では、通信網に関し、インターネットなどの公衆通信網を例示しているが、例えば、企業などで使用されているLAN(Local Area Network)を用いることも可能であり、それらが混在した構成であっても良い。なお、図2では、管理サーバ200を用いた構成を示しているが、作業端末103と指示装置108とが直接通信を行う形態でも問題はなく、以下の説明では、作業端末103と指示装置108とが直接通信する形態を説明する。また、通常のテレビ会議システムで用いられる一般的な音声通信処理や付加画面情報以外の映像通信処理に関しては、支障のない範囲で説明を省略する。 As shown in FIG. 2, the remote communication system 1 is provided with a management server 200 for collectively managing instruction mark information, and the management server 200 is also connected to the public communication network 201. The work terminal 103 and the public communication network 201 may be connected by wireless communication. In this case, the wireless communication is, for example, a Wi-Fi (Wireless Fidelity) (registered trademark) connection of an international standard (IEEE 802.11) defined by Wi-Fi (registered trademark) li Alliance (US industry group). Can be realized. In the above, a public communication network such as the Internet has been illustrated as an example of the communication network. For example, a LAN (Local Area Network) used in a company or the like can be used, and a configuration in which these are mixed. It may be. 2 shows a configuration using the management server 200, there is no problem even if the work terminal 103 and the instruction device 108 directly communicate with each other. In the following description, the work terminal 103 and the instruction device 108 are not problematic. The form which communicates directly with will be described. In addition, descriptions of general audio communication processing and video communication processing other than additional screen information used in a normal video conference system are omitted as long as there is no problem.
 <指示装置の構成例>
 続いて、本実施形態に係る遠隔通信システム1の構成例について説明する。前述したとおり、遠隔通信システム1には、指示者107の指示装置108と、作業者101の作業端末103とがあり、それぞれについて順番に説明する。
<Configuration example of pointing device>
Subsequently, a configuration example of the remote communication system 1 according to the present embodiment will be described. As described above, the remote communication system 1 includes the instruction device 108 of the instructor 107 and the work terminal 103 of the worker 101, which will be described in turn.
 まず、指示装置108における処理ブロックについて、図3を用いて説明する。図3に示すように、指示装置108は、外部から送られてくる映像や指示マーク情報の受信と、内部で生成する指示マーク情報を外部に送信するための通信部301(受信部、送信部)と、指示マーク情報や、作業端末103の撮影範囲を、指示者107の持つ抽象化されたデータへと合成する映像合成部302と、映像合成の結果や、後述する抽象化データを表示するための表示部303と、ユーザからの入力を受け付ける外部入出力部304と、映像合成に利用するための種々のデータを保存する保存部305と、作業端末103に備わるカメラ103aの撮影可能範囲(以下、撮影範囲と称す)を、抽象化データ上に合成するための撮影範囲の位置を算出する撮影範囲算出部306と、指示装置108全体の制御を行うための制御部307と、各々のブロック間でデータのやり取りを行うためのデータバス300と、を有している。また、映像合成部302は、全体画像表示部311(映像合成部)、および撮影範囲表示部312(映像合成部)を更に有している。なお、抽象化データは作業対象102の抽象化されたデータであり、作業対象102(作業対象物)の立体的形状を指示者107(ユーザ)が把握するための画像を含む全体画像である。抽象化データの具体的な構成例については後述する。 First, processing blocks in the instruction device 108 will be described with reference to FIG. As shown in FIG. 3, the instruction device 108 receives a video and instruction mark information sent from the outside, and transmits an instruction mark information generated inside to the outside, a communication unit 301 (reception unit, transmission unit). ), The image mark information and the image synthesizing range of the work terminal 103 into the abstracted data held by the instructor 107, the result of the video composition, and the abstract data to be described later are displayed. Display unit 303, an external input / output unit 304 that receives input from the user, a storage unit 305 that stores various data used for video composition, and a shootable range of the camera 103 a included in the work terminal 103 ( (Hereinafter referred to as “shooting range”), a shooting range calculation unit 306 for calculating the position of the shooting range for compositing on the abstract data, and a control unit 3 for controlling the entire pointing device 108. 7, and a data bus 300 for exchanging data between each block. The video composition unit 302 further includes an entire image display unit 311 (video composition unit) and a shooting range display unit 312 (video composition unit). Note that the abstract data is abstract data of the work object 102 and is an entire image including an image for the instructor 107 (user) to grasp the three-dimensional shape of the work object 102 (work object). A specific configuration example of the abstract data will be described later.
 通信部301は、FPGA(Field Programmable Gate Array)やASIC(Application Specific IntegratedCircuit)などによって構成され、外部とデータの送受信を行う処理ブロックである。具体的には、後述する作業端末103から送られてくる映像符号および指示マーク情報の受信と、内部で作りだす指示マーク情報の送信処理を行う。映像符号は、動画像の符号化に適した符号化処理が実行されたデータであり、例えばH.264によって符号化されたデータである。H.264符号化とは、動画データの圧縮符号化方式の標準の一つであり、ISO(国際標準化機構)によって規格化された方式である。通信部301が映像の送信処理を行う場合は、前述の符号化方式に則ってエンコード処理を行い、送信する。映像の受信の際は、映像のデコード処理を行い、前述のエンコード処理とは逆の処理を行う。 The communication unit 301 includes a FPGA (Field Programmable Gate Array) and an ASIC (Application Specific Integrated Circuit), and is a processing block that transmits and receives data to and from the outside. Specifically, a video code and instruction mark information sent from the work terminal 103, which will be described later, are received, and instruction mark information that is created internally is transmitted. The video code is data on which an encoding process suitable for encoding a moving image has been executed. H.264 encoded data. H. H.264 encoding is one of the standards for compression encoding of moving image data, and is a method standardized by ISO (International Organization for Standardization). When the communication unit 301 performs video transmission processing, the communication unit 301 performs encoding processing in accordance with the above-described encoding method and transmits the encoded video. At the time of video reception, video decoding processing is performed, and processing opposite to the above-described encoding processing is performed.
 なお、前記エンコード処理および前記デコード処理は、後述する制御部307が実行することが好適である。 The encoding process and the decoding process are preferably executed by the control unit 307 described later.
 映像合成部302は、FPGAやASIC、あるいは、GPU(Graphics Processing Unit)などによって構成される。映像合成部302は、入力した映像に、指示マークや作業端末103の撮影した領域を視覚的に表現できるよう合成処理を行う。指示マーク情報とは、前述のように指示マークやポインタなどの視覚的に表現できる指示内容を生成する際に必要な情報である。本実施形態において、映像合成部302は、全体画像表示部311、および撮影範囲表示部312を更に有している。 The video composition unit 302 includes an FPGA, an ASIC, or a GPU (Graphics-Processing Unit). The video composition unit 302 performs composition processing on the input video so that the instruction mark and the area captured by the work terminal 103 can be visually represented. The instruction mark information is information necessary for generating instruction contents that can be expressed visually, such as an instruction mark and a pointer as described above. In the present embodiment, the video composition unit 302 further includes an entire image display unit 311 and a shooting range display unit 312.
 全体画像表示部311は、作業対象物の立体的形状をユーザが把握するための画像を含む全体画像を表示するものである。具体的には、作業対象102の抽象化データを含む全体画像を後述する表示部303を介して指示者107に対して表示する。 The entire image display unit 311 displays an entire image including an image for the user to grasp the three-dimensional shape of the work object. Specifically, the entire image including the abstract data of the work target 102 is displayed to the instructor 107 via the display unit 303 described later.
 撮影範囲表示部312は、全体画像表示部311が表示した全体画像に重畳させて、前記作業対象物のどの部分が前記撮像映像に撮像されているかを示す情報を表示するものである。具体的には、作業端末103が撮影した映像について、該映像の撮影範囲が作業対象102のどの領域に相当するかを示す情報を、後述する表示部303を介して指示者107に対して表示する。なお、撮影範囲表示部312は、映像の撮影範囲を示す構成であればどのようなものであってもよい。例えば、撮影範囲の境界線のみを表示してもよいし、映像そのものを全体画像に重畳させてもよい。 The imaging range display unit 312 displays information indicating which part of the work object is captured in the captured image by being superimposed on the entire image displayed by the entire image display unit 311. Specifically, with respect to the video captured by the work terminal 103, information indicating to which area of the work target 102 the imaging range of the video corresponds is displayed to the instructor 107 via the display unit 303 described later. To do. Note that the shooting range display unit 312 may be of any configuration as long as it is a configuration showing the shooting range of the video. For example, only the boundary line of the shooting range may be displayed, or the video itself may be superimposed on the entire image.
 表示部303は、LCD(Liquid Crystal Display)や有機ELディスプレイ(OELD:Organic ElectroLuminescence Display)などによって構成される。表示部303は、映像合成部302から出力された合成映像、映像処理結果、保存部305に保存された画像や、装置を制御するためのUI(User Interface)などを表示する。また、表示部303は、その表示面を押すことで端末を操作することができるようなタッチパネルの機能を具備させることもでき、本機能を用いることで、前述の指示マークを設置する場所などを指定することができる。なお、表示部303は、図1の表示装置109のように、外部入出力部304を介して、指示装置108の外部に外付け設置される構成としてもよい。 The display unit 303 includes an LCD (Liquid Crystal Display), an organic EL display (OELD: Organic Luminescence Display), and the like. The display unit 303 displays a composite video output from the video composition unit 302, a video processing result, an image stored in the storage unit 305, a UI (User Interface) for controlling the apparatus, and the like. In addition, the display unit 303 can be provided with a touch panel function that allows the terminal to be operated by pressing the display surface. By using this function, the location of the instruction mark described above can be determined. Can be specified. The display unit 303 may be configured to be externally installed outside the pointing device 108 via the external input / output unit 304 as in the display device 109 of FIG.
 外部入出力部304は、USB(Universal Serial Bus)やHDMI(High Definition Multimedia Interface:登録商標)などの入出力ポートを有し、外部ストレージとのインターフェースとして動作する。 The external input / output unit 304 has input / output ports such as USB (Universal Serial Bus) and HDMI (High Definition Multimedia Interface), and operates as an interface with the external storage.
 保存部305は、例えば、RAM(Ramdom Access Memory)などの主記憶装置、あるいは、ハードディスクなどの補助記憶装置からなる。主記憶装置は、画像データや画像処理結果を一時的に保持するために利用される。補助記憶装置は、撮像された画像データ、画像処理結果など、ストレージとして長期的に保存するためのデータが格納される。 The storage unit 305 includes, for example, a main storage device such as a RAM (Random Access Memory) or an auxiliary storage device such as a hard disk. The main storage device is used to temporarily hold image data and image processing results. The auxiliary storage device stores data for long-term storage as storage, such as captured image data and image processing results.
 撮影範囲算出部306は、前記デコードして復元された映像信号をもとに、指示者107が所持している抽象化データ上に、作業者101が撮影した映像の撮影範囲を視覚化するために必要な情報を算出し、映像合成部302に出力する。すなわち、撮影範囲算出部306は、作業対象102の抽象化データを含む座標空間上の、映像の撮影範囲に対応する座標位置を算出する。具体的な算出方法については後述する。 The shooting range calculation unit 306 visualizes the shooting range of the video shot by the worker 101 on the abstract data possessed by the instructor 107 based on the decoded and restored video signal. Necessary information is calculated and output to the video composition unit 302. That is, the shooting range calculation unit 306 calculates the coordinate position corresponding to the shooting range of the video on the coordinate space including the abstract data of the work object 102. A specific calculation method will be described later.
 制御部307は、指示装置108の処理全体の制御を行い、かつ、各部間の情報のやり取りの制御を行うものである。本実施形態において、制御部307は、通信部301が送受信するデータについて、必要に応じて映像信号へのエンコードおよびデコード処理を行う。 The control unit 307 controls the entire processing of the instruction device 108 and controls the exchange of information between the units. In the present embodiment, the control unit 307 performs encoding and decoding processing on the video signal as necessary for the data transmitted and received by the communication unit 301.
 データバス300は、各々のユニット間でのデータのやり取りを行うためのバスである。 The data bus 300 is a bus for exchanging data between each unit.
 <抽象化データについて>
 続いて、本発明の一態様における撮影範囲算出部306が撮影範囲を視覚化する対象である抽象化データ(作業対象物の全体像もしくはその部分像を含む画像)の構成例について説明する。本発明の一態様においては、撮影範囲算出部306は、作業者101と指示者107との間で指示マーク情報を共有する従来手法に加え、作業者101が撮影している範囲を、指示者107側で所持している抽象化データ上に描画するための、3次元モデル空間上の位置を算出する手段を有する。前記指示者107側で所持している抽象化データとは、作業対象102の全体像を表現する情報をパラメータとして持つ、抽象化されたデータのことを指す。例えば、抽象化データの形態は、3次元的な情報、または2次元的な情報として表現可能であるものが例として挙げられる。前記3次元的、または2次元的な情報とは、少なくとも1つ以上のパラメータの組み合わせで記述することができる表現手法に基づくデータ、あるいは、該パラメータによって記述された情報に基づき、更に別の表現手法に変換されたデータである。前記3次元的な情報には、例えば、作業対象102の大きさや形状、表面の色の情報や座標情報といった情報を有する3次元モデルデータが考えられる。3次元モデルデータは、指示者が事前に作業対象をスキャンしてデータとして持っている場合や、作業対象102が設計される段階の3次元設計図面として持っている場合、または、作業者101が作業現場100で測距デバイスを用いて取得した3次元情報を、指示者107が所持しているような場合が考えられる。前記2次元的な情報には、俯瞰図(俯瞰映像)、あるいは、前記3次元的なデータを展開または投影して求めることができる展開図、三面図といった、2次元平面上に表現されたデータ(以下、2次元データと称す)がある。
<About abstract data>
Next, a configuration example of abstract data (an image including a whole image of a work target or a partial image thereof) that is a target for the shooting range calculation unit 306 to visualize the shooting range in one embodiment of the present invention will be described. In one aspect of the present invention, the shooting range calculation unit 306 adds the range in which the worker 101 is shooting to the instructor in addition to the conventional method of sharing the instruction mark information between the worker 101 and the instructor 107. Means for calculating a position in the three-dimensional model space for rendering on the abstract data possessed by the 107 side. The abstracted data possessed by the instructor 107 refers to abstracted data having information representing the overall image of the work object 102 as a parameter. For example, the abstract data can be expressed as three-dimensional information or two-dimensional information. The three-dimensional or two-dimensional information is data based on an expression method that can be described by a combination of at least one or more parameters, or another expression based on information described by the parameters. It is the data converted into the method. The three-dimensional information may be, for example, three-dimensional model data having information such as the size and shape of the work object 102, surface color information, and coordinate information. The three-dimensional model data is obtained when the instructor scans the work target in advance and has it as data, or as the three-dimensional design drawing at the stage where the work target 102 is designed, or by the worker 101 There may be a case where the instructor 107 possesses the three-dimensional information acquired using the distance measuring device at the work site 100. The two-dimensional information includes data represented on a two-dimensional plane, such as an overhead view (overhead video), a development view that can be obtained by developing or projecting the three-dimensional data, and a three-view drawing. (Hereinafter referred to as two-dimensional data).
 前記俯瞰図とは、作業対象102及びその周辺を含めて撮影された作業現場100の全体がわかる映像のことで、作業現場100のいずれかの場所に撮像装置を設置して撮影した、2次元の撮影映像のことを指す。指示者107は、抽象化データの形態を3次元モデルデータとして扱うか、2次元データ、もしくは俯瞰図として扱うかを選択することができ、作業の状況に応じて自由に切り替えられるものとする。前記抽象化データは、指示装置108の保存部305に保存されており、作業の最中に限らず、指示者107の好きなタイミングで参照が可能であるとする。 The bird's-eye view is an image showing the entire work site 100 photographed including the work object 102 and its surroundings, and is a two-dimensional image taken with an imaging device installed at any location on the work site 100. Refers to the video shot of The instructor 107 can select whether the abstract data is handled as three-dimensional model data, two-dimensional data, or an overhead view, and can be freely switched according to the work situation. It is assumed that the abstract data is stored in the storage unit 305 of the instruction device 108 and can be referred to at any timing by the instructor 107 as well as during the work.
 すなわち、本実施の形態において、全体画像表示部311は、作業対象102(作業対象物)の立体的形状を特定する複数のパラメータを用いて作業対象102(作業対象物)の全体画像(抽象化データ、例えば、3次元モデルデータ404、2次元データ406、俯瞰図407など)を表示することができる。さらに、全体画像表示部311は、作業対象102(作業対象物)の3次元モデルを示すデータ(3次元モデルデータ404)、あるいは、前記3次元モデルを2次元平面上に展開して得られるデータ(2次元データ406)を用いて前記全体画像を表示することができる。また、全体画像表示部311が2次元データ406を用いて表示する、作業対象物の全体画像は、前記作業対象物の周囲の状況を示す画像(俯瞰図407)を含むことができる。 In other words, in the present embodiment, the entire image display unit 311 uses a plurality of parameters that specify the three-dimensional shape of the work object 102 (work object) to display the entire image (abstraction) of the work object 102 (work object). Data, eg, 3D model data 404, 2D data 406, overhead view 407, etc.) can be displayed. Further, the entire image display unit 311 is data (three-dimensional model data 404) indicating a three-dimensional model of the work object 102 (work object) or data obtained by developing the three-dimensional model on a two-dimensional plane. The whole image can be displayed using (two-dimensional data 406). In addition, the entire image of the work object displayed by the entire image display unit 311 using the two-dimensional data 406 can include an image (overhead view 407) showing a situation around the work object.
 <抽象化データと撮影範囲の合成例>
 前記3次元モデルデータと、2次元データと、俯瞰図と、のそれぞれにおいて、作業端末103が撮影した範囲を視覚化する場合の好ましい例について、図4を用いて説明する。図4の(a)は作業者101が作業対象102を撮影した際の実写映像400を模式的に表現したものである。図4の(b)~(d)はそれぞれ、3次元モデル空間401上へ撮影範囲405aを描画した結果、2次元データ406である展開図上へ撮影範囲405bを描画した結果、2次元データである前述の俯瞰図407上へ撮影範囲405cを描画した結果を表している。
<Composition example of abstraction data and shooting range>
A preferable example of visualizing the range captured by the work terminal 103 in each of the three-dimensional model data, the two-dimensional data, and the overhead view will be described with reference to FIG. FIG. 4A schematically represents a live-action image 400 when the worker 101 captures the work target 102. FIGS. 4B to 4D respectively show the result of drawing the imaging range 405a on the three-dimensional model space 401, the result of drawing the imaging range 405b on the development view, which is the two-dimensional data 406, and two-dimensional data. The result of drawing the imaging range 405c on the above-described overhead view 407 is shown.
 402aおよび402bは前記遠隔通信システム1にて設置された指示マーク105を表現したものである。403は作業対象に物理的に貼り付けられたARマーカーを表す。ARマーカーとは、画像認識技術を用いてARを実現するシステムにおいて、付加情報(ここでは、指示マーク情報)を表示する位置を指定するための標識となる、パターンのことを指す。代表的な例として、単純ではっきりした白黒の図形やQR(Quick Response)コードなどが挙げられる。本実施例では、ARマーカーは403のように黒枠の正方形で表現する。 402a and 402b represent instruction marks 105 installed in the remote communication system 1. Reference numeral 403 represents an AR marker physically attached to the work target. The AR marker refers to a pattern that serves as a mark for designating a position for displaying additional information (instruction mark information here) in a system that implements AR using an image recognition technique. Typical examples include simple and clear black and white figures and QR (Quick Response) codes. In the present embodiment, the AR marker is represented by a black framed square such as 403.
 <実写映像の合成描画について>
 撮影範囲算出部306は、前記ARマーカー403の情報を用いて作業端末103の撮影位置や傾き、撮影範囲の位置を算出する。算出方法については後述する。作業端末103の表示部103bに指示マークが設置されていた場合は、従来技術のような方法を用いて位置情報が指示装置108に送信され、抽象化データ上の適切な位置に指示マークが重畳描画されるようになる。作業端末103上の指示マーク105と、指示装置108上の指示マーク111の位置情報の同期手段については本発明の一態様の本質とは関係がないため、従来手法にならうものとし、詳細の説明については省略する。
<About composite drawing of live-action video>
The shooting range calculation unit 306 calculates the shooting position and tilt of the work terminal 103 and the position of the shooting range using the information of the AR marker 403. The calculation method will be described later. When the instruction mark is installed on the display unit 103b of the work terminal 103, the position information is transmitted to the instruction device 108 using a method such as the conventional technique, and the instruction mark is superimposed on an appropriate position on the abstract data. It will be drawn. The means for synchronizing the positional information of the instruction mark 105 on the work terminal 103 and the instruction mark 111 on the instruction device 108 is not related to the essence of one aspect of the present invention. The explanation is omitted.
 図中の408は、作業端末103から送られてきた映像を、抽象化データ上に合成して描画しているようすを示す。前記映像408の合成位置や、大きさや、表示の有無に関しては指示者107は自由に調整、選択できる。前記映像408は、抽象化データを閲覧するだけでは指示がし辛いような場合、例えば、実際に撮影された作業対象102と抽象化データとの差分を指示内容に盛り込む、といった場合に指示者107が参照するために用いられる。 In the figure, reference numeral 408 indicates that the video sent from the work terminal 103 is synthesized and drawn on the abstract data. The instructor 107 can freely adjust and select the composition position, size, and presence / absence of display of the video 408. In the video 408, when it is difficult to give an instruction simply by browsing the abstract data, for example, when the difference between the actually photographed work object 102 and the abstract data is included in the instruction content, the instruction person 107 Is used for reference.
 すなわち、本実施形態に係る指示装置108は、作業端末103が撮像した作業対象102(作業対象物)の撮像映像(指示装置108における映像408、および作業端末103における実写映像400)と、作業に係る情報を示す画像であって、映像408に重畳させて表示される画像(指示装置108における指示マーク111、および作業端末103における指示マーク105)と、を作業端末103と共有するものである。指示装置108における映像408および指示マーク111は、各々、作業端末103における実写映像400および指示マーク105に対応する。要するに、指示装置108が作業端末103と画像を共有するとは、作業対象102を含む画像上において作業対象102の特定の箇所に示された指示マークを、指示装置108でも作業端末103でも表示していることを意味する。 In other words, the instruction device 108 according to the present embodiment includes a captured image (the image 408 in the instruction device 108 and the live-action image 400 in the work terminal 103) captured by the work terminal 103 and the work. An image indicating such information, which is displayed superimposed on the video 408 (the instruction mark 111 on the instruction device 108 and the instruction mark 105 on the work terminal 103) is shared with the work terminal 103. The image 408 and the instruction mark 111 on the instruction device 108 correspond to the live-action image 400 and the instruction mark 105 on the work terminal 103, respectively. In short, when the instruction device 108 shares an image with the work terminal 103, the instruction mark displayed at a specific location of the work target 102 on the image including the work target 102 is displayed on both the instruction device 108 and the work terminal 103. Means that
 さらに、指示装置108は、作業対象102の立体的形状をユーザが把握するための画像を含む全体画像(例えば、3次元モデルデータ404、2次元データ406、俯瞰図407など)を表示し、表示した前記全体画像に重畳させて、作業対象102のどの部分が、作業端末103が撮像した実写映像400に撮像されているかを示す情報を表示する。「作業対象102のどの部分が、作業端末103が撮像した実写映像400に撮像されているかを示す情報」とは、例えば、図4に例示する撮影範囲405a、405b、および405cなどである。 Further, the instruction device 108 displays and displays an entire image (for example, 3D model data 404, 2D data 406, overhead view 407) including an image for the user to grasp the three-dimensional shape of the work object 102. The information indicating which part of the work object 102 is captured in the real image 400 captured by the work terminal 103 is displayed by being superimposed on the whole image. The “information indicating which part of the work object 102 is captured in the live-action image 400 captured by the work terminal 103” is, for example, the imaging ranges 405a, 405b, and 405c illustrated in FIG.
 <撮影範囲算出部のブロック構成>
 次に、撮影範囲算出部306の機能ブロック構成について、図5を用いて説明する。撮影範囲算出部306は、データバス300から映像信号を入力として受ける。撮影範囲算出部306は、入力された映像信号から前記ARマーカー403の位置を探索するARマーカー検出部501を有する。また、撮影範囲算出部306は、検出されたARマーカー403の幾何学的な情報(頂点の座標、辺の長さ、面積、対角線、中心座標、など)と、作業端末103に備わるカメラ103aの内部パラメータとを用いて、作業端末103の位置情報および傾き情報を算出する外部パラメータ算出部502を有する。撮影範囲算出部306は、さらに、外部入出力部304から、作業対象102に貼り付けられたARマーカー403の位置と対応する3次元モデル上の位置を、撮影範囲座標算出部504に入力するためのARマーカー基準位置入力部503を有する。そして、撮影範囲算出部306は、前記外部パラメータ算出部502とARマーカー基準位置入力部503からの入力を受けて、抽象化データ上の、撮影範囲を描画するための座標を算出する撮影範囲座標算出部504と、で構成される。撮影範囲座標算出部504から出力された座標情報は、データバス300を経由して映像合成部302へと出力され、抽象化データ上の該当する座標に撮影範囲を描画するために使用される。
<Block configuration of shooting range calculation unit>
Next, the functional block configuration of the imaging range calculation unit 306 will be described with reference to FIG. The shooting range calculation unit 306 receives a video signal from the data bus 300 as an input. The imaging range calculation unit 306 includes an AR marker detection unit 501 that searches for the position of the AR marker 403 from the input video signal. The shooting range calculation unit 306 also includes geometric information (vertex coordinates, side lengths, areas, diagonal lines, center coordinates, etc.) of the detected AR marker 403 and the camera 103 a included in the work terminal 103. An external parameter calculation unit 502 that calculates position information and tilt information of the work terminal 103 using the internal parameters is provided. The imaging range calculation unit 306 further inputs a position on the three-dimensional model corresponding to the position of the AR marker 403 pasted on the work target 102 from the external input / output unit 304 to the imaging range coordinate calculation unit 504. AR marker reference position input unit 503. The shooting range calculation unit 306 receives the inputs from the external parameter calculation unit 502 and the AR marker reference position input unit 503, and calculates shooting range coordinates on the abstract data for drawing the shooting range. And a calculation unit 504. The coordinate information output from the shooting range coordinate calculation unit 504 is output to the video composition unit 302 via the data bus 300, and is used to draw the shooting range at the corresponding coordinates on the abstract data.
 <撮影範囲座標算出部のブロック構成>
 続いて、撮影範囲座標算出部504を構成する要素について、図6を用いて説明する。撮影範囲座標算出部504は、描画位置算出部601と、2次元データ変換部603と、特徴点マッチング処理部605と、撮影範囲射影変換部606と、算出結果出力部607とを有している。撮影範囲座標算出部504は、さらに、3次元モデルデータ判定部602、および俯瞰図判定部604を有している。
<Block Configuration of Shooting Range Coordinate Calculation Unit>
Next, elements constituting the imaging range coordinate calculation unit 504 will be described with reference to FIG. The shooting range coordinate calculation unit 504 includes a drawing position calculation unit 601, a two-dimensional data conversion unit 603, a feature point matching processing unit 605, a shooting range projection conversion unit 606, and a calculation result output unit 607. . The shooting range coordinate calculation unit 504 further includes a three-dimensional model data determination unit 602 and an overhead view determination unit 604.
 描画位置算出部601は、外部パラメータ算出部502から入力された作業端末103の位置および傾きの情報(以下、姿勢情報と称す)と、事前に算出済みの作業端末103のカメラ103aの内部パラメータ情報とを用いて、撮影範囲405aが描画される、3次元モデル空間401上の座標情報を後述する手段で算出し、3次元モデルデータ判定部602へと送る。なお、他の実施形態において、作業端末103は、公知の測位システムやセンサを用いて作業端末103の位置および傾きを検出し、作業端末103から指示装置108へ作業端末103の位置および傾きの情報を送信するようになっていてもよい。この場合、描画位置算出部601は、外部パラメータ算出部502ではなく、通信部301から入力された姿勢情報を用いて、撮影範囲405aが描画される、3次元モデル空間401上の座標情報を算出してもよい。 The drawing position calculation unit 601 includes the position and inclination information (hereinafter referred to as posture information) of the work terminal 103 input from the external parameter calculation unit 502 and the internal parameter information of the camera 103a of the work terminal 103 that has been calculated in advance. Are used to calculate coordinate information on the three-dimensional model space 401 in which the photographing range 405a is drawn by means described later, and send it to the three-dimensional model data determination unit 602. In another embodiment, the work terminal 103 detects the position and tilt of the work terminal 103 using a known positioning system and sensor, and the work terminal 103 is informed of the position and tilt of the work terminal 103 from the work terminal 103 to the instruction device 108. May be sent. In this case, the drawing position calculation unit 601 uses the posture information input from the communication unit 301 instead of the external parameter calculation unit 502 to calculate coordinate information on the three-dimensional model space 401 where the shooting range 405a is drawn. May be.
 3次元モデルデータ判定部602は指示者107が閲覧する抽象化データが3次元モデルデータか、2次元データかによって、描画位置算出部601の結果を渡す先を変更する役割を持っている。3次元モデルデータ判定部602は、指示者が3次元モデルデータを閲覧している場合は、描画位置算出部601の結果はそのまま算出結果出力部607へと渡される。3次元モデルデータ判定部602は、前記抽象化データが2次元データだった場合は、描画位置算出部601の出力結果を基に2次元データの形態へと変換処理を行う、2次元データ変換部603へと結果を渡す。 The 3D model data determination unit 602 has a role of changing a destination to which the result of the drawing position calculation unit 601 is transferred depending on whether the abstract data viewed by the instructor 107 is 3D model data or 2D data. When the instructor is browsing the 3D model data, the 3D model data determination unit 602 directly passes the result of the drawing position calculation unit 601 to the calculation result output unit 607. When the abstracted data is two-dimensional data, the three-dimensional model data determination unit 602 performs a conversion process into a two-dimensional data form based on the output result of the drawing position calculation unit 601. The result is passed to 603.
 2次元データ変換部603は、変換処理の結果を算出結果出力部607へ渡す。 The two-dimensional data conversion unit 603 passes the result of the conversion process to the calculation result output unit 607.
 俯瞰図判定部604は、指示者107が参照している抽象化データが俯瞰図407であるか否かを判定する。俯瞰図判定部604は、指示者107が参照している抽象化データが俯瞰図407だった場合、前記姿勢情報およびカメラ103aの内部パラメータを後述する特徴点マッチング処理部605へ渡す。 The overhead view determination unit 604 determines whether or not the abstract data referred to by the instructor 107 is the overhead view 407. When the abstract data referred to by the instructor 107 is the overhead view 407, the overhead view determination unit 604 passes the posture information and the internal parameters of the camera 103a to the feature point matching processing unit 605 described later.
 特徴点マッチング処理部605は、作業端末103から送られた映像と、指示者107が参照している俯瞰図407とに対して特徴点検出処理と対応付け処理を行う。このとき得られた結果は撮影範囲射影変換部606へと送られる。なお、特徴点抽出処理は、作業対象102の外観上の特徴を示す、例えば複数のエッジが結合するような画素である特徴点を抽出する処理であり、特徴点対応付け処理は、該映像の特徴点を俯瞰図407における特徴点に対応付ける処理である。 The feature point matching processing unit 605 performs feature point detection processing and association processing on the video image sent from the work terminal 103 and the overhead view 407 referred to by the instructor 107. The result obtained at this time is sent to the shooting range projection conversion unit 606. Note that the feature point extraction process is a process of extracting feature points that indicate the appearance features of the work object 102, for example, pixels that are a combination of a plurality of edges. This is processing for associating the feature points with the feature points in the overhead view 407.
 撮影範囲射影変換部606は、前記作業端末103から送られた映像で見えている範囲に対応する、俯瞰図407上の座標へと射影変換処理を行い、その結果を算出結果出力部607へと渡す。 The shooting range projection conversion unit 606 performs projective conversion processing to the coordinates on the overhead view 407 corresponding to the range visible in the video sent from the work terminal 103, and the result is sent to the calculation result output unit 607. hand over.
 算出結果出力部607は、指示者107が閲覧している抽象化データの形態に応じて、描画位置算出部601の結果か、2次元データ変換部603の結果か、撮影範囲射影変換部606の結果の、どれを映像合成部302へと出力するかの判断を行う。 The calculation result output unit 607 is the result of the drawing position calculation unit 601, the result of the two-dimensional data conversion unit 603, or the shooting range projection conversion unit 606 according to the form of the abstract data that the instructor 107 is browsing. It is determined which of the results is output to the video composition unit 302.
 <指示装置の処理フローチャート>
 図7は、指示装置108における処理の流れを示す図である。
<Processing flowchart of instruction device>
FIG. 7 is a diagram showing a flow of processing in the instruction device 108.
 ステップ1(S1):
 制御部307は、作業端末103から送られてきた映像符号を受信すると、その映像符号に対しデコード処理を行い、撮影範囲算出部306へと出力する。制御部307は、その後ステップ2に処理を進める。
Step 1 (S1):
When the control unit 307 receives the video code transmitted from the work terminal 103, the control unit 307 performs a decoding process on the video code and outputs it to the shooting range calculation unit 306. The control unit 307 then proceeds to step 2.
 ステップ2(S2):
 撮影範囲算出部306は、制御部307でデコードされた映像を入力として受け取ると、その映像をARマーカー検出部501へ渡し、ARマーカーの検出処理を行う。
Step 2 (S2):
Upon receiving the video decoded by the control unit 307 as an input, the imaging range calculation unit 306 passes the video to the AR marker detection unit 501 and performs AR marker detection processing.
 ステップ3(S3):
 ARマーカー検出部501は、映像からARマーカーが検出されたか否かを判定する。前記ARマーカーが検出された場合は、外部パラメータ算出部502へと結果を渡し、ステップ4に処理を進める。ARマーカーが検出されなかった場合は、以下のステップによる撮影範囲の算出ができないため、何もせずにステップ13に処理を進める。
Step 3 (S3):
The AR marker detection unit 501 determines whether an AR marker is detected from the video. If the AR marker is detected, the result is passed to the external parameter calculation unit 502, and the process proceeds to step 4. If the AR marker is not detected, the imaging range cannot be calculated in the following steps, and the process proceeds to step 13 without doing anything.
 ステップ4(S4):
 外部パラメータ算出部502では、保存部305に保存されたカメラ103aの内部パラメータ情報と、前記ARマーカーの検出結果を用いて、作業端末103の姿勢情報を算出する。算出した結果は撮影範囲座標算出部504へと出力され、ステップ5へ処理を進める。
Step 4 (S4):
The external parameter calculation unit 502 calculates the posture information of the work terminal 103 using the internal parameter information of the camera 103a stored in the storage unit 305 and the detection result of the AR marker. The calculated result is output to the shooting range coordinate calculation unit 504, and the process proceeds to step 5.
 ステップ5(S5):
 撮影範囲座標算出部504は、姿勢情報を受信すると、該姿勢情報を俯瞰図判定部604へ渡し、指示者107が参照している抽象化データの形態が、俯瞰図407であるか否かを判定する。指示者107が参照している抽象化データの形態が、俯瞰図407ではない場合(3次元モデルデータ404、あるいは2次元データ406の場合)は、そのままステップ6へと処理を進める。俯瞰図407を参照していた場合はステップ11へと処理を進める。
Step 5 (S5):
Upon receiving the posture information, the imaging range coordinate calculation unit 504 passes the posture information to the overhead view determination unit 604, and determines whether or not the form of the abstract data referred to by the instructor 107 is the overhead view 407. judge. If the abstract data referred to by the instructor 107 is not the overhead view 407 (in the case of the three-dimensional model data 404 or the two-dimensional data 406), the process proceeds to step 6 as it is. If the overhead view 407 is being referred to, the process proceeds to step 11.
 ステップ6(S6):
 撮影範囲座標算出部504は、描画位置算出部601を用いて、S4で作業端末103の姿勢情報が算出された際、作業対象102に貼り付けられたARマーカーの位置に対応した3次元モデルデータ404上の位置に、ARマーカーの基準位置が設定されているかどうかをこのステップで判断する。撮影範囲座標算出部504は、前記基準位置が設定されていた場合はそのままステップ8へ処理を進め、前記基準位置が設定されていなかった場合はステップ7へ処理を進める。
Step 6 (S6):
The shooting range coordinate calculation unit 504 uses the drawing position calculation unit 601 to calculate the three-dimensional model data corresponding to the position of the AR marker attached to the work target 102 when the posture information of the work terminal 103 is calculated in S4. In this step, it is determined whether or not the reference position of the AR marker is set at a position on 404. The imaging range coordinate calculation unit 504 proceeds to step 8 as it is when the reference position is set, and proceeds to step 7 when the reference position is not set.
 ステップ7(S7):
 指示者107は、外部入出力部304からARマーカー基準位置入力部503へ3次元モデルデータ404上の基準位置を入力する。入力されたデータは保存部305へ保存され、ステップ8の処理で参照される。一度、前記基準位置が入力された場合は、この処理ステップはスキップすることができる。
Step 7 (S7):
The instructor 107 inputs a reference position on the three-dimensional model data 404 from the external input / output unit 304 to the AR marker reference position input unit 503. The input data is stored in the storage unit 305 and is referred to in the process of step 8. Once the reference position has been input, this processing step can be skipped.
 ステップ8(S8):
 撮影範囲座標算出部504は、外部パラメータ算出部502とARマーカー基準位置入力部503とからの入力を描画位置算出部601に渡す。前記描画位置算出部601は、3次元モデル空間401上で、撮影範囲405aが描画されるべき座標値を後述の方法で算出する。前記描画位置算出部601は、算出された座標値を算出結果出力部607へと渡し、ステップ9へ処理を進める。
Step 8 (S8):
The imaging range coordinate calculation unit 504 passes inputs from the external parameter calculation unit 502 and the AR marker reference position input unit 503 to the drawing position calculation unit 601. The drawing position calculation unit 601 calculates a coordinate value on which the imaging range 405a is to be drawn on the three-dimensional model space 401 by a method described later. The drawing position calculation unit 601 passes the calculated coordinate value to the calculation result output unit 607 and proceeds to step 9.
 ステップ9(S9):
 撮影範囲座標算出部504は、3次元モデルデータ判定部602を用いて、指示者107が閲覧している抽象化データの形態が3次元モデルデータ404か否かを判定する。3次元モデルデータ404だった場合は、算出結果出力部607はステップ707で得られた撮影範囲の座標情報を映像合成部302へと出力し、ステップ12へと処理を進める。3次元モデルデータ404ではないと判定した場合は、ステップ10へと処理を進める。
Step 9 (S9):
The shooting range coordinate calculation unit 504 uses the 3D model data determination unit 602 to determine whether the abstract data being viewed by the instructor 107 is the 3D model data 404. In the case of the 3D model data 404, the calculation result output unit 607 outputs the coordinate information of the shooting range obtained in step 707 to the video composition unit 302, and proceeds to step 12. If it is determined that the data is not the three-dimensional model data 404, the process proceeds to step 10.
 ステップ10(S10):
 ステップ9で指示者107が閲覧している抽象化データが3次元モデルデータ404ではないと判定した場合は、撮影範囲座標算出部504は、前記座標情報を2次元データ変換部603へと渡し、撮影範囲を後述する方法で2次元データの形態へと変換する処理を行う。変換された2次元データは、算出結果出力部607へと渡された後に、映像合成部302へ出力され、制御部307はステップ12へ処理を進める。
Step 10 (S10):
When it is determined in step 9 that the abstract data being browsed by the instructor 107 is not the 3D model data 404, the shooting range coordinate calculation unit 504 passes the coordinate information to the 2D data conversion unit 603, A process for converting the imaging range into a two-dimensional data format by a method described later is performed. The converted two-dimensional data is transferred to the calculation result output unit 607 and then output to the video composition unit 302, and the control unit 307 advances the process to step 12.
 ステップ11(S11):
 ステップ5で、指示者107が参照している抽象化データが俯瞰図407の形態だった場合は、本ステップに処理が進められる。撮影範囲座標算出部504は、特徴点マッチング処理部605を用いて、受信した前述の映像と、保存部305にて保存されている俯瞰図407の両方について、特徴点抽出処理および特徴点対応付け処理を行う。次に、撮影範囲座標算出部504は、特徴点マッチング処理部605によって対応付けがなされた特徴点の組を撮影範囲射影変換部606に入力し、後述する方法で撮影範囲の座標の射影変換を行う。撮影範囲座標算出部504は、撮影範囲射影変換部606による変換結果を、算出結果出力部607へと渡し、その後映像合成部302へと出力する。制御部307はステップ12へ処理を進める。
Step 11 (S11):
If the abstract data referred to by the instructor 107 is in the form of an overhead view 407 in step 5, the process proceeds to this step. The shooting range coordinate calculation unit 504 uses the feature point matching processing unit 605 to perform feature point extraction processing and feature point association for both the received video and the overhead view 407 stored in the storage unit 305. Process. Next, the shooting range coordinate calculation unit 504 inputs the set of feature points associated by the feature point matching processing unit 605 to the shooting range projection conversion unit 606, and performs projection conversion of the coordinates of the shooting range by a method described later. Do. The shooting range coordinate calculation unit 504 passes the conversion result by the shooting range projection conversion unit 606 to the calculation result output unit 607, and then outputs it to the video composition unit 302. The control unit 307 advances the process to step 12.
 ステップ12(S12):
 映像合成部302は、ステップ9、ステップ10、ステップ11から座標値を入力として受け、現在指示者107が参照している抽象化データの対応する座標上に、視覚化情報を合成する。なお、該視覚化情報を合成する代わりに、抽象化データ上に作業端末103から送られた映像408を合成してもよいし、あるいはその両方の処理を行ってもよい。
Step 12 (S12):
The video composition unit 302 receives the coordinate values from Steps 9, 10, and 11 as inputs, and synthesizes the visualization information on the corresponding coordinates of the abstract data that is currently referenced by the instructor 107. Instead of synthesizing the visualization information, the video 408 sent from the work terminal 103 may be synthesized on the abstract data, or both processes may be performed.
 本ステップに、ステップ3から遷移した場合(ARマーカーが検出されず、撮影範囲の算出を行わずに直接処理が移った場合)は、前記視覚化情報を合成する処理は行われない。その後、制御部307はステップ13へ処理を進める。 When transition is made from step 3 to this step (when the AR marker is not detected and the process moves directly without calculating the shooting range), the process of synthesizing the visualization information is not performed. Thereafter, the control unit 307 advances the process to step 13.
 ステップ13(S13):
 映像合成部302は、合成が完了すると、表示部303を介して撮影範囲の合成映像を画面110に出力する処理を行う。具体的には、以下のステップ131および132の処理を実行する。その後、処理をステップ14へと進める。
Step 13 (S13):
When the composition is completed, the video composition unit 302 performs processing for outputting a composite image of the shooting range to the screen 110 via the display unit 303. Specifically, the following steps 131 and 132 are executed. Thereafter, the process proceeds to step 14.
 ステップ131(S131:全体画像表示ステップ(映像合成ステップ)):
 映像合成部302は、全体画像表示部311を用いて、現在指示者107が参照している抽象化データを表示部303を介して表示する。その後、処理をステップ132へと進める。
Step 131 (S131: Whole image display step (video composition step)):
The video composition unit 302 uses the entire image display unit 311 to display the abstract data currently referred to by the instructor 107 via the display unit 303. Thereafter, the process proceeds to step 132.
 ステップ132(S132:撮影範囲表示ステップ(映像合成ステップ)):
 映像合成部302は、撮影範囲表示部312を用いて、ステップ12で合成された撮影範囲を示す視覚化情報を表示部303を介して、ステップ131で表示した抽象化データに重畳して表示する。その後、処理をステップ14へと進める。
Step 132 (S132: shooting range display step (video composition step)):
The video composition unit 302 uses the photographing range display unit 312 to display the visualization information indicating the photographing range synthesized in step 12 superimposed on the abstract data displayed in step 131 via the display unit 303. . Thereafter, the process proceeds to step 14.
 ステップ14(S14):
 制御部307は、指示装置108の処理を継続するか否かを判定する。処理を継続すると判定した場合には、ステップ1に戻り、前述した処理を繰り返す。処理を継続しないと判定した場合には、全ての処理を終了させる。例えば、制御部307は、作業端末103からの通信が途絶した場合に処理を継続しないと判定し、それ以外の場合は処理を継続すると判定する構成であってもよい。
Step 14 (S14):
The control unit 307 determines whether or not to continue the processing of the instruction device 108. If it is determined that the process is to be continued, the process returns to step 1 and the above-described process is repeated. If it is determined not to continue the process, all the processes are terminated. For example, the control unit 307 may be configured to determine that the process is not continued when communication from the work terminal 103 is interrupted, and to continue the process otherwise.
 これまでに説明した指示装置108の処理内容は、以下のように整理することができる。すなわち、指示装置108が実行する処理は、作業対象102(作業対象物)の立体的形状をユーザが把握するための画像を含む全体画像(例えば、3次元モデルデータ404、2次元データ406、俯瞰図407など)を表示する全体画像表示ステップ(ステップ131)と、前記全体画像に重畳させて、作業対象102のどの部分が実写映像400(撮像映像)に撮像されているかを示す情報(例えば、撮影範囲405a、405b、および405cなど)を表示する撮影範囲表示ステップ(ステップ132)と、を含む。 The processing contents of the instruction device 108 described so far can be organized as follows. That is, the processing executed by the instruction device 108 is a whole image including an image for the user to grasp the three-dimensional shape of the work object 102 (work object) (for example, 3D model data 404, 2D data 406, overhead view). An entire image display step (step 131) for displaying FIG. 407 and the like, and information indicating which part of the work object 102 is captured in the live-action image 400 (captured image) by being superimposed on the entire image (for example, A shooting range display step (step 132) for displaying shooting ranges 405a, 405b, and 405c).
 <姿勢情報の算出方法について>
 外部パラメータ算出部502が、作業端末103の姿勢情報を算出する方法について、以下に説明する。外部パラメータ算出部502は、カメラ103aの撮像センサの特性やレンズの歪み情報といった、カメラ内部の固有情報である内部パラメータから、カメラ103aから作業対象102までの位置関係や傾きのパラメータを表す外部パラメータである姿勢情報を算出することができる。
<Attitude information calculation method>
A method in which the external parameter calculation unit 502 calculates the posture information of the work terminal 103 will be described below. The external parameter calculation unit 502 is an external parameter that represents a positional relationship or tilt parameter from the camera 103a to the work target 102 from internal parameters that are unique information inside the camera, such as characteristics of the imaging sensor of the camera 103a and lens distortion information. It is possible to calculate the posture information.
 作業端末103のカメラ103aのもつ内部パラメータの算出には、ARTооlKitといった汎用のツールや、OpenCV(Computer Vision)等の汎用のライブラリに備わる、一般的なカメラキャリブレーション手法を用いて算出してよい。作業端末103のカメラ103aのパラメータは事前にキャリブレーションされていてもいいし、作業者101が作業前にキャリブレーション処理を実施してもよい。上述の手段によって得られた内部パラメータは、作業端末103および指示装置108の保存部305に保持される。外部パラメータ算出部502は、前記保存部305に保持された作業端末103のカメラ103aの内部パラメータを参照し、作業端末103の姿勢情報を算出する。 The internal parameters of the camera 103a of the work terminal 103 may be calculated using a general camera calibration method provided in a general-purpose tool such as ARTоо Kit or a general-purpose library such as OpenCV (Computer Vision). The parameters of the camera 103a of the work terminal 103 may be calibrated in advance, or the worker 101 may perform calibration processing before work. The internal parameters obtained by the above-described means are held in the work terminal 103 and the storage unit 305 of the pointing device 108. The external parameter calculation unit 502 refers to the internal parameters of the camera 103 a of the work terminal 103 held in the storage unit 305 and calculates the posture information of the work terminal 103.
 外部パラメータ算出部502にて算出される作業端末103の姿勢情報は、ARマーカー403の中心を起点とした座標軸上の点から、前記カメラ103aの光軸原点を起点とした座標軸上の点に変換する回転行列と併進ベクトルによって記述される。前記回転行列と併進ベクトルは、ARToolKitを用いて算出できる。また、ARToolKitは作業端末103の光軸原点を中心とした座標系における、ARマーカー403の中心の座標を算出する機能も備わっている。そのため、後述の撮影範囲座標算出部504で用いるために、回転行列および併進ベクトルと併せて求めておくことが好適である。 The posture information of the work terminal 103 calculated by the external parameter calculation unit 502 is converted from a point on the coordinate axis starting from the center of the AR marker 403 to a point on the coordinate axis starting from the optical axis origin of the camera 103a. Described by a rotation matrix and a translation vector. The rotation matrix and the translation vector can be calculated using ARTToolKit. The ARTToolKit also has a function of calculating the coordinates of the center of the AR marker 403 in the coordinate system centered on the optical axis origin of the work terminal 103. For this reason, it is preferable to obtain it together with the rotation matrix and the translation vector for use in the imaging range coordinate calculation unit 504 described later.
 ここで、ARマーカー基準位置入力部503から入力された3次元モデル空間上のARマーカーの位置と、前述の作業端末103の姿勢情報を適用すると、作業端末103とARマーカー403との位置関係を同期させることができる。以降の計算は、作業端末103と作業対象102、3次元モデル空間上の指示者107の視点と3次元モデルデータ404の、位置関係の同期が取れているものとして説明する。 Here, when the position of the AR marker in the three-dimensional model space input from the AR marker reference position input unit 503 and the posture information of the work terminal 103 are applied, the positional relationship between the work terminal 103 and the AR marker 403 is obtained. Can be synchronized. The following calculation will be described on the assumption that the positional relationship between the work terminal 103, the work object 102, the viewpoint of the instructor 107 in the 3D model space and the 3D model data 404 is synchronized.
 <3次元モデルデータ上の撮影範囲描画位置算出方法について>
 撮影範囲座標算出部504が描画位置算出部601を用いて実行する、作業対象102の3次元モデルデータ404上に撮影範囲405aを描画する位置を算出する方法について、図8を用いて説明する。図8は、作業対象102の正面にカメラ103aを置いて撮影している様子を模式的に表したものである。801は、作業対象102を含めて撮影した映像で、104は映像内の作業対象102に該当する。この映像104の撮影範囲を3次元モデル空間401上に描画すると、802のようになる。撮影範囲802の描画位置は、作業端末103の姿勢情報とカメラ103aの内部パラメータ情報を用いれば導くことができる。具体的には、カメラ103aの画像上の特定の画素をP(u、v)と表現する(iは画素のインデックス番号)と、Pに対応する3次元モデル空間上の座標は、カメラの内部パラメータAを用いることで、以下の(式1)のような透視投影変換で求めることができる。
<Regarding the method of calculating the drawing range drawing position on the 3D model data>
A method of calculating the position at which the shooting range 405a is drawn on the three-dimensional model data 404 of the work object 102, which is executed by the shooting range coordinate calculation unit 504 using the drawing position calculation unit 601, will be described with reference to FIG. FIG. 8 schematically shows a state in which the camera 103a is placed in front of the work object 102 and is photographed. Reference numeral 801 denotes a video shot including the work target 102, and 104 corresponds to the work target 102 in the video. When the shooting range of the video image 104 is drawn on the three-dimensional model space 401, it becomes 802. The drawing position of the shooting range 802 can be derived by using the posture information of the work terminal 103 and the internal parameter information of the camera 103a. Specifically, when a specific pixel on the image of the camera 103a is expressed as P i (u i , v i ) (i is a pixel index number), coordinates in the three-dimensional model space corresponding to P i are By using the internal parameter A of the camera, it can be obtained by perspective projection conversion as in the following (Equation 1).
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 ここで、(式1)のmはスケール変数を表し、カメラ103aの投影面に対し垂直な直線を引いた時の、カメラ103aの光軸原点からの距離に応じて一意に求まる。fは焦点距離、cxとcyは光学中心、sはレンズの歪み係数を表す。上記計算では、計算の簡略化のためsを0と、cxとcyを画像の中心と置いても構わない。 Here, m in (Expression 1) represents a scale variable, and is uniquely determined according to the distance from the optical axis origin of the camera 103a when a straight line perpendicular to the projection plane of the camera 103a is drawn. f is a focal length, cx and cy are optical centers, and s is a distortion coefficient of a lens. In the above calculation, s may be set to 0 and cx and cy may be set to the center of the image for simplification of the calculation.
 以上の方法を用いて、撮影範囲座標算出部504は描画位置算出部601を用いて、作業端末103から受信した映像701をもとに、作業対象102の3次元モデルデータ404上に撮影範囲405aを重畳して表示する位置を算出することができる。 Using the above method, the shooting range coordinate calculation unit 504 uses the drawing position calculation unit 601 and based on the video 701 received from the work terminal 103, the shooting range 405a on the three-dimensional model data 404 of the work target 102. Can be calculated.
 <2次元データ上の撮影範囲描画位置算出方法について>
 作業対象102を表現する抽象化データの形態のうち、指示者107が2次元データを選択した場合における、撮影範囲座標算出部504が2次元データ変換部603を用いる、2次元データ上の撮影範囲描画位置を算出する方法について図9を用いて説明する。
<About the method for calculating the drawing position of the shooting range on two-dimensional data>
Of the forms of abstract data representing the work object 102, when the instructor 107 selects two-dimensional data, the photographing range coordinate calculation unit 504 uses the two-dimensional data conversion unit 603 and the photographing range on the two-dimensional data. A method for calculating the drawing position will be described with reference to FIG.
 本発明にかかる2次元データへの撮影範囲の描画は、一度描画位置算出部601によって3次元モデル空間401上で撮影範囲の座標を算出した後で行う。 The drawing of the shooting range on the two-dimensional data according to the present invention is performed after the drawing position calculation unit 601 once calculates the coordinates of the shooting range on the three-dimensional model space 401.
 図9でARマーカー403を用いて算出した作業端末103の姿勢情報を3次元モデル空間401上の指示者107の視点に適用した結果、901の位置に視点がおかれ、作業対象102を撮影した時の撮影範囲が、902のように3次元的に視覚化されたとする。この時の撮影範囲902を3次元モデルデータ404にテクスチャとして貼り付けた後に、後述する方法を用いて2次元データの形式に変換することで、指示者107は撮影範囲902を2次元データとして参照することができる。 As a result of applying the attitude information of the work terminal 103 calculated using the AR marker 403 in FIG. 9 to the viewpoint of the instructor 107 on the three-dimensional model space 401, the viewpoint is placed at the position 901 and the work target 102 is photographed. Assume that the shooting range at the time is visualized three-dimensionally as in 902. The pasting range 902 at this time is pasted as texture on the 3D model data 404, and then converted into a 2D data format using a method described later, so that the instructor 107 refers to the shooting range 902 as 2D data. can do.
 3次元モデルデータを2次元データに変換するための手段の一例として、撮影範囲902を3次元モデルデータ404の表面に投影したデータを、AutoCAD(Computer-Aided Design)のようなCAD製図ツールを使用することが考えられる。CAD製図ツールには、3次元モデルデータ404を2次元の様々な形式の図面に変換し、元の3次元モデルデータ404上の座標との位置関係を関連付けする機能が組み込まれているものもある。そのため、撮影範囲902の投影変換処理が施された3次元モデルデータ404を、前記CAD製図ツールで読み込み可能な形式に変換して撮影範囲902の座標データを受け渡した後、前記座標データから2次元データ上の撮影範囲を生成することができる。従って、CAD製図ツールは、撮影範囲902を有する3次元モデルデータ404から、撮影範囲が描画された2次元データを生成することができる。2次元データ変換部603では、上記CAD製図ツールに3次元モデルデータ404を入力して変換処理をさせ、撮影範囲の2次元データ上の座標値を受け取る処理を行う。 As an example of means for converting 3D model data into 2D data, a CAD drafting tool such as AutoCAD (Computer-Aided Design) is used for data obtained by projecting the imaging range 902 onto the surface of the 3D model data 404. It is possible to do. Some CAD drafting tools have a function of converting the three-dimensional model data 404 into various two-dimensional drawings and associating the positional relationship with the coordinates on the original three-dimensional model data 404. . Therefore, the 3D model data 404 that has undergone the projection conversion processing of the imaging range 902 is converted into a format that can be read by the CAD drawing tool, and the coordinate data of the imaging range 902 is delivered, and then the two-dimensional A shooting range on the data can be generated. Therefore, the CAD drafting tool can generate two-dimensional data in which the shooting range is drawn from the three-dimensional model data 404 having the shooting range 902. In the two-dimensional data conversion unit 603, three-dimensional model data 404 is input to the CAD drawing tool to perform conversion processing, and processing for receiving coordinate values on the two-dimensional data of the imaging range is performed.
 上記方法を適用した際の2次元データ例を903と904に示す。撮影範囲902がテクスチャとして貼り付けられた3次元モデルデータ404を前記2次元データ変換部603に入力して得られた結果が、例えば、映像903の展開図905上に描画される撮影範囲は907のようになる。また、映像904の三面図906上に描画される撮影範囲は909のようになる。 903 and 904 show two-dimensional data examples when the above method is applied. The result obtained by inputting the 3D model data 404 with the imaging range 902 pasted as a texture to the two-dimensional data conversion unit 603 is, for example, an imaging range 907 drawn on the development view 905 of the video 903. become that way. The shooting range drawn on the three-view diagram 906 of the video 904 is as shown in 909.
 <俯瞰図上の撮影範囲描画位置算出方法について>
 2次元データが前記CAD製図ツールによらない方法で取得された場合、例えば、作業対象102の全体像が映るようにあらかじめ撮影した俯瞰図407のようなデータの場合に、俯瞰図407上に撮影範囲描画位置を算出する方法について、以下に説明する。撮影範囲座標算出部504は、特徴点マッチング処理部605を用いて、カメラ103aで撮影された作業対象102の部分的な箇所が映る映像と、前記俯瞰図407との間で特徴点のマッチング処理を行うことで、撮影範囲を特定することが可能になる。特徴点とは、例えば複数のエッジが結合するような画素であり、例えばSURF(Speeded Up Robust Features)を用いて、特徴点の情報を算出することができる。特徴点の情報とは、検出された特徴点の画像座標における位置情報、および、その特徴点を特定することができる記述情報(特徴量)である。なお、特徴点マッチング処理部605が実行する特徴点抽出処理は、SURFを用いる方法に限定されない。例えば、PrewittフィルタやLaplacianフィルタ、Cannyフィルタ、SIFT(Scale-Invariant Feature Transform)と呼ばれる各種特徴点データのいずれか、もしくは、複数を使う構成にすることもできる。
<About the method of calculating the shooting position drawing position on the overhead view>
When the two-dimensional data is acquired by a method that does not depend on the CAD drawing tool, for example, in the case of data such as an overhead view 407 taken in advance so that the entire image of the work object 102 is reflected, the image is captured on the overhead view 407. A method for calculating the range drawing position will be described below. The shooting range coordinate calculation unit 504 uses the feature point matching processing unit 605 to perform feature point matching processing between an image showing a partial portion of the work object 102 shot by the camera 103a and the overhead view 407. By performing this, it becomes possible to specify the shooting range. A feature point is, for example, a pixel in which a plurality of edges are combined, and the feature point information can be calculated using, for example, SURF (Speeded Up Robust Features). The feature point information is positional information of the detected feature points in the image coordinates and description information (feature amount) that can identify the feature points. Note that the feature point extraction processing executed by the feature point matching processing unit 605 is not limited to the method using SURF. For example, it is possible to employ a configuration in which one or a plurality of various feature point data called Prewitt filter, Laplacian filter, Canny filter, SIFT (Scale-Invariant Feature Transform) is used.
 図10に示すように、作業対象102の全体像が映った俯瞰図1001があり、作業者101が作業中に作業対象102を撮影した時の様子が1002の映像であるとする。この時、俯瞰図1001からN個の、映像1002からM個の前記特徴点がそれぞれ検出されたとする。特徴点マッチング処理部605は、特徴点対応付け処理を実行し、前記検出された特徴点間で対応付けを行うことで、映像1002の撮影範囲を俯瞰図1001の対応する範囲へと対応付けを行うことができる。 As shown in FIG. 10, there is an overhead view 1001 in which the entire image of the work object 102 is shown, and it is assumed that the state when the worker 101 photographs the work object 102 during work is a video 1002. At this time, it is assumed that N feature points are detected from the overhead view 1001 and M feature points are detected from the video 1002. The feature point matching processing unit 605 executes a feature point association process, and associates the detected feature points with each other, thereby associating the shooting range of the video 1002 with the corresponding range of the overhead view 1001. It can be carried out.
 特徴点対応付け処理は、例えば、特徴点の情報および隣接する別の特徴点の情報に対する相対的な変化量の組み合わせが同一である特徴点同士を対応付けるものであってもよい。 The feature point associating process may be, for example, associating feature points having the same combination of relative change amounts with respect to feature point information and information of another adjacent feature point.
 具体的には、俯瞰図1001と映像1002で検出された特徴点同士の対応が4組以上存在する場合に、下記の変換式を用いて変換することができる。この変換式により作業者101が撮影した映像1002上の画素(m,n)を俯瞰図1001上の画素(m’、n’)に変換することができる(ホモグラフィ変換1004)。 Specifically, when there are four or more pairs of feature points detected in the overhead view 1001 and the video 1002, the conversion can be performed using the following conversion formula. With this conversion formula, the pixel (m i , n i ) on the video 1002 photographed by the worker 101 can be converted to the pixel (m i ′, n i ′) on the overhead view 1001 (homography conversion 1004). .
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 このホモグラフィ変換(式2)におけるHは3×3の行列で、ホモグラフィ行列と呼ばれている。ホモグラフィ行列とは、2枚の画像を射影変換することができる行列である。ホモグラフィ行列H*は、例えば以下の(式3)のようにして表すことができる。 H * in this homography transformation (Equation 2) is a 3 × 3 matrix and is called a homography matrix. A homography matrix is a matrix that can projectively transform two images. The homography matrix H * can be expressed, for example, as shown in (Equation 3) below.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 ここで、ホモグラフィ行列の各要素を(式3)のように定義すると、撮影範囲座標算出部504は撮影範囲射影変換部606を用いて、(式3)による座標変換誤差を最小にするように3×3の各要素の値を求める。具体的には、以下の(式4)の値を最小にするように各要素を計算する。 Here, when each element of the homography matrix is defined as (Equation 3), the imaging range coordinate calculation unit 504 uses the imaging range projection conversion unit 606 to minimize the coordinate conversion error according to (Equation 3). The value of each element of 3 × 3 is obtained. Specifically, each element is calculated so as to minimize the value of the following (formula 4).
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
ここで、argmin()は、カッコ内の値を最小にするargminの下部にあるパラメータを算出する関数である。具体的には、(式4)のカッコ内の値が最小となるような、ホモグラフィ行列Hの各要素h11~h33の組み合わせを算出する。 Here, argmin () is a function that calculates a parameter below argmin that minimizes the value in parentheses. Specifically, a combination of the elements h 11 to h 33 of the homography matrix H * is calculated so that the value in parentheses in (Equation 4) is minimized.
 上記処理により、撮影範囲座標算出部504は撮影範囲射影変換部606を用いて、作業者101の撮影した映像1002上の各画素を俯瞰図1001上の画素に変換できる。 Through the above processing, the shooting range coordinate calculation unit 504 can convert each pixel on the video 1002 captured by the worker 101 into a pixel on the overhead view 1001 using the shooting range projection conversion unit 606.
 なお、撮影範囲射影変換部606が変換する画素(m,n)は、処理の軽量化のために映像1002中最も外側にある画素のみにしてもよい。また、全ての画素位置を変換して、俯瞰図1001上の画素(m’,n’)の輝度値を映像1002上の画素(m,n)のものと置き換えるような方法で視覚化してもよい。例えば、映像1002の左上の座標1003を(式2)の計算でホモグラフィ変換1004を行った結果、俯瞰図1001上の座標1005の位置に射影変換されたとする。同様の手順で画像1002中の左下、右下、右上の画素もホモグラフィ変換を行い、俯瞰図1001上の画素と対応付けを行う。この時変換された俯瞰図1001上の4点を直線で結ぶと、作業端末103の撮影範囲を俯瞰図1001上に視覚化することができる。 Note that the pixels (m, n) converted by the shooting range projection conversion unit 606 may be only the outermost pixels in the video 1002 in order to reduce the processing weight. Alternatively, all pixel positions may be converted and visualized by a method in which the luminance value of the pixel (m ′, n ′) on the overhead view 1001 is replaced with that of the pixel (m, n) on the video 1002. Good. For example, suppose that the coordinate 1003 at the upper left of the video 1002 is subjected to the projective transformation to the position of the coordinate 1005 on the overhead view 1001 as a result of performing the homography transformation 1004 by the calculation of (Equation 2). In the same procedure, the lower left, lower right, and upper right pixels in the image 1002 are also subjected to homography conversion and associated with the pixels on the overhead view 1001. When the four points on the overhead view 1001 converted at this time are connected by a straight line, the shooting range of the work terminal 103 can be visualized on the overhead view 1001.
 この処理は映像1002を受信した際に作業端末103におけるカメラ103aの姿勢が変化する度に行われ、都度、俯瞰図1001上に視覚化された撮影範囲の位置も変化する。また、撮影範囲の位置の変化は、指示装置108が3次元モデルデータ404および2次元データを参照している場合であっても、俯瞰図1001を参照している場合と同様に行われてもよい。 This processing is performed every time the posture of the camera 103a in the work terminal 103 changes when the video 1002 is received, and the position of the shooting range visualized on the overhead view 1001 also changes each time. Further, the change in the position of the photographing range may be performed in the same manner as when the pointing device 108 refers to the three-dimensional model data 404 and the two-dimensional data and refers to the overhead view 1001. Good.
 以上が、指示装置108で行われる処理に関する説明となる。 The above is the description regarding the processing performed by the instruction device 108.
 指示装置108は、作業端末103が撮影した作業対象物の立体的形状をユーザが把握するための画像を含む全体画像を表示し、前記全体画像に重畳させて、前記作業対象物のどの部分が前記撮像映像に撮像されているかを示す情報を表示することができる。従って、指示者107は指示装置108を用いて、作業対象102の全体を把握した上で作業指示を行うことが可能となるため、指示者107と作業者101との間で認識に齟齬が生じにくくなる。 The instruction device 108 displays an entire image including an image for allowing the user to grasp the three-dimensional shape of the work object photographed by the work terminal 103, and superimposes the entire image on the entire image so that any part of the work object is displayed. Information indicating whether the captured image is captured can be displayed. Accordingly, the instructor 107 can give a work instruction after grasping the entire work object 102 by using the instruction device 108, so that there is a discrepancy in recognition between the instructor 107 and the worker 101. It becomes difficult.
 次に、作業端末103の構成と、具体的な処理内容について説明する。 Next, the configuration of the work terminal 103 and specific processing contents will be described.
 <作業端末の構成>
 作業端末103のブロック構成について、図11を用いて説明する。作業端末103は、通信部1101と、映像合成部1102と、表示部1103と、外部入出力部1104と、保存部1105と、映像取得部1106と、制御部1107と、データバス1100とを有している。
<Work terminal configuration>
A block configuration of the work terminal 103 will be described with reference to FIG. The work terminal 103 includes a communication unit 1101, a video composition unit 1102, a display unit 1103, an external input / output unit 1104, a storage unit 1105, a video acquisition unit 1106, a control unit 1107, and a data bus 1100. is doing.
 作業端末103は、映像取得部1106を備えている点と、撮影範囲算出部306が存在しない点と、が図3の指示装置108のブロック構成と異なる。その他の構成は作業端末103も指示装置108も同一である。つまり、通信部1101は通信部301と、映像合成部1102は映像合成部302と、表示部1103は表示部303と、外部入出力部1104外部入出力部304と、保存部1105は保存部305と、制御部1107は制御部307と、データバス1100はデータバス300と、それぞれ同等の機能を有する。 The work terminal 103 is different from the block configuration of the pointing device 108 in FIG. 3 in that the work terminal 103 includes the video acquisition unit 1106 and the shooting range calculation unit 306 does not exist. Other configurations are the same for the work terminal 103 and the pointing device 108. That is, the communication unit 1101 is the communication unit 301, the video composition unit 1102 is the video composition unit 302, the display unit 1103 is the display unit 303, the external input / output unit 1104, the external input / output unit 304, and the storage unit 1105 is the storage unit 305. The control unit 1107 and the data bus 1100 have the same functions as the control unit 307 and the data bus 300, respectively.
 <作業端末の処理の流れ>
 作業端末103が実行する処理フローについて、図12を用いて説明する。
<Flow of work terminal processing>
A processing flow executed by the work terminal 103 will be described with reference to FIG.
 ステップ21(S21):
 映像取得部1106は、前述したカメラ103aを用いて作業対象102の映像を取得し、取得した映像信号を制御部1107と映像合成部1102に出力する。制御部1107は、その後、処理をステップ22に進める。
Step 21 (S21):
The video acquisition unit 1106 acquires the video of the work target 102 using the camera 103 a described above, and outputs the acquired video signal to the control unit 1107 and the video synthesis unit 1102. Thereafter, the control unit 1107 advances the process to step 22.
 ステップ22(S22):
 制御部1107は、映像取得部1106から映像信号を受け取ると、通信できるようにデータを整形後、動画像の通信に適したエンコード処理を施し、通信部1101を介して指示装置108へ映像信号を送信する。制御部1107は、その後、処理をステップ23に進める。
Step 22 (S22):
Upon receiving the video signal from the video acquisition unit 1106, the control unit 1107 shapes the data so that it can be communicated, performs encoding processing suitable for moving image communication, and sends the video signal to the instruction device 108 via the communication unit 1101. Send. Thereafter, the control unit 1107 advances the processing to step 23.
 ステップ23(S23):
 制御部1107は通信部1101を介して指示装置108から指示マーク情報を受け取ると、その指示マーク情報を映像合成部1102に出力する。制御部1107は、その後、処理をステップ24に進める。
Step 23 (S23):
Upon receiving the instruction mark information from the instruction device 108 via the communication unit 1101, the control unit 1107 outputs the instruction mark information to the video composition unit 1102. Thereafter, the control unit 1107 advances the process to step 24.
 ステップ24(S24):
 映像合成部1102は制御部1107から受けた指示マーク情報に基づいて、ステップ21で撮影した映像上に指示マーク105を合成し、合成した結果を表示部1103に出力する。制御部1107は、その後、処理をステップ25に進める。
Step 24 (S24):
Based on the instruction mark information received from the control unit 1107, the video composition unit 1102 composes the instruction mark 105 on the video photographed in step 21 and outputs the composite result to the display unit 1103. Thereafter, the control unit 1107 advances the process to step 25.
 ステップ35(S25):
 表示部1103は、映像合成部1102から受け取った指示マーク105の合成結果を表示する。制御部1107は、その後、処理をステップ26に進める。
Step 35 (S25):
The display unit 1103 displays the synthesis result of the instruction mark 105 received from the video synthesis unit 1102. Thereafter, the control unit 1107 advances the processing to step 26.
 ステップ26(S26):
 制御部1107は、作業端末103の処理を継続するか否かを判定する。処理を継続すると判定した場合には、ステップ21に戻り、繰り返し同じ処理が実行される。制御部1107は、処理を継続しないと判定した場合には、全ての処理を終了させる。
Step 26 (S26):
The control unit 1107 determines whether or not to continue the processing of the work terminal 103. If it is determined that the process is to be continued, the process returns to step 21 and the same process is repeatedly executed. When it is determined that the processing is not continued, the control unit 1107 ends all the processing.
 以上が、作業端末103で行われる処理に関する説明となる。 The above is the description regarding the processing performed on the work terminal 103.
 一連の処理を実行することにより、本実施形態に係る作業端末103は、作業対象102の少なくとも一部を含む映像を撮影して指示装置108へ送信する。そして、作業端末103は、指示装置108にて入力された指示内容を示す指示マーク情報を受信し、該指示マーク情報を指示マーク105として視覚化して表示することができる。従って、作業者101は作業端末103を用いて、作業対象102を撮影した映像内に表示された指示マーク105を確認することが可能となるため、指示者107と作業者101との間で認識に齟齬が生じにくい作業を行うことができる。 By executing a series of processes, the work terminal 103 according to the present embodiment shoots an image including at least a part of the work object 102 and transmits the image to the instruction device 108. Then, the work terminal 103 can receive the instruction mark information indicating the instruction content input by the instruction device 108, and can visualize and display the instruction mark information as the instruction mark 105. Accordingly, the worker 101 can check the instruction mark 105 displayed in the video obtained by photographing the work target 102 using the work terminal 103, and therefore, the worker 101 can recognize the instruction mark 107 and the worker 101. It is possible to perform work that is difficult to cause wrinkles.
 以上の処理により、本実施形態に係る遠隔通信システム1は、作業端末103の撮影した映像の撮影範囲を指示装置108の所持する抽象化データ上に視覚化して表示する。そして、指示者107が作業対象102の全体を把握した上で指示マークによる作業指示を行うことを可能とする。よって、指示者107と作業者101の間で指示内容の認識に齟齬が生じにくい遠隔通信システム1を提供することができるという効果を奏する。 Through the above processing, the remote communication system 1 according to the present embodiment visualizes and displays the shooting range of the video shot by the work terminal 103 on the abstract data possessed by the pointing device 108. Then, the instructor 107 can give a work instruction using an instruction mark after grasping the entire work object 102. Therefore, there is an effect that it is possible to provide the remote communication system 1 in which it is difficult for the instructor 107 and the worker 101 to recognize the instruction content.
 (第2の実施の形態)
 本発明の一態様である他の実施の形態について、以下に説明する。本実施の形態では、遠隔通信システム1が、撮影範囲を3次元的な面として3次元モデル空間上に描画する方法について説明する。
(Second Embodiment)
Another embodiment which is one embodiment of the present invention is described below. In the present embodiment, a method will be described in which the remote communication system 1 draws an imaging range on a three-dimensional model space as a three-dimensional surface.
 前記第1の実施の形態では、作業者101の撮影した範囲全体に、作業対象102の一部が映っている場合に適切な表示がされる方法である。すなわち、前記第1の実施の形態で説明した方法は、カメラ103aの光軸原点から作業対象102までの距離値を用いて、作業対象102の形状にあわせて撮影範囲を描画するものであった。しかし、撮影画像中に作業対象102の他に背景情報が映っている場合、撮影範囲を描画するための座標位置(Z値)が定まらないため、描画が困難になるという問題があった。これは、カメラ103aの光軸原点から背景までの奥行きに対応する座標位置(Z値)が一定ではないことが原因である。 In the first embodiment, an appropriate display is performed when a part of the work object 102 is shown in the entire range photographed by the worker 101. That is, in the method described in the first embodiment, the imaging range is drawn according to the shape of the work object 102 using the distance value from the optical axis origin of the camera 103a to the work object 102. . However, when background information is shown in addition to the work object 102 in the captured image, the coordinate position (Z value) for rendering the imaging range is not fixed, and there is a problem that rendering becomes difficult. This is because the coordinate position (Z value) corresponding to the depth from the optical axis origin of the camera 103a to the background is not constant.
 また、撮影範囲の境界が背景部分にかかると、作業対象102上の撮影範囲と背景上の撮影範囲とで、撮影範囲の情報(描画されるライン)が分断されることになる。その結果、指示者107は、抽象化データ上で撮影範囲を連続的に視認することが難しくなる。 Also, when the boundary of the shooting range is applied to the background portion, information on the shooting range (line to be drawn) is divided between the shooting range on the work object 102 and the shooting range on the background. As a result, it becomes difficult for the instructor 107 to visually recognize the shooting range continuously on the abstract data.
 そこで、本実施の形態に係る遠隔通信システム1は、作業対象までの距離値を用いて形状にあわせて撮影範囲を描画するのでなく、カメラ103aの光軸原点からARマーカー403までの距離Lの位置に、撮影範囲を表現する面を描画することで上述の問題に対応する。すなわち、本実施の形態において、撮影範囲表示部312は、撮像映像(指示装置108における映像408、および作業端末103における実写映像400)の撮影範囲を示す平面(撮影範囲1303)を、全体画像(例えば、3次元モデルデータ404、2次元データ406、俯瞰図407など)に重畳させて表示する点で前記実施の形態と異なる。 Therefore, the remote communication system 1 according to the present embodiment does not draw the imaging range in accordance with the shape using the distance value to the work target, but the distance L from the optical axis origin of the camera 103a to the AR marker 403. The above-described problem is addressed by drawing a surface representing the imaging range at the position. In other words, in the present embodiment, the shooting range display unit 312 displays a plane (shooting range 1303) indicating the shooting range of the captured video (the video 408 in the pointing device 108 and the live-action video 400 in the work terminal 103) as a whole image ( For example, it is different from the above-described embodiment in that it is displayed superimposed on the three-dimensional model data 404, the two-dimensional data 406, the overhead view 407, and the like.
 具体的には、(式1)のZ値の部分をLで固定し、下記の(式1’)のように記述する。 Specifically, the Z value part of (Formula 1) is fixed at L and described as (Formula 1 ') below.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 これにより求まる撮影範囲は、カメラ103aの投影面に対して平行で、ARマーカーまでの距離に応じて大きさが変化する平面となる。 The imaging range determined in this way is a plane that is parallel to the projection surface of the camera 103a and changes in size according to the distance to the AR marker.
 この(式1’)の射影変換式により求まった座標値を基に撮影範囲を描画した結果を模式的に表したものが、図13である。作業端末103の撮影した映像1301の撮影範囲は、3次元モデル空間401において、指示者107の視点1302から、作業対象102上のARマーカー403と対応する3次元モデルデータ404上のARマーカー403’までの距離Lの位置に描画される。図示の例によれば、撮影範囲は、作業端末103の投影面と平行に1303のように描画される。なお、撮影範囲1303の描画位置の算出に用いる情報は、視点1302からARマーカー403’の貼付位置までの距離値Lに限定するものではなく、特定の面を代表する距離値であれば良い。例えば、作業対象102に含まれる、代表的な面であったり、作業対象102の各面の内、カメラ103aに最も近い面を抽出して、そこまでの距離値を用いても良い。 FIG. 13 schematically shows the result of drawing the photographing range based on the coordinate value obtained by the projective transformation formula of (Formula 1 '). The shooting range of the video 1301 shot by the work terminal 103 is the AR marker 403 ′ on the three-dimensional model data 404 corresponding to the AR marker 403 on the work target 102 from the viewpoint 1302 of the instructor 107 in the three-dimensional model space 401. Is drawn at the position of the distance L. According to the illustrated example, the imaging range is drawn as 1303 in parallel with the projection plane of the work terminal 103. Note that the information used to calculate the drawing position of the shooting range 1303 is not limited to the distance value L from the viewpoint 1302 to the AR marker 403 'application position, but may be a distance value that represents a specific surface. For example, a representative surface included in the work object 102 or a face closest to the camera 103a out of each surface of the work object 102 may be extracted and the distance value up to that may be used.
 上述の方法を実施することで、撮影範囲を表現する面が一意に求まるため、背景の有無によらず、指示装置108は指示者107に対し撮影範囲を視覚化した映像を提供できるようになる。 By performing the above-described method, a plane that represents the shooting range is uniquely obtained, so that the instruction device 108 can provide an image that visualizes the shooting range to the instructor 107 regardless of the presence or absence of the background. .
 以上の処理により、本実施形態に係る指示装置108は、作業端末103が撮影した撮像映像の撮影範囲を示す平面を、全体画像に重畳させて表示することができる。従って、指示装置108は、作業端末103が撮影した映像が背景を含む場合であっても、指示者107と作業者101の間で指示内容の認識に齟齬が生じにくい遠隔通信システム1を提供することができるという効果を奏する。 Through the above processing, the pointing device 108 according to the present embodiment can display a plane indicating the shooting range of the captured image captured by the work terminal 103 so as to be superimposed on the entire image. Therefore, the instruction device 108 provides the remote communication system 1 in which even if the video captured by the work terminal 103 includes a background, the instruction content is less likely to be recognized between the instruction person 107 and the worker 101. There is an effect that can be.
 (第3の実施の形態)
 本発明の一態様である第3の実施の形態について、以下に説明する。本実施の形態に係る遠隔通信システム1は、複数のARマーカーを用いて撮影範囲描画可能範囲を拡張する点で、前記各実施の形態と異なる。
(Third embodiment)
A third embodiment which is one embodiment of the present invention will be described below. The remote communication system 1 according to the present embodiment is different from the above embodiments in that the imaging range drawing range is extended using a plurality of AR markers.
 まず、第3の実施の形態においては、作業対象102に対して複数のARマーカーを貼り付け、3次元モデルデータ404上に対応するマーカー位置を複数指定した状態で作業端末103が作業対象102を撮影する。 First, in the third embodiment, the work terminal 103 displays the work object 102 in a state where a plurality of AR markers are pasted on the work object 102 and a plurality of corresponding marker positions are designated on the three-dimensional model data 404. Take a picture.
 次に、それぞれのARマーカーに割り振られた、個々のARマーカーを識別するためのID(識別子:IDentification)を用いて、(式1)および(式1’)で作業端末103の姿勢をIDごとに算出する。なお、前記第1および第2の実施の形態では1つのARマーカーが必ず画面に映るように撮影しなければならず作業端末103の移動範囲が大幅に制限されていた。本実施の形態に記載の方法を適用すると、作業者101は作業端末103の映像に、複数のARのマーカーのうち最低でもどれか1つが映るように撮影すればよくなるため、作業の自由度が向上する。 Next, using the ID (identifier: IDentification) assigned to each AR marker for identifying each AR marker, the posture of the work terminal 103 is set for each ID in (Equation 1) and (Equation 1 ′). To calculate. In the first and second embodiments, it is necessary to shoot so that one AR marker is always displayed on the screen, and the movement range of the work terminal 103 is greatly limited. When the method described in this embodiment is applied, the worker 101 only needs to shoot so that at least one of a plurality of AR markers is displayed in the video of the work terminal 103, so that the degree of freedom of work is increased. improves.
 本実施の形態による方法で、第1の実施の形態の描画内容に適用した時の例を、図14を用いて説明する。作業対象1401にARマーカー1402と1403が貼り付けられており、3次元モデルデータ404上の対応する基準位置として1402’および1403’が入力されたものとする。この時、映像1404が撮影された場合、撮影範囲は3次元モデル空間上で1406の位置に描画されることになる。一方、映像1405が撮影された場合は、撮影範囲は1407の位置に描画される。また、複数のARマーカーが同時に映った場合は、どれか1つのARマーカーに対する作業端末103の姿勢の算出結果を用いて撮影範囲を算出すればよい。すなわち、本実施の形態において、指示装置108は、作業端末103が撮影した映像(映像1404、映像1405)が、作業対象1401に貼り付けられた複数のARマーカー(ARマーカー1402、ARマーカー1403)の内、少なくとも1つを含む映像であれば、作業対象1401の立体的形状をユーザが把握するための画像を含む全体画像(例えば、3次元モデルデータ404、2次元データ406、俯瞰図407など)を表示し、表示した前記全体画像に重畳させて、作業対象1401のどの部分が、作業端末103が撮像した映像に撮像されているかを示す情報を表示することができる。 An example when the method according to the present embodiment is applied to the drawing contents of the first embodiment will be described with reference to FIG. Assume that AR markers 1402 and 1403 are pasted on the work target 1401 and 1402 ′ and 1403 ′ are input as corresponding reference positions on the three-dimensional model data 404. At this time, when the image 1404 is photographed, the photographing range is drawn at a position 1406 in the three-dimensional model space. On the other hand, when the video 1405 is shot, the shooting range is drawn at a position 1407. In addition, when a plurality of AR markers are displayed at the same time, the imaging range may be calculated using the calculation result of the attitude of the work terminal 103 with respect to any one AR marker. In other words, in this embodiment, the instruction device 108 includes a plurality of AR markers (AR marker 1402, AR marker 1403) in which videos (video 1404, video 1405) taken by the work terminal 103 are pasted on the work target 1401. If the image includes at least one of the images, an entire image including an image for the user to grasp the three-dimensional shape of the work target 1401 (for example, the three-dimensional model data 404, the two-dimensional data 406, the overhead view 407, etc.) ) And superimposed on the displayed whole image, information indicating which part of the work target 1401 is captured in the image captured by the work terminal 103 can be displayed.
 第2の実施の形態において複数のARマーカーが撮影された場合は、それぞれのARマーカーの座標の重心と、マーカーまでの距離の平均値を用いた仮想的なARマーカーが1つ存在するとして描画位置が計算される。具体的には、ARマーカーの座標をP(Xpi、Ypi、Zpi)(piはARマーカーのインデックス、pi=1~N)、視点からARマーカーまでの距離をLpiとすると、仮想的なARマーカーの存在する位置P(X、Y、Z)は、 When a plurality of AR markers are photographed in the second embodiment, rendering is performed assuming that there is one virtual AR marker using the center of gravity of the coordinates of each AR marker and the average value of the distance to the marker. The position is calculated. Specifically, assuming that the coordinates of the AR marker are P i (X pi , Y pi , Z pi ) (pi is the index of the AR marker, pi = 1 to N), and the distance from the viewpoint to the AR marker is L pi The position P v (X v , Y v , Z v ) where the virtual AR marker exists is
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
(式5)のように表現される。(式5)により求まった仮想的なマーカーの位置と、(式1’)を用いることで、指示装置108は、複数のARマーカーが存在する場合に、仮想的なARマーカーの座標位置から撮影面の算出が可能になり、作業できる範囲を広げられる。 It is expressed as (Formula 5). By using the position of the virtual marker obtained by (Expression 5) and (Expression 1 ′), the pointing device 108 captures an image from the coordinate position of the virtual AR marker when there are a plurality of AR markers. The surface can be calculated, and the range of work can be expanded.
 以上の処理により、本実施の形態に係る遠隔通信システム1は、複数のARマーカー(ARマーカー1402、ARマーカー1403)を備える作業対象1401について、作業端末103の撮影した、少なくとも1つのARマーカーを含む映像(映像1404、映像1405)の撮影範囲を抽象化データ上に(例えば、図14に例示する位置1406、1407として)視覚化して表示する。そして、指示者107が作業対象1401の全体を把握した上で指示マークによる作業指示を行うことを可能とする。よって、少なくとも1つのARマーカーを映像に収めることが出来る範囲で、指示者107と作業者101の間で指示内容の認識に齟齬が生じにくい遠隔通信システム1を提供することができるという効果を奏する。 Through the above processing, the remote communication system 1 according to the present embodiment uses at least one AR marker captured by the work terminal 103 for the work target 1401 including a plurality of AR markers (AR marker 1402 and AR marker 1403). The shooting range of the included video (video 1404, video 1405) is visualized and displayed on the abstract data (for example, as positions 1406 and 1407 illustrated in FIG. 14). Then, the instructor 107 can give a work instruction using an instruction mark after grasping the entire work target 1401. Therefore, there is an effect that it is possible to provide the remote communication system 1 in which at least one AR marker can be included in the video, and it is difficult for the instructor 107 and the worker 101 to recognize the instruction content. .
 (第4の実施の形態)
 本実施の形態では、撮影範囲に加えて、作業端末103の向きなどを示す補助情報を描画する形態について記載する。図15は抽象化データの一例である3次元モデル空間401内の3次元モデルデータ404における撮影範囲および補助情報の描画例である。
(Fourth embodiment)
In this embodiment, a mode is described in which auxiliary information indicating the orientation of the work terminal 103 and the like is drawn in addition to the shooting range. FIG. 15 is a drawing example of a shooting range and auxiliary information in the 3D model data 404 in the 3D model space 401, which is an example of abstract data.
 作業端末103の向きの描画例1501では、撮影範囲405aの回転にあわせて、作業端末103の向きを示す視覚情報を合成して表示している様子を示している。この例では、作業端末103の上方向の向きを表す補助情報として、向き情報1505を3次元モデル空間401上の撮影範囲内に描画している。すなわち、本実施の形態において、撮影範囲表示部312は、作業端末103が作業対象102(作業対象物)を撮像した際の前記作業端末103の向きを示す情報(向き情報1505)を表示することができる。 The drawing example 1501 of the orientation of the work terminal 103 shows a state in which visual information indicating the orientation of the work terminal 103 is synthesized and displayed in accordance with the rotation of the shooting range 405a. In this example, orientation information 1505 is drawn in the imaging range on the three-dimensional model space 401 as auxiliary information indicating the upward direction of the work terminal 103. That is, in the present embodiment, the imaging range display unit 312 displays information (orientation information 1505) indicating the orientation of the work terminal 103 when the work terminal 103 images the work target 102 (work target). Can do.
 描画例1501の場合は、作業端末103を縦に持って撮影していることはわかるが、向き情報1505が無いと左右どちらに回転させた結果を示しているのかは指示者107には分からない。作業端末103をどの方向に動かせばいいかを指示する際にはこの向き情報1505は有効である。なお、作業端末103の回転の向きについては、作業端末103に備わる傾斜センサを用いて算出することができる。また、向き情報1505の代わりに、撮影範囲405aについて、例えば上方向の境界線の色を他の3方向の境界線の色と別の色にするなどの方法を用いてもよい。 In the case of the drawing example 1501, it can be seen that the work terminal 103 is held vertically, but the instructor 107 does not know whether the result is rotated left or right without the orientation information 1505. . This direction information 1505 is effective when instructing in which direction the work terminal 103 should be moved. Note that the direction of rotation of the work terminal 103 can be calculated using an inclination sensor provided in the work terminal 103. Instead of the orientation information 1505, for the imaging range 405a, for example, a method of changing the color of the upper boundary line to a color different from the other three boundary colors may be used.
 描画例1502は撮影範囲内に3次元モデルデータ404が存在する場合、そのモデルの形状に沿って、もしくは撮影範囲405aの面の上に、補助情報として作業端末103が撮影した実写映像400のテクスチャ1506を貼り付けた様子を示している。指示者107が所持している抽象化データと、実際に作業者101の撮影している作業対象102とは、必ずしも形状、色、サイズ、といった特徴が一致しているとは限らない。例えば、作業対象102の置かれている環境、時間経過といった様々な要因で変化している可能性がある。その場合に作業者101が撮影した作業対象102の情報を3次元モデル空間401上に重畳描画してやることで、指示者107は抽象化データと実物(作業対象102)との差異を確認できるようになり、作業対象102の形状変化に応じた指示ができるようになる。 In the drawing example 1502, when the 3D model data 404 exists in the shooting range, the texture of the live-action image 400 taken by the work terminal 103 as auxiliary information along the shape of the model or on the surface of the shooting range 405 a. A state where 1506 is pasted is shown. The abstract data possessed by the instructor 107 and the work object 102 actually photographed by the worker 101 do not necessarily have the same characteristics such as shape, color, and size. For example, there may be a change due to various factors such as the environment in which the work object 102 is placed and the passage of time. In this case, by superimposing and drawing the information of the work object 102 photographed by the worker 101 on the three-dimensional model space 401, the instructor 107 can confirm the difference between the abstract data and the actual object (work object 102). Thus, it becomes possible to give an instruction according to the shape change of the work object 102.
 前記各実施の形態に係る遠隔通信システム1は、作業者101が現在どこを注視しているのかを判断する手段が無い。例えば、撮影範囲405a上に撮影映像を重畳し、作業者101が作業をしている様子が映った場合に初めて、指示者107は作業者101が作業端末103の画面を確認せずに作業を継続している可能性があることに気づく。また、作業者101が作業をしている様子が映らないことから、作業端末103を把持して指示者107の説明を聞いていることはわかっても、作業端末103を注視しているとは限らない。そのため、指示者107が設置した指示マーク情報を見逃してしまう可能性がある。そこで、指示者が適切なタイミングで指示マークを設置できるよう、描画例1503のように補助情報として作業者101の視線情報1507を3次元モデル空間401上に視覚化する方法で解決することが考えられる。視線情報1507が撮影範囲405aの中にある時は、作業者101は画面を注視していると判断してよく、指示者107が設置した指示マークを視認可能なタイミングを把握しやすくなる。この例では視線情報1507は円形で表現しているが、形状や色は特に限定しない。一点を長い間注視しているときは徐々にサイズを大きくするといった工夫をしてもよいし、視線の位置に合わせて指示マークが追従するような形態をとってもよい。 The remote communication system 1 according to each of the above embodiments has no means for determining where the worker 101 is currently watching. For example, the instructor 107 does not check the screen of the work terminal 103 until the operator 107 confirms the screen of the work terminal 103 only when the photographed image is superimposed on the photographing range 405 a and the worker 101 is working. Notice that it may continue. In addition, since the situation where the worker 101 is working is not shown, it is understood that he is grasping the work terminal 103 and is listening to the explanation of the instructor 107, but is watching the work terminal 103. Not exclusively. Therefore, there is a possibility that the instruction mark information set by the instructor 107 may be missed. Therefore, it is conceivable to solve the problem by visualizing the gaze information 1507 of the worker 101 on the 3D model space 401 as auxiliary information as in the drawing example 1503 so that the instructor can place the instruction mark at an appropriate timing. It is done. When the line-of-sight information 1507 is within the imaging range 405a, it may be determined that the worker 101 is gazing at the screen, and it is easy to grasp the timing at which the instruction mark placed by the instructor 107 can be visually recognized. In this example, the line-of-sight information 1507 is expressed as a circle, but the shape and color are not particularly limited. When gazing at a point for a long time, the size may be gradually increased, or the indication mark may follow the line of sight.
 作業者101の視線情報を取得する方法としては、図示していない作業端末103の表示部103bの側に備わる、作業者101を撮影する別のカメラを用いることが考えられる。該カメラの映像に対して、オープンソースの視線追跡ライブラリであるOpenGazerや、トビー・テクノロジー株式会社の提供するアイトラッカーツールであるEyeX dev kitといった汎用のツールを併用することで視線情報の取得が可能である。そのために必要な機器を本発明の構成に組み込むこともできる。 As a method of acquiring the line-of-sight information of the worker 101, it is conceivable to use another camera for photographing the worker 101, which is provided on the display unit 103b side of the work terminal 103 (not shown). Gaze information can be obtained by using general-purpose tools such as OpenGazer, an open source gaze tracking library, and EyeX dev kit, an eye tracker tool provided by Toby Technology Co., Ltd. It is. Devices necessary for this can be incorporated into the configuration of the present invention.
 描画例1504aでは、補助情報として、指示者107が撮影範囲405a上の特定の位置を指定することで、補助的な役割を持つ電子オブジェクト情報1508aを撮影範囲405aの一部、あるいは全面に重畳する。前記電子オブジェクト情報1508aは、指示者107自身の指示時の目安に使用したり、作業者101に指示内容の補足説明的な役割を持つ資料を提示する、といった用途に使用する。また、描画例1504bのように、3次元モデル空間401上だけでなく、作業端末103の画面内、あるいはその両方に電子オブジェクト情報1508bを描画してもよい。この電子オブジェクト情報とは、テキスト情報であったり、指示の内容に応じた適切な画像であったり、動画であったり、設計図や地図、といった、2次元で表現可能な形態であってもよい。さらに、3次元モデルデータの縮小図、作業箇所や作業対象となる部品のモデルデータ等、3次元的な情報を描画するようにしても良い。 In the drawing example 1504a, the instructor 107 designates a specific position on the shooting range 405a as auxiliary information, so that the electronic object information 1508a having an auxiliary role is superimposed on a part of the shooting range 405a or on the entire surface. . The electronic object information 1508a is used as a guideline when the instructor 107 is instructed, or is used for the purpose of presenting materials having a supplementary explanation role of the instruction content to the worker 101. Further, as in the drawing example 1504b, the electronic object information 1508b may be drawn not only on the three-dimensional model space 401 but also on the screen of the work terminal 103, or both. The electronic object information may be text information, an appropriate image according to the content of the instruction, a moving image, or a form that can be expressed in two dimensions, such as a design drawing or a map. . Further, three-dimensional information such as a reduced view of three-dimensional model data, model data of a work location or a part to be worked on, etc. may be drawn.
 以上の処理により、本実施の形態に係る遠隔通信システム1は、作業端末103の撮影した映像(指示装置108における映像408、および作業端末103における実写映像400)の撮影範囲を指示装置108の所持する抽象化データ上に、補助情報とともに視覚化して表示する。そして、指示者107が作業対象102の全体(例えば、3次元モデルデータ404、2次元データ406、俯瞰図407など)および補助情報を把握した上で指示マーク(指示情報、指示装置108における指示マーク111、および作業端末103における指示マーク105)による作業指示を行うことを可能とする。よって、指示者107と作業者101の間で指示内容の認識に齟齬が生じにくい遠隔通信システム1を提供することができるという効果を奏する。 With the above processing, the remote communication system 1 according to the present embodiment has the shooting range of the video captured by the work terminal 103 (the video 408 in the pointing device 108 and the real shot video 400 in the work terminal 103). On the abstract data to be visualized and displayed together with auxiliary information. Then, the instructor 107 grasps the entire work object 102 (for example, the three-dimensional model data 404, the two-dimensional data 406, the overhead view 407) and the auxiliary information, and then indicates the instruction mark (instruction information, the instruction mark on the instruction device 108). 111 and the work mark 103) on the work terminal 103 can be given work instructions. Therefore, there is an effect that it is possible to provide the remote communication system 1 in which it is difficult for the instructor 107 and the worker 101 to recognize the instruction content.
 〔第5の実施の形態〕
 上述の各実施の形態では、作業端末側に作業者が存在し、作業者が、作業端末に表示された指示者からの指示情報に基づいて作業をおこなうという態様を説明した。しかしながら、本発明はこの態様に限定されるものではない。例えば、本実施の形態では、作業端末が作業対象物を撮影するとともに、指示情報に基づいた作業も作業端末自体がおこなう態様である。
[Fifth Embodiment]
In each of the above-described embodiments, a mode has been described in which an operator is present on the work terminal side and the worker performs work based on instruction information from the instructor displayed on the work terminal. However, the present invention is not limited to this embodiment. For example, in this Embodiment, while a work terminal image | photographs a work target object, the work terminal itself also performs the work based on instruction information.
 本実施の形態では、自動移動型(例えば、自動走行型、自動飛行型)のロボットを作業端末として挙げることができる。すなわち、本実施の形態では、自動走行型のロボットと、上述の実施の形態1の指示装置108とを含む遠隔通信システムを例示する。 In this embodiment, an automatic movement type (for example, an automatic traveling type and an automatic flight type) robot can be cited as a work terminal. That is, in the present embodiment, a telecommunications system including an automatic traveling robot and the pointing device 108 of the first embodiment described above is illustrated.
 前記ロボットは、上述の実施の形態1の作業端末103に具備された各構成に加えて、作業部を具備している点において上述の実施の形態1の作業端末103と異なる。 The robot differs from the work terminal 103 of the first embodiment described above in that it includes a work unit in addition to the components of the work terminal 103 of the first embodiment.
 また、上述の各実施形態の作業端末と前記ロボットとの違いは、前記ロボットは、指示装置側から送信された指示情報を視覚的に把握できるような構成を必須としない点にある。前記ロボットは、指示情報を受け取ると、当該指示情報の内容に基づいて前記作業部が作業をおこなう構成となっていればよい。 Further, the difference between the work terminal of each of the above-described embodiments and the robot is that the robot does not necessarily have a configuration that can visually grasp the instruction information transmitted from the instruction device side. The robot may be configured such that when the instruction information is received, the working unit performs the operation based on the content of the instruction information.
 前記作業部は、前記ロボットの走行を制御する走行制御部を含むほか、例えば、作業アームおよび、該作業アームの制御部を含むことができる。 The working unit includes a traveling control unit that controls traveling of the robot, and may include, for example, a working arm and a control unit for the working arm.
 指示装置108から前記ロボットに送信される指示情報としては、ロボットが具備する作業部の移動位置、姿勢、作業内容等がある。要するに実施の形態1において作業端末側に居る作業者が作業をおこなうために必要な情報が、前記作業部への指示情報として指示装置108から前記ロボットに送信される。 The instruction information transmitted from the instruction device 108 to the robot includes the movement position, posture, work content, and the like of the working unit included in the robot. In short, in the first embodiment, information necessary for the worker on the work terminal side to perform work is transmitted from the instruction device 108 to the robot as instruction information to the working unit.
 本実施の形態の遠隔通信システムによれば、ロボットの撮影した映像の撮影範囲を指示装置108の所持する抽象化データ上に視覚化して表示する。そして、指示者107が作業対象102の全体を把握した上で指示マークによる指示情報を行うことを可能とする。よって、指示者107とロボットの間で指示内容の認識に齟齬が生じにくい遠隔通信システムを提供することができるという効果を奏する。 According to the remote communication system of the present embodiment, the shooting range of the video shot by the robot is visualized and displayed on the abstract data possessed by the pointing device 108. Then, the instructor 107 can perform the instruction information by the instruction mark after grasping the entire work object 102. Therefore, there is an effect that it is possible to provide a telecommunications system in which it is difficult for the instructor 107 and the robot to recognize the instruction content.
 〔変形例〕
 上記の各実施の形態において、添付図面に図示されている構成等については、あくまで一例であり、これらに限定されるものではなく、本発明の一態様の効果を発揮する範囲内で適宜変更することが可能である。その他、本発明の一態様の目的の範囲を逸脱しない限りにおいて適宜変更して実施することが可能である。
[Modification]
In each of the above-described embodiments, the configuration illustrated in the accompanying drawings is merely an example, and is not limited thereto, and may be changed as appropriate within the scope of the effect of one embodiment of the present invention. It is possible. In addition, various modifications can be made without departing from the scope of the object of one embodiment of the present invention.
 上記の各実施の形態における作業端末103の姿勢算出の方法については、抽象化データ上の座標系と対応付けが可能であり、姿勢を算出可能な方法であればどのような方法であってもよい。例えば、SFM(Structure From Motion)の技術を用いて、作業端末103の相対的な姿勢を求めるのと同時に周辺の3次元復元を行い、得られた3次元復元結果と抽象化データの位置合わせをすればよい。位置合わせについては、例えば、ICP(Iterative Closest Point)の技術を用いることができる。 As for the method of calculating the attitude of the work terminal 103 in each of the above embodiments, any method can be used as long as it can be associated with the coordinate system on the abstract data and can calculate the attitude. Good. For example, using the technique of SFM (Structure From Motion), the relative attitude of the work terminal 103 is obtained, and at the same time, the surrounding 3D restoration is performed, and the obtained 3D restoration result and the abstract data are aligned. do it. For the positioning, for example, the technology of ICP (Iterative Closest Point) can be used.
 また、各実施の形態における指示者107が見ることのできる情報は抽象化データ以外のものも同時に表示できるような形態にしてもよい。例えば、作業端末103が撮影した映像と、作業対象102の抽象化データを並べて表示する形態であってもよい。また、表示する形態は、映像内に抽象化データを重畳して表示する形態であってもよい。さらに、指示マーク情報は撮影した映像と抽象化データのいずれか一方、あるいは両方に合成して表示する形態にしてもよい。このとき、撮影範囲表示部312は、作業対象102の全体画像(例えば、3次元モデルデータ404、2次元データ406、俯瞰図407など)における、作業端末103の撮像映像(指示装置108における映像408、および作業端末103における実写映像400)に撮像されている部分に対応する部分に、前記撮像映像を表示することができる。 Further, the information that can be viewed by the instructor 107 in each embodiment may be configured so that other than abstract data can be displayed at the same time. For example, the video captured by the work terminal 103 and the abstract data of the work target 102 may be displayed side by side. Further, the display form may be a form in which the abstract data is superimposed on the video. Further, the instruction mark information may be displayed in a synthesized form with one or both of the captured video and abstract data. At this time, the imaging range display unit 312 displays the captured image of the work terminal 103 (the video 408 on the pointing device 108) in the entire image of the work target 102 (for example, the 3D model data 404, the 2D data 406, and the overhead view 407). The captured video can be displayed in a portion corresponding to the portion captured in the live-action video 400) in the work terminal 103.
 上記の各実施の形態の説明では、機能を実現するための各構成要素をそれぞれ異なる部位であるとして説明を行っているが、実際にこのように明確に分離して認識できる部位を有していなければならないわけではない。上記の各実施の形態の機能を実現する遠隔通信システム1が、機能を実現するための各構成要素を、例えば実際にそれぞれ異なる部位を用いて構成していてもかまわない。あるいは、全ての構成要素を一つのLSI(Large Scale Integration、大型集積回路)に実装していてもかまわない。すなわち、どのような実装形態であれ、機能として各構成要素を有していれば良い。また、本発明の一態様の各構成要素は、任意に取捨選択することができ、取捨選択した構成を具備する発明も本発明の一態様に含まれるものである。 In the description of each of the above embodiments, each component for realizing the function is described as being a different part, but actually has a part that can be clearly separated and recognized in this way. It doesn't have to be. The remote communication system 1 that implements the functions of the above-described embodiments may be configured by using, for example, different parts for each component for realizing the functions. Alternatively, all the components may be mounted on one LSI (Large Scale Integration, large integrated circuit). That is, in any mounting form, each component may be included as a function. In addition, each constituent element of one embodiment of the present invention can be arbitrarily selected, and an invention having a selected configuration is also included in one embodiment of the present invention.
 また、上記の各実施形態で説明した機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより各部の処理を行っても良い。なお、ここでいう「コンピュータシステム」とは、OS(Operation System)や周辺機器等のハードウェアを含むものとする。 In addition, a program for realizing the functions described in the above embodiments is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Processing may be performed. Here, the “computer system” includes hardware such as an OS (Operation System) and peripheral devices.
 また、「コンピュータシステム」は、WWW(World Wide Web)システムを利用している場合であれば、ホームページ提供環境(あるいは表示環境)も含むものとする。 In addition, the “computer system” includes a homepage providing environment (or display environment) if a WWW (World Wide Web) system is used.
 また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ROM、CD-ROM等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するものも含む。また、プログラムの送信時に用いられる、サーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また前記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 Further, the “computer-readable recording medium” means a storage device such as a flexible disk, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. Including things. In addition, a volatile memory inside a computer system serving as a server or a client that is used when transmitting the program, such as one that holds the program for a certain period of time. The program may be a program for realizing a part of the above-described functions, or may be a program that can realize the above-described functions in combination with a program already recorded in a computer system.
 〔まとめ〕
 本発明の一態様である態様1に係る指示装置(108)は、作業端末(103)が撮像した作業対象物(102)の撮像映像(実写映像400および映像408)と、作業に係る情報を示す画像であって、前記撮像映像に重畳させて表示される画像(指示マーク105および指示マーク111)と、を前記作業端末(103)と共有する指示装置(108)であって、前記作業対象物(102)の立体的形状をユーザが把握するための画像を含む全体画像(映像110)を表示する全体画像表示部(311)と、前記全体画像に重畳させて、前記作業対象物(102)のどの部分が前記撮像映像に撮像されているかを示す情報(撮影範囲405a、撮影範囲405b、撮影範囲405c)を表示する撮影範囲表示部(312)と、を備える。
[Summary]
The pointing device (108) according to the aspect 1 which is an aspect of the present invention includes a captured image (the live-action image 400 and the image 408) of the work target (102) captured by the work terminal (103) and information related to the work. An instruction device (108) for sharing an image (instruction mark 105 and instruction mark 111) displayed on the captured image with the work terminal (103). An overall image display unit (311) for displaying an entire image (video 110) including an image for the user to grasp the three-dimensional shape of the object (102), and the work object (102) superimposed on the entire image. ) Includes an imaging range display unit (312) for displaying information (imaging range 405a, imaging range 405b, and imaging range 405c) indicating which part of the captured image is captured.
 上記の構成によれば、指示装置は、作業対象物の立体的形状をユーザが把握するための画像を含む全体画像を表示し、該全体画像に重畳させて、作業対象物のどの部分が作業端末が撮像した撮像映像に撮像されているかを示す情報を表示することができる。従って、作業端末の撮影した映像の撮影範囲を該作業対象物の全体画像に重畳して表示することによって、指示者が作業対象の全体を把握した上で作業指示を行うことを可能とする。よって、指示者と作業者の間で指示内容の認識に齟齬が生じにくい遠隔通信システムを提供することができるという効果を奏する。 According to the above configuration, the pointing device displays an entire image including an image for the user to grasp the three-dimensional shape of the work object, and superimposes the entire image on the entire image so that which part of the work object is the work object. Information indicating whether an image is captured in the captured image captured by the terminal can be displayed. Therefore, by displaying the shooting range of the video shot by the work terminal superimposed on the entire image of the work target, it is possible for the instructor to give a work instruction after grasping the entire work target. Therefore, there is an effect that it is possible to provide a telecommunications system in which the instruction content is less likely to be discerned between the instructor and the operator.
 本発明の一態様である態様2に係る指示装置(108)は、前記態様1において、前記撮影範囲表示部(312)は、前記撮像映像の撮影範囲を示す平面(撮影範囲1303)を、前記全体画像に重畳させて表示する。 In the pointing device (108) according to aspect 2 which is an aspect of the present invention, in the aspect 1, the shooting range display unit (312) includes a plane (shooting range 1303) indicating the shooting range of the captured video. It is displayed superimposed on the whole image.
 上記の構成によれば、指示装置は、作業対象物の全体画像の特定の平面上に、作業端末が撮影した撮像映像の撮影範囲を示す平面を重畳させて表示する。従って、作業端末が撮影した映像が背景を含む場合であっても、指示者と作業者の間で指示内容の認識に齟齬が生じにくい遠隔通信システムを提供することができるという効果を奏する。 According to the above configuration, the pointing device superimposes and displays a plane indicating the shooting range of the captured video shot by the work terminal on a specific plane of the entire image of the work target. Therefore, there is an effect that it is possible to provide a remote communication system in which even if the video captured by the work terminal includes a background, it is difficult for the instructor and the worker to recognize the instruction content.
 本発明の一態様である態様3に係る指示装置(108)は、前記態様1または2において、前記撮影範囲表示部(312)は、前記全体画像における、前記撮像映像に撮像されている部分に対応する部分に、前記撮像映像を表示する。 In the pointing device (108) according to aspect 3 which is an aspect of the present invention, in the aspect 1 or 2, the shooting range display unit (312) is provided on a portion of the whole image captured in the captured video. The captured image is displayed on the corresponding part.
 上記の構成によれば、指示装置は、作業対象物の全体画像における、作業端末が撮影した撮像映像に撮像されている部分に対応する部分に、前記撮像映像を表示することができる。従って、指示者が、作業端末が撮影した実写映像と、作業対象物の全体画像とを比較しながら作業指示を行い、指示者と作業者の間で指示内容の認識に齟齬が生じにくい遠隔通信システムを提供することができるという効果を奏する。 According to the above configuration, the pointing device can display the captured image in a portion corresponding to a portion captured in the captured image captured by the work terminal in the entire image of the work target. Therefore, the instructor gives a work instruction while comparing the actual image captured by the work terminal with the entire image of the work object, and remote communication is unlikely to cause a flaw in the recognition of the instruction content between the instructor and the worker. The system can be provided.
 本発明の一態様である態様4に係る指示装置(108)は、前記態様1から3のいずれかにおいて、前記撮影範囲表示部(312)は、前記作業端末(103)が前記作業対象物(102)を撮像した際の前記作業端末(103)の向きを示す情報(向き情報1505)を表示する。 In the pointing device (108) according to aspect 4 which is an aspect of the present invention, in any one of the aspects 1 to 3, the imaging range display unit (312) is configured so that the work terminal (103) is the work object (103). 102), information (orientation information 1505) indicating the direction of the work terminal (103) at the time of imaging is displayed.
 上記の構成によれば、指示装置は、作業対象物の全体画像に、作業端末の撮影した映像の撮影範囲に加えて、作業端末が前記作業対象物を撮像した際の前記作業端末の向きを示す情報を表示することができる。従って、撮影範囲および作業端末の向きを示す情報を用いて、指示者と作業者の間で指示内容の認識に齟齬が生じにくい遠隔通信システムを提供することができるという効果を奏する。 According to the above configuration, the pointing device indicates the orientation of the work terminal when the work terminal images the work object, in addition to the shooting range of the video taken by the work terminal, on the entire image of the work object. Information to be displayed can be displayed. Therefore, it is possible to provide a telecommunications system that uses the information indicating the shooting range and the orientation of the work terminal and that is less likely to cause a flaw in recognizing the instruction content between the instructor and the operator.
 本発明の一態様である態様5に係る指示装置(108)は、前記態様1から4のいずれかにおいて、前記全体画像表示部(312)は、前記作業対象物(102)の立体的形状を特定する複数のパラメータを用いて前記全体画像を表示する。 In the pointing device (108) according to aspect 5 which is an aspect of the present invention, in any one of the aspects 1 to 4, the whole image display unit (312) has a three-dimensional shape of the work object (102). The whole image is displayed using a plurality of parameters to be specified.
 上記の構成によれば、指示装置は、作業対象物の立体的形状を特定する複数のパラメータを用いて該作業対象の全体画像を表示することができる。従って、指示者は、該作業対象物の立体的形状を特定する複数のパラメータを用いて表示された全体画像を把握した上で作業指示を行うことができる。よって、指示者と作業者の間で指示内容の認識に齟齬が生じにくい遠隔通信システムを提供することができるという効果を奏する。 According to the above configuration, the pointing device can display the entire image of the work target using a plurality of parameters that specify the three-dimensional shape of the work target. Therefore, the instructor can give a work instruction after grasping the entire image displayed using a plurality of parameters specifying the three-dimensional shape of the work object. Therefore, there is an effect that it is possible to provide a telecommunications system in which the instruction content is less likely to be discerned between the instructor and the operator.
 本発明の一態様である態様6に係る指示装置(108)は、前記態様1から5のいずれかにおいて、前記全体画像表示部(311)は、前記作業対象物(102)の3次元モデルを示すデータ(3次元モデルデータ404)、あるいは、前記3次元モデルを2次元平面上に展開して得られるデータ(展開図905、三面図906)を用いて前記全体画像を表示する。 In the pointing device (108) according to aspect 6 which is an aspect of the present invention, in any one of the aspects 1 to 5, the whole image display unit (311) displays a three-dimensional model of the work object (102). The whole image is displayed using the data shown (three-dimensional model data 404) or the data obtained by developing the three-dimensional model on a two-dimensional plane (development diagram 905, three-view diagram 906).
 上記の構成によれば、指示装置は、作業対象物の3次元モデルを示すデータ、あるいは、前記3次元モデルを2次元平面上に展開して得られるデータを用いて該作業対象物の全体画像を表示する。従って、指示者は、該作業対象物の全体画像を把握した上で作業指示を行うことができる。よって、指示者と作業者の間で指示内容の認識に齟齬が生じにくい遠隔通信システムを提供することができるという効果を奏する。 According to the above configuration, the instruction device uses the data indicating the three-dimensional model of the work object or the data obtained by developing the three-dimensional model on the two-dimensional plane to display the entire image of the work object. Is displayed. Therefore, the instructor can give a work instruction after grasping the entire image of the work object. Therefore, there is an effect that it is possible to provide a remote communication system that is less likely to cause a flaw in the recognition of instruction content between the instructor and the operator.
 本発明の一態様である態様7に係る指示装置(108)は、前記態様1から5のいずれかにおいて、前記全体画像は、前記作業対象物の周囲の状況を示す画像(俯瞰図407)を含む。 In the pointing device (108) according to aspect 7 which is an aspect of the present invention, in any of the aspects 1 to 5, the entire image is an image (overhead view 407) showing a situation around the work object. Including.
 上記の構成によれば、指示装置は、作業対象物の周囲の状況を示す画像に作業端末の撮影した撮像映像の撮影範囲を重畳して表示することができる。従って、指示者は、該作業対象物の全体画像を把握した上で作業指示を行うことができる。よって、指示者と作業者の間で指示内容の認識に齟齬が生じにくい遠隔通信システムを提供することができるという効果を奏する。 According to the above configuration, the pointing device can display the imaged range of the captured image captured by the work terminal superimposed on the image indicating the situation around the work target. Therefore, the instructor can give a work instruction after grasping the entire image of the work object. Therefore, there is an effect that it is possible to provide a telecommunications system in which the instruction content is less likely to be discerned between the instructor and the operator.
 本発明の一態様である態様8に係る指示装置(108)の制御方法は、作業端末(103)が撮像した作業対象物(102)の撮像映像と、作業に係る情報を示す画像であって、前記撮像映像に重畳させて表示される画像と、を前記作業端末(103)と共有する指示装置(108)の制御方法であって、前記作業対象物(102)の立体的形状をユーザが把握するための画像を含む全体画像を表示する全体画像表示ステップ(S131)と、前記全体画像に重畳させて、前記作業対象物(102)のどの部分が前記撮像映像に撮像されているかを示す情報を表示する撮影範囲表示ステップ(S132)と、を含む。上記の構成によれば、態様1と同様の作用効果を奏する。 The control method of the pointing device (108) according to aspect 8 which is an aspect of the present invention is an image showing a captured image of the work target (102) captured by the work terminal (103) and information related to the work. , A control method of the pointing device (108) for sharing an image displayed superimposed on the captured video with the work terminal (103), wherein the user determines the three-dimensional shape of the work target (102). An entire image display step (S131) for displaying an entire image including an image for grasping, and indicating which part of the work object (102) is captured in the captured video by being superimposed on the entire image. A shooting range display step (S132) for displaying information. According to said structure, there exists an effect similar to the aspect 1. FIG.
 本発明の各態様に係る指示装置(108)は、コンピュータによって実現してもよく、この場合には、コンピュータを上記指示装置(108)が備える各部(ソフトウェア要素)として動作させることにより上記指示装置(108)をコンピュータにて実現させる指示装置(108)の制御プログラム、およびそれを記録したコンピュータ読み取り可能な記録媒体も、本発明の範疇に入る。 The pointing device (108) according to each aspect of the present invention may be realized by a computer. In this case, the pointing device is operated by causing the computer to operate as each unit (software element) included in the pointing device (108). A control program for the instruction device (108) for realizing (108) by a computer and a computer-readable recording medium on which the control program is recorded also fall within the scope of the present invention.
 本発明の一態様は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の一態様の技術的範囲に含まれる。さらに、各実施形態にそれぞれ開示された技術的手段を組み合わせることにより、新しい技術的特徴を形成することができる。 One aspect of the present invention is not limited to the above-described embodiments, and various modifications can be made within the scope of the claims, and the technical means disclosed in different embodiments can be appropriately combined. Such embodiments are also included in the technical scope of one aspect of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.
 (関連出願の相互参照)
 本出願は、2016年7月22日に出願された日本国特許出願:特願2016-144912に対して優先権の利益を主張するものであり、それを参照することにより、その内容の全てが本書に含まれる。
(Cross-reference of related applications)
This application claims the benefit of priority to the Japanese patent application filed on July 22, 2016: Japanese Patent Application No. 2016-144912. By referring to it, all of its contents Included in this document.
 1 遠隔通信システム  102 作業対象(作業対象物)
 103 作業端末(第1端末)
 105 指示マーク(作業に係る情報を示す画像であって、撮像映像に重畳させて表示される画像)
 108 指示装置(第2端末)  110 映像(作業対象物の全体像もしくはその部分像を含む画像)
 111 指示マーク(作業に係る情報を示す画像であって、撮像映像に重畳させて表示される画像)
 302 映像合成部  306撮影範囲算出部   311 全体画像表示部(映像合成部)
 312 撮影範囲表示部(映像合成部)  400 実写映像(撮像映像)
 404 3次元モデルデータ
 405a 撮影範囲  405b 撮影範囲  405c 撮影範囲
 406 2次元データ  407 俯瞰図  408 映像(撮像映像)
 905 展開図  906 三面図
 1303 撮影範囲(撮像映像の撮影範囲を示す平面)
 1401 作業対象  1505 向き情報(補助情報)
1 Telecommunication system 102 Work object (work object)
103 Work terminal (first terminal)
105 Instruction mark (an image indicating information related to work and displayed superimposed on the captured video)
108 Indicating device (second terminal) 110 Video (an image including a whole image of a work object or a partial image thereof)
111 instruction mark (an image showing information related to work and displayed superimposed on the captured video)
302 Image composition unit 306 Shooting range calculation unit 311 Whole image display unit (image composition unit)
312 Shooting range display section (video composition section) 400 Live-action video (captured video)
404 Three-dimensional model data 405a Shooting range 405b Shooting range 405c Shooting range 406 Two-dimensional data 407 Overhead view 408 Video (captured video)
905 Development view 906 Three-sided view 1303 Shooting range (plane showing the shooting range of the captured image)
1401 Work target 1505 Orientation information (auxiliary information)

Claims (12)

  1.  作業端末が撮像した作業対象物の撮像映像を受信する受信部と、
     前記作業端末に対して指示情報を送信する送信部と、
     前記撮像映像の撮影範囲を示す情報を、前記作業対象物の全体像もしくはその部分像を含む画像に重畳させて表示部に表示させる映像合成部と、を備えることを特徴とする指示装置。
    A receiving unit that receives a captured image of the work object captured by the work terminal;
    A transmission unit for transmitting instruction information to the work terminal;
    An instruction device comprising: a video composition unit configured to superimpose information indicating a photographing range of the captured video on an image including the whole image of the work object or a partial image thereof and to display the information on a display unit.
  2.  前記映像合成部は、前記撮像映像の撮影範囲を示す情報を、前記作業対象物の全体像もしくはその部分像を含む画像における当該撮影範囲に対応する位置に重畳させて表示部に表示させることを特徴とする請求項1に記載の指示装置。 The video composition unit superimposes information indicating the shooting range of the captured video on a position corresponding to the shooting range in an image including the whole image of the work object or a partial image thereof and causes the display unit to display the information. The pointing device according to claim 1, wherein
  3.  前記映像合成部は、前記撮像映像の撮影範囲を示す平面を、前記全体像もしくはその部分像を含む画像に重畳させて表示部に表示させることを特徴とする請求項1または2に記載の指示装置。 The instruction according to claim 1 or 2, wherein the video composition unit causes a display unit to display a plane indicating a shooting range of the captured video superimposed on an image including the whole image or a partial image thereof. apparatus.
  4.  前記映像合成部は、前記全体像もしくはその部分像を含む画像における前記位置に、前記撮像映像を重畳させて表示部に表示させることを特徴とする請求項2に記載の指示装置。 The instruction device according to claim 2, wherein the video composition unit superimposes the captured video on the position in the image including the whole image or a partial image thereof and displays the image on the display unit.
  5.  前記映像合成部は、前記作業端末の向きを示す情報を表示することを特徴とする請求項1から4のいずれか1項に記載の指示装置。 The instruction device according to any one of claims 1 to 4, wherein the video composition unit displays information indicating a direction of the work terminal.
  6.  前記映像合成部は、前記作業対象物の立体的形状を特定する複数のパラメータを用いて前記全体像もしくはその部分像を含む画像を生成することを特徴とする請求項1に記載の指示装置。 The instruction device according to claim 1, wherein the video composition unit generates an image including the whole image or a partial image thereof using a plurality of parameters that specify a three-dimensional shape of the work object.
  7.  前記全体像もしくはその部分像を含む画像は、前記作業対象物の3次元モデルを示す画像、あるいは、前記3次元モデルを2次元平面上に展開した展開図を示す画像であることを特徴とする請求項1に記載の指示装置。 The image including the whole image or a partial image thereof is an image showing a three-dimensional model of the work object or an image showing a developed view of the three-dimensional model developed on a two-dimensional plane. The pointing device according to claim 1.
  8.  前記全体像もしくはその部分像を含む画像は、前記作業対象物の周囲の状況を示す画像であることを特徴とする請求項1から6のいずれか1項に記載の指示装置。 The pointing device according to any one of claims 1 to 6, wherein the image including the whole image or a partial image thereof is an image showing a situation around the work object.
  9.  前記表示部を備えることを特徴とする請求項1から8までの何れか1項に記載の指示装置。 The pointing device according to any one of claims 1 to 8, further comprising the display unit.
  10.  作業対象物を撮影する作業端末と、
     作業端末が撮像した作業対象物の撮像映像を受信する受信部、および前記作業端末に対して指示情報を送信する送信部を有する指示装置と、
    を備えた遠隔作業支援システムであって、
     前記撮像映像の撮影範囲を示す情報を、前記作業対象物の全体像もしくはその部分像を含む画像に重畳させて前記指示装置側の表示部に表示させる映像合成部を備えていることを特徴とする遠隔作業支援システム。
    A work terminal for photographing the work object;
    A receiving unit that receives a captured image of a work object captured by the work terminal, and an instruction device that includes a transmission unit that transmits instruction information to the work terminal;
    A remote work support system comprising:
    A video compositing unit that superimposes information indicating a capturing range of the captured video on an image including the whole image of the work object or a partial image thereof and displays the information on the display unit on the pointing device side; Remote work support system.
  11.  作業端末が撮像した作業対象物の撮像映像を受信する受信ステップと、
     前記作業端末に対して指示情報を送信する送信ステップと、
     前記撮像映像の撮影範囲を示す情報を、前記作業対象物の全体像もしくはその部分像を含む画像に重畳させて表示部に表示させる映像合成ステップとを含むことを特徴とする指示装置の制御方法。
    A receiving step of receiving a captured image of the work object imaged by the work terminal;
    A transmission step of transmitting instruction information to the work terminal;
    A method of controlling the pointing device, comprising: a video composition step of superimposing information indicating a photographing range of the picked-up video on an image including the whole image of the work object or a partial image thereof and displaying the information on a display unit. .
  12.  請求項1から9のいずれか1項に記載の指示装置としてコンピュータを機能させるための情報処理プログラムであって、前記映像合成部としてコンピュータを機能させるための情報処理プログラム。

     
    An information processing program for causing a computer to function as the pointing device according to any one of claims 1 to 9, wherein the information processing program causes the computer to function as the video composition unit.

PCT/JP2017/026726 2016-07-22 2017-07-24 Instructing device, method of controlling instructing device, remote operation support system, and information processing program WO2018016655A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016-144912 2016-07-22
JP2016144912 2016-07-22

Publications (1)

Publication Number Publication Date
WO2018016655A1 true WO2018016655A1 (en) 2018-01-25

Family

ID=60993026

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/026726 WO2018016655A1 (en) 2016-07-22 2017-07-24 Instructing device, method of controlling instructing device, remote operation support system, and information processing program

Country Status (1)

Country Link
WO (1) WO2018016655A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008141280A (en) * 2006-11-30 2008-06-19 Fuji Xerox Co Ltd Remote instruction system
JP2014079824A (en) * 2012-10-15 2014-05-08 Toshiba Corp Work screen display method and work screen display device
JP2014106888A (en) * 2012-11-29 2014-06-09 Brother Ind Ltd Work assistance system, and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008141280A (en) * 2006-11-30 2008-06-19 Fuji Xerox Co Ltd Remote instruction system
JP2014079824A (en) * 2012-10-15 2014-05-08 Toshiba Corp Work screen display method and work screen display device
JP2014106888A (en) * 2012-11-29 2014-06-09 Brother Ind Ltd Work assistance system, and program

Similar Documents

Publication Publication Date Title
JP6230113B2 (en) Video instruction synchronization method, system, terminal, and program for synchronously superimposing instruction images on captured moving images
WO2018235163A1 (en) Calibration device, calibration chart, chart pattern generation device, and calibration method
JP6846661B2 (en) Projection methods and devices for three-dimensional projection objects
JP2016171463A (en) Image processing system, image processing method, and program
WO2017013986A1 (en) Information processing device, terminal, and remote communication system
JPWO2021076757A5 (en)
JP6192107B2 (en) Video instruction method, system, terminal, and program capable of superimposing instruction image on photographing moving image
JP6521352B2 (en) Information presentation system and terminal
TWI615808B (en) Image processing method for immediately producing panoramic images
JP6359333B2 (en) Telecommunications system
US11043019B2 (en) Method of displaying a wide-format augmented reality object
JP2018033107A (en) Video distribution device and distribution method
JP2006318015A (en) Image processing device, image processing method, image display system, and program
JP6146869B2 (en) Video instruction display method, system, terminal, and program for superimposing instruction image on photographing moving image synchronously
WO2018016655A1 (en) Instructing device, method of controlling instructing device, remote operation support system, and information processing program
JP5864371B2 (en) Still image automatic generation system, worker information processing terminal, instructor information processing terminal, and determination device in still image automatic generation system
US11758101B2 (en) Restoration of the FOV of images for stereoscopic rendering
JP2005142765A (en) Apparatus and method for imaging
JP5326816B2 (en) Remote conference system, information processing apparatus, and program
JP7225016B2 (en) AR Spatial Image Projection System, AR Spatial Image Projection Method, and User Terminal
JP2018032991A (en) Image display unit, image display method and computer program for image display
JP6156930B2 (en) Video instruction method, system, terminal, and program capable of superimposing instruction image on photographing moving image
JP6242009B2 (en) Video transfer system, terminal, program, and method for displaying a shooting area frame superimposed on a wide area image
JP2016071496A (en) Information terminal device, method, and program
JP7417827B2 (en) Image editing method, image display method, image editing system, and image editing program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17831171

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: JP

122 Ep: pct application non-entry in european phase

Ref document number: 17831171

Country of ref document: EP

Kind code of ref document: A1