WO2021114824A1 - Presentation generation method, apparatus, and device, and medium - Google Patents

Presentation generation method, apparatus, and device, and medium Download PDF

Info

Publication number
WO2021114824A1
WO2021114824A1 PCT/CN2020/118118 CN2020118118W WO2021114824A1 WO 2021114824 A1 WO2021114824 A1 WO 2021114824A1 CN 2020118118 W CN2020118118 W CN 2020118118W WO 2021114824 A1 WO2021114824 A1 WO 2021114824A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
file
feature information
sub
Prior art date
Application number
PCT/CN2020/118118
Other languages
French (fr)
Chinese (zh)
Inventor
谢静文
阮晓雯
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021114824A1 publication Critical patent/WO2021114824A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/438Presentation of query results
    • G06F16/4387Presentation of query results by the use of playlists
    • G06F16/4393Multimedia presentations, e.g. slide shows, multimedia albums
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30176Document

Definitions

  • This application relates to the field of computer technology, and in particular to a method, device, equipment and medium for generating a presentation.
  • presentations are widely used in all aspects of social life. For example, in people's daily work, study, and technical communication, presentations are usually used to give speeches.
  • the user can record the presentation to ensure that the user's speech is synchronized with the presentation.
  • the image of the presentation obtained by recording the presentation requires manual editing by the user to restore the image of the presentation to the presentation, resulting in a relatively low efficiency of restoring the presentation.
  • the embodiment of the present application provides a method for generating a presentation, which can obtain text feature information and image feature information in a target image through a layered processing method, and generate a target presentation corresponding to the target image according to the text feature information and image feature information.
  • the process of presentation does not require manual participation, which avoids the reduction of the efficiency of presentation restoration due to manual participation.
  • an embodiment of the present application provides a presentation method, the method includes: obtaining a file to be processed, the file to be processed includes a target image, and the target image is generated based on the target presentation in the target presentation file Perform text recognition on the target image to obtain text feature information corresponding to the target image; remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image; Perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image; generate the target image corresponding to the text feature information and the image feature information Target presentation.
  • an embodiment of the present application provides a presentation generation device, the device includes: a first acquisition module for acquiring a file to be processed, the file to be processed is generated based on the target presentation in the target presentation file ,
  • the file to be processed includes a target image; a first recognition module for performing text recognition on the target image to obtain text feature information corresponding to the target image; a removal module for moving according to the text feature information
  • the first sub-image corresponding to the target image is obtained;
  • the second recognition module is used to perform image recognition on the first sub-image corresponding to the target image to obtain the corresponding target image Image feature information corresponding to the first sub-image;
  • a first generation module configured to generate target presentations corresponding to the target image according to the text feature information and the image feature information.
  • an embodiment of the present application provides an electronic device, which includes: a processor, adapted to implement one or more instructions; and, a computer storage medium that stores one or more instructions, The one or more instructions are adapted to be loaded by the processor and execute the following steps: obtain a file to be processed, the file to be processed includes a target image, and the target image is generated according to a target presentation in the target presentation file Perform text recognition on the target image to obtain text feature information corresponding to the target image; remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image; Perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image; generate the target image corresponding to the text feature information and the image feature information Target presentation.
  • an embodiment of the present application provides a computer-readable storage medium, including: the computer storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps : Obtain a file to be processed, the file to be processed includes a target image, the target image is generated according to the target presentation in the target presentation file; text recognition is performed on the target image to obtain the text feature corresponding to the target image Information; remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image; perform image recognition on the first sub-image corresponding to the target image to obtain the target Image feature information corresponding to the first sub-image corresponding to the image; generating a target presentation corresponding to the target image according to the text feature information and the image feature information.
  • the text feature information corresponding to the target image is obtained by performing text recognition on the target image
  • the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ;
  • After obtaining the first sub-image perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information.
  • the process of restoring the presentation does not require manual participation, which can improve restoration The efficiency and accuracy of the presentation.
  • FIG. 1 is a schematic flowchart of a method for generating a presentation provided by an embodiment of the present application.
  • Fig. 2 is a schematic diagram of a method for obtaining image feature information corresponding to a first sub-image provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of another method for generating a presentation provided by an embodiment of the present application.
  • Fig. 4 is a schematic structural diagram of a presentation generating device provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an electronic device provided by another embodiment of the present application.
  • Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology.
  • Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • Computer Vision technology is a science that studies how to make machines "see”. Furthermore, it refers to the use of cameras and computers instead of human eyes to identify, track, and measure targets. And further graphics processing, so that computer processing becomes more suitable for human eyes to observe or send to the instrument to detect the image.
  • Computer vision studies related theories and technologies trying to establish an artificial intelligence system that can obtain information from images or multi-dimensional data.
  • Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping Construction and other technologies also include common face recognition, fingerprint recognition and other biometric recognition technologies.
  • This application relates to image recognition technology in artificial intelligence.
  • the image recognition technology is used to automatically convert images into presentations without manual participation, which can improve the efficiency and accuracy of restoring presentations; this application can be applied to smart government affairs, smart education, etc. This field is conducive to promoting the construction of smart cities.
  • FIG. 1 is a schematic flowchart of a method for generating a presentation provided by an embodiment of the present application, which is executed by an electronic device according to an embodiment of the present application.
  • the method for generating a presentation includes the following steps S101-S105.
  • S101 Acquire a file to be processed, where the file to be processed includes a target image, and the target image is generated according to a target presentation in the target presentation file.
  • the file to be processed includes a target image, which is generated based on the target presentation in the target presentation file, and the target presentation in the target presentation file may be the user's presentation when attending a class or a lecture.
  • the presentation corresponding to the content of the courseware, and the target presentation file includes at least one target presentation.
  • the target image is the image taken by the user during the lecture or lecture on the target presentation.
  • the file to be processed can be generated based on at least one image taken by the user, and the target image is any one of the at least one image. .
  • the user can choose to upload the single-page picture file or single-page portable file corresponding to the target presentation in the target presentation file to the electronic device in an orderly manner, and the content in the single-page picture file or single-page portable file corresponds to the one-page target
  • the content of the presentation, the sequence of uploading the at least one single-page picture file or the single-page portable file corresponds to the sequence of at least one target presentation in the target presentation file.
  • the electronic device can obtain and store the sequence when the user uploads at least one single-page picture file or single-page portable file, and according to the sequence of uploading the at least one single-page picture file or single-page portable file, the order for each single-page picture
  • the content in the target image corresponding to the file or single-page portable file is extracted, and the target presentation corresponding to each target image is restored.
  • a candidate file can be obtained.
  • the candidate file is generated based on the target presentation file.
  • the subject information in the candidate file is obtained from the text content in the candidate file, and the candidate file is segmented according to the subject information to obtain Multiple candidate sub-files, perform image conversion processing on the multiple candidate sub-files to obtain a multi-frame image, use the multi-frame image as the file to be processed, and the target image is any frame image in the file to be processed.
  • the candidate file may be a multi-page picture file or a multi-page portable file uploaded by the user, and the multi-page picture file or the multi-page portable file is generated according to the target presentation file.
  • the target presentation file includes a long image formed by stitching at least one frame of images corresponding to at least one page of the target presentation document.
  • Can perform text recognition on the candidate file obtain the text content of the candidate file, and obtain the topic information in the candidate file according to the text content in the candidate file.
  • the method for obtaining the topic information in the candidate file may be obtained according to a topic extraction model in the prior art, such as an Ida topic extraction model, an LDA topic extraction model, and so on.
  • the candidate file is segmented according to the topic information to obtain multiple candidate subfiles, one topic information corresponds to one candidate subfile, and one candidate subfile contains a page target The content corresponding to the presentation. Then, image conversion processing is performed on the multiple candidate subfiles to obtain multiple frames of images.
  • One candidate subfile is converted into one frame of image, and one frame of image corresponds to the content of one page of the target presentation.
  • the multi-frame image is taken as the aforementioned file to be processed, and the target image is any one frame of the image in the file to be processed.
  • obtain a candidate file which is generated based on the target presentation file; receive a segmentation instruction for the candidate file, the segmentation instruction includes a size trimming ratio; segment the candidate file according to the size trimming ratio to obtain a multi-frame image , Take multiple frames of images as files to be processed, and the target image is any frame of images in the file to be processed.
  • the candidate file may be a multi-page picture file or a multi-page portable file uploaded by the user.
  • the multi-page picture file or the multi-page portable file is at least one frame of image corresponding to at least one page of the target presentation in the target presentation file. Long image stitched together.
  • the size cropping ratio of the long image in the candidate file input by the user can be obtained.
  • the segmentation instruction includes the size cropping ratio, and the size cropping ratio is compared with the length of the candidate file.
  • Image segmentation that is, segment the long image formed by stitching at least one frame of images corresponding to at least one page of the target presentation to obtain multiple frames of images, and use the multiple frames of images as files to be processed.
  • the target image is to be processed. Any frame of image in the file. After obtaining the input size and cropping ratio, it is also possible to obtain the arrangement order of each frame image of the file to be processed by the user, sort each frame of the image in the file to be processed, and compare the order of each frame image of the file to be processed. Arrange the order for storage.
  • S102 Perform text recognition on the target image to obtain text feature information corresponding to the target image.
  • text recognition is performed on the target image to obtain the text feature information in the target image.
  • the general OCR text recognition algorithm can be used to identify the text content in the target image and the location information of the block where the text is located, and then the sift algorithm and surf algorithm can be used to perform the font and font size corresponding to the text in the target image.
  • Recognition in order to obtain the text feature information corresponding to the target image.
  • the text feature information of the target image includes at least one of text content in the target image, location information of the text content in the target image, font size of the text content, font type of the text content, color of the text content, and the like.
  • the text content in the target image is removed according to the text feature information, that is, the text layer corresponding to the text content in the target image is removed to avoid subsequent
  • the text layer interferes with the image recognition of the target image, which is beneficial to accurately restore the target presentation corresponding to the target image.
  • the area in the target image after the text content is removed is obtained, and the area is taken as the target area. Then determine the color information corresponding to the text content in the target image according to the text feature information, and then determine the color corresponding to each position in the target area in the first sub-image according to the color information corresponding to the text content, and perform the operation on each position of the target area in the first sub-image. Fill processing. In this way, vacancies in the first sub-image due to the removal of text content can be filled, which is more conducive to subsequent accurate acquisition of image feature information corresponding to the first sub-image.
  • S104 Perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image.
  • the text feature information corresponding to the target image is obtained, and after the first sub-image corresponding to the target image is obtained, the image recognition algorithm can be used to perform image recognition on the first sub-image corresponding to the target image to obtain the corresponding target image.
  • the image recognition algorithm may be a sift algorithm, a surf algorithm, a fully convolutional neural network algorithm, a HOG feature extraction algorithm, and so on.
  • FIG. 2 it is a schematic diagram of a method for obtaining image feature information corresponding to a first sub-image provided by an embodiment of the present application.
  • an embodiment of the present application provides a method for obtaining a first sub-image.
  • the corresponding image feature information method includes steps S21-S26.
  • S21 Acquire pixel feature information of the first sub-image corresponding to the target image.
  • the aforementioned image feature information includes foreground image feature information and background image feature information.
  • the pixel feature information of the first sub-image corresponding to the target object may be obtained first. Since the text information in the first sub-image has been removed, there is no interference from the text content.
  • An image recognition algorithm can be used to extract the pixel feature information in the first sub-image and the location information of the layer where each pixel feature is located.
  • S23 Perform image recognition on the foreground image area to obtain foreground image feature information of the first sub-image.
  • the pixel characteristic information of the first sub-image corresponding to the target image can be obtained, the foreground image area of the first sub-image is determined according to the pixel characteristic information, and the aforementioned foreground image area is image-recognized to obtain the foreground image of the first sub-image Characteristic information.
  • the gray image corresponding to the first sub-image can be determined, and then the corresponding binarized image can be determined, and the foreground image area and the background image area of the first sub-image can be determined.
  • the grabCut algorithm can be used to perform image recognition on the foreground area of the first sub-image to obtain the foreground image corresponding to the first sub-image, the layer where the foreground image is located, and the location of the layer. Information and other foreground image feature information.
  • S25 Perform image recognition on the second sub-image to obtain background image feature information of the second sub-image.
  • S26 Use the background image feature information of the second sub-image as the background image feature information of the first sub-image.
  • the foreground image area in the first sub-image can be removed to obtain the second sub-image, and then image recognition is performed on the second sub-image to obtain the second sub-image.
  • the background image feature information of the sub-image the background image feature information of the second sub-image is used as the background image feature information of the first sub-image.
  • the background image feature information includes the background picture, page layout, decorations, etc. of the second sub-image, and the second sub-image
  • the background feature information of the image is determined as the background feature information of the first sub-image.
  • S105 Generate a target presentation corresponding to the target image according to the above-mentioned text feature information and the above-mentioned image feature information.
  • text recognition is performed on the target image to obtain text feature information corresponding to the target image
  • image recognition is performed on the first sub-image corresponding to the target image to obtain the image feature information corresponding to the first sub-image.
  • Create a blank presentation page and restore the target presentation corresponding to the target image based on the image feature information of the first sub-image corresponding to the target image and the text feature information of the target image.
  • the background image feature information of the first sub-image is used to generate the background image area of the target image;
  • the foreground image feature information of the first sub-image is used to generate the foreground image area of the target image;
  • the text feature information corresponding to the target image is used to generate The text content of the target image; splicing the background image area, foreground image area, and text content of the target image to obtain the target presentation corresponding to the target image.
  • the background image feature information of the first sub-image that is, page layout, background image, and decoration information
  • the foreground feature information of the first sub-image that is, the foreground image, the layer where the foreground image is located, and the location information of the layer, etc., in the presentation that restores the background image area of the target object, add the information of the first sub-image
  • the layer and foreground image corresponding to the foreground image restore the foreground image area of the target presentation corresponding to the target.
  • the text feature information corresponding to the target image is used on the presentation in which the background image area and the foreground image area of the target object are restored to generate the text content of the target image, and the text content of the target image
  • the text content is added to the presentation in which the background image area and the foreground image area of the target object are restored, and the target presentation corresponding to the target image can be obtained.
  • the electronic device in this application can refer to any node device in the blockchain.
  • the so-called blockchain is a computer technology such as distributed data storage, peer-to-peer transmission (P2P transmission), consensus mechanism, encryption algorithm, etc.
  • the new type of application model is essentially a decentralized database; a block chain can be composed of multiple serial transaction records (also called blocks) that are connected and protected by cryptography.
  • the connected distributed ledger allows multiple parties to effectively record the transaction, and the transaction can be permanently checked (not tampered with).
  • the consensus mechanism refers to the mathematical algorithm that realizes the establishment of trust between different nodes and the acquisition of rights and interests in the blockchain network; that is to say, the consensus mechanism is a mathematical algorithm recognized by all network nodes of the blockchain.
  • This application can use the consensus mechanism of the blockchain to realize the restoration of the target image to the target presentation, which can improve the accuracy of the restoration of the target presentation.
  • each node device in the blockchain performs consensus verification on the execution results of the above steps S101-S105, and the execution results of each step are passed by the consensus verification, then it can be determined that the accuracy of generating the target presentation is relatively high; if there are steps If the execution result of is not passed by the consensus verification, it can be determined that the accuracy of generating the target presentation is relatively low, and the node device may perform the above steps S101-S105 again to obtain the target presentation again.
  • each node device in the blockchain can perform consensus verification on the target presentation (that is, only the execution result of step S105).
  • the node device may perform the above steps S101-S105 again to obtain the target presentation again.
  • the text feature information corresponding to the target image is obtained by performing text recognition on the target image
  • the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ;
  • After obtaining the first sub-image perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information.
  • the process of restoring the presentation does not require manual participation, which can improve restoration The efficiency and accuracy of the presentation.
  • FIG. 3 is a schematic flowchart of another method for generating a presentation provided by an embodiment of the present application, which is executed by the electronic device in the embodiment of the present application.
  • the other method for generating a presentation includes the following steps S201-S209.
  • S201 Acquire a file to be processed, where the file to be processed includes a target image, and the target image is generated according to a target presentation in the target presentation file.
  • S202 Perform text recognition on the target image to obtain text feature information corresponding to the target image.
  • S203 Remove the text content in the target image according to the text feature information to obtain a first sub-image corresponding to the target image.
  • S204 Perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image.
  • S205 Generate a target presentation corresponding to the target image according to the above-mentioned text feature information and the above-mentioned image feature information.
  • the content in steps S201-S205 of the another method for generating a presentation can refer to the content shown in FIG. 1, which will not be repeated in the embodiment of the present application.
  • step S205 may include: correcting the text content included in the text characteristic information to obtain the corrected text characteristic information, and generating the target image corresponding to the corrected text characteristic information and the above-mentioned image characteristic information. Target presentation.
  • the electronic device can perform correction processing on the text content included in the text characteristic information to obtain the corrected text characteristic information.
  • the correction processing here includes correction processing for typos in the text content, and normalization processing for at least one of the size, color, font, etc. of the text content located in the same position. Further, a target presentation corresponding to the target image is generated according to the corrected text feature information and the above-mentioned image feature information.
  • the above-mentioned file to be processed includes at least one frame of image, and one frame of image corresponds to a page of presentation in the target presentation file.
  • At least one frame of image in the file to be processed can be subjected to text recognition and image recognition to obtain each frame of image in the file to be processed.
  • the text feature information and image feature information corresponding to one frame of image are generated to generate a presentation corresponding to each frame of image in the file to be processed.
  • S207 Determine the sequence of each frame of images in the file to be processed according to the above-mentioned candidate file.
  • each frame of image in the file to be processed after the presentation corresponding to each frame of image is generated, since each frame of image in the file to be processed is obtained by segmenting the candidate file, therefore,
  • the sequence of each frame of the image in the file to be processed can be determined according to the candidate file, and the sequence of each frame of image can be determined according to the position of each frame of the image in the candidate file.
  • the sequence of uploading the at least one single-page picture file or the single-page portable file may determine the sequence of each frame of the image in the file to be processed.
  • S209 Use the arranged presentation documents to generate a target presentation file.
  • the arrangement sequence of each frame of the image in the file to be processed is obtained, the presentation corresponding to each frame of the image in the file to be processed is arranged according to the arrangement sequence, and the arranged presentation is used to generate the target presentation file.
  • obtain the theme information of each presentation and the association relationship between the various theme information determine the arrangement order of each presentation according to the association relationship between the topics, and sort the presentations according to the arrangement order to get the sort After the presentation, the sorted presentation is used to generate the target presentation file.
  • the electronic device may determine the theme information of the presentation according to the font size of the text content in each presentation, or may determine the theme information of each presentation according to the location information of the text content in each presentation. Further, the association relationship between the various topic information can be obtained.
  • the association relationship between the various topic information herein may refer to the containment relationship between the topic information, and the association relationship between the various topic information can be used to determine the relationship between each presentation.
  • Order For example, the content of the presentation file is a company’s annual work summary. Assume that the subject information of presentation 1 is the annual work summary, and the subject information of presentation 2 is the work completion status. Since the annual work summary includes the work completion status, the presentation The order of presentation 1 is before the order of presentation 2. Sort the presentations according to the arrangement order of the presentations, and use the arranged presentations to generate the target presentation file. By sorting the presentations according to the association relationship between the presentations, the accuracy of restoring the presentations is improved.
  • the text feature information corresponding to the target image is obtained by performing text recognition on the target image
  • the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ;
  • After obtaining the first sub-image perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation.
  • a presentation corresponding to each frame of the image in the file to be processed is generated, and the candidate file is used to determine the image of each frame in the file to be processed.
  • the presentation corresponding to each frame of the image in the file to be processed is arranged in the arrangement sequence, and the arranged presentation is used to generate the target presentation file. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information.
  • the process of restoring the presentation does not require manual participation, which can improve restoration The efficiency of the presentation.
  • the restored presentations are arranged according to the sort order of each frame of image in the file to be processed, so as to improve the accuracy of the restored presentations.
  • FIG. 4 is a schematic structural diagram of a presentation generating apparatus provided by an embodiment of the present application.
  • the presentation generating apparatus of the embodiment of the present application may be in the above-mentioned electronic device.
  • the data processing device includes: a first acquisition module 11 for acquiring a file to be processed, the file to be processed is generated based on a target presentation in the target presentation file, and the file to be processed includes a target image ;
  • the above-mentioned first acquisition module includes: a second acquisition unit, a third acquisition unit, a first segmentation unit, and an image conversion unit.
  • the second obtaining unit is configured to obtain a candidate file, the candidate file being generated according to the target presentation file.
  • the third obtaining unit is configured to obtain the topic information in the candidate file from the text content in the candidate file.
  • the first segmentation unit is used to segment the candidate file according to the topic information to obtain multiple candidate subfiles.
  • the image conversion unit is configured to perform image conversion processing on the multiple candidate sub-files to obtain a multi-frame image, use the multi-frame image as the file to be processed, and the target image is any of the files to be processed One frame of image.
  • the above-mentioned acquisition module further includes: a fourth acquisition unit, an acceptance unit, and a second segmentation unit.
  • the fourth obtaining unit is configured to obtain a candidate file, the candidate file being generated according to the target presentation file.
  • the receiving unit is configured to receive a segmentation instruction for the candidate file, where the segmentation instruction includes a size trimming ratio.
  • the second segmentation unit is configured to segment the candidate file according to the size trimming ratio to obtain a multi-frame image, use the multi-frame image as the file to be processed, and the target image is the file to be processed Any frame of image in the file.
  • the first recognition module 12 is configured to perform text recognition on the target image to obtain text feature information corresponding to the target image.
  • the removing module 13 is configured to remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image.
  • the second recognition module 14 is configured to perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image.
  • the above-mentioned second identification module includes: a first acquisition unit, a determination unit, a first identification unit, a removal unit, and a second identification unit.
  • the first acquiring unit is configured to acquire the pixel feature information of the first sub-image corresponding to the target image.
  • the determining unit is configured to determine the foreground image area of the first sub-image according to the pixel characteristic information.
  • the first recognition unit is configured to perform image recognition on the foreground image area to obtain the foreground image feature information of the first sub-image.
  • the removing unit is used to remove the foreground image area in the first sub-image to obtain the second sub-image.
  • the second recognition unit is configured to perform image recognition on the second sub-image to obtain background image feature information of the second sub-image; use the background image feature information of the second sub-image as the first sub-image The feature information of the background image.
  • the generating module 15 is configured to generate a target presentation corresponding to the target image according to the text feature information and the image feature information.
  • the above-mentioned generating module includes: a first generating unit, a second generating unit, and a splicing unit.
  • the first generating unit is configured to use the background image feature information of the first sub-image to generate the background image area of the target image.
  • the second generating unit is configured to use the foreground image feature information of the first sub-image to generate the foreground image area of the target image.
  • the second generating unit is configured to use the text feature information corresponding to the target image to generate the text content of the target image.
  • the splicing unit is used for splicing the background image area, the foreground image area, and the text content of the target image to obtain a target presentation corresponding to the target image.
  • the device further includes: a second obtaining module, configured to obtain the The area in the target image after the text feature information is removed is used as the target area; the first determining module is used to determine the color information corresponding to the text content in the target image according to the text feature information; the filling module is used to use The color information corresponding to the text content performs padding processing on the target area.
  • the device further includes: a second generation module, which is used to generate the corresponding information for each frame of the image in the file to be processed according to the text feature information and image feature information of each frame of the image in the file to be processed Presentation; the second determining module is used to determine the arrangement order of each frame of the image in the file to be processed according to the candidate file; the arrangement module is used to apply the arrangement order to each of the files to be processed The presentation documents corresponding to one frame of image are arranged; the third generation module is used to generate the target presentation file by using the arranged presentation documents.
  • a second generation module which is used to generate the corresponding information for each frame of the image in the file to be processed according to the text feature information and image feature information of each frame of the image in the file to be processed Presentation
  • the second determining module is used to determine the arrangement order of each frame of the image in the file to be processed according to the candidate file
  • the arrangement module is used to apply the arrangement order to each of the files to be processed
  • the presentation documents corresponding to one frame of image are
  • the text feature information corresponding to the target image is obtained by performing text recognition on the target image
  • the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ;
  • After obtaining the first sub-image perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information.
  • the process of restoring the presentation does not require manual participation, which can improve restoration The efficiency and accuracy of the presentation.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the electronic device in this embodiment may include: one or more processors 21; and one or more input devices 22.
  • the aforementioned processor 21, input device 22, output device 23, and memory 24 are connected by a bus 25.
  • the processor 21 may be a central processing unit (Central Processing Unit, CPU), the processor can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), ready-made programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the input device 22 may include a touch panel, a fingerprint sensor (used to collect user fingerprint information and fingerprint orientation information), a microphone, etc.
  • the output device 23 may include a display (LCD, etc.), a speaker, etc., and the output device 23 may output calibration The processed data sheet.
  • the memory 24 may include a read-only memory and a random access memory, and provides instructions and data to the processor 21. A part of the memory 24 may also include a non-volatile random access memory.
  • the memory 24 is used to store a computer program.
  • the computer program includes program instructions.
  • the processor 21 is used to execute the program instructions stored in the memory 24 to execute a program.
  • a presentation generation method is used to perform the following operations: obtain a file to be processed, the file to be processed includes a target image, the target image is generated according to the target presentation in the target presentation file; Text recognition to obtain the text feature information corresponding to the target image; remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image; Image recognition is performed on the first sub-image to obtain image feature information corresponding to the first sub-image corresponding to the target image; a target presentation corresponding to the target image is generated according to the text feature information and the image feature information.
  • the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: the image feature information includes foreground image feature information and background image feature information; and obtain the first sub-image corresponding to the target image Determine the foreground image area of the first sub-image according to the pixel characteristic information; perform image recognition on the foreground image area to obtain the foreground image characteristic information of the first sub-image; remove the The foreground image area in the first sub-image to obtain the second sub-image; perform image recognition on the second sub-image to obtain the background image feature information of the second sub-image; combine the background image of the second sub-image
  • the feature information is used as the background image feature information of the first sub-image.
  • the processor 21 is configured to execute program instructions stored in the memory 24, and is configured to perform the following operations: use the background image feature information of the first sub-image to generate the background image area of the target image;
  • the foreground image feature information of a sub-image is used to generate the foreground image area of the target image;
  • the text feature information corresponding to the target image is used to generate the text content of the target image;
  • the background image area of the target image The foreground image area and the text content are spliced to obtain a target presentation corresponding to the target image.
  • the processor 21 is configured to execute program instructions stored in the memory 24, and is configured to perform the following operations: obtain an area in the target image after the text content is removed as the target area; determine according to the text feature information The text content in the target image corresponds to color information; the color information corresponding to the text content is used to fill in the target area.
  • the processor 21 is configured to execute program instructions stored in the memory 24, and is configured to perform the following operations: obtain a candidate file, which is generated according to the target presentation file; Obtain the topic information in the candidate file in the candidate file; segment the candidate file according to the topic information to obtain a plurality of candidate subfiles; perform image conversion processing on the plurality of candidate subfiles to obtain a multi-frame image,
  • the multi-frame image is used as the file to be processed, and the target image is any frame image in the file to be processed.
  • the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: obtain candidate files, which are generated according to the target presentation file; receive segmentation for the candidate files Instruction, the segmentation instruction includes a size trimming ratio; the candidate file is segmented according to the size trimming ratio to obtain a multi-frame image, the multi-frame image is used as the file to be processed, and the target image is Any frame of image in the file to be processed.
  • the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: generate the to-be-processed file according to the text feature information and image feature information of each frame of the image in the to-be-processed file The presentation corresponding to each frame of the image in the file; determine the sequence of each frame of the image in the file to be processed according to the candidate file; use the sequence to correspond to each frame of the image in the file to be processed Arrange the presentations; use the arranged presentations to generate the target presentation file.
  • the processor 21, the input device 22, and the output device 23 described in the embodiments of this application can execute the implementations described in the first embodiment and the second embodiment of the presentation generation method provided in the embodiments of this application.
  • the implementation of the electronic device described in the embodiments of the present application is implemented, which will not be repeated here.
  • the text feature information corresponding to the target image is obtained by performing text recognition on the target image
  • the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ;
  • After obtaining the first sub-image perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information.
  • the process of restoring the presentation does not require manual participation, which can improve restoration The efficiency and accuracy of the presentation.
  • An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions.
  • the program instructions When the program instructions are executed by a processor, the implementation is shown in FIG. 1 and FIG. 3 The presentation generation method shown in the embodiment.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium may be an internal storage unit of the electronic device described in any of the foregoing embodiments, such as a hard disk or a memory of a control device.
  • the computer-readable storage medium may also be an external storage device of the control device, such as a plug-in hard disk equipped on the control device, a smart memory card (Smart Media Card, SMC), or a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the computer-readable storage medium may also include both an internal storage unit of the control device and an external storage device.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by the control device.
  • the computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
  • the above-mentioned computer-readable storage medium may be deployed and executed on one computer device, or deployed on multiple computer devices located in one location, or in multiple locations that are distributed in multiple locations and interconnected by a communication network.
  • Executed on a computer device multiple computer devices distributed in multiple locations and interconnected by a communication network can form a blockchain network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A presentation generation method, apparatus, and device, and a medium, wherein a first data processing method comprises: acquiring a file to be processed, the file to be processed comprising a target image (S101); performing text recognition on the target image to obtain text feature information corresponding to the target image (S102); on the basis of the text feature information, removing the text content in the target image to obtain a first sub-image corresponding to the target image (S103); performing image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image (S104); and, on the basis of the text feature information and the image feature information, generating a target presentation corresponding to the target image (S105). Using the present method, a target presentation corresponding to the target image can be rapidly and accurately generated. The present method, apparatus, device, and medium relate to image recognition technology in artificial intelligence, and are also suitable for fields such as smart government affairs and smart education, being conducive to promoting the construction of smart cities.

Description

演示文稿生成方法、装置、设备及介质Method, device, equipment and medium for generating presentation
本申请要求于2020年06月28日提交中国专利局、申请号为202010598196.4、申请名称为“演示文稿生成方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 28, 2020, the application number is 202010598196.4, and the application name is "Presentation Generation Method, Apparatus, Equipment and Medium", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种演示文稿生成方法、装置、设备及介质。This application relates to the field of computer technology, and in particular to a method, device, equipment and medium for generating a presentation.
背景技术Background technique
随着办公软件的普遍推广,演示文稿被广泛应用于社会生活的方方面面,例如,在人们日常工作、学习以及技术交流中,通常采用演示文稿进行演讲。为了实现远程会议或远程教学,用户在演讲时,通过对演示文稿进行录制的方式,以确保用户的演讲与演示文稿同步。但是,发明人发现,对演示文稿进行录制得到的是演示文稿的图像,需要用户手动编辑方式才能将演示文稿的图像还原为演示文稿,导致对演示文稿还原的效率比较低。With the general promotion of office software, presentations are widely used in all aspects of social life. For example, in people's daily work, study, and technical communication, presentations are usually used to give speeches. In order to realize the remote meeting or remote teaching, the user can record the presentation to ensure that the user's speech is synchronized with the presentation. However, the inventor found that the image of the presentation obtained by recording the presentation requires manual editing by the user to restore the image of the presentation to the presentation, resulting in a relatively low efficiency of restoring the presentation.
技术问题technical problem
本申请实施方式提供一种演示文稿生成方法,可通过分层处理方式获取目标图像中的文字特征信息以及图像特征信息,根据文字特征信息以及图像特征信息生成目标图像对应的目标演示文稿,该还原演示文稿的过程不需要人工参与,避免了由于人工参与而降低演示文稿还原的效率。The embodiment of the present application provides a method for generating a presentation, which can obtain text feature information and image feature information in a target image through a layered processing method, and generate a target presentation corresponding to the target image according to the text feature information and image feature information. The process of presentation does not require manual participation, which avoids the reduction of the efficiency of presentation restoration due to manual participation.
技术解决方案Technical solutions
第一方面,本申请实施例提供了一种演示文稿方法,该方法包括:获取待处理文件,所述待处理文件包括目标图像,所述目标图像是根据目标演示文件中的目标演示文稿生成的;对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息;根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;根据所述文本特征信息以及所述图像特征信息生成所述目标图像对应的目标演示文稿。In a first aspect, an embodiment of the present application provides a presentation method, the method includes: obtaining a file to be processed, the file to be processed includes a target image, and the target image is generated based on the target presentation in the target presentation file Perform text recognition on the target image to obtain text feature information corresponding to the target image; remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image; Perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image; generate the target image corresponding to the text feature information and the image feature information Target presentation.
第二方面,本申请实施例提供了一种演示文稿生成装置,该装置包括:第一获取模块,用于获取待处理文件,所述待处理文件是根据目标演示文件中的目标演示文稿生成的,所述待处理文件包括目标图像;第一识别模块,用于对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息;移除模块,用于根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;第二识别模块,用于对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;第一生成模块,用于根据所述文本特征信息以及所述图像特征信息分别生成所述目标图像对应的目标演示文稿。In a second aspect, an embodiment of the present application provides a presentation generation device, the device includes: a first acquisition module for acquiring a file to be processed, the file to be processed is generated based on the target presentation in the target presentation file , The file to be processed includes a target image; a first recognition module for performing text recognition on the target image to obtain text feature information corresponding to the target image; a removal module for moving according to the text feature information In addition to the text content in the target image, the first sub-image corresponding to the target image is obtained; the second recognition module is used to perform image recognition on the first sub-image corresponding to the target image to obtain the corresponding target image Image feature information corresponding to the first sub-image; a first generation module configured to generate target presentations corresponding to the target image according to the text feature information and the image feature information.
第三方面,本申请实施例提供了一种电子设备,该设备包括:处理器,适于实现一条或一条以上指令;以及,计算机存储介质,所述计算机存储介质存储有一条或一条以上指令,所述一条或一条以上指令适于由所述处理器加载并执行以下步骤:获取待处理文件,所述待处理文件包括目标图像,所述目标图像是根据目标演示文件中的目标演示文稿生成的;对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息;根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;根据所述文本特征信息以及所述图像特征信息生成所述目标图像对应的目标演示文稿。In a third aspect, an embodiment of the present application provides an electronic device, which includes: a processor, adapted to implement one or more instructions; and, a computer storage medium that stores one or more instructions, The one or more instructions are adapted to be loaded by the processor and execute the following steps: obtain a file to be processed, the file to be processed includes a target image, and the target image is generated according to a target presentation in the target presentation file Perform text recognition on the target image to obtain text feature information corresponding to the target image; remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image; Perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image; generate the target image corresponding to the text feature information and the image feature information Target presentation.
第四方面,本申请实施例提供了一种计算机可读存储介质,包括:所述计算机存储介质存储有一条或一条以上指令,所述一条或一条以上指令适于由处理器加载并执行以下步骤:获取待处理文件,所述待处理文件包括目标图像,所述目标图像是根据目标演示文件中的目标演示文稿生成的;对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息;根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;根据所述文本特征信息以及所述图像特征信息生成所述目标图像对应的目标演示文稿。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, including: the computer storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps : Obtain a file to be processed, the file to be processed includes a target image, the target image is generated according to the target presentation in the target presentation file; text recognition is performed on the target image to obtain the text feature corresponding to the target image Information; remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image; perform image recognition on the first sub-image corresponding to the target image to obtain the target Image feature information corresponding to the first sub-image corresponding to the image; generating a target presentation corresponding to the target image according to the text feature information and the image feature information.
有益效果Beneficial effect
本申请实施例中,通过对目标图像进行文字识别,获取到目标图像对应的文字特征信息后,再移除目标图像中的文本内容,得到第一子图像,避免了文本内容对后续处理进行干扰;得到第一子图像后,在对第一子图像进行图像识别,得到第一子图像的图像特征信息,根据目标图像的文字特征信息以及第一子图像的图像特征信息生成目标图像对应的目标演示文稿。即通过分层处理方式获取目标图像中的文字特征信息以及图像特征信息,根据文字特征信息以及图像特征信息生成目标图像对应的目标演示文稿,该还原演示文稿的过程不需要人工参与,可提高还原演示文稿的效率以及准确度。In the embodiment of the present application, after the text feature information corresponding to the target image is obtained by performing text recognition on the target image, the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ; After obtaining the first sub-image, perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information. The process of restoring the presentation does not require manual participation, which can improve restoration The efficiency and accuracy of the presentation.
附图说明Description of the drawings
图1是本申请实施例提供的一种演示文稿生成方法的流程示意图。FIG. 1 is a schematic flowchart of a method for generating a presentation provided by an embodiment of the present application.
图2是本申请实施例提供的一种获取第一子图像对应的图像特征信息方法的示意图。Fig. 2 is a schematic diagram of a method for obtaining image feature information corresponding to a first sub-image provided by an embodiment of the present application.
图3是本申请实施例提供的另一种演示文稿生成方法的流程示意图。FIG. 3 is a schematic flowchart of another method for generating a presentation provided by an embodiment of the present application.
图4是本申请实施例提供的一种演示文稿生成装置的结构示意图。Fig. 4 is a schematic structural diagram of a presentation generating device provided by an embodiment of the present application.
图5是本申请另一实施例提供的一种电子设备的结构示意图。FIG. 5 is a schematic structural diagram of an electronic device provided by another embodiment of the present application.
本发明的实施方式Embodiments of the present invention
人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互***、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology. Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
其中,计算机视觉技术(Computer Vision, CV)是一门研究如何使机器“看”的科学,更进一步的说,就是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能***。计算机视觉技术通常包括图像处理、图像识别、图像语义理解、图像检索、OCR、视频处理、视频语义理解、视频内容/行为识别、三维物体重建、3D技术、虚拟现实、增强现实、同步定位与地图构建等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。Among them, computer vision technology (Computer Vision, CV) is a science that studies how to make machines "see". Furthermore, it refers to the use of cameras and computers instead of human eyes to identify, track, and measure targets. And further graphics processing, so that computer processing becomes more suitable for human eyes to observe or send to the instrument to detect the image. As a scientific discipline, computer vision studies related theories and technologies, trying to establish an artificial intelligence system that can obtain information from images or multi-dimensional data. Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping Construction and other technologies also include common face recognition, fingerprint recognition and other biometric recognition technologies.
本申请涉及人工智能中的图像识别技术,利用图像识别技术将图像自动转换为演示文稿,不需要人工参与,可提高还原演示文稿的效率以及准确度;本申请可适用于智慧政务、智慧教育等领域,有利于推动智慧城市的建设。This application relates to image recognition technology in artificial intelligence. The image recognition technology is used to automatically convert images into presentations without manual participation, which can improve the efficiency and accuracy of restoring presentations; this application can be applied to smart government affairs, smart education, etc. This field is conducive to promoting the construction of smart cities.
请参见图1,是本申请实施例提供的一种演示文稿生成方法的流程示意图,本申请实施例电子设备来执行,该演示文稿生成方法包括以下步骤S101-S105。Please refer to FIG. 1, which is a schematic flowchart of a method for generating a presentation provided by an embodiment of the present application, which is executed by an electronic device according to an embodiment of the present application. The method for generating a presentation includes the following steps S101-S105.
S101,获取待处理文件,该待处理文件包括目标图像,该目标图像是根据目标演示文件中的目标演示文稿生成的。S101: Acquire a file to be processed, where the file to be processed includes a target image, and the target image is generated according to a target presentation in the target presentation file.
在本申请实施例中,待处理文件中包括目标图像,该目标图像是根据目标演示文件中的目标演示文稿生成的,该目标演示文件中的目标演示文稿可以是用户在听课或听讲座时的课件内容对应的演示文稿,目标演示文件中包括至少一个目标演示文稿。而目标图像是用户在听课或者听讲座时对目标演示文稿拍摄得到的图像,待处理文件中可以是根据用户拍摄的至少一张图像生成的,目标图像为该至少一张图像中的任意一张。当用户想要还原在会议、课堂或者讲座上的演示文稿时,可以将想要还原的目标演示文稿对应的目标图像上传给电子设备。In the embodiment of this application, the file to be processed includes a target image, which is generated based on the target presentation in the target presentation file, and the target presentation in the target presentation file may be the user's presentation when attending a class or a lecture. The presentation corresponding to the content of the courseware, and the target presentation file includes at least one target presentation. The target image is the image taken by the user during the lecture or lecture on the target presentation. The file to be processed can be generated based on at least one image taken by the user, and the target image is any one of the at least one image. . When a user wants to restore a presentation in a meeting, class, or lecture, he can upload the target image corresponding to the target presentation he wants to restore to the electronic device.
可选的,用户可以选择有序的将目标演示文件中目标演示文稿对应的单页图片文件或者单页便携式文件上传至电子设备,单页图片文件或者单页便携式文件中的内容对应一页目标演示文稿的内容,该至少一个单页图片文件或者单页便携式文件上传的先后顺序对应目标演示文件中至少一个目标演示文稿的先后顺序。对上述至少一个单页图片文件或者单页便携式文件进行图像转换处理,得到至少一个单页图片文件或者单页便携式文件对应的多帧图像,单页图片文件或者单页便携式文件对应一帧图像,将该多帧图像作为上述待处理文件,目标图像是待处理文件中的任一帧图像。电子设备可以获取并存储用户上传至少一个单页图片文件或者单页便携式文件时的先后顺序,并根据该至少一个单页图片文件或者单页便携式文件上传时的先后顺序,对每个单页图片文件或者单页便携式文件对应的目标图像中的内容进行提取,还原每个目标图像对应的目标演示文稿。Optionally, the user can choose to upload the single-page picture file or single-page portable file corresponding to the target presentation in the target presentation file to the electronic device in an orderly manner, and the content in the single-page picture file or single-page portable file corresponds to the one-page target The content of the presentation, the sequence of uploading the at least one single-page picture file or the single-page portable file corresponds to the sequence of at least one target presentation in the target presentation file. Perform image conversion processing on the above-mentioned at least one single-page picture file or single-page portable file to obtain at least one single-page picture file or a multi-frame image corresponding to the single-page portable file, and the single-page picture file or single-page portable file corresponds to one frame of image, The multi-frame image is used as the file to be processed, and the target image is any frame image in the file to be processed. The electronic device can obtain and store the sequence when the user uploads at least one single-page picture file or single-page portable file, and according to the sequence of uploading the at least one single-page picture file or single-page portable file, the order for each single-page picture The content in the target image corresponding to the file or single-page portable file is extracted, and the target presentation corresponding to each target image is restored.
可选的,可以获取候选文件,该候选文件是根据所述目标演示文件生成的,从候选文件中的文本内容中获取候选文件中的主题信息,根据该主题信息对候选文件进行切分,得到多个候选子文件,对多个候选子文件进行图像转换处理,得到多帧图像,将该多帧图像作为上述待处理文件,目标图像是待处理文件中的任一帧图像。Optionally, a candidate file can be obtained. The candidate file is generated based on the target presentation file. The subject information in the candidate file is obtained from the text content in the candidate file, and the candidate file is segmented according to the subject information to obtain Multiple candidate sub-files, perform image conversion processing on the multiple candidate sub-files to obtain a multi-frame image, use the multi-frame image as the file to be processed, and the target image is any frame image in the file to be processed.
其中,该候选文件可以是用户上传的多页图片文件或者多页便携式文件,该多页图片文件或者多页便携式文件是根据目标演示文件生成的。该目标演示文件中包括至少一页目标演示文稿对应的至少一帧图像拼接而成的长图像。可以对候选文件进行文字识别,获得候选文件的文本内容,并根据该候选文件中的文本内容获取候选文件中的主题信息。获取候选文件中的主题信息的方法可以是根据现有技术中的主题提取模型来获得,如Ida主题提取模型、LDA主题提取模型等。根据上述主题提取模型获取候选文件中的主题信息后,根据该主题信息对候选文件进行切分,得到多个候选子文件,一个主题信息对应一个候选子文件,一个候选子文件中包含一页目标演示文稿对应的内容。然后再对上述多个候选子文件进行图像转换处理,得到多帧图像,一个候选子文件转换成一帧图像,一帧图像对应一页目标演示文稿中的内容。将该多帧图像作为上述待处理文件,目标图像就是该待处理文件中任意一帧图像。The candidate file may be a multi-page picture file or a multi-page portable file uploaded by the user, and the multi-page picture file or the multi-page portable file is generated according to the target presentation file. The target presentation file includes a long image formed by stitching at least one frame of images corresponding to at least one page of the target presentation document. Can perform text recognition on the candidate file, obtain the text content of the candidate file, and obtain the topic information in the candidate file according to the text content in the candidate file. The method for obtaining the topic information in the candidate file may be obtained according to a topic extraction model in the prior art, such as an Ida topic extraction model, an LDA topic extraction model, and so on. After obtaining the topic information in the candidate file according to the above topic extraction model, the candidate file is segmented according to the topic information to obtain multiple candidate subfiles, one topic information corresponds to one candidate subfile, and one candidate subfile contains a page target The content corresponding to the presentation. Then, image conversion processing is performed on the multiple candidate subfiles to obtain multiple frames of images. One candidate subfile is converted into one frame of image, and one frame of image corresponds to the content of one page of the target presentation. The multi-frame image is taken as the aforementioned file to be processed, and the target image is any one frame of the image in the file to be processed.
可选的,获取候选文件,候选文件是根据目标演示文件生成的;接收针对候选文件的切分指令,切分指令包括尺寸剪裁比例;按照尺寸剪裁比例对候选文件进行切分,得到多帧图像,将多帧图像作为待处理文件,目标图像是待处理文件中的任一帧图像。Optionally, obtain a candidate file, which is generated based on the target presentation file; receive a segmentation instruction for the candidate file, the segmentation instruction includes a size trimming ratio; segment the candidate file according to the size trimming ratio to obtain a multi-frame image , Take multiple frames of images as files to be processed, and the target image is any frame of images in the file to be processed.
其中,该候选文件可以是用户上传的一个多页图片文件或者多页便携式文件,该多页图片文件或者多页便携式文件是由目标演示文件中的至少一页目标演示文稿对应的至少一帧图像拼接而成的长图像。可以获取用户输入的对候选文件中长图像的尺寸裁剪比例,当电子设备接受针对候选文件的切分指令时,该切分指令包括尺寸裁剪比例,则按照该尺寸裁剪比例对候选文件中的长图像进行切分,即对上述至少一页目标演示文稿对应的至少一帧图像拼接而成的长图像进行切分,获得多帧图像,将该多帧图像作为待处理文件,目标图像是待处理文件中的任意一帧图像。在获取输入的尺寸裁剪比例之后,还可以获取用户输入的针对待处理文件每一帧图像的排列顺序,对待处理文件中的每一帧图像进行排序,并将该待处理文件每一帧图像的排列顺序进行储存。The candidate file may be a multi-page picture file or a multi-page portable file uploaded by the user. The multi-page picture file or the multi-page portable file is at least one frame of image corresponding to at least one page of the target presentation in the target presentation file. Long image stitched together. The size cropping ratio of the long image in the candidate file input by the user can be obtained. When the electronic device accepts the segmentation instruction for the candidate file, the segmentation instruction includes the size cropping ratio, and the size cropping ratio is compared with the length of the candidate file. Image segmentation, that is, segment the long image formed by stitching at least one frame of images corresponding to at least one page of the target presentation to obtain multiple frames of images, and use the multiple frames of images as files to be processed. The target image is to be processed. Any frame of image in the file. After obtaining the input size and cropping ratio, it is also possible to obtain the arrangement order of each frame image of the file to be processed by the user, sort each frame of the image in the file to be processed, and compare the order of each frame image of the file to be processed. Arrange the order for storage.
S102,对目标图像进行文本识别,得到目标图像对应的文本特征信息。S102: Perform text recognition on the target image to obtain text feature information corresponding to the target image.
在本实施例中,获取目标图像后,对目标图像进行文本识别,获得目标图像中的文本特征信息。其中,可以使用通用OCR文字识别算法,对目标图像中的文字内容以及文字所处字块的位置信息进行识别获取,然后可以用sift算法、surf算法对目标图像中的文字对应的字体、字号进行识别,以此得到目标图像对应的文本特征信息。目标图像的文本特征信息包括目标图像中的文本内容、文本内容在目标图像中的位置信息、文本内容的字体大小、文本内容的字体类型、文本内容的颜色等中的至少一种。In this embodiment, after acquiring the target image, text recognition is performed on the target image to obtain the text feature information in the target image. Among them, the general OCR text recognition algorithm can be used to identify the text content in the target image and the location information of the block where the text is located, and then the sift algorithm and surf algorithm can be used to perform the font and font size corresponding to the text in the target image. Recognition, in order to obtain the text feature information corresponding to the target image. The text feature information of the target image includes at least one of text content in the target image, location information of the text content in the target image, font size of the text content, font type of the text content, color of the text content, and the like.
S103,根据上述文本特征信息移除目标图像中的文本内容,得到目标图像对应的第一子图像。S103: Remove the text content in the target image according to the above-mentioned text feature information to obtain a first sub-image corresponding to the target image.
在本申请实施例中,在对目标图像中获取到文本特征信息后,根据该文本特征信息移除目标图像中的文本内容,即将目标图像中的文本内容对应的文字图层移除,避免后续文字图层对目标图像进行图像识别的干扰,有利于准确还原目标图像对应的目标演示文稿。In the embodiment of the present application, after the text feature information is obtained from the target image, the text content in the target image is removed according to the text feature information, that is, the text layer corresponding to the text content in the target image is removed to avoid subsequent The text layer interferes with the image recognition of the target image, which is beneficial to accurately restore the target presentation corresponding to the target image.
可选的,获取目标图像中移除目标图像中的文本内容后的区域,作为目标区域;根据文本特征信息确定目标图像中的文本内容对应颜色信息;采用文本内容对应颜色信息对目标区域进行填补处理。Optionally, obtain the area in the target image after removing the text content in the target image as the target area; determine the color information corresponding to the text content in the target image according to the text feature information; fill the target area with the color information corresponding to the text content deal with.
根据上述文本特征信息移除目标图像中的文本内容后,获取目标图像中移除文本内容后的区域,将该区域作为目标区域。再根据文本特征信息确定目标图像中的文本内容对应颜色信息,再根据文本内容对应颜色信息确定第一子图像中目标区域中各个位置对应的颜色,对第一子图像中目标区域的各个位置进行填补处理。这样,可以将第一子图像中因移除文本内容而产生空缺的地方进行填补,更利于后续精确的获取第一子图像对应的图像特征信息。After the text content in the target image is removed according to the above-mentioned text feature information, the area in the target image after the text content is removed is obtained, and the area is taken as the target area. Then determine the color information corresponding to the text content in the target image according to the text feature information, and then determine the color corresponding to each position in the target area in the first sub-image according to the color information corresponding to the text content, and perform the operation on each position of the target area in the first sub-image. Fill processing. In this way, vacancies in the first sub-image due to the removal of text content can be filled, which is more conducive to subsequent accurate acquisition of image feature information corresponding to the first sub-image.
S104,对目标图像对应的第一子图像进行图像识别,得到目标图像对应的第一子图像对应的图像特征信息。S104: Perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image.
在本实施例中,获取目标图像对应的文本特征信息,得到目标图像对应的第一子图像后,可以利用图像识别算法,对目标图像对应的第一子图像进行图像识别,得到目标图像对应的第一子图像对应的图像特征信息,图像识别算法可以为sift算法、surf算法、全卷积神经网络算法、HOG特征提取算法等等。In this embodiment, the text feature information corresponding to the target image is obtained, and after the first sub-image corresponding to the target image is obtained, the image recognition algorithm can be used to perform image recognition on the first sub-image corresponding to the target image to obtain the corresponding target image. For the image feature information corresponding to the first sub-image, the image recognition algorithm may be a sift algorithm, a surf algorithm, a fully convolutional neural network algorithm, a HOG feature extraction algorithm, and so on.
其中,如图2所示,是本申请实施例提供的一种获取第一子图像对应的图像特征信息方法的示意图,如图2所示,本申请实施例提供的一种获取第一子图像对应的图像特征信息方法包括步骤S21-S26。Wherein, as shown in FIG. 2, it is a schematic diagram of a method for obtaining image feature information corresponding to a first sub-image provided by an embodiment of the present application. As shown in FIG. 2, an embodiment of the present application provides a method for obtaining a first sub-image. The corresponding image feature information method includes steps S21-S26.
S21,获取目标图像对应的第一子图像的像素特征信息。S21: Acquire pixel feature information of the first sub-image corresponding to the target image.
在本申请实施例中,上述图像特征信息包括前景图像特征信息和背景图像特征信息。得到目标对象对应的第一子图像后,可以先获取目标对象对应的第一子图像的像素特征信息。由于第一子图像中的文本信息已被移除,所以不存在文本内容的干扰,可以利用图像识别算法,提取第一子图像中的像素特征信息以及各像素特征所在图层的位置信息。In the embodiment of the present application, the aforementioned image feature information includes foreground image feature information and background image feature information. After the first sub-image corresponding to the target object is obtained, the pixel feature information of the first sub-image corresponding to the target object may be obtained first. Since the text information in the first sub-image has been removed, there is no interference from the text content. An image recognition algorithm can be used to extract the pixel feature information in the first sub-image and the location information of the layer where each pixel feature is located.
S22,根据上述像素特征信息确定第一子图像的前景图像区域。S22: Determine a foreground image area of the first sub-image according to the above-mentioned pixel characteristic information.
S23,对上述前景图像区域进行图像识别,得到第一子图像的前景图像特征信息。S23: Perform image recognition on the foreground image area to obtain foreground image feature information of the first sub-image.
其中,可以获取目标图像对应的第一子图像的像素特征信息,根据该像素特征信息确定第一子图像的前景图像区域,在对上述前景图像区域进行图像识别,得到第一子图像的前景图像特征信息。其中,可以根据第一子图像的像素特征信息,确定第一子图像对应的灰度图像,再确定对应的二值化图像,确定第一子图像的前景图像区域与背景图像区域。获得第一子图像对应的前景图像区域后,可以使用grabCut算法对第一子图像的前景区域进行图像识别,得到第一子图像对应的前景图像、前景图像所在图层以及该图层所在的位置信息等前景图像特征信息。Among them, the pixel characteristic information of the first sub-image corresponding to the target image can be obtained, the foreground image area of the first sub-image is determined according to the pixel characteristic information, and the aforementioned foreground image area is image-recognized to obtain the foreground image of the first sub-image Characteristic information. Wherein, according to the pixel characteristic information of the first sub-image, the gray image corresponding to the first sub-image can be determined, and then the corresponding binarized image can be determined, and the foreground image area and the background image area of the first sub-image can be determined. After obtaining the foreground image area corresponding to the first sub-image, the grabCut algorithm can be used to perform image recognition on the foreground area of the first sub-image to obtain the foreground image corresponding to the first sub-image, the layer where the foreground image is located, and the location of the layer. Information and other foreground image feature information.
S24,移除第一子图像中的前景图像区域,得到第二子图像。S24: Remove the foreground image area in the first sub-image to obtain a second sub-image.
S25,对第二子图像进行图像识别,得到第二子图像的背景图像特征信息。S25: Perform image recognition on the second sub-image to obtain background image feature information of the second sub-image.
S26,将第二子图像的背景图像特征信息作为第一子图像的背景图像特征信息。S26: Use the background image feature information of the second sub-image as the background image feature information of the first sub-image.
在本实施例中,在获得第一子图像的前景特征信息后,可以移除第一子图像中的前景图像区域,得到第二子图像,再对第二子图像进行图像识别,得到第二子图像的背景图像特征信息,将第二子图像的背景图像特征信息作为第一子图像的背景图像特征信息。其中,根据该前景图像特征信息中前景图像内容所在图层的位置信息,移除第一子图像中的前景图像内容以及前景内容所在的图层,得到第二子图像。由此第二子图像中只留下背景图像区域,避免前景图像区域对后续获得背景图像特征信息的干扰,有利于精确的还原目标图像对应的目标演示文稿。然后根据图像识别算法对第二子图像进行图像识别,得到第二子图像的背景图像特征信息,该背景图像特征信息包括第二子图像的背景图片、页面布局、装饰品等,将第二子图像的背景特征信息确定为第一子图像的背景特征信息。In this embodiment, after obtaining the foreground feature information of the first sub-image, the foreground image area in the first sub-image can be removed to obtain the second sub-image, and then image recognition is performed on the second sub-image to obtain the second sub-image. For the background image feature information of the sub-image, the background image feature information of the second sub-image is used as the background image feature information of the first sub-image. Wherein, according to the position information of the layer where the foreground image content is located in the foreground image feature information, the foreground image content and the layer where the foreground content is located in the first sub-image are removed to obtain the second sub-image. Therefore, only the background image area is left in the second sub-image, which avoids the interference of the foreground image area on the subsequent acquisition of background image feature information, which is beneficial to accurately restore the target presentation corresponding to the target image. Then perform image recognition on the second sub-image according to the image recognition algorithm to obtain the background image feature information of the second sub-image. The background image feature information includes the background picture, page layout, decorations, etc. of the second sub-image, and the second sub-image The background feature information of the image is determined as the background feature information of the first sub-image.
S105,根据上述文本特征信息以及上述图像特征信息生成目标图像对应的目标演示文稿。S105: Generate a target presentation corresponding to the target image according to the above-mentioned text feature information and the above-mentioned image feature information.
在本申请实施例中,对目标图像进行文本识别,得到目标图像对应的文本特征信息,以及对目标图像对应的第一子图像进行图像识别,获得第一子图像对应的图像特征信息后,可以创建一个空白的演示文稿页面,根据目标图像对应的第一子图像的图像特征信息,以及目标图像的文本特征信息,还原目标图像对应的目标演示文稿。In this embodiment of the application, text recognition is performed on the target image to obtain text feature information corresponding to the target image, and image recognition is performed on the first sub-image corresponding to the target image to obtain the image feature information corresponding to the first sub-image. Create a blank presentation page, and restore the target presentation corresponding to the target image based on the image feature information of the first sub-image corresponding to the target image and the text feature information of the target image.
其中,采用第一子图像的背景图像特征信息,生成目标图像的背景图像区域;采用第一子图像的前景图像特征信息,生成目标图像的前景图像区域;采用目标图像对应的文本特征信息,生成目标图像的文本内容;对目标图像的背景图像区域、前景图像区域、文本内容进行拼接,得到目标图像对应的目标演示文稿。Among them, the background image feature information of the first sub-image is used to generate the background image area of the target image; the foreground image feature information of the first sub-image is used to generate the foreground image area of the target image; the text feature information corresponding to the target image is used to generate The text content of the target image; splicing the background image area, foreground image area, and text content of the target image to obtain the target presentation corresponding to the target image.
采用第一子图像的背景图像特征信息,即页面布局、背景图片以及装饰品等信息,在新创建的空白演示文稿上,还原目标图像对应的目标演示文稿中的页面布局、背景图片以及装饰品等背景图像区域。采用第一子图像的前景特征信息,即前景图像、前景图像所在图层以及该图层所在的位置信息等信息,在还原了目标对象的背景图像区域的演示文稿上,加入第一子图像的前景图像对应的图层以及前景图像,还原目标对应的目标演示文稿的前景图像区域。再根据目标图像对应的文本特征信息,在还原了目标对象的背景图像区域和前景图像区域的演示文稿上,采用目标图像对应的文本特征信息,生成目标图像的文本内容,并将该目标图像的文本内容加入还原了目标对象的背景图像区域和前景图像区域的演示文稿中,就可以得到目标图像对应的目标演示文稿。Using the background image feature information of the first sub-image, that is, page layout, background image, and decoration information, restore the page layout, background image, and decorations in the target presentation corresponding to the target image on the newly created blank presentation Wait for the background image area. Using the foreground feature information of the first sub-image, that is, the foreground image, the layer where the foreground image is located, and the location information of the layer, etc., in the presentation that restores the background image area of the target object, add the information of the first sub-image The layer and foreground image corresponding to the foreground image restore the foreground image area of the target presentation corresponding to the target. Then according to the text feature information corresponding to the target image, the text feature information corresponding to the target image is used on the presentation in which the background image area and the foreground image area of the target object are restored to generate the text content of the target image, and the text content of the target image The text content is added to the presentation in which the background image area and the foreground image area of the target object are restored, and the target presentation corresponding to the target image can be obtained.
可选的,本申请中的电子设备可以是指区块链中的任一节点设备,所谓区块链是一种分布式数据存储、点对点传输(P2P传输)、共识机制、加密算法等计算机技术的新型应用模式,其本质上是一个去中心化的数据库;区块链可由多个借由密码学串接并保护内容的串连交易记录(又称区块)构成,用区块链所串接的分布式账本能让多方有效纪录交易,且可永久查验此交易(不可篡改)。其中,共识机制是指区块链网络中实现不同节点之间建立信任、获取权益的数学算法;也就是说,共识机制是区块链各网络节点共同认可的一种数学算法。本申请可利用区块链的共识机制,来实现将目标图像还原为目标演示文稿,可提高还原目标演示文稿的准确度。Optionally, the electronic device in this application can refer to any node device in the blockchain. The so-called blockchain is a computer technology such as distributed data storage, peer-to-peer transmission (P2P transmission), consensus mechanism, encryption algorithm, etc. The new type of application model is essentially a decentralized database; a block chain can be composed of multiple serial transaction records (also called blocks) that are connected and protected by cryptography. The connected distributed ledger allows multiple parties to effectively record the transaction, and the transaction can be permanently checked (not tampered with). Among them, the consensus mechanism refers to the mathematical algorithm that realizes the establishment of trust between different nodes and the acquisition of rights and interests in the blockchain network; that is to say, the consensus mechanism is a mathematical algorithm recognized by all network nodes of the blockchain. This application can use the consensus mechanism of the blockchain to realize the restoration of the target image to the target presentation, which can improve the accuracy of the restoration of the target presentation.
例如,区块链中的各个节点设备对上述步骤S101-S105的执行结果进行共识验证,每个步骤的执行结果均被共识验证通过,则可以确定生成目标演示文稿准确度比较高;如果存在步骤的执行结果未被共识验证通过,则可以确定生成目标演示文稿的准确度比较低,则节点设备可以再次执行上述步骤S101-S105,重新获取目标演示文稿。或者,区块链中的各个节点设备可以对目标演示文稿(即仅对步骤S105的执行结果)进行共识验证,如果共识验证通过,则确定目标演示文稿的准确度比较高;如果共识验证未通过,则确定目标演示文稿的准确度比较低,节点设备可再次执行上述步骤S101-S105,重新获取目标演示文稿。For example, each node device in the blockchain performs consensus verification on the execution results of the above steps S101-S105, and the execution results of each step are passed by the consensus verification, then it can be determined that the accuracy of generating the target presentation is relatively high; if there are steps If the execution result of is not passed by the consensus verification, it can be determined that the accuracy of generating the target presentation is relatively low, and the node device may perform the above steps S101-S105 again to obtain the target presentation again. Alternatively, each node device in the blockchain can perform consensus verification on the target presentation (that is, only the execution result of step S105). If the consensus verification passes, it is determined that the accuracy of the target presentation is relatively high; if the consensus verification fails , It is determined that the accuracy of the target presentation is relatively low, and the node device may perform the above steps S101-S105 again to obtain the target presentation again.
本申请实施例中,通过对目标图像进行文字识别,获取到目标图像对应的文字特征信息后,再移除目标图像中的文本内容,得到第一子图像,避免了文本内容对后续处理进行干扰;得到第一子图像后,在对第一子图像进行图像识别,得到第一子图像的图像特征信息,根据目标图像的文字特征信息以及第一子图像的图像特征信息生成目标图像对应的目标演示文稿。即通过分层处理方式获取目标图像中的文字特征信息以及图像特征信息,根据文字特征信息以及图像特征信息生成目标图像对应的目标演示文稿,该还原演示文稿的过程不需要人工参与,可提高还原演示文稿的效率以及准确度。In the embodiment of the present application, after the text feature information corresponding to the target image is obtained by performing text recognition on the target image, the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ; After obtaining the first sub-image, perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information. The process of restoring the presentation does not require manual participation, which can improve restoration The efficiency and accuracy of the presentation.
请参见图3,是本申请实施例提供的另一种演示文稿生成方法的流程示意图,本申请实施例电子设备来执行,该另一种演示文稿生成方法包括以下步骤S201-S209。Please refer to FIG. 3, which is a schematic flowchart of another method for generating a presentation provided by an embodiment of the present application, which is executed by the electronic device in the embodiment of the present application. The other method for generating a presentation includes the following steps S201-S209.
S201,获取待处理文件,该待处理文件包括目标图像,该目标图像是根据目标演示文件中的目标演示文稿生成的。S201: Acquire a file to be processed, where the file to be processed includes a target image, and the target image is generated according to a target presentation in the target presentation file.
S202,对上述目标图像进行文本识别,得到上述目标图像对应的文本特征信息。 S202: Perform text recognition on the target image to obtain text feature information corresponding to the target image.
S203,根据上述文本特征信息移除上述目标图像中的文本内容,得到上述目标图像对应的第一子图像。S203: Remove the text content in the target image according to the text feature information to obtain a first sub-image corresponding to the target image.
S204,对上述目标图像对应的第一子图像进行图像识别,得到上述目标图像对应的第一子图像对应的图像特征信息。S204: Perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image.
S205,根据上述文本特征信息以及上述图像特征信息生成目标图像对应的目标演示文稿。S205: Generate a target presentation corresponding to the target image according to the above-mentioned text feature information and the above-mentioned image feature information.
在本申请实施例中,该另一种演示文稿生成方法步骤S201-S205中的内容可以参看图1所示的内容,本申请实施例不再累述。In the embodiment of the present application, the content in steps S201-S205 of the another method for generating a presentation can refer to the content shown in FIG. 1, which will not be repeated in the embodiment of the present application.
可选的,步骤S205可包括:对文本特征信息中所包括的文本内容进行校正处理,得到校正处理后的文本特征信息,根据校正处理后的文本特征信息以及上述图像特征信息生成目标图像对应的目标演示文稿。Optionally, step S205 may include: correcting the text content included in the text characteristic information to obtain the corrected text characteristic information, and generating the target image corresponding to the corrected text characteristic information and the above-mentioned image characteristic information. Target presentation.
电子设备可以对文本特征信息中所包括的文本内容进行校正处理,得到校正处理后的文本特征信息。此处校正处理包括对文本内容中的错别字进行校正处理、对位于同一个位置中的文本内容的大小、颜色、字体等中的至少一种进行归一化处理。进一步,根据校正处理后的文本特征信息以及上述图像特征信息生成目标图像对应的目标演示文稿。通过对文本特征信息进行校正处理,可提高文本特征信息的准确度,进一步,提高还原演示文稿的准确度。The electronic device can perform correction processing on the text content included in the text characteristic information to obtain the corrected text characteristic information. The correction processing here includes correction processing for typos in the text content, and normalization processing for at least one of the size, color, font, etc. of the text content located in the same position. Further, a target presentation corresponding to the target image is generated according to the corrected text feature information and the above-mentioned image feature information. By correcting the text feature information, the accuracy of the text feature information can be improved, and further, the accuracy of restoring the presentation can be improved.
S206,根据上述待处理文件中的每一帧图像的文本特征信息与图像特征信息,生成上述待处理文件中的每一帧图像对应的演示文稿。S206: According to the text feature information and image feature information of each frame of the image in the file to be processed, a presentation corresponding to each frame of the image in the file to be processed is generated.
其中,上述待处理文件中包括至少一帧图像,一帧图像对应目标演示文件中的一页演示文稿,可以对待处理文件中的至少一帧图像进行文本识别和图像识别,获得待处理文件中每一帧图像对应的文本特征信息和图像特征信息,生成待处理文件中每一帧图像对应的演示文稿。Wherein, the above-mentioned file to be processed includes at least one frame of image, and one frame of image corresponds to a page of presentation in the target presentation file. At least one frame of image in the file to be processed can be subjected to text recognition and image recognition to obtain each frame of image in the file to be processed. The text feature information and image feature information corresponding to one frame of image are generated to generate a presentation corresponding to each frame of image in the file to be processed.
S207,根据上述候选文件确定待处理文件中的每一帧图像的排列顺序。S207: Determine the sequence of each frame of images in the file to be processed according to the above-mentioned candidate file.
根据待处理文件中每一帧图像的文本特征信息和图像特征信息,生成每一帧图像对应的演示文稿后,由于待处理文件中的每帧图像是通过对候选文件进行切分得到,因此,可以根据候选文件确定待处理文件中的每一帧图像的排列顺序,即可根据每一帧图像在候选文件中的位置确定每一帧图像的排列顺。其中,若待处理文件中的至少一帧图像是根据用户有序上传的单页图片文件或者单页便携式文件生成的,单页图片文件或者单页便携式文件中的内容对应目标演示文件中一页演示文稿的内容,则可以将该至少一个单页图片文件或者单页便携式文件上传的先后顺序确定待处理文件中的每一帧图像的排列顺序。According to the text feature information and image feature information of each frame of image in the file to be processed, after the presentation corresponding to each frame of image is generated, since each frame of image in the file to be processed is obtained by segmenting the candidate file, therefore, The sequence of each frame of the image in the file to be processed can be determined according to the candidate file, and the sequence of each frame of image can be determined according to the position of each frame of the image in the candidate file. Among them, if at least one frame of image in the file to be processed is generated based on a single-page picture file or a single-page portable file uploaded by the user in an orderly manner, the content in the single-page picture file or single-page portable file corresponds to a page in the target presentation file For the content of the presentation, the sequence of uploading the at least one single-page picture file or the single-page portable file may determine the sequence of each frame of the image in the file to be processed.
S208,采用上述排列顺序对待处理文件中的每一帧图像对应的演示文稿进行排列。S208: Arrange the presentation documents corresponding to each frame of the image in the file to be processed using the above arrangement sequence.
S209,采用排列后的演示文稿生成目标演示文件。S209: Use the arranged presentation documents to generate a target presentation file.
获得待处理文件中的每一帧图像的排列顺序,根据该排列顺序对待处理文件中的每一帧图像对应的演示文稿进行排列,采用排列后的演示文稿生成目标演示文件。The arrangement sequence of each frame of the image in the file to be processed is obtained, the presentation corresponding to each frame of the image in the file to be processed is arranged according to the arrangement sequence, and the arranged presentation is used to generate the target presentation file.
可选的,获取各个演示文稿的主题信息,以及各个主题信息之间的关联关系,根据主题之间的关联关系确定各个演示文稿的排列顺序,按照该排列顺序对各个演示文稿进行排序,得到排序后的演示文稿,采用排序后的演示文稿生成目标演示文件。Optionally, obtain the theme information of each presentation and the association relationship between the various theme information, determine the arrangement order of each presentation according to the association relationship between the topics, and sort the presentations according to the arrangement order to get the sort After the presentation, the sorted presentation is used to generate the target presentation file.
电子设备可以根据各个演示文稿中文本内容的字体大小确定演示文稿的主题信息,或者,可以根据各个演示文稿中的文本内容的位置信息确定各个演示文稿的主题信息。进一步,可获取各个主题信息之间的关联关系,此处的各个主题信息之间的关联关系可以是指主题信息之间的包含关系,可根据各个主题信息之间的关联关系确定各个演示文稿的排列顺序。例如,该演示文件的内容为某公司的年度工作总结,假设演示文稿1的主题信息为年度工作概述,演示文稿2的主题信息为工作完成情况,由于年度工作概述包含工作完成情况,因此,演示文稿1的排列顺序位于演示文稿2的排列顺序之前。按照各个演示文稿的排列顺序对各个演示文稿进行排序,采用排列后的演示文稿生成目标演示文件。通过根据演示文稿之间的关联关系对演示文稿进行排序,提高还原演示文稿的准确度。The electronic device may determine the theme information of the presentation according to the font size of the text content in each presentation, or may determine the theme information of each presentation according to the location information of the text content in each presentation. Further, the association relationship between the various topic information can be obtained. The association relationship between the various topic information herein may refer to the containment relationship between the topic information, and the association relationship between the various topic information can be used to determine the relationship between each presentation. Order. For example, the content of the presentation file is a company’s annual work summary. Assume that the subject information of presentation 1 is the annual work summary, and the subject information of presentation 2 is the work completion status. Since the annual work summary includes the work completion status, the presentation The order of presentation 1 is before the order of presentation 2. Sort the presentations according to the arrangement order of the presentations, and use the arranged presentations to generate the target presentation file. By sorting the presentations according to the association relationship between the presentations, the accuracy of restoring the presentations is improved.
本申请实施例中,通过对目标图像进行文字识别,获取到目标图像对应的文字特征信息后,再移除目标图像中的文本内容,得到第一子图像,避免了文本内容对后续处理进行干扰;得到第一子图像后,在对第一子图像进行图像识别,得到第一子图像的图像特征信息,根据目标图像的文字特征信息以及第一子图像的图像特征信息生成目标图像对应的目标演示文稿。根据上述待处理文件中的每一帧图像的文本特征信息与图像特征信息,生成上述待处理文件中的每一帧图像对应的演示文稿,根据候选文件确定待处理文件中的每一帧图像的排列顺序,采用排列顺序对待处理文件中的每一帧图像对应的演示文稿进行排列,采用排列后的演示文稿生成目标演示文件。即通过分层处理方式获取目标图像中的文字特征信息以及图像特征信息,根据文字特征信息以及图像特征信息生成目标图像对应的目标演示文稿,该还原演示文稿的过程不需要人工参与,可提高还原演示文稿的效率。另外,根据每帧图像在待处理文件中的排序顺序对还原得到的演示文稿进行排列,以提高还原演示文稿的准确度。In the embodiment of the present application, after the text feature information corresponding to the target image is obtained by performing text recognition on the target image, the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ; After obtaining the first sub-image, perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. According to the text feature information and image feature information of each frame of the image in the file to be processed, a presentation corresponding to each frame of the image in the file to be processed is generated, and the candidate file is used to determine the image of each frame in the file to be processed. Arrangement sequence, the presentation corresponding to each frame of the image in the file to be processed is arranged in the arrangement sequence, and the arranged presentation is used to generate the target presentation file. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information. The process of restoring the presentation does not require manual participation, which can improve restoration The efficiency of the presentation. In addition, the restored presentations are arranged according to the sort order of each frame of image in the file to be processed, so as to improve the accuracy of the restored presentations.
请参见图4,是本申请实施例提供的一种演示文稿生成装置的结构示意图,本申请实施例的所述演示文稿生成装置可以在上述提及的电子设备中。本实施例中,该数据处理装置包括:第一获取模块11,用于获取待处理文件,所述待处理文件是根据目标演示文件中的目标演示文稿生成的,所述待处理文件包括目标图像;其中,上述第一获取模块包括:第二获取单元、第三获取单元、第一切分单元、图像转换单元。Refer to FIG. 4, which is a schematic structural diagram of a presentation generating apparatus provided by an embodiment of the present application. The presentation generating apparatus of the embodiment of the present application may be in the above-mentioned electronic device. In this embodiment, the data processing device includes: a first acquisition module 11 for acquiring a file to be processed, the file to be processed is generated based on a target presentation in the target presentation file, and the file to be processed includes a target image ; Wherein, the above-mentioned first acquisition module includes: a second acquisition unit, a third acquisition unit, a first segmentation unit, and an image conversion unit.
第二获取单元,用于获取候选文件,所述候选文件是根据所述目标演示文件生成的。The second obtaining unit is configured to obtain a candidate file, the candidate file being generated according to the target presentation file.
第三获取单元,用于从所述候选文件中的文本内容中获取所述候选文件中的主题信息。The third obtaining unit is configured to obtain the topic information in the candidate file from the text content in the candidate file.
第一切分单元,用于根据所述主题信息对所述候选文件进行切分,得到多个候选子文件。The first segmentation unit is used to segment the candidate file according to the topic information to obtain multiple candidate subfiles.
图像转换单元,用于对所述多个候选子文件进行图像转换处理,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。The image conversion unit is configured to perform image conversion processing on the multiple candidate sub-files to obtain a multi-frame image, use the multi-frame image as the file to be processed, and the target image is any of the files to be processed One frame of image.
其中,上述获取模块还包括:第四获取单元、接受单元、第二切分单元。Wherein, the above-mentioned acquisition module further includes: a fourth acquisition unit, an acceptance unit, and a second segmentation unit.
第四获取单元,用于获取候选文件,所述候选文件是根据所述目标演示文件生成的。The fourth obtaining unit is configured to obtain a candidate file, the candidate file being generated according to the target presentation file.
接受单元,用于接收针对所述候选文件的切分指令,所述切分指令包括尺寸剪裁比例。The receiving unit is configured to receive a segmentation instruction for the candidate file, where the segmentation instruction includes a size trimming ratio.
第二切分单元,用于按照所述尺寸剪裁比例对所述候选文件进行切分,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。The second segmentation unit is configured to segment the candidate file according to the size trimming ratio to obtain a multi-frame image, use the multi-frame image as the file to be processed, and the target image is the file to be processed Any frame of image in the file.
第一识别模块12,用于对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息。The first recognition module 12 is configured to perform text recognition on the target image to obtain text feature information corresponding to the target image.
移除模块13,用于根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像。The removing module 13 is configured to remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image.
第二识别模块14,用于对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息。The second recognition module 14 is configured to perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image.
上述第二识别模块包括:第一获取单元、确定单元、第一识别单元、移除单元、第二识别单元。The above-mentioned second identification module includes: a first acquisition unit, a determination unit, a first identification unit, a removal unit, and a second identification unit.
第一获取单元,用于获取所述目标图像对应的第一子图像的像素特征信息。The first acquiring unit is configured to acquire the pixel feature information of the first sub-image corresponding to the target image.
确定单元,用于根据所述像素特征信息确定所述第一子图像的前景图像区域。The determining unit is configured to determine the foreground image area of the first sub-image according to the pixel characteristic information.
第一识别单元,用于对所述前景图像区域进行图像识别,得到所述第一子图像的前景图像特征信息。The first recognition unit is configured to perform image recognition on the foreground image area to obtain the foreground image feature information of the first sub-image.
移除单元,用于移除所述第一子图像中的前景图像区域,得到第二子图像。The removing unit is used to remove the foreground image area in the first sub-image to obtain the second sub-image.
第二识别单元,用于对所述第二子图像进行图像识别,得到所述第二子图像的背景图像特征信息;将所述第二子图像的背景图像特征信息作为所述第一子图像的背景图像特征信息。The second recognition unit is configured to perform image recognition on the second sub-image to obtain background image feature information of the second sub-image; use the background image feature information of the second sub-image as the first sub-image The feature information of the background image.
生成模块15,用于根据所述文本特征信息以及所述图像特征信息分别生成所述目标图像对应的目标演示文稿。The generating module 15 is configured to generate a target presentation corresponding to the target image according to the text feature information and the image feature information.
其中,上述生成模块包括:第一生成单元、第二生成单元、拼接单元。Wherein, the above-mentioned generating module includes: a first generating unit, a second generating unit, and a splicing unit.
第一生成单元,用于采用所述第一子图像的背景图像特征信息,生成所述目标图像的背景图像区域。The first generating unit is configured to use the background image feature information of the first sub-image to generate the background image area of the target image.
第二生成单元,用于采用所述第一子图像的前景图像特征信息,生成所述目标图像的前景图像区域。The second generating unit is configured to use the foreground image feature information of the first sub-image to generate the foreground image area of the target image.
第二生成单元,用于采用所述目标图像对应的文本特征信息,生成所述目标图像的文本内容。The second generating unit is configured to use the text feature information corresponding to the target image to generate the text content of the target image.
拼接单元,用于对所述目标图像的所述背景图像区域、所述前景图像区域、所述文本内容进行拼接,得到所述目标图像对应的目标演示文稿。The splicing unit is used for splicing the background image area, the foreground image area, and the text content of the target image to obtain a target presentation corresponding to the target image.
其中,所述根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像之后,所述装置还包括:第二获取模块,用于获取所述目标图像中移除所述文本特征信息后的区域,作为目标区域;第一确定模块,用于根据所述文本特征信息确定所述目标图像中的文本内容对应颜色信息;填补模块,用于采用所述文本内容对应颜色信息对所述目标区域进行填补处理。Wherein, after the text content in the target image is removed according to the text feature information, and the first sub-image corresponding to the target image is obtained, the device further includes: a second obtaining module, configured to obtain the The area in the target image after the text feature information is removed is used as the target area; the first determining module is used to determine the color information corresponding to the text content in the target image according to the text feature information; the filling module is used to use The color information corresponding to the text content performs padding processing on the target area.
其中,所述装置还包括:第二生成模块,用于根据所述待处理文件中的每一帧图像的文本特征信息与图像特征信息,生成所述待处理文件中的每一帧图像对应的演示文稿;第二确定模块,用于根据所述候选文件确定所述待处理文件中的每一帧图像的排列顺序;排列模块,用于采用所述排列顺序对所述待处理文件中的每一帧图像对应的演示文稿进行排列;第三生成模块,用于采用排列后的演示文稿生成所述目标演示文件。Wherein, the device further includes: a second generation module, which is used to generate the corresponding information for each frame of the image in the file to be processed according to the text feature information and image feature information of each frame of the image in the file to be processed Presentation; the second determining module is used to determine the arrangement order of each frame of the image in the file to be processed according to the candidate file; the arrangement module is used to apply the arrangement order to each of the files to be processed The presentation documents corresponding to one frame of image are arranged; the third generation module is used to generate the target presentation file by using the arranged presentation documents.
本申请实施例中,通过对目标图像进行文字识别,获取到目标图像对应的文字特征信息后,再移除目标图像中的文本内容,得到第一子图像,避免了文本内容对后续处理进行干扰;得到第一子图像后,在对第一子图像进行图像识别,得到第一子图像的图像特征信息,根据目标图像的文字特征信息以及第一子图像的图像特征信息生成目标图像对应的目标演示文稿。即通过分层处理方式获取目标图像中的文字特征信息以及图像特征信息,根据文字特征信息以及图像特征信息生成目标图像对应的目标演示文稿,该还原演示文稿的过程不需要人工参与,可提高还原演示文稿的效率以及准确度。In the embodiment of the present application, after the text feature information corresponding to the target image is obtained by performing text recognition on the target image, the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ; After obtaining the first sub-image, perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information. The process of restoring the presentation does not require manual participation, which can improve restoration The efficiency and accuracy of the presentation.
请参见图5,是本申请实施例提供的一种电子设备的结构示意图,如图5所示的本实施例中的电子设备可以包括:一个或多个处理器21;一个或多个输入装置22,一个或多个输出装置23和存储器24。上述处理器21、输入装置22、输出装置23和存储器24通过总线25连接。Please refer to FIG. 5, which is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in FIG. 5, the electronic device in this embodiment may include: one or more processors 21; and one or more input devices 22. One or more output devices 23 and storage 24. The aforementioned processor 21, input device 22, output device 23, and memory 24 are connected by a bus 25.
所处理器21可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器 (Digital Signal Processor,DSP)、专用集成电路 (Application Specific Integrated Circuit,ASIC)、现成可编程门阵列 (Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 21 may be a central processing unit (Central Processing Unit, CPU), the processor can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), ready-made programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
输入装置22可以包括触控板、指纹采传感器(用于采集用户的指纹信息和指纹的方向信息)、麦克风等,输出装置23可以包括显示器(LCD等)、扬声器等,输出装置23可以输出校正处理后的数据表。The input device 22 may include a touch panel, a fingerprint sensor (used to collect user fingerprint information and fingerprint orientation information), a microphone, etc., the output device 23 may include a display (LCD, etc.), a speaker, etc., and the output device 23 may output calibration The processed data sheet.
该存储器24可以包括只读存储器和随机存取存储器,并向处理器21提供指令和数据。存储器24的一部分还可以包括非易失性随机存取存储器,存储器24用于存储计算机程序,所述计算机程序包括程序指令,处理器21用于执行存储器24存储的程序指令,以用于执行一种演示文稿生成方法,即用于执行以下操作:获取待处理文件,所述待处理文件包括目标图像,所述目标图像是根据目标演示文件中的目标演示文稿生成的;对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息;根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;根据所述文本特征信息以及所述图像特征信息生成所述目标图像对应的目标演示文稿。The memory 24 may include a read-only memory and a random access memory, and provides instructions and data to the processor 21. A part of the memory 24 may also include a non-volatile random access memory. The memory 24 is used to store a computer program. The computer program includes program instructions. The processor 21 is used to execute the program instructions stored in the memory 24 to execute a program. A presentation generation method is used to perform the following operations: obtain a file to be processed, the file to be processed includes a target image, the target image is generated according to the target presentation in the target presentation file; Text recognition to obtain the text feature information corresponding to the target image; remove the text content in the target image according to the text feature information to obtain the first sub-image corresponding to the target image; Image recognition is performed on the first sub-image to obtain image feature information corresponding to the first sub-image corresponding to the target image; a target presentation corresponding to the target image is generated according to the text feature information and the image feature information.
可选的,处理器21用于执行存储器24存储的程序指令,用于执行以下操作:所述图像特征信息包括前景图像特征信息和背景图像特征信息;获取所述目标图像对应的第一子图像的像素特征信息;根据所述像素特征信息确定所述第一子图像的前景图像区域;对所述前景图像区域进行图像识别,得到所述第一子图像的前景图像特征信息;移除所述第一子图像中的前景图像区域,得到第二子图像;对所述第二子图像进行图像识别,得到所述第二子图像的背景图像特征信息;将所述第二子图像的背景图像特征信息作为所述第一子图像的背景图像特征信息。Optionally, the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: the image feature information includes foreground image feature information and background image feature information; and obtain the first sub-image corresponding to the target image Determine the foreground image area of the first sub-image according to the pixel characteristic information; perform image recognition on the foreground image area to obtain the foreground image characteristic information of the first sub-image; remove the The foreground image area in the first sub-image to obtain the second sub-image; perform image recognition on the second sub-image to obtain the background image feature information of the second sub-image; combine the background image of the second sub-image The feature information is used as the background image feature information of the first sub-image.
可选的,处理器21用于执行存储器24存储的程序指令,用于执行以下操作:采用所述第一子图像的背景图像特征信息,生成所述目标图像的背景图像区域;采用所述第一子图像的前景图像特征信息,生成所述目标图像的前景图像区域;采用所述目标图像对应的文本特征信息,生成所述目标图像的文本内容;对所述目标图像的所述背景图像区域、所述前景图像区域、所述文本内容进行拼接,得到所述目标图像对应的目标演示文稿。Optionally, the processor 21 is configured to execute program instructions stored in the memory 24, and is configured to perform the following operations: use the background image feature information of the first sub-image to generate the background image area of the target image; The foreground image feature information of a sub-image is used to generate the foreground image area of the target image; the text feature information corresponding to the target image is used to generate the text content of the target image; the background image area of the target image , The foreground image area and the text content are spliced to obtain a target presentation corresponding to the target image.
可选的,处理器21用于执行存储器24存储的程序指令,用于执行以下操作:获取所述目标图像中移除所述文本内容后的区域,作为目标区域;根据所述文本特征信息确定所述目标图像中的文本内容对应颜色信息;采用所述文本内容对应颜色信息对所述目标区域进行填补处理。Optionally, the processor 21 is configured to execute program instructions stored in the memory 24, and is configured to perform the following operations: obtain an area in the target image after the text content is removed as the target area; determine according to the text feature information The text content in the target image corresponds to color information; the color information corresponding to the text content is used to fill in the target area.
可选的,处理器21用于执行存储器24存储的程序指令,用于执行以下操作:获取候选文件,所述候选文件是根据所述目标演示文件生成的;从所述候选文件中的文本内容中获取所述候选文件中的主题信息;根据所述主题信息对所述候选文件进行切分,得到多个候选子文件;对所述多个候选子文件进行图像转换处理,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。Optionally, the processor 21 is configured to execute program instructions stored in the memory 24, and is configured to perform the following operations: obtain a candidate file, which is generated according to the target presentation file; Obtain the topic information in the candidate file in the candidate file; segment the candidate file according to the topic information to obtain a plurality of candidate subfiles; perform image conversion processing on the plurality of candidate subfiles to obtain a multi-frame image, The multi-frame image is used as the file to be processed, and the target image is any frame image in the file to be processed.
可选的,处理器21用于执行存储器24存储的程序指令,用于执行以下操作:获取候选文件,所述候选文件是根据所述目标演示文件生成的;接收针对所述候选文件的切分指令,所述切分指令包括尺寸剪裁比例;按照所述尺寸剪裁比例对所述候选文件进行切分,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。Optionally, the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: obtain candidate files, which are generated according to the target presentation file; receive segmentation for the candidate files Instruction, the segmentation instruction includes a size trimming ratio; the candidate file is segmented according to the size trimming ratio to obtain a multi-frame image, the multi-frame image is used as the file to be processed, and the target image is Any frame of image in the file to be processed.
可选的,处理器21用于执行存储器24存储的程序指令,用于执行以下操作:根据所述待处理文件中的每一帧图像的文本特征信息与图像特征信息,生成所述待处理文件中的每一帧图像对应的演示文稿;根据所述候选文件确定所述待处理文件中的每一帧图像的排列顺序;采用所述排列顺序对所述待处理文件中的每一帧图像对应的演示文稿进行排列;采用排列后的演示文稿生成所述目标演示文件。Optionally, the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: generate the to-be-processed file according to the text feature information and image feature information of each frame of the image in the to-be-processed file The presentation corresponding to each frame of the image in the file; determine the sequence of each frame of the image in the file to be processed according to the candidate file; use the sequence to correspond to each frame of the image in the file to be processed Arrange the presentations; use the arranged presentations to generate the target presentation file.
本申请实施例中所描述的处理器21、输入装置22、输出装置23可执行本申请实施例提供的演示文稿生成方法的第一实施例和第二实施例中所描述的实现方式,也可执行本申请实施例所描述的电子设备的实现方式,在此不再赘述。The processor 21, the input device 22, and the output device 23 described in the embodiments of this application can execute the implementations described in the first embodiment and the second embodiment of the presentation generation method provided in the embodiments of this application. The implementation of the electronic device described in the embodiments of the present application is implemented, which will not be repeated here.
本申请实施例中,通过对目标图像进行文字识别,获取到目标图像对应的文字特征信息后,再移除目标图像中的文本内容,得到第一子图像,避免了文本内容对后续处理进行干扰;得到第一子图像后,在对第一子图像进行图像识别,得到第一子图像的图像特征信息,根据目标图像的文字特征信息以及第一子图像的图像特征信息生成目标图像对应的目标演示文稿。即通过分层处理方式获取目标图像中的文字特征信息以及图像特征信息,根据文字特征信息以及图像特征信息生成目标图像对应的目标演示文稿,该还原演示文稿的过程不需要人工参与,可提高还原演示文稿的效率以及准确度。In the embodiment of the present application, after the text feature information corresponding to the target image is obtained by performing text recognition on the target image, the text content in the target image is removed to obtain the first sub-image, which prevents the text content from interfering with subsequent processing ; After obtaining the first sub-image, perform image recognition on the first sub-image to obtain the image feature information of the first sub-image, and generate the target corresponding to the target image according to the text feature information of the target image and the image feature information of the first sub-image Presentation. That is, the text feature information and image feature information in the target image are obtained through layered processing, and the target presentation corresponding to the target image is generated according to the text feature information and image feature information. The process of restoring the presentation does not require manual participation, which can improve restoration The efficiency and accuracy of the presentation.
本申请实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时实现如图1及图3实施例中所示的演示文稿生成方法。所述计算机可读存储介质可以是非易失性,也可以是易失性的。An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions. When the program instructions are executed by a processor, the implementation is shown in FIG. 1 and FIG. 3 The presentation generation method shown in the embodiment. The computer-readable storage medium may be non-volatile or volatile.
所述计算机可读存储介质可以是前述任一实施例所述的电子设备的内部存储单元,例如控制设备的硬盘或内存。所述计算机可读存储介质也可以是所述控制设备的外部存储设备,例如所述控制设备上配备的插接式硬盘,智能存储卡(Smart Media Card, SMC),安全数字(Secure Digital, SD)卡,闪存卡(Flash Card)等。进一步地,所述计算机可读存储介质还可以既包括所述控制设备的内部存储单元也包括外部存储设备。所述计算机可读存储介质用于存储所述计算机程序以及所述控制设备所需的其他程序和数据。所述计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。The computer-readable storage medium may be an internal storage unit of the electronic device described in any of the foregoing embodiments, such as a hard disk or a memory of a control device. The computer-readable storage medium may also be an external storage device of the control device, such as a plug-in hard disk equipped on the control device, a smart memory card (Smart Media Card, SMC), or a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Further, the computer-readable storage medium may also include both an internal storage unit of the control device and an external storage device. The computer-readable storage medium is used to store the computer program and other programs and data required by the control device. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
作为示例,上述计算机可读存储介质可被部署在一个计算机设备上执行,或者被部署位于一个地点的多个计算机设备上执行,又或者,在分布在多个地点且通过通信网络互连的多个计算机设备上执行,分布在多个地点且通过通信网络互连的多个计算机设备可以组成区块链网络。As an example, the above-mentioned computer-readable storage medium may be deployed and executed on one computer device, or deployed on multiple computer devices located in one location, or in multiple locations that are distributed in multiple locations and interconnected by a communication network. Executed on a computer device, multiple computer devices distributed in multiple locations and interconnected by a communication network can form a blockchain network.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (20)

  1. 一种演示文稿生成方法,其中,包括:A method for generating presentations, including:
    获取待处理文件,所述待处理文件包括目标图像,所述目标图像是根据目标演示文件中的目标演示文稿生成的;Acquiring a file to be processed, where the file to be processed includes a target image, and the target image is generated according to the target presentation in the target presentation file;
    对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息; Performing text recognition on the target image to obtain text feature information corresponding to the target image;
    根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;Removing the text content in the target image according to the text feature information to obtain a first sub-image corresponding to the target image;
    对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;Performing image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image;
    根据所述文本特征信息以及所述图像特征信息生成所述目标图像对应的目标演示文稿。Generate a target presentation corresponding to the target image according to the text feature information and the image feature information.
  2. 根据权利要求1所述的方法,其中,所述图像特征信息包括前景图像特征信息和背景图像特征信息;The method according to claim 1, wherein the image feature information includes foreground image feature information and background image feature information;
    所述对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息,包括:The performing image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image includes:
    获取所述目标图像对应的第一子图像的像素特征信息;Acquiring pixel feature information of the first sub-image corresponding to the target image;
    根据所述像素特征信息确定所述第一子图像的前景图像区域;Determining the foreground image area of the first sub-image according to the pixel characteristic information;
    对所述前景图像区域进行图像识别,得到所述第一子图像的前景图像特征信息;Performing image recognition on the foreground image area to obtain foreground image feature information of the first sub-image;
    移除所述第一子图像中的前景图像区域,得到第二子图像;Removing the foreground image area in the first sub-image to obtain a second sub-image;
    对所述第二子图像进行图像识别,得到所述第二子图像的背景图像特征信息;Performing image recognition on the second sub-image to obtain background image feature information of the second sub-image;
    将所述第二子图像的背景图像特征信息作为所述第一子图像的背景图像特征信息。Use the background image feature information of the second sub-image as the background image feature information of the first sub-image.
  3. 根据权利要求2所述的方法,其中,所述根据所述文本特征信息以及所述图像特征信息生成所述目标图像对应的目标演示文稿,包括:The method according to claim 2, wherein the generating a target presentation corresponding to the target image according to the text characteristic information and the image characteristic information comprises:
    采用所述第一子图像的背景图像特征信息,生成所述目标图像的背景图像区域;Using the background image feature information of the first sub-image to generate the background image area of the target image;
    采用所述第一子图像的前景图像特征信息,生成所述目标图像的前景图像区域;Using the foreground image feature information of the first sub-image to generate the foreground image area of the target image;
    采用所述目标图像对应的文本特征信息,生成所述目标图像的文本内容;Using the text feature information corresponding to the target image to generate the text content of the target image;
    对所述目标图像的所述背景图像区域、所述前景图像区域、所述文本内容进行拼接,得到所述目标图像对应的目标演示文稿。The background image area, the foreground image area, and the text content of the target image are spliced to obtain a target presentation corresponding to the target image.
  4. 根据权利要求1-3任一项所述的方法,其中,所述根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像之后,所述方法还包括:The method according to any one of claims 1 to 3, wherein, after the text content in the target image is removed according to the text characteristic information, and the first sub-image corresponding to the target image is obtained, the Methods also include:
    获取所述目标图像中移除所述文本内容后的区域,作为目标区域;Acquiring an area in the target image after the text content is removed as a target area;
    根据所述文本特征信息确定所述目标图像中的文本内容对应颜色信息;Determining color information corresponding to the text content in the target image according to the text feature information;
    采用所述文本内容对应颜色信息对所述目标区域进行填补处理。The color information corresponding to the text content is used to fill in the target area.
  5. 根据权利要求1所述的方法,其中,所述获取待处理文件,包括:The method according to claim 1, wherein said obtaining the file to be processed comprises:
    获取候选文件,所述候选文件是根据所述目标演示文件生成的;Obtaining a candidate file, the candidate file being generated according to the target presentation file;
    从所述候选文件中的文本内容中获取所述候选文件中的主题信息;Acquiring subject information in the candidate file from the text content in the candidate file;
    根据所述主题信息对所述候选文件进行切分,得到多个候选子文件;Segmenting the candidate file according to the subject information to obtain multiple candidate sub-files;
    对所述多个候选子文件进行图像转换处理,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。Image conversion processing is performed on the plurality of candidate sub-files to obtain a multi-frame image, the multi-frame image is used as the file to be processed, and the target image is any frame image in the file to be processed.
  6. 根据权利要求1所述的方法,其中,所述获取待处理文件,包括:The method according to claim 1, wherein said obtaining the file to be processed comprises:
    获取候选文件,所述候选文件是根据所述目标演示文件生成的;Obtaining a candidate file, the candidate file being generated according to the target presentation file;
    接收针对所述候选文件的切分指令,所述切分指令包括尺寸剪裁比例;Receiving a segmentation instruction for the candidate file, where the segmentation instruction includes a size trimming ratio;
    按照所述尺寸剪裁比例对所述候选文件进行切分,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。The candidate file is segmented according to the size cropping ratio to obtain a multi-frame image, the multi-frame image is used as the file to be processed, and the target image is any frame image in the file to be processed.
  7. 根据权利要求6所述的方法,其中,所述方法还包括:The method according to claim 6, wherein the method further comprises:
    根据所述待处理文件中的每一帧图像的文本特征信息与图像特征信息,生成所述待处理文件中的每一帧图像对应的演示文稿;Generating a presentation corresponding to each frame of the image in the file to be processed according to the text feature information and image feature information of each frame of the image in the file to be processed;
    根据所述候选文件确定所述待处理文件中的每一帧图像的排列顺序;Determining the sequence of each frame of images in the file to be processed according to the candidate file;
    采用所述排列顺序对所述待处理文件中的每一帧图像对应的演示文稿进行排列;Arrange the presentation corresponding to each frame of the image in the file to be processed in the arrangement sequence;
    采用排列后的演示文稿生成所述目标演示文件。The arranged presentation document is used to generate the target presentation file.
  8. 一种演示文稿生成装置,其中,包括:A presentation generating device, which includes:
    第一获取模块,用于获取待处理文件,所述待处理文件是根据目标演示文件中的目标演示文稿生成的,所述待处理文件包括目标图像;The first obtaining module is configured to obtain a file to be processed, the file to be processed is generated according to the target presentation in the target presentation file, and the file to be processed includes the target image;
    第一识别模块,用于对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息;The first recognition module is configured to perform text recognition on the target image to obtain text feature information corresponding to the target image;
    移除模块,用于根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;A removing module, configured to remove text content in the target image according to the text feature information to obtain a first sub-image corresponding to the target image;
    第二识别模块,用于对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;The second recognition module is configured to perform image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image;
    生成模块,用于根据所述文本特征信息以及所述图像特征信息分别生成所述目标图像对应的目标演示文稿。The generating module is used to generate the target presentation corresponding to the target image according to the text characteristic information and the image characteristic information.
  9. 一种电子设备,其中,包括:An electronic device, including:
    处理器,适于实现一条或一条以上指令;以及,Processor, suitable for implementing one or more instructions; and,
    计算机可读存储介质,所述计算机可读存储介质存储有一条或一条以上指令,所述一条或一条以上指令适于由所述处理器加载并执行以下步骤:A computer-readable storage medium storing one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:
    获取待处理文件,所述待处理文件包括目标图像,所述目标图像是根据目标演示文件中的目标演示文稿生成的;Acquiring a file to be processed, where the file to be processed includes a target image, and the target image is generated according to the target presentation in the target presentation file;
    对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息; Performing text recognition on the target image to obtain text feature information corresponding to the target image;
    根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;Removing the text content in the target image according to the text feature information to obtain a first sub-image corresponding to the target image;
    对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;Performing image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image;
    根据所述文本特征信息以及所述图像特征信息生成所述目标图像对应的目标演示文稿。Generate a target presentation corresponding to the target image according to the text feature information and the image feature information.
  10. 根据权利要求9所述的电子设备,其中,所述图像特征信息包括前景图像特征信息和背景图像特征信息;The electronic device according to claim 9, wherein the image feature information includes foreground image feature information and background image feature information;
    所述处理器用于:The processor is used for:
    获取所述目标图像对应的第一子图像的像素特征信息;Acquiring pixel feature information of the first sub-image corresponding to the target image;
    根据所述像素特征信息确定所述第一子图像的前景图像区域;Determining the foreground image area of the first sub-image according to the pixel characteristic information;
    对所述前景图像区域进行图像识别,得到所述第一子图像的前景图像特征信息;Performing image recognition on the foreground image area to obtain foreground image feature information of the first sub-image;
    移除所述第一子图像中的前景图像区域,得到第二子图像;Removing the foreground image area in the first sub-image to obtain a second sub-image;
    对所述第二子图像进行图像识别,得到所述第二子图像的背景图像特征信息;Performing image recognition on the second sub-image to obtain background image feature information of the second sub-image;
    将所述第二子图像的背景图像特征信息作为所述第一子图像的背景图像特征信息。Use the background image feature information of the second sub-image as the background image feature information of the first sub-image.
  11. 根据权利要求10所述的电子设备,其中,所述处理器用于:The electronic device according to claim 10, wherein the processor is configured to:
    采用所述第一子图像的背景图像特征信息,生成所述目标图像的背景图像区域;Using the background image feature information of the first sub-image to generate the background image area of the target image;
    采用所述第一子图像的前景图像特征信息,生成所述目标图像的前景图像区域;Using the foreground image feature information of the first sub-image to generate the foreground image area of the target image;
    采用所述目标图像对应的文本特征信息,生成所述目标图像的文本内容;Using the text feature information corresponding to the target image to generate the text content of the target image;
    对所述目标图像的所述背景图像区域、所述前景图像区域、所述文本内容进行拼接,得到所述目标图像对应的目标演示文稿。The background image area, the foreground image area, and the text content of the target image are spliced to obtain a target presentation corresponding to the target image.
  12. 根据权利要求9-11所述的电子设备,其中,所述处理器还用于:The electronic device according to claims 9-11, wherein the processor is further configured to:
    获取所述目标图像中移除所述文本内容后的区域,作为目标区域;Acquiring an area in the target image after the text content is removed as a target area;
    根据所述文本特征信息确定所述目标图像中的文本内容对应颜色信息;Determining color information corresponding to the text content in the target image according to the text feature information;
    采用所述文本内容对应颜色信息对所述目标区域进行填补处理。The color information corresponding to the text content is used to fill in the target area.
  13. 根据权利要求9所述的电子设备,其中,所述处理器用于:The electronic device according to claim 9, wherein the processor is configured to:
    获取候选文件,所述候选文件是根据所述目标演示文件生成的;Obtaining a candidate file, the candidate file being generated according to the target presentation file;
    从所述候选文件中的文本内容中获取所述候选文件中的主题信息;Acquiring subject information in the candidate file from the text content in the candidate file;
    根据所述主题信息对所述候选文件进行切分,得到多个候选子文件;Segmenting the candidate file according to the subject information to obtain multiple candidate sub-files;
    对所述多个候选子文件进行图像转换处理,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。Image conversion processing is performed on the plurality of candidate sub-files to obtain a multi-frame image, the multi-frame image is used as the file to be processed, and the target image is any frame image in the file to be processed.
  14. 根据权利要求9所述的电子设备,其中,所述处理器用于:The electronic device according to claim 9, wherein the processor is configured to:
    获取候选文件,所述候选文件是根据所述目标演示文件生成的;Obtaining a candidate file, the candidate file being generated according to the target presentation file;
    接收针对所述候选文件的切分指令,所述切分指令包括尺寸剪裁比例;Receiving a segmentation instruction for the candidate file, where the segmentation instruction includes a size trimming ratio;
    按照所述尺寸剪裁比例对所述候选文件进行切分,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。The candidate file is segmented according to the size cropping ratio to obtain a multi-frame image, the multi-frame image is used as the file to be processed, and the target image is any frame image in the file to be processed.
  15. 根据权利要求14所述的电子设备,其中,所述处理器还用于:The electronic device according to claim 14, wherein the processor is further configured to:
    根据所述待处理文件中的每一帧图像的文本特征信息与图像特征信息,生成所述待处理文件中的每一帧图像对应的演示文稿;Generating a presentation corresponding to each frame of the image in the file to be processed according to the text feature information and image feature information of each frame of the image in the file to be processed;
    根据所述候选文件确定所述待处理文件中的每一帧图像的排列顺序;Determining the sequence of each frame of images in the file to be processed according to the candidate file;
    采用所述排列顺序对所述待处理文件中的每一帧图像对应的演示文稿进行排列;Arrange the presentation corresponding to each frame of the image in the file to be processed in the arrangement sequence;
    采用排列后的演示文稿生成所述目标演示文件。The arranged presentation document is used to generate the target presentation file.
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有一条或者一条以上指令,所述一条或一条以上指令适于由处理器加载并执行以下步骤:A computer-readable storage medium, wherein the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps:
    获取待处理文件,所述待处理文件包括目标图像,所述目标图像是根据目标演示文件中的目标演示文稿生成的;Acquiring a file to be processed, where the file to be processed includes a target image, and the target image is generated according to the target presentation in the target presentation file;
    对所述目标图像进行文本识别,得到所述目标图像对应的文本特征信息; Performing text recognition on the target image to obtain text feature information corresponding to the target image;
    根据所述文本特征信息移除所述目标图像中的文本内容,得到所述目标图像对应的第一子图像;Removing the text content in the target image according to the text feature information to obtain a first sub-image corresponding to the target image;
    对所述目标图像对应的第一子图像进行图像识别,得到所述目标图像对应的第一子图像对应的图像特征信息;Performing image recognition on the first sub-image corresponding to the target image to obtain image feature information corresponding to the first sub-image corresponding to the target image;
    根据所述文本特征信息以及所述图像特征信息生成所述目标图像对应的目标演示文稿。Generate a target presentation corresponding to the target image according to the text feature information and the image feature information.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述图像特征信息包括前景图像特征信息和背景图像特征信息;The computer-readable storage medium according to claim 16, wherein the image feature information includes foreground image feature information and background image feature information;
    所述一条或一条以上指令适于由处理器加载并还执行以下步骤:The one or more instructions are suitable to be loaded by the processor and further execute the following steps:
    获取所述目标图像对应的第一子图像的像素特征信息;Acquiring pixel feature information of the first sub-image corresponding to the target image;
    根据所述像素特征信息确定所述第一子图像的前景图像区域;Determining the foreground image area of the first sub-image according to the pixel characteristic information;
    对所述前景图像区域进行图像识别,得到所述第一子图像的前景图像特征信息;Performing image recognition on the foreground image area to obtain foreground image feature information of the first sub-image;
    移除所述第一子图像中的前景图像区域,得到第二子图像;Removing the foreground image area in the first sub-image to obtain a second sub-image;
    对所述第二子图像进行图像识别,得到所述第二子图像的背景图像特征信息;Performing image recognition on the second sub-image to obtain background image feature information of the second sub-image;
    将所述第二子图像的背景图像特征信息作为所述第一子图像的背景图像特征信息。Use the background image feature information of the second sub-image as the background image feature information of the first sub-image.
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述一条或一条以上指令适于由处理器加载并还执行以下步骤:The computer-readable storage medium according to claim 17, wherein the one or more instructions are adapted to be loaded by a processor and further execute the following steps:
    采用所述第一子图像的背景图像特征信息,生成所述目标图像的背景图像区域;Using the background image feature information of the first sub-image to generate the background image area of the target image;
    采用所述第一子图像的前景图像特征信息,生成所述目标图像的前景图像区域;Using the foreground image feature information of the first sub-image to generate the foreground image area of the target image;
    采用所述目标图像对应的文本特征信息,生成所述目标图像的文本内容;Using the text feature information corresponding to the target image to generate the text content of the target image;
    对所述目标图像的所述背景图像区域、所述前景图像区域、所述文本内容进行拼接,得到所述目标图像对应的目标演示文稿。The background image area, the foreground image area, and the text content of the target image are spliced to obtain a target presentation corresponding to the target image.
  19. 根据权利要求16-18所述的计算机可读存储介质,其中,所述一条或一条以上指令适于由处理器加载并还执行以下步骤:The computer-readable storage medium according to claims 16-18, wherein the one or more instructions are adapted to be loaded by a processor and further execute the following steps:
    获取所述目标图像中移除所述文本内容后的区域,作为目标区域;Acquiring an area in the target image after the text content is removed as a target area;
    根据所述文本特征信息确定所述目标图像中的文本内容对应颜色信息;Determining color information corresponding to the text content in the target image according to the text feature information;
    采用所述文本内容对应颜色信息对所述目标区域进行填补处理。The color information corresponding to the text content is used to fill in the target area.
  20. 根据权利要求16所述的计算机可读存储介质,其中,所述一条或一条以上指令适于由处理器加载并还执行以下步骤:The computer-readable storage medium according to claim 16, wherein the one or more instructions are adapted to be loaded by a processor and further execute the following steps:
    获取候选文件,所述候选文件是根据所述目标演示文件生成的;Obtaining a candidate file, the candidate file being generated according to the target presentation file;
    从所述候选文件中的文本内容中获取所述候选文件中的主题信息;Acquiring subject information in the candidate file from the text content in the candidate file;
    根据所述主题信息对所述候选文件进行切分,得到多个候选子文件;Segmenting the candidate file according to the subject information to obtain multiple candidate sub-files;
    对所述多个候选子文件进行图像转换处理,得到多帧图像,将所述多帧图像作为所述待处理文件,所述目标图像是所述待处理文件中的任一帧图像。Image conversion processing is performed on the plurality of candidate sub-files to obtain a multi-frame image, the multi-frame image is used as the file to be processed, and the target image is any frame image in the file to be processed.
PCT/CN2020/118118 2020-06-28 2020-09-27 Presentation generation method, apparatus, and device, and medium WO2021114824A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010598196.4A CN111753108B (en) 2020-06-28 2020-06-28 Presentation generation method, device, equipment and medium
CN202010598196.4 2020-06-28

Publications (1)

Publication Number Publication Date
WO2021114824A1 true WO2021114824A1 (en) 2021-06-17

Family

ID=72677626

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118118 WO2021114824A1 (en) 2020-06-28 2020-09-27 Presentation generation method, apparatus, and device, and medium

Country Status (2)

Country Link
CN (1) CN111753108B (en)
WO (1) WO2021114824A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327170A (en) * 2021-12-31 2022-04-12 北京安博盛赢教育科技有限责任公司 Method, device, medium and electronic equipment for generating communication group

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060072830A1 (en) * 2004-02-26 2006-04-06 Xerox Corporation Method for automated image indexing and retrieval
CN109313695A (en) * 2016-05-18 2019-02-05 诺基亚技术有限公司 For restoring the apparatus, method, and computer program product of editable lantern slide
CN110493640A (en) * 2019-08-01 2019-11-22 东莞理工学院 A kind of system and method that the Video Quality Metric based on video processing is PPT
CN111126301A (en) * 2019-12-26 2020-05-08 腾讯科技(深圳)有限公司 Image processing method and device, computer equipment and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060153448A1 (en) * 2005-01-13 2006-07-13 International Business Machines Corporation System and method for adaptively separating foreground from arbitrary background in presentations
CN101533474B (en) * 2008-03-12 2014-06-04 三星电子株式会社 Character and image recognition system based on video image and method thereof
CN102831106A (en) * 2012-08-27 2012-12-19 腾讯科技(深圳)有限公司 Electronic document generation method of mobile terminal and mobile terminal
CN105791950A (en) * 2014-12-24 2016-07-20 珠海金山办公软件有限公司 Power Point video recording method and device
CN111444361A (en) * 2017-06-07 2020-07-24 邹时月 Chat group information management method based on shooting time and position information
US10528807B2 (en) * 2018-05-01 2020-01-07 Scribe Fusion, LLC System and method for processing and identifying content in form documents
CN110838105B (en) * 2019-10-30 2023-09-15 南京大学 Business process model image recognition and reconstruction method
CN111275048B (en) * 2020-01-15 2023-04-18 山东浪潮科学研究院有限公司 PPT reproduction method based on OCR character recognition technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060072830A1 (en) * 2004-02-26 2006-04-06 Xerox Corporation Method for automated image indexing and retrieval
CN109313695A (en) * 2016-05-18 2019-02-05 诺基亚技术有限公司 For restoring the apparatus, method, and computer program product of editable lantern slide
CN110493640A (en) * 2019-08-01 2019-11-22 东莞理工学院 A kind of system and method that the Video Quality Metric based on video processing is PPT
CN111126301A (en) * 2019-12-26 2020-05-08 腾讯科技(深圳)有限公司 Image processing method and device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327170A (en) * 2021-12-31 2022-04-12 北京安博盛赢教育科技有限责任公司 Method, device, medium and electronic equipment for generating communication group
CN114327170B (en) * 2021-12-31 2023-12-05 北京安博盛赢教育科技有限责任公司 Alternating current group generation method and device, medium and electronic equipment

Also Published As

Publication number Publication date
CN111753108A (en) 2020-10-09
CN111753108B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN111062871B (en) Image processing method and device, computer equipment and readable storage medium
CN111950424B (en) Video data processing method and device, computer and readable storage medium
CN111553267B (en) Image processing method, image processing model training method and device
EP3631750B1 (en) Image resolution enhancement
CN111340077B (en) Attention mechanism-based disparity map acquisition method and device
CN111930976B (en) Presentation generation method, device, equipment and storage medium
US20150154806A1 (en) Aligning Digital 3D Models Using Synthetic Images
CN113505848B (en) Model training method and device
CN109712082B (en) Method and device for collaboratively repairing picture
JP2022133378A (en) Face biological detection method, device, electronic apparatus, and storage medium
WO2022227218A1 (en) Drug name recognition method and apparatus, and computer device and storage medium
JP2021005164A (en) Character recognition device, imaging device, character recognition method, and character recognition program
US20140029854A1 (en) Metadata supersets for matching images
JP2023526899A (en) Methods, devices, media and program products for generating image inpainting models
CN111881904A (en) Blackboard writing recording method and system
WO2021114824A1 (en) Presentation generation method, apparatus, and device, and medium
CN112528978B (en) Face key point detection method and device, electronic equipment and storage medium
CN114359159A (en) Video generation method, system, electronic device and storage medium
US10991085B2 (en) Classifying panoramic images
CN112750065B (en) Carrier object processing and watermark embedding method, device and electronic equipment
CN110598785B (en) Training sample image generation method and device
CN112102145B (en) Image processing method and device
CN111860486B (en) Card identification method, device and equipment
US20220217321A1 (en) Method of training a neural network configured for converting 2d images into 3d models
CN114329050A (en) Visual media data deduplication processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20900466

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20900466

Country of ref document: EP

Kind code of ref document: A1