WO2021134229A1 - Text identification method, device, storage medium, and electronic apparatus - Google Patents

Text identification method, device, storage medium, and electronic apparatus Download PDF

Info

Publication number
WO2021134229A1
WO2021134229A1 PCT/CN2019/129963 CN2019129963W WO2021134229A1 WO 2021134229 A1 WO2021134229 A1 WO 2021134229A1 CN 2019129963 W CN2019129963 W CN 2019129963W WO 2021134229 A1 WO2021134229 A1 WO 2021134229A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
video
electronic device
images
cpu
Prior art date
Application number
PCT/CN2019/129963
Other languages
French (fr)
Chinese (zh)
Inventor
郭子亮
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to PCT/CN2019/129963 priority Critical patent/WO2021134229A1/en
Priority to CN201980100391.5A priority patent/CN114391260A/en
Publication of WO2021134229A1 publication Critical patent/WO2021134229A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects

Definitions

  • This application belongs to the field of electronic technology, and in particular relates to a character recognition method, device, storage medium and electronic equipment.
  • Video images in electronic devices such as smart phones often contain a large amount of information content.
  • the video image may also contain text information, which is usually the display of important information of the video playback content. Compared with the ever-changing image information, it is usually easier to understand which type of video the electronic device is playing by recognizing and analyzing text information.
  • character recognition of video images usually requires the collaborative work of a central processing unit (CPU) and a graphics processing unit (GPU).
  • CPU central processing unit
  • GPU graphics processing unit
  • the central processor cuts out the area containing the text from the video image
  • the graphics processor recognizes the text in the area containing the text.
  • the embodiments of the present application provide a text recognition method, device, storage medium, and electronic equipment, which can improve the resource utilization rate of the GPU.
  • an embodiment of the present application provides a text recognition method, including:
  • an embodiment of the present application provides a text recognition device, including:
  • the acquisition module is used to acquire multiple video frames
  • the decoding module is used to create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
  • a saving module configured to save the plurality of first images in a database
  • the determining module is used to create a first GPU process, and use the first GPU process to sequentially obtain the first image from the database, and sequentially determine the position information of the text from each first image, and obtain each first image. Location information corresponding to an image;
  • the cropping module is configured to perform cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
  • the recognition module is used to perform character recognition processing on each second image to obtain a character recognition result.
  • an embodiment of the present application provides a storage medium on which a computer program is stored, wherein when the computer program is executed on a computer, the computer is caused to execute the character recognition method provided in this embodiment.
  • an embodiment of the present application provides an electronic device, including a memory and a processor, the memory stores a computer program, and the processor invokes the computer program stored in the memory to execute:
  • FIG. 1 is a schematic diagram of the first flow of a character recognition method provided by an embodiment of the present application.
  • Fig. 2 is a schematic diagram of a second flow of a character recognition method provided by an embodiment of the present application.
  • Fig. 3 is a schematic structural diagram of a character recognition device provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a first structure of an electronic device provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a second structure of an electronic device provided by an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of the first type of character recognition method provided by an embodiment of the present application.
  • the process of the text recognition method may include:
  • the electronic device can obtain a video and then decompose the video into multiple video frames, so that the electronic device can obtain multiple video frames.
  • each CUP process is used to decode each video frame to obtain multiple first images.
  • the electronic device may create multiple CPU processes, and use each CPU process to decode each video frame to obtain multiple first images.
  • the number of CPU processes may be less than or equal to the number of video frames.
  • the electronic device can use the 5 CPU processes to obtain the first 5 video frames of the 10 video frames. Decoding processing to obtain multiple first images; or, the electronic device can use the 5 CPU processes to obtain any 5 video frames of the 10 video frames for decoding processing to obtain multiple first images.
  • a CPU process is created to decode the video frame to obtain multiple first images.
  • a plurality of first images are stored in a database.
  • the electronic device can use the CPU process to store the obtained first image in the database. That is, whenever the CPU process finishes decoding and processing the acquired video frame, the electronic device can use the CPU process to store the obtained first image in the database, thereby storing multiple first images in the database .
  • create the first GPU process and use the first GPU process to obtain the first image from the database in turn, and determine the position information of the text from each first image in turn to obtain the position corresponding to each first image information.
  • the electronic device may create a first GPU process, and use the first GPU process to sequentially obtain the first image from the database. Then, each time the first GPU process acquires a first image, the electronic device can use the first GPU process to perform position detection processing on the first image, thereby determining the position information of the text from the first image, and obtaining the first image. Location information corresponding to an image.
  • performing position detection processing on the image may be: detecting areas in the image where text exists to confirm which areas in the image have text.
  • the electronic device may use the first GPU process to sequentially obtain the first image from the database in a first-in first-out order. That is, the first image stored in the database first will be acquired by the first GPU process.
  • the electronic device may use the first GPU process to use a pre-trained position detection model to perform position detection processing on the first image it has acquired, so as to obtain the position information corresponding to the first image it has acquired.
  • the electronic device can use the first GPU process to use a pre-trained position detection model to perform position detection processing on the first image it obtains, so as to obtain the position information corresponding to the first image it obtains.
  • the electronic device can first use each of the multiple CPU processes to decode each of the multiple video frames, and then perform pre-processing to obtain multiple first images, and combine the multiple first images An image is stored in the database. Subsequently, the electronic device may use the first GPU process to sequentially obtain the first image from the database.
  • preprocessing the image can be: converting the format and size of the image into a format and size supported by the position detection model.
  • a cropping process is performed on each first image to obtain a plurality of second images.
  • each time position information corresponding to a first image is obtained the electronic device may perform crop processing on the corresponding first image according to the position information to obtain a second image. It can be understood that the electronic device performs the cropping process on which first image according to the position information obtained from which first image. Among them, according to the position information, the cropping process on the image may be: cropping out the area where the text exists in the image.
  • a character recognition process is performed on each second image to obtain a character recognition result.
  • the electronic device may perform character recognition processing on each second image to obtain a character recognition result.
  • the character recognition result includes the character recognition result corresponding to each second image.
  • the electronic device can perform character recognition processing on the second image to obtain the character recognition result of the second image , So as to finally obtain the character recognition results of the plurality of second images. Subsequently, the electronic device can save the text recognition result, and use the text recognition result for video classification, video push, etc.
  • multiple CPU processes are used to decode each video frame to obtain multiple first images; the multiple first images are stored in the database, so that the first GPU process can uninterruptedly get from the database The first image is obtained, and the position information corresponding to each first image is obtained in turn, so as to finally obtain the character recognition result.
  • the text recognition method provided in this embodiment can prevent the first GPU process from being in a long waiting process, thereby improving the resource utilization of the GPU.
  • the multiple video frames are video frames corresponding to the video to be classified. After the process 106, it may further include:
  • the text recognition result obtained by the electronic device only extracts the text in the image, and does not perform word segmentation.
  • the text recognition result obtained by the electronic device can be: Let children explore beautiful and magical Chinese characters while listening to stories and playing games. Then, the electronic device can segment the word recognition result, and the multiple segmentation obtained can be: let, children, zai, listening, story, play, game, middle, exploration, beautiful, magical, Chinese, text .
  • the electronic device can determine the category of the video to be classified according to the multiple word segmentation. For example, when words such as songs, lyrics, and singing appear multiple times among the multiple word segmentation obtained by the electronic device, the electronic device may determine the category of the video to be classified as the song category. For another example, if the electronic device analyzes the multiple word segmentation that it obtains to belong to the lyrics of a certain song, the electronic device can determine the category of the video to be classified as the song category; or, the electronic device can also determine that the lyrics belong Genres, such as antique, popular, rock and roll, etc. Assuming that the electronic device determines that the genre of the lyrics belongs to the ancient style, the electronic device may determine the category of the video to be classified as the ancient style under the song category.
  • the electronic device may determine the category of the video to be classified as the ancient style under the song category.
  • determining the category of the video to be classified according to multiple word segmentation may include:
  • the target keywords determine the category of the video to be classified.
  • the electronic device can determine the target keyword from the multiple word segmentation. Subsequently, the electronic device can determine the category of the video to be classified according to the target keyword. For example, the electronic device can determine the same participle from multiple participles. Subsequently, the electronic device may determine the number of the same word segmentation, and determine the same word segmentation with the largest number as the target keyword. For example, suppose that the electronic device obtains 10 word segmentation, where the number of the word song is 7, the number of the word style is 2, and the number of the word grace is 1. Then, the electronic device can determine "song" as the target keyword, so that the electronic device can determine the category of the video to be classified as the song category.
  • determining the category of the video to be classified according to the target keyword may include:
  • the category corresponding to the target keyword is determined as the category of the video to be classified.
  • the electronic device may preset a preset mapping relationship R1 between keywords and categories. For example, keyword K1 corresponds to category C1, keyword K2 corresponds to category C2, keyword K3 corresponds to category C3, and so on. Assuming that the target keyword is K1, then its corresponding category is C1. Therefore, the category of the video to be classified is C1.
  • the electronic device may preset a preset mapping relationship R2 between keywords and categories.
  • keywords K1, K2, K3 correspond to category C1
  • keywords K4, K5, and K6 correspond to category C2
  • keywords K7, K8, and K9 correspond to category C3, and so on.
  • the target keyword is K3, then its corresponding category is C1, so the category of the video to be classified is C1.
  • determining the target keyword from multiple word segmentation may include:
  • the same word segmentation corresponding to the number greater than the preset number is determined as the target keyword.
  • the electronic device can determine the same word segmentation from the multiple word segmentation. Subsequently, the electronic device can determine the number of the same word segmentation, and determine the same word segmentation corresponding to the number greater than the preset number as the target keyword. For example, suppose that the electronic device obtains 10 word segmentation, where the number of the word song is 7, the number of the word style is 2, the number of beautiful words is 1, and the preset number is 5. Then, the electronic device can determine "song" as the target keyword.
  • the text recognition method may further include:
  • the video to be classified is pushed to the user.
  • the electronic device can obtain the user portrait of the user.
  • the user portrait refers to the abstraction of each user's specific information into tags, and the use of these tags to concretize the user's image, thereby providing users with targeted services.
  • the user portrait of a user can describe which types of articles a user frequently browses, which types of videos the user frequently watches, and which types of items the user frequently buys, etc. Wait.
  • the electronic device After obtaining a user portrait of a certain user, the electronic device can determine which types of videos the user frequently watches. Then, the electronic device can determine whether the category of the video to be classified belongs to one of the categories corresponding to the video frequently watched by the user. If the category of the video to be classified belongs to one of the categories corresponding to the video frequently watched by the user, the electronic device may push the video to be classified to the user for the user to watch.
  • FIG. 2 is a schematic diagram of the second flow of the character recognition method provided by an embodiment of this application.
  • the text recognition method may include:
  • the electronic device acquires multiple video frames.
  • the electronic device can obtain a video and then decompose the video into multiple video frames, so that the electronic device can obtain multiple video frames.
  • the electronic device can enter the video recording mode, and use the camera to continuously shoot the shooting scene to continuously output multiple video frames to form a video stream. The electronic device can then obtain the continuously output multiple video frames.
  • the shooting scene refers to the scene that the user wants to shoot through the camera, that is, the scene that the camera is aimed at is the shooting scene. It should be noted that the shooting scene in the embodiment of the present application does not specifically refer to a specific scene, but a scene that is aligned in real time following the direction of the camera. Text can be included in the shooting scene.
  • the electronic device creates multiple CPU processes, and uses each CUP process to decode each video frame to obtain multiple first images.
  • the electronic device may create multiple CPU processes, and use each CPU process to decode each video frame to obtain multiple first images.
  • the electronic device may create multiple CPU processes, and use each CPU process to decode and preprocess each video frame to obtain multiple first images.
  • the preprocessing of the image may include: converting the format, size, etc. of the image into a corresponding format, size, etc., such as converting the format, size, etc. of the image into a format, size, etc. supported by the position detection model.
  • the number of CPU processes may be less than or equal to the number of video frames.
  • the electronic device can use the 5 CPU processes to obtain the first 5 video frames of the 10 video frames. Decoding processing to obtain multiple first images; or, the electronic device can use the 5 CPU processes to obtain any 5 video frames of the 10 video frames for decoding processing to obtain multiple first images.
  • a CPU process is created to decode the video frame to obtain multiple first images. It can be understood that the number of CPU processes may be less than or equal to the number of video frames in the video stream.
  • a process such as a CPU process or a GPU process
  • necessary process communication initialization work is also required. For example, allocating shared memory for each process, and establishing queues and pipes for inter-process communication.
  • the shared memory can be used to implement data transfer between processes, for example, one process can obtain data from the shared memory of another process.
  • the electronic device uses each CPU process to store the obtained first image in its shared memory, and stores its identification information in the first queue.
  • the electronic device can use the CPU process to store the first image it obtains in its shared memory, and store its identification information in the first image.
  • the identification information of the CPU process may be the process ID of the CPU process.
  • the first queue may be a first-in-first-out queue, that is, the data that enters the queue first is processed first, and the data that enters the queue later is processed.
  • the electronic device when the CPU process is used to store its identification information in the first queue, the electronic device can also use the CPU process to store the image size and image format of the first image it obtains in the first queue. in.
  • the electronic device when the first image obtained by the CPU process is stored in its shared memory and its identification information is stored in the first queue, the electronic device can suspend the CPU process until the first image.
  • a GPU process sends the obtained position information to the CPU process, the electronic device makes the CPU process enter the ready state.
  • each CPU process has a shared memory
  • the database may include the shared memory of each CPU process.
  • the reason why the shared memory is used to store image data, such as the first image or the second image, instead of storing the image data in a queue or pipeline, is because the image data is generally relatively large. , If the image data is directly stored in the queue for transmission, it may cause unnecessary copying and other operations, which will seriously affect the overall running speed of the process in the text recognition process.
  • the communication speed between processes can be greatly improved, so that the communication time between processes is almost negligible for the time required for the entire character recognition process.
  • the electronic device creates a first GPU process, and uses the first GPU process to sequentially obtain identification information from the first queue.
  • the electronic device uses the first GPU process to sequentially obtain the first image from the shared memory of the corresponding CPU process according to the identification information, and sequentially determines the location information of the text from each first image, and obtains each first image.
  • the location information corresponding to the image is a predefined value.
  • the electronic device may create a first GPU process, and use the first GPU process to sequentially obtain identification information from the first queue.
  • the identification information first stored in the first queue may be first acquired by the first GPU process.
  • the electronic device can use the first GPU process to obtain the first image from the shared memory of the CPU process corresponding to the identification information, and perform position detection processing on the obtained first image in turn to obtain The position information where the text is located in the first image obtained is determined, and the position information corresponding to the obtained first image is obtained.
  • the electronic device may use the first GPU process to obtain the first image from the shared memory of the CPU process P1.
  • the electronic device uses the first GPU process to sequentially perform position detection processing on the acquired first image to determine the position information of the text from the acquired first image, and obtain the acquired first image.
  • the position information corresponding to an image may include: the electronic device uses the first GPU process to use a pre-trained position detection model to sequentially perform position detection processing on the first image it obtains, so as to obtain the first image from the first image.
  • the location information where the text is located is determined, and the location information corresponding to the acquired first image is obtained.
  • the position detection model may be a deep neural network model.
  • the electronic device uses the first GPU process to sequentially send the location information corresponding to each first image to the corresponding CPU process.
  • each time the electronic device obtains the position information corresponding to a first image by using the first GPU process it is convenient to use the first GPU process to send the obtained position information corresponding to the first image to the corresponding CPU process through the pipeline.
  • the electronic device can use the first GPU process to obtain the first image from the shared memory of the CPU process P1, and perform processing on the first image.
  • the position detection process obtains the position information corresponding to the first image. Subsequently, the electronic device can use the first GPU process to send the location information to the CPU process P1.
  • the electronic device uses multiple CPU processes to perform decoding processing in parallel, and stores the obtained first image in its shared memory, and stores its identification information in the first queue. Then, the first CPU process can continuously obtain the identification information from the first queue, and obtain the first image from the shared memory of the corresponding CPU process according to the obtained identification information for position detection processing, thereby reducing the first GPU process The waiting time, which in turn improves the resource utilization of the GPU.
  • the electronic device uses each CPU process to perform cropping processing on the corresponding first image according to the received position information, so as to obtain a plurality of second images in sequence.
  • the electronic device can use the CPU process to perform cropping processing on the corresponding first image according to the received location information, so as to obtain multiple second images.
  • the electronic device can use the CPU process P1 to perform cutting processing on the first image obtained before the CPU process P1 according to the received position information, so as to obtain the second image.
  • the cropping process on the image may be: cropping out the area where the text exists in the image. It can be understood that the electronic device performs the cropping process on which first image according to the position information obtained from which first image. For example, assuming that the electronic device obtains the position information according to the first image G1, the electronic device may perform a cropping process on the first image G1 according to the position information.
  • the electronic device may use the CPU process to perform post-processing on the corresponding first image according to the location information.
  • the location information may include the size and location of the area.
  • the electronic device can use the CPU process to filter the corresponding first image according to the size and location of the area, so as to determine the area where the text is located from the corresponding first image.
  • the electronic device can use the CPU process to cut out the area where the text is located.
  • the electronic device can use the CPU process to preprocess the area where the text is located, thereby obtaining the second image.
  • the preprocessing of the area where the text is located can convert the format and size of the obtained second image into the corresponding format and size, such as converting the format and size of the obtained second image into a text recognition model supported Format, size, etc.
  • the electronic device uses each CPU process to store the obtained second image in its shared memory, and stores its identification information in the second queue.
  • the electronic device can use the CPU process to store the obtained second image in its shared memory, and Store its identification information in the second queue.
  • the identification information of the CPU process may be the process ID of the CPU process.
  • the second queue may be a first-in first-out queue, that is, data that enters the queue first is processed first, and data that enters the queue later is processed.
  • the electronic device when the CPU process is used to store its identification information in the second queue, the electronic device can also use the CPU process to store the image size, image format, etc. of the second image it obtains in the second queue. in.
  • the electronic device can suspend the CPU process until the first Second, when the GPU process sends the text recognition result it obtains to the CPU process, the electronic device makes the CPU process enter a ready state.
  • the electronic device creates a second GPU process, and uses the second GPU process to sequentially obtain identification information from the second queue.
  • the electronic device uses the second GPU process to sequentially obtain the second image from the shared memory of the corresponding CPU process according to the identification information obtained from the second queue.
  • the electronic device uses the second GPU process to perform character recognition processing on the second image to obtain a character recognition result.
  • the electronic device may create a second GPU process, and use the second GPU process to sequentially obtain identification information from the second queue.
  • the identification information first stored in the second queue can be first acquired by the second GPU process.
  • the electronic device can use the second GPU process to obtain a second image from the shared memory of the CPU process corresponding to the identification information, and sequentially perform text recognition processing on the obtained second image to obtain text recognition result.
  • the electronic device may use the second GPU process to obtain the second image from the shared memory of the CPU process P1.
  • the electronic device uses the second GPU process to sequentially perform text recognition processing on the acquired second image to obtain a text recognition result, which may include: the electronic device uses the second GPU process to use pre-trained text recognition
  • the model sequentially performs character recognition processing on the second image it acquires to obtain the character recognition result.
  • the character recognition model may be a deep neural network model.
  • the electronic device when the electronic device uses the second GPU process to perform character recognition processing on the second image obtained by the CPU process P1, and obtains the character recognition result, the electronic device can also use the second GPU process to perform the character recognition result Sent to the CPU process P1 through the pipeline. Subsequently, the electronic device can use the CPU process P1 to perform post-processing on the character recognition result, such as performing code conversion processing on the character recognition result, so as to convert the character recognition result into a format required by the user.
  • the multiple second images are obtained based on multiple first images
  • the multiple first images are obtained based on multiple video frames.
  • the multiple video frames belong to the video to be classified. Therefore, the electronic device can classify the video to be classified according to the text recognition result.
  • the electronic device can perform word segmentation processing on the character recognition result to obtain multiple word segmentation. After obtaining multiple word segmentation, the electronic device can determine the category of the video to be classified according to the multiple word segmentation. For example, when words such as songs, lyrics, and singing appear multiple times among the multiple word segmentation obtained by the electronic device, the electronic device may determine the category of the video to be classified as the song category. For another example, if the electronic device analyzes the multiple word segments obtained by the electronic device that the multiple word segments belong to the lyrics of a certain song, the electronic device may determine the category of the video to be classified as the song category.
  • the electronic device can use the CPU process to continue to obtain a video frame , And decode the video frame to obtain the first image. Then, through the process 203 to the process 211, the character recognition result corresponding to the first image is obtained. It is understandable that when executing the above process, the electronic device does not need to create a new CPU process or GPU process anymore, and it can continue to use the previously created CPU process and GPU process.
  • the electronic device can use the CPU process to continue. Obtain a video frame, and decode the video frame to obtain the first image. Then, through the process 203 to the process 211, the character recognition result corresponding to the first image is obtained. It is understandable that when executing the above process, the electronic device does not need to create a new CPU process or GPU process anymore, and it can continue to use the previously created CPU process and GPU process.
  • the number of CPU processes, the first GPU process, and the second GPU process can also be set reasonably to make the running times of the three parts as close as possible, thereby maximizing resource utilization.
  • the decoding process takes a long time, so the number of CPU processes can be set a little more, and the first GPU process and the second GPU process can be set relatively less, so that the GPU process can be Continuously obtain computing tasks from the shared memory.
  • the first GPU process can continuously obtain the first image from the shared memory for position detection processing
  • the second GPU process can continuously obtain the second image from the shared memory for processing. Character recognition processing, thereby maximizing the resource utilization of the GPU, and improving the operating speed of the entire system.
  • the electronic device can also use the idle time of the CPU process to perform the decoding process, so as to improve the resource utilization of the CPU and save the running time of the entire process. For example, when a CPU process is used to store the first image obtained in its shared memory, and its identification information is stored in the first queue, the electronic device can use the CPU process again to obtain video frames for decoding processing To get the first image again. Subsequently, the electronic device may store the first image obtained again in the shared memory of the CPU process, and store the identification information of the CPU process in the first queue, until the first GPU process sends the obtained position information to the CPU process.
  • the electronic device can make the CPU process pause the video frame.
  • the decoding process starts to perform a cropping process on the corresponding first image according to the received position information, or perform post-processing on the received character recognition result. If the cutting process of the CPU process is completed, or after the post-processing is completed, the position information or character recognition result is not received, the electronic device can use the CPU process to continue to decode the video frame.
  • the electronic device when the CPU process P1 is used to decode the video frame to obtain the first image G1, the electronic device can use the CPU process P1 to associate the first image G1 with the identification information of the CPU process P1 and store it in In the database. After the first GPU process obtains the first image G1 from the database and obtains the location information of the text in the first image G1, the electronic device can send the location information according to the identification information associated with the first image G1 To the CPU process P1.
  • the text recognition device 300 may include: an acquisition module 301, a decoding module 302, a storage module 303, a determination module 304, a cutting module 305, and an identification module 306.
  • the obtaining module 301 is used to obtain multiple video frames
  • the decoding module 302 is used to create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
  • the saving module 303 is configured to store the multiple first images in a database
  • the determining module 304 is configured to create a first GPU process, and use the first GPU process to sequentially obtain first images from the database, and sequentially determine the location information of the text from each first image, and obtain each Location information corresponding to the first image;
  • the cropping module 305 is configured to perform cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
  • the recognition module 306 is configured to perform character recognition processing on each second image to obtain a character recognition result.
  • the database includes the shared memory of each CPU process, and the saving module 303 may be used to store the first image obtained by each CPU process in its shared memory and store it in the shared memory.
  • the identification information is stored in the first queue;
  • the determining module 304 may be configured to: use the first GPU process to sequentially obtain the identification information from the first queue; use the first GPU process to sequentially obtain the identification information from the corresponding CPU process according to the identification information Acquire the first image in the shared memory.
  • the determining module 304 may be used to: use the first GPU process to send the location information corresponding to each first image to the corresponding CPU process in turn;
  • the cropping module 305 may be used for: using each CPU process to perform cropping processing on the corresponding first image according to the received position information, so as to obtain multiple second images in sequence.
  • the cropping module 305 can be used to store the second image obtained by each CPU process in its shared memory, and store its identification information in the second queue;
  • the identification module 306 may be used to: create a second GPU process, and use the second GPU process to sequentially obtain identification information from the second queue; use the second GPU process to obtain identification information from the second queue The identification information obtained in, sequentially obtain the second image from the shared memory of the corresponding CPU process; use the second GPU process to perform character recognition processing on the second image to obtain a character recognition result.
  • the multiple video frames are video frames corresponding to the video to be classified
  • the recognition module 306 may be used to: perform word segmentation processing on the text recognition result to obtain multiple word segmentation; Multiple word segmentation to determine the category of the video to be classified.
  • the recognition module 306 may be used to: determine a target keyword from the multiple word segmentation; and determine the category of the video to be classified according to the target keyword.
  • the recognition module 306 may be used to: determine the category corresponding to the target keyword according to the preset mapping relationship between the keyword and the category; determine the category as the video to be classified Category.
  • the recognition module 306 may be used to: determine the same participle from the multiple participles; determine the number of the same participle; determine the same participle corresponding to the number greater than the preset number as Target keywords.
  • the identification module 306 may be used to: obtain a user portrait of a user; determine whether to push the to-be-classified video to the user according to the user's portrait and the category of the to-be-classified video; If yes, push the video to be classified to the user.
  • the embodiment of the present application provides a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed on a computer, the computer is caused to execute the process in the character recognition method provided in this embodiment.
  • An embodiment of the present application also provides an electronic device, including a memory, a processor, and a computer program stored in the memory.
  • the processor is configured to execute the computer program stored in the memory by calling the computer program stored in the memory. The process in the text recognition method.
  • the above-mentioned electronic device may be a mobile terminal such as a tablet computer or a smart phone.
  • FIG. 4 is a schematic diagram of a first structure of an electronic device provided by an embodiment of this application.
  • the electronic device 400 may include components such as a memory 401, a central processing unit 402, and a graphics processor 403. Those skilled in the art can understand that the structure of the electronic device shown in FIG. 4 does not constitute a limitation on the electronic device, and may include more or fewer components than shown in the figure, or combine certain components, or different component arrangements.
  • the memory 401 can be used to store application programs and data.
  • the application program stored in the memory 401 contains executable code.
  • Application programs can be composed of various functional modules.
  • the central processing unit 402 executes various functional applications and data processing by running application programs stored in the memory 401.
  • the central processing unit 402 is the control center of the electronic device. It uses various interfaces and lines to connect the various parts of the entire electronic device. It executes the electronic device by running or executing the application program stored in the memory 401 and calling the data stored in the memory 401. Various functions and processing data of the equipment, so as to monitor the electronic equipment as a whole.
  • the graphics processor 403 can be used to perform image and graphics related operations.
  • the central processing unit 402 in the electronic device loads the executable code corresponding to the process of one or more application programs into the memory 401 according to the following instructions, and the central processing unit 402 runs and stores the executable code.
  • the application program in the storage 401 thereby realizing the process:
  • FIG. 5 is a schematic diagram of a second structure of an electronic device provided by an embodiment of this application.
  • the electronic device 400 may include a memory 401, a central processing unit 402, a graphics processor 403, an input unit 404, an output unit 405, a display screen 406 and other components.
  • the memory 401 can be used to store application programs and data.
  • the application program stored in the memory 401 contains executable code.
  • Application programs can be composed of various functional modules.
  • the central processing unit 402 executes various functional applications and data processing by running application programs stored in the storage 401.
  • the central processing unit 402 is the control center of the electronic device. It uses various interfaces and lines to connect the various parts of the entire electronic device. It executes the electronic device by running or executing the application program stored in the memory 401 and calling the data stored in the memory 401. Various functions and processing data of the device can be used to monitor the electronic device as a whole, such as decoding and cropping the first image.
  • the graphics processor 403 can be used to perform image and graphics-related operations, such as performing position detection processing on the first image, and performing character recognition processing on the second image.
  • the input unit 404 can be used to receive input numbers, character information, or user characteristic information (such as fingerprints), and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • user characteristic information such as fingerprints
  • the output unit 405 may be used to display information input by the user or information provided to the user and various graphical user interfaces of the electronic device. These graphical user interfaces may be composed of graphics, text, icons, videos, and any combination thereof.
  • the output unit may include a display panel.
  • the display screen 406 can be used to display information such as text and pictures.
  • the central processing unit 402 in the electronic device loads the executable code corresponding to the process of one or more application programs into the memory 401 according to the following instructions, and the central processing unit 402 runs and stores the executable code.
  • the application program in the storage 401 thereby realizing the process:
  • the database includes the shared memory of each CPU process.
  • the central processing unit 402 executes the storing of the plurality of first images in the database, it can execute: use each CPU process to obtain them Store the first image of the image in its shared memory, and store its identification information in the first queue; when the central processing unit 402 executes the process of using the first GPU to sequentially obtain the first image from the database, it may Execution: using the first GPU process to sequentially obtain the identification information from the first queue; using the first GPU process to sequentially obtain the first image from the shared memory of the corresponding CPU process according to the identification information.
  • the central processing unit 402 executes the sequence of determining the location information of the text from each first image, and after obtaining the location information corresponding to each first image, it may also execute: use the first GPU
  • the process sends the location information corresponding to each first image to the corresponding CPU process in turn; the central processing unit 402 performs the cutting process on each first image according to the location information corresponding to each first image, and obtains multiple
  • the second image it may be executed: using each CPU process to perform cropping processing on the corresponding first image according to the received position information, so as to obtain multiple second images in sequence.
  • the central processing unit 402 executes the process of using each CPU to crop the corresponding first image according to the received position information to obtain multiple second images in sequence, it can also Execution: Use each CPU process to store the second image it obtains in its shared memory, and store its identification information in the second queue; the central processing unit 402 executes the character recognition processing for each second image , When the text recognition result is obtained, you can execute: create a second GPU process, and use the second GPU process to sequentially obtain identification information from the second queue; use the second GPU process to obtain the identification information from the second queue The identification information obtained in, sequentially obtain the second image from the shared memory of the corresponding CPU process; use the second GPU process to perform character recognition processing on the second image to obtain a character recognition result.
  • the plurality of video frames are video frames corresponding to the video to be classified
  • the central processor 402 executes the character recognition processing on each second image, and after obtaining the character recognition result, it may also execute:
  • the word recognition result is subjected to word segmentation processing to obtain a plurality of word segmentation; according to the plurality of word segmentation, the category of the video to be classified is determined.
  • the central processor 402 when the central processor 402 executes the determination of the category of the video to be classified based on the multiple word segmentation, it may execute: determine the target keyword from the multiple word segmentation; according to the target Keywords, determine the category of the video to be classified.
  • the central processor 402 when the central processor 402 executes the determination of the category of the video to be classified according to the target keyword, it may execute: determine the target according to a preset mapping relationship between keywords and categories The category corresponding to the keyword; the category is determined as the category of the video to be classified.
  • the central processor 402 when the central processor 402 executes the determination of the target keyword from the plurality of word segmentation, it may execute: determine the same word segmentation from the plurality of word segmentation; determine the number of the same word segmentation; The same word segmentation corresponding to the number greater than the preset number is determined as the target keyword.
  • the central processor 402 may also execute: obtain a user portrait of the user; determine whether to push the video to be classified to the user according to the user portrait and the category of the video to be classified; if so, Then, the video to be classified is pushed to the user.
  • the text recognition device provided in the embodiment of the application belongs to the same concept as the text recognition method in the above embodiment, and any method provided in the text recognition method embodiment can be run on the text recognition device.
  • the specific for details of the implementation process refer to the embodiment of the character recognition method, which will not be repeated here.
  • the computer program may be stored in a computer readable storage medium, such as stored in a memory, and executed by at least one processor.
  • the execution process may include the process of the embodiment of the character recognition method.
  • the storage medium may be a magnetic disk, an optical disc, a read only memory (ROM, Read Only Memory), a random access memory (RAM, Random Access Memory), etc.
  • the character recognition device of the embodiment of the present application its functional modules may be integrated in one processing chip, or each module may exist alone physically, or two or more modules may be integrated in one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium, such as a read-only memory, a magnetic disk or an optical disk, etc. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

A text identification method, a device, a storage medium, and an electronic apparatus. The method comprises: acquiring multiple video frames (101); creating multiple CPU threads, and performing, by using the respective CPU threads, a decoding process on the video frames to obtain multiple first images (102); storing the multiple first images in a database (103); creating a first GPU thread, using the first GPU thread to acquire the first images sequentially from the database, and sequentially determining position information of text from each of the first images so as to obtain position information corresponding to each of the first images (104); cropping, according to the position information corresponding to each of the first images, each of the first images to obtain multiple second images (105); and performing text identification processing on each of the second images to obtain a text identification result (106).

Description

文字识别方法、装置、存储介质及电子设备Character recognition method, device, storage medium and electronic equipment 技术领域Technical field
本申请属于电子技术领域,尤其涉及一种文字识别方法、装置、存储介质及电子设备。This application belongs to the field of electronic technology, and in particular relates to a character recognition method, device, storage medium and electronic equipment.
背景技术Background technique
对于智能手机等电子设备中的视频图像,往往包含着大量的信息内容。在视频图像中,除了图像画面,还可能包含文字信息,这些文字信息通常是对视频播放内容的重要信息的显示。对比于***的图像信息,识别并分析文字信息,通常更容易了解到电子设备播放的是哪种类型的视频。Video images in electronic devices such as smart phones often contain a large amount of information content. In addition to the image screen, the video image may also contain text information, which is usually the display of important information of the video playback content. Compared with the ever-changing image information, it is usually easier to understand which type of video the electronic device is playing by recognizing and analyzing text information.
相关技术中,对视频图像进行文字识别通常需要中央处理器(Central Processing Unit,CPU)和图形处理器(Graphics Processing Unit,GPU)的协同工作。例如,由中央处理器从视频图像中裁切出包含文字的区域,由图形处理器对包含文字的区域中的文字进行识别。In related technologies, character recognition of video images usually requires the collaborative work of a central processing unit (CPU) and a graphics processing unit (GPU). For example, the central processor cuts out the area containing the text from the video image, and the graphics processor recognizes the text in the area containing the text.
发明内容Summary of the invention
本申请实施例提供一种文字识别方法、装置、存储介质及电子设备,可以提高GPU的资源利用率。The embodiments of the present application provide a text recognition method, device, storage medium, and electronic equipment, which can improve the resource utilization rate of the GPU.
第一方面,本申请实施例提供一种文字识别方法,包括:In the first aspect, an embodiment of the present application provides a text recognition method, including:
获取多个视频帧;Obtain multiple video frames;
创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;Create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
将所述多个第一图像存入数据库中;Storing the plurality of first images in a database;
创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;Create a first GPU process, and use the first GPU process to obtain the first image from the database in turn, and determine the position information of the text from each first image in turn to obtain the position corresponding to each first image information;
根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像;Performing cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
对每个第二图像进行文字识别处理,得到文字识别结果。Perform character recognition processing on each second image to obtain a character recognition result.
第二方面,本申请实施例提供一种文字识别装置,包括:In the second aspect, an embodiment of the present application provides a text recognition device, including:
获取模块,用于获取多个视频帧;The acquisition module is used to acquire multiple video frames;
解码模块,用于创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;The decoding module is used to create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
保存模块,用于将所述多个第一图像存入数据库中;A saving module, configured to save the plurality of first images in a database;
确定模块,用于创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;The determining module is used to create a first GPU process, and use the first GPU process to sequentially obtain the first image from the database, and sequentially determine the position information of the text from each first image, and obtain each first image. Location information corresponding to an image;
裁切模块,用于根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像;The cropping module is configured to perform cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
识别模块,用于对每个第二图像进行文字识别处理,得到文字识别结果。The recognition module is used to perform character recognition processing on each second image to obtain a character recognition result.
第三方面,本申请实施例提供一种存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机上执行时,使得所述计算机执行本实施例提供的文字识别方法。In a third aspect, an embodiment of the present application provides a storage medium on which a computer program is stored, wherein when the computer program is executed on a computer, the computer is caused to execute the character recognition method provided in this embodiment.
第四方面,本申请实施例提供一种电子设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器通过调用所述存储器中存储的所述计算机程序,用于执行:In a fourth aspect, an embodiment of the present application provides an electronic device, including a memory and a processor, the memory stores a computer program, and the processor invokes the computer program stored in the memory to execute:
获取多个视频帧;Obtain multiple video frames;
创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;Create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
将所述多个第一图像存入数据库中;Storing the plurality of first images in a database;
创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;Create a first GPU process, and use the first GPU process to obtain the first image from the database in turn, and determine the position information of the text from each first image in turn to obtain the position corresponding to each first image information;
根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像;Performing cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
对每个第二图像进行文字识别处理,得到文字识别结果。Perform character recognition processing on each second image to obtain a character recognition result.
附图说明Description of the drawings
下面结合附图,通过对本申请的具体实施方式详细描述,将使本申请的技术方案及其有益效果显而易见。The following detailed descriptions of the specific implementations of the present application in conjunction with the accompanying drawings will make the technical solutions of the present application and its beneficial effects obvious.
图1是本申请实施例提供的文字识别方法的第一种流程示意图。FIG. 1 is a schematic diagram of the first flow of a character recognition method provided by an embodiment of the present application.
图2是本申请实施例提供的文字识别方法的第二种流程示意图。Fig. 2 is a schematic diagram of a second flow of a character recognition method provided by an embodiment of the present application.
图3是本申请实施例提供的文字识别装置的结构示意图。Fig. 3 is a schematic structural diagram of a character recognition device provided by an embodiment of the present application.
图4是本申请实施例提供的电子设备的第一种结构示意图。FIG. 4 is a schematic diagram of a first structure of an electronic device provided by an embodiment of the present application.
图5是本申请实施例提供的电子设备的第二种结构示意图。FIG. 5 is a schematic diagram of a second structure of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
请参照图示,其中相同的组件符号代表相同的组件,本申请的原理是以实施在一适当 的运算环境中来举例说明。以下的说明是基于所例示的本申请具体实施例,其不应被视为限制本申请未在此详述的其它具体实施例。Please refer to the drawings, where the same component symbols represent the same components. The principle of the present application is implemented in an appropriate computing environment as an example. The following description is based on the exemplified specific embodiments of the application, which should not be regarded as limiting other specific embodiments of the application that are not described in detail herein.
请参阅图1,图1是本申请实施例提供的文字识别方法的第一种流程示意图。该文字识别方法的流程可以包括:Please refer to FIG. 1. FIG. 1 is a schematic flowchart of the first type of character recognition method provided by an embodiment of the present application. The process of the text recognition method may include:
在101中,获取多个视频帧。In 101, multiple video frames are acquired.
比如,电子设备可获取一视频,然后将该视频分解为多个视频帧,从而电子设备可得到多个视频帧。For example, the electronic device can obtain a video and then decompose the video into multiple video frames, so that the electronic device can obtain multiple video frames.
在102中,创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像。In 102, multiple CPU processes are created, and each CUP process is used to decode each video frame to obtain multiple first images.
比如,当得到多个视频帧之后,电子设备可创建多个CPU进程,并利用每个CPU进程对每个视频帧进行解码处理,得到多个第一图像。For example, after obtaining multiple video frames, the electronic device may create multiple CPU processes, and use each CPU process to decode each video frame to obtain multiple first images.
可以理解的是,CPU进程的数量可小于或等于视频帧的数量。当CPU进程的数量小于视频帧的数量时,如假设CPU进程的数量为5,视频帧的数量为10,电子设备可利用该5个CPU进程获取该10个视频帧的前5帧视频帧进行解码处理,得到多个第一图像;或者,电子设备可利用该5个CPU进程获取该10个视频帧的任意5帧视频帧进行解码处理,得到多个第一图像。It can be understood that the number of CPU processes may be less than or equal to the number of video frames. When the number of CPU processes is less than the number of video frames, for example, assuming that the number of CPU processes is 5 and the number of video frames is 10, the electronic device can use the 5 CPU processes to obtain the first 5 video frames of the 10 video frames. Decoding processing to obtain multiple first images; or, the electronic device can use the 5 CPU processes to obtain any 5 video frames of the 10 video frames for decoding processing to obtain multiple first images.
在一些实施例中,对于电子设备实时输出视频帧的视频,如视频流,电子设备每输出一帧视频帧,就创建一CPU进程对该视频帧进行解码处理,从而得到多个第一图像。In some embodiments, for an electronic device that outputs a video of a video frame in real time, such as a video stream, each time the electronic device outputs a video frame, a CPU process is created to decode the video frame to obtain multiple first images.
在103中,将多个第一图像存入数据库中。In 103, a plurality of first images are stored in a database.
比如,当某CPU进程对其所获取的视频帧解码处理完毕时,电子设备可利用该CPU进程将其所得到的第一图像存入数据库中。即,每当CPU进程对其所获取的视频帧解码处理完毕时,电子设备均可利用该CPU进程将其所得到的第一图像存入数据库中,从而将多个第一图像存入数据库中。For example, when a certain CPU process completes the decoding processing of the acquired video frame, the electronic device can use the CPU process to store the obtained first image in the database. That is, whenever the CPU process finishes decoding and processing the acquired video frame, the electronic device can use the CPU process to store the obtained first image in the database, thereby storing multiple first images in the database .
在104中,创建第一GPU进程,并利用第一GPU进程依次从数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息。In 104, create the first GPU process, and use the first GPU process to obtain the first image from the database in turn, and determine the position information of the text from each first image in turn to obtain the position corresponding to each first image information.
比如,当数据库中存在至少一个第一图像时,电子设备可创建第一GPU进程,并利用第一GPU进程依次从数据库中获取第一图像。然后,第一GPU进程每获取到一个第一图像,电子设备可利用该第一GPU进程对该第一图像进行位置检测处理,从而从该第一图像中确定文字所在的位置信息,得到该第一图像对应的位置信息。其中,对图像进行位置检测处理可以为:对图像中存在文字的区域进行检测,以确认图像中的哪些区域存在文字。For example, when there is at least one first image in the database, the electronic device may create a first GPU process, and use the first GPU process to sequentially obtain the first image from the database. Then, each time the first GPU process acquires a first image, the electronic device can use the first GPU process to perform position detection processing on the first image, thereby determining the position information of the text from the first image, and obtaining the first image. Location information corresponding to an image. Wherein, performing position detection processing on the image may be: detecting areas in the image where text exists to confirm which areas in the image have text.
可以理解的是,在利用第一GPU进程依次从数据库中获取第一图像时,电子设备可利用第一GPU进程采用先进先出的顺序依次从数据库中获取第一图像。即,先存入数据库中的第一图像将先被第一GPU进程获取。It is understandable that when the first GPU process is used to sequentially obtain the first image from the database, the electronic device may use the first GPU process to sequentially obtain the first image from the database in a first-in first-out order. That is, the first image stored in the database first will be acquired by the first GPU process.
在一些实施例中,电子设备可利用第一GPU进程采用预先训练好的位置检测模型对其所获取到的第一图像进行位置检测处理,以得到其所获取到的第一图像对应的位置信息。In some embodiments, the electronic device may use the first GPU process to use a pre-trained position detection model to perform position detection processing on the first image it has acquired, so as to obtain the position information corresponding to the first image it has acquired. .
需要说明的是,电子设备可利用第一GPU进程采用预先训练好的位置检测模型对其所获取到的第一图像进行位置检测处理,以得到其所获取到的第一图像对应的位置信息之前,电子设备可先利用多个CPU进程中的每个CPU进程对多个视频帧中的每个视频帧进行解码处理,再进行预处理,从而得到多个第一图像,并将该多个第一图像存入数据库中。随后,电子设备可采用第一GPU进程从数据库中依次获取第一图像。其中,对图像进行预处理可以为:将图像的格式、大小等转换成位置检测模型所支持的格式、大小等。It should be noted that the electronic device can use the first GPU process to use a pre-trained position detection model to perform position detection processing on the first image it obtains, so as to obtain the position information corresponding to the first image it obtains. , The electronic device can first use each of the multiple CPU processes to decode each of the multiple video frames, and then perform pre-processing to obtain multiple first images, and combine the multiple first images An image is stored in the database. Subsequently, the electronic device may use the first GPU process to sequentially obtain the first image from the database. Among them, preprocessing the image can be: converting the format and size of the image into a format and size supported by the position detection model.
在105中,根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像。In 105, according to the position information corresponding to each first image, a cropping process is performed on each first image to obtain a plurality of second images.
在本申请实施例中,每得到一个第一图像对应的位置信息,电子设备可根据该位置信息,对相应的第一图像进行裁切处理,得到第二图像。可以理解,电子设备根据哪个第一图像得到的位置信息,便根据该位置信息对哪个第一图像进行裁切处理。其中,根据位置信息,对图像进行裁切处理可以为:裁切出图像中存在文字的区域。In the embodiment of the present application, each time position information corresponding to a first image is obtained, the electronic device may perform crop processing on the corresponding first image according to the position information to obtain a second image. It can be understood that the electronic device performs the cropping process on which first image according to the position information obtained from which first image. Among them, according to the position information, the cropping process on the image may be: cropping out the area where the text exists in the image.
在106中,对每个第二图像进行文字识别处理,得到文字识别结果。In 106, a character recognition process is performed on each second image to obtain a character recognition result.
比如,当得到多个第二图像之后,电子设备可对每个第二图像进行文字识别处理,以得到文字识别结果。其中该文字识别结果包括每个第二图像对应的文字识别结果。For example, after obtaining multiple second images, the electronic device may perform character recognition processing on each second image to obtain a character recognition result. The character recognition result includes the character recognition result corresponding to each second image.
可以理解的是,多个第二图像并不一定是同时得到的,因此,每得到一个第二图像,电子设备即可对该第二图像进行文字识别处理,得到该第二图像的文字识别结果,从而最终得到该多个第二图像的文字识别结果。随后,电子设备可保存该文字识别结果,并利用该文字识别结果进行视频分类,视频推送等。It is understandable that multiple second images are not necessarily obtained at the same time. Therefore, every time a second image is obtained, the electronic device can perform character recognition processing on the second image to obtain the character recognition result of the second image , So as to finally obtain the character recognition results of the plurality of second images. Subsequently, the electronic device can save the text recognition result, and use the text recognition result for video classification, video push, etc.
本实施例中,利用多个CPU进程对每个视频帧进行解码处理,得到多个第一图像;将多个第一图像存入数据库中,从而使得第一GPU进程可以不间断的从数据库中获取第一图像,并依次得到每个第一图像对应的位置信息,从而最终得到文字识别结果。可知,本实施例提供的文字识别方法可以使得第一GPU进程不会处于长时间的等待过程中,进而提高了GPU的资源利用率。In this embodiment, multiple CPU processes are used to decode each video frame to obtain multiple first images; the multiple first images are stored in the database, so that the first GPU process can uninterruptedly get from the database The first image is obtained, and the position information corresponding to each first image is obtained in turn, so as to finally obtain the character recognition result. It can be seen that the text recognition method provided in this embodiment can prevent the first GPU process from being in a long waiting process, thereby improving the resource utilization of the GPU.
在一些实施例中,多个视频帧为待分类视频对应的视频帧,在流程106之后,还可以包括:In some embodiments, the multiple video frames are video frames corresponding to the video to be classified. After the process 106, it may further include:
对文字识别结果进行分词处理,以得到多个分词;Perform word segmentation processing on the text recognition result to obtain multiple word segmentation;
根据多个分词,确定待分类视频的类别。According to multiple word segmentation, determine the category of the video to be classified.
可以理解的是,电子设备所得到的文字识别结果仅仅是将图像中的文字提取出来,并未进行分词。例如,电子设备所得到的文字识别结果可以为:让孩子们在听故事玩游戏中探索优美神奇的中国文字。那么,电子设备可对该文字识别结果进行分词,所得到的多个分词可以为:让、孩子们、在、听、故事、玩、游戏、中、探索、优美、神奇、的、中国、文字。It is understandable that the text recognition result obtained by the electronic device only extracts the text in the image, and does not perform word segmentation. For example, the text recognition result obtained by the electronic device can be: Let children explore beautiful and magical Chinese characters while listening to stories and playing games. Then, the electronic device can segment the word recognition result, and the multiple segmentation obtained can be: let, children, zai, listening, story, play, game, middle, exploration, beautiful, magical, Chinese, text .
比如,当得到多个分词之后,电子设备可根据该多个分词,确定待分类视频的类别。例如,当电子设备得到的多个分词中,多次出现歌、歌词、唱等词,电子设备可将待分类视频的类别确定为歌曲类。又例如,若电子设备根据其得到的多个分词分析出该多个分词属于某首歌的歌词,电子设备可将待分类视频的类别确定为歌曲类;或者,电子设备还可确定该歌词所属的类型,如古风类、流行类、摇滚类等。假设电子设备确定该歌词所属的类型为古风类,电子设备可将待分类视频的类别确定为歌曲类下的古风类。For example, after obtaining multiple word segmentation, the electronic device can determine the category of the video to be classified according to the multiple word segmentation. For example, when words such as songs, lyrics, and singing appear multiple times among the multiple word segmentation obtained by the electronic device, the electronic device may determine the category of the video to be classified as the song category. For another example, if the electronic device analyzes the multiple word segmentation that it obtains to belong to the lyrics of a certain song, the electronic device can determine the category of the video to be classified as the song category; or, the electronic device can also determine that the lyrics belong Genres, such as antique, popular, rock and roll, etc. Assuming that the electronic device determines that the genre of the lyrics belongs to the ancient style, the electronic device may determine the category of the video to be classified as the ancient style under the song category.
在一些实施例中,根据多个分词,确定待分类视频的类别,可以包括:In some embodiments, determining the category of the video to be classified according to multiple word segmentation may include:
从多个分词中确定出目标关键词;Determine the target keywords from multiple word segmentation;
根据目标关键词,确定待分类视频的类别。According to the target keywords, determine the category of the video to be classified.
比如,当得到多个分词之后,电子设备可从多个分词中确定出目标关键词。随后,电子设备可根据该目标关键词,确定待分类视频的类别。例如,电子设备可从多个分词中确定出相同的分词。随后,电子设备可确定相同的分词的数量,并将数量最多的相同的分词,确定为目标关键词。例如,假设电子设备得到10个分词,其中,歌曲一词的数量为7,风格一词的数量为2,优美一词的数量为1。那么,电子设备可将“歌曲”确定为目标关键词,从而电子设备可将待分类视频的类别确定为歌曲类。For example, after obtaining multiple word segmentation, the electronic device can determine the target keyword from the multiple word segmentation. Subsequently, the electronic device can determine the category of the video to be classified according to the target keyword. For example, the electronic device can determine the same participle from multiple participles. Subsequently, the electronic device may determine the number of the same word segmentation, and determine the same word segmentation with the largest number as the target keyword. For example, suppose that the electronic device obtains 10 word segmentation, where the number of the word song is 7, the number of the word style is 2, and the number of the word grace is 1. Then, the electronic device can determine "song" as the target keyword, so that the electronic device can determine the category of the video to be classified as the song category.
在一些实施例中,根据目标关键词,确定待分类视频的类别,可以包括:In some embodiments, determining the category of the video to be classified according to the target keyword may include:
根据关键词与类别之间的预设映射关系,确定目标关键词对应的类别;Determine the category corresponding to the target keyword according to the preset mapping relationship between keywords and categories;
将目标关键词对应的类别确定为待分类视频的类别。The category corresponding to the target keyword is determined as the category of the video to be classified.
比如,电子设备可预先设置关键词与类别之间的预设映射关系R1。例如,关键词K1对应类别C1,关键词K2对应类别C2,关键词K3对应类别C3,等等。假设目标关键词为K1,那么其所对应的类别为C1,因此,待分类视频的类别即为C1。For example, the electronic device may preset a preset mapping relationship R1 between keywords and categories. For example, keyword K1 corresponds to category C1, keyword K2 corresponds to category C2, keyword K3 corresponds to category C3, and so on. Assuming that the target keyword is K1, then its corresponding category is C1. Therefore, the category of the video to be classified is C1.
又比如,电子设备可预先设置关键词与类别之间的预设映射关系R2。例如,关键词K1、K2、K3对应类别C1,关键词K4、K5、K6对应类别C2,关键词K7、K8、K9对应类别C3,等等。假设目标关键词为K3,那么其所对应的类别为C1,因此,待分类视频的类别即为C1。For another example, the electronic device may preset a preset mapping relationship R2 between keywords and categories. For example, keywords K1, K2, K3 correspond to category C1, keywords K4, K5, and K6 correspond to category C2, keywords K7, K8, and K9 correspond to category C3, and so on. Assuming that the target keyword is K3, then its corresponding category is C1, so the category of the video to be classified is C1.
在一些实施例中,从多个分词中确定出目标关键词,可以包括:In some embodiments, determining the target keyword from multiple word segmentation may include:
从多个分词中确定出相同的分词;Determine the same participle from multiple participles;
确定相同的分词的数量;Determine the number of identical participles;
将数量大于预设数量所对应的相同的分词确定为目标关键词。The same word segmentation corresponding to the number greater than the preset number is determined as the target keyword.
比如,当得到多个分词之后,电子设备可从多个分词中确定出相同的分词。随后,电子设备可确定相同的分词的数量,并将数量大于预设数量所对应的相同的分词确定为目标关键词。例如,假设电子设备得到10个分词,其中,歌曲一词的数量为7,风格一词的数量为2,优美一词的数量为1,预设数量为5。那么,电子设备可将“歌曲”确定为目标关键词。For example, after obtaining multiple word segmentation, the electronic device can determine the same word segmentation from the multiple word segmentation. Subsequently, the electronic device can determine the number of the same word segmentation, and determine the same word segmentation corresponding to the number greater than the preset number as the target keyword. For example, suppose that the electronic device obtains 10 word segmentation, where the number of the word song is 7, the number of the word style is 2, the number of beautiful words is 1, and the preset number is 5. Then, the electronic device can determine "song" as the target keyword.
在一些实施例中,该文字识别方法还可以包括:In some embodiments, the text recognition method may further include:
获取用户的用户画像;Obtain user portraits of users;
根据用户画像和待分类视频的类别,判断是否将待分类视频推送给用户;According to the user portrait and the category of the video to be classified, determine whether to push the video to be classified to the user;
若是,则将待分类视频推送给用户。If it is, the video to be classified is pushed to the user.
比如,当确定出待分类视频的类别之后,电子设备可获取用户的用户画像。其中,用户画像是指将用户的每个具体信息抽象成标签,利用这些标签将用户形象具体化,从而为用户提供有针对性的服务。通俗来讲,某个用户的用户画像可描述某个用户经常浏览的文章为哪些类别的文章、该用户经常观看的视频为哪些类别的视频、该用户经常购买的物品为哪些类别的物品,等等。For example, after determining the category of the video to be classified, the electronic device can obtain the user portrait of the user. Among them, the user portrait refers to the abstraction of each user's specific information into tags, and the use of these tags to concretize the user's image, thereby providing users with targeted services. In layman's terms, the user portrait of a user can describe which types of articles a user frequently browses, which types of videos the user frequently watches, and which types of items the user frequently buys, etc. Wait.
当获取到某用户的用户画像之后,电子设备可确定该用户经常观看的视频为哪些类别的视频。然后,电子设备可判断该待分类视频的类别是否属于该用户经常观看的视频所对应的类别中的其中一个类别。若该待分类视频的类别属于该用户经常观看的视频所对应的类别中的其中一个类别,电子设备可将该待分类视频推送给用户,以供用户观看。After obtaining a user portrait of a certain user, the electronic device can determine which types of videos the user frequently watches. Then, the electronic device can determine whether the category of the video to be classified belongs to one of the categories corresponding to the video frequently watched by the user. If the category of the video to be classified belongs to one of the categories corresponding to the video frequently watched by the user, the electronic device may push the video to be classified to the user for the user to watch.
请参阅图2,图2为本申请实施例提供的文字识别方法的第二种流程示意图。该文字识别方法可以包括:Please refer to FIG. 2. FIG. 2 is a schematic diagram of the second flow of the character recognition method provided by an embodiment of this application. The text recognition method may include:
在201中,电子设备获取多个视频帧。In 201, the electronic device acquires multiple video frames.
比如,电子设备可获取一视频,然后将该视频分解为多个视频帧,从而电子设备可得 到多个视频帧。For example, the electronic device can obtain a video and then decompose the video into multiple video frames, so that the electronic device can obtain multiple video frames.
又比如,电子设备可进入录像模式,并采用摄像头对拍摄场景进行连续拍摄,以连续输出多个视频帧,从而构成视频流。电子设备即可获取该连续输出的多个视频帧。For another example, the electronic device can enter the video recording mode, and use the camera to continuously shoot the shooting scene to continuously output multiple video frames to form a video stream. The electronic device can then obtain the continuously output multiple video frames.
其中,拍摄场景是指用户通过摄像头所要拍摄的场景,即摄像头所对准的场景即为拍摄场景。需要说明的是,本申请实施例中的拍摄场景并非特指某一特定场景,而是跟随摄像头的指向所实时对准的场景。拍摄场景中可包括文字。Among them, the shooting scene refers to the scene that the user wants to shoot through the camera, that is, the scene that the camera is aimed at is the shooting scene. It should be noted that the shooting scene in the embodiment of the present application does not specifically refer to a specific scene, but a scene that is aligned in real time following the direction of the camera. Text can be included in the shooting scene.
在202中,电子设备创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像。In 202, the electronic device creates multiple CPU processes, and uses each CUP process to decode each video frame to obtain multiple first images.
比如,当得到多个视频帧之后,电子设备可创建多个CPU进程,并利用每个CPU进程对每个视频帧进行解码处理,得到多个第一图像。For example, after obtaining multiple video frames, the electronic device may create multiple CPU processes, and use each CPU process to decode each video frame to obtain multiple first images.
又比如,当得到多个视频帧之后,电子设备可创建多个CPU进程,并利用每个CPU进程对每个视频帧进行解码处理及预处理,得到多个第一图像。其中,对图像进行预处理可以为:将图像的格式、大小等转换成相应的格式、大小等,如将图像的格式、大小等转换成位置检测模型所支持的格式、大小等。For another example, after obtaining multiple video frames, the electronic device may create multiple CPU processes, and use each CPU process to decode and preprocess each video frame to obtain multiple first images. The preprocessing of the image may include: converting the format, size, etc. of the image into a corresponding format, size, etc., such as converting the format, size, etc. of the image into a format, size, etc. supported by the position detection model.
可以理解的是,CPU进程的数量可小于或等于视频帧的数量。当CPU进程的数量小于视频帧的数量时,如假设CPU进程的数量为5,视频帧的数量为10,电子设备可利用该5个CPU进程获取该10个视频帧的前5帧视频帧进行解码处理,得到多个第一图像;或者,电子设备可利用该5个CPU进程获取该10个视频帧的任意5帧视频帧进行解码处理,得到多个第一图像。It can be understood that the number of CPU processes may be less than or equal to the number of video frames. When the number of CPU processes is less than the number of video frames, for example, assuming that the number of CPU processes is 5 and the number of video frames is 10, the electronic device can use the 5 CPU processes to obtain the first 5 video frames of the 10 video frames. Decoding processing to obtain multiple first images; or, the electronic device can use the 5 CPU processes to obtain any 5 video frames of the 10 video frames for decoding processing to obtain multiple first images.
在一些实施例中,对于电子设备实时输出视频帧的视频,如视频流,电子设备每输出一帧视频帧,就创建一CPU进程对该视频帧进行解码处理,从而得到多个第一图像。可以理解的是,CPU进程的数量可小于或等于视频流中的视频帧的数量。In some embodiments, for an electronic device that outputs a video of a video frame in real time, such as a video stream, each time the electronic device outputs a video frame, a CPU process is created to decode the video frame to obtain multiple first images. It can be understood that the number of CPU processes may be less than or equal to the number of video frames in the video stream.
需要说明的是,在创建并启动进程,如CPU进程或GPU进程时,还需进行必要的进程通信初始化工作。比如,为每个进程分配共享内存,建立进程间通信的队列、管道等。其中,共享内存可用于实现进程之间的数据传递,如一进程可从另一进程的共享内存中获取数据。It should be noted that when creating and starting a process, such as a CPU process or a GPU process, necessary process communication initialization work is also required. For example, allocating shared memory for each process, and establishing queues and pipes for inter-process communication. Among them, the shared memory can be used to implement data transfer between processes, for example, one process can obtain data from the shared memory of another process.
在203中,电子设备利用每个CPU进程将其得到的第一图像存入其共享内存中,并将其标识信息存入第一队列中。In 203, the electronic device uses each CPU process to store the obtained first image in its shared memory, and stores its identification information in the first queue.
比如,每当CPU进程对其所获取的视频帧解码处理完毕时,电子设备均可利用该CPU进程将其所得到的第一图像存入其共享内存中,并将其标识信息存入第一队列中。其中, CPU进程的标识信息可以为CPU进程的进程ID。第一队列可为先进先出队列,即先进入队列中的数据先被处理,后进入队列中的数据后被处理。For example, whenever the CPU process has completed the decoding processing of the video frame it has obtained, the electronic device can use the CPU process to store the first image it obtains in its shared memory, and store its identification information in the first image. In the queue. The identification information of the CPU process may be the process ID of the CPU process. The first queue may be a first-in-first-out queue, that is, the data that enters the queue first is processed first, and the data that enters the queue later is processed.
在一些实施例中,当利用CPU进程将其标识信息存入第一队列中时,电子设备还可利用该CPU进程将其得到的第一图像的图像大小、图像格式等存入该第一队列中。In some embodiments, when the CPU process is used to store its identification information in the first queue, the electronic device can also use the CPU process to store the image size and image format of the first image it obtains in the first queue. in.
在另一些实施例中,当利用CPU进程将其得到的第一图像存入其共享内存中,并将其标识信息存入第一队列中之后,电子设备可将该CPU进程挂起,直到第一GPU进程将其得到的位置信息发送给该CPU进程时,电子设备再使得该CPU进程进入就绪状态。In other embodiments, when the first image obtained by the CPU process is stored in its shared memory and its identification information is stored in the first queue, the electronic device can suspend the CPU process until the first image. When a GPU process sends the obtained position information to the CPU process, the electronic device makes the CPU process enter the ready state.
可以理解的是,每个CPU进程均具有一共享内存,数据库可包括每个CPU进程的共享内存。It is understandable that each CPU process has a shared memory, and the database may include the shared memory of each CPU process.
需要说明的是,本申请实施例中,之所以使用共享内存来存放图像数据,如第一图像或第二图像等,而不是将图像数据存入队列或管道中,是因为图像数据一般比较大,若直接将图像数据存入队列中进行传输,可能导致不必要的拷贝等操作,从而严重影响文字识别过程中进程的整体运行速度。通过将图像数据放入共享内存中,只在队列中存放标识信息等,可以大大提高进程间的通信速度,使得进程间通信的时间对于整个文字识别过程所需的时间而言几乎可忽略不计。It should be noted that, in the embodiments of this application, the reason why the shared memory is used to store image data, such as the first image or the second image, instead of storing the image data in a queue or pipeline, is because the image data is generally relatively large. , If the image data is directly stored in the queue for transmission, it may cause unnecessary copying and other operations, which will seriously affect the overall running speed of the process in the text recognition process. By putting the image data in the shared memory and storing only the identification information in the queue, the communication speed between processes can be greatly improved, so that the communication time between processes is almost negligible for the time required for the entire character recognition process.
在204中,电子设备创建第一GPU进程,并利用第一GPU进程依次从第一队列中获取标识信息。In 204, the electronic device creates a first GPU process, and uses the first GPU process to sequentially obtain identification information from the first queue.
在205中,电子设备利用第一GPU进程根据标识信息,依次从相应CPU进程的共享内存中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息。In 205, the electronic device uses the first GPU process to sequentially obtain the first image from the shared memory of the corresponding CPU process according to the identification information, and sequentially determines the location information of the text from each first image, and obtains each first image. The location information corresponding to the image.
比如,当第一队列中存在至少一个标识信息时,电子设备可创建第一GPU进程,并利用第一GPU进程依次从第一队列中获取标识信息。其中,先存入第一队列中的标识信息可先被第一GPU进程获取。For example, when there is at least one piece of identification information in the first queue, the electronic device may create a first GPU process, and use the first GPU process to sequentially obtain identification information from the first queue. Wherein, the identification information first stored in the first queue may be first acquired by the first GPU process.
每得到一个标识信息,电子设备可利用第一GPU进程从该标识信息对应的CPU进程的共享内存中获取第一图像,并依次对所获取到的第一图像进行位置检测处理,以从所获取到的第一图像中确定文字所在的位置信息,得到所获取到的第一图像对应的位置信息。例如,假设得到的标识信息为CPU进程P1的进程ID,那么,电子设备可利用第一GPU进程从CPU进程P1的共享内存中获取第一图像。Each time a piece of identification information is obtained, the electronic device can use the first GPU process to obtain the first image from the shared memory of the CPU process corresponding to the identification information, and perform position detection processing on the obtained first image in turn to obtain The position information where the text is located in the first image obtained is determined, and the position information corresponding to the obtained first image is obtained. For example, assuming that the obtained identification information is the process ID of the CPU process P1, the electronic device may use the first GPU process to obtain the first image from the shared memory of the CPU process P1.
在一些实施例中,电子设备利用第一GPU进程依次对所获取到的第一图像进行位置检测处理,以从所获取到的第一图像中确定文字所在的位置信息,得到所获取到的第一图像 对应的位置信息,可以包括:电子设备利用第一GPU进程采用预先训练好的位置检测模型对其所获取到的第一图像依次进行位置检测处理,以从所获取到的第一图像中确定文字所在的位置信息,得到所获取到的第一图像对应的位置信息。其中,该位置检测模型可为深度神经网络模型。In some embodiments, the electronic device uses the first GPU process to sequentially perform position detection processing on the acquired first image to determine the position information of the text from the acquired first image, and obtain the acquired first image. The position information corresponding to an image may include: the electronic device uses the first GPU process to use a pre-trained position detection model to sequentially perform position detection processing on the first image it obtains, so as to obtain the first image from the first image. The location information where the text is located is determined, and the location information corresponding to the acquired first image is obtained. Among them, the position detection model may be a deep neural network model.
在206中,电子设备利用第一GPU进程将每个第一图像对应的位置信息依次发送至相应CPU进程。In 206, the electronic device uses the first GPU process to sequentially send the location information corresponding to each first image to the corresponding CPU process.
在本申请实施例中,电子设备每利用第一GPU进程得到一个第一图像对应的位置信息,便利用第一GPU进程将得到的第一图像对应的位置信息通过管道发送给相应CPU进程。In the embodiment of the present application, each time the electronic device obtains the position information corresponding to a first image by using the first GPU process, it is convenient to use the first GPU process to send the obtained position information corresponding to the first image to the corresponding CPU process through the pipeline.
例如,假设第一GPU进程获取到的标识信息为CPU进程P1的进程ID,那么,电子设备可利用第一GPU进程从CPU进程P1的共享内存中获取第一图像,并对该第一图像进行位置检测处理,得到该第一图像对应的位置信息。随后,电子设备可利用第一GPU进程将该位置信息发送给CPU进程P1。For example, assuming that the identification information obtained by the first GPU process is the process ID of the CPU process P1, then the electronic device can use the first GPU process to obtain the first image from the shared memory of the CPU process P1, and perform processing on the first image. The position detection process obtains the position information corresponding to the first image. Subsequently, the electronic device can use the first GPU process to send the location information to the CPU process P1.
可以理解的是,在本申请实施例中,电子设备采用多个CPU进程并行进行解码处理,并将得到的第一图像存入其共享内存,将其标识信息存入第一队列中。那么,第一CPU进程便可不间断的从第一队列中获取标识信息,并根据所获取的标识信息从相应CPU进程的共享内存中获取第一图像进行位置检测处理,从而减少了第一GPU进程的等待时间,进而提高了GPU的资源利用率。It can be understood that, in the embodiment of the present application, the electronic device uses multiple CPU processes to perform decoding processing in parallel, and stores the obtained first image in its shared memory, and stores its identification information in the first queue. Then, the first CPU process can continuously obtain the identification information from the first queue, and obtain the first image from the shared memory of the corresponding CPU process according to the obtained identification information for position detection processing, thereby reducing the first GPU process The waiting time, which in turn improves the resource utilization of the GPU.
在207中,电子设备利用每个CPU进程根据接收到的位置信息,对相应的第一图像进行裁切处理,以依次得到多个第二图像。In 207, the electronic device uses each CPU process to perform cropping processing on the corresponding first image according to the received position information, so as to obtain a plurality of second images in sequence.
比如,每当一CPU进程接收到位置信息之后,电子设备可利用该CPU进程根据接收到的位置信息,对相应的第一图像进行裁切处理,从而得到多个第二图像。For example, every time a CPU process receives location information, the electronic device can use the CPU process to perform cropping processing on the corresponding first image according to the received location information, so as to obtain multiple second images.
例如,假设CPU进程P1接收到位置信息,电子设备可利用CPU进程P1根据接收到的位置信息,对该CPU进程P1之前得到的第一图像进行裁切处理,从而得到第二图像。其中,根据位置信息,对图像进行裁切处理可以为:裁切出图像中存在文字的区域。可以理解,电子设备根据哪个第一图像得到的位置信息,便根据该位置信息对哪个第一图像进行裁切处理。例如,假设电子设备根据第一图像G1得到的位置信息,那么,电子设备可根据该位置信息对第一图像G1进行裁切处理。For example, assuming that the CPU process P1 receives the position information, the electronic device can use the CPU process P1 to perform cutting processing on the first image obtained before the CPU process P1 according to the received position information, so as to obtain the second image. Among them, according to the position information, the cropping process on the image may be: cropping out the area where the text exists in the image. It can be understood that the electronic device performs the cropping process on which first image according to the position information obtained from which first image. For example, assuming that the electronic device obtains the position information according to the first image G1, the electronic device may perform a cropping process on the first image G1 according to the position information.
在一些实施例中,当获取到位置信息后,电子设备可利用CPU进程根据位置信息对相应的第一图像进行后处理,比如,位置信息可包括区域的大小和位置。那么,电子设备可利用CPU进程根据区域的大小和位置对相应的第一图像进行筛选,从而从相应的第一图像 中确定出文字所在的区域。随后,电子设备可利用CPU进程裁切出该文字所在的区域。接着,电子设备可利用CPU进程对文字所在的区域进行预处理,从而得到第二图像。其中,对文字所在的区域进行预处理,可以使得得到的第二图像的格式、大小等转换成相应的格式和大小,如使得得到的第二图像的格式、大小转换成文字识别模型所支持的格式、大小等。In some embodiments, after obtaining the location information, the electronic device may use the CPU process to perform post-processing on the corresponding first image according to the location information. For example, the location information may include the size and location of the area. Then, the electronic device can use the CPU process to filter the corresponding first image according to the size and location of the area, so as to determine the area where the text is located from the corresponding first image. Subsequently, the electronic device can use the CPU process to cut out the area where the text is located. Then, the electronic device can use the CPU process to preprocess the area where the text is located, thereby obtaining the second image. Among them, the preprocessing of the area where the text is located can convert the format and size of the obtained second image into the corresponding format and size, such as converting the format and size of the obtained second image into a text recognition model supported Format, size, etc.
在208中,电子设备利用每个CPU进程将其得到的第二图像存入其共享内存中,并将其标识信息存入第二队列中。In 208, the electronic device uses each CPU process to store the obtained second image in its shared memory, and stores its identification information in the second queue.
比如,每当CPU进程根据其所接收到的位置信息对相应的第一图像裁切处理完毕时,电子设备均可利用该CPU进程将其所得到的第二图像存入其共享内存中,并将其标识信息存入第二队列中。其中,CPU进程的标识信息可以为CPU进程的进程ID。第二队列可为先进先出队列,即先进入队列中的数据先被处理,后进入队列中的数据后被处理。For example, whenever the CPU process finishes cutting the corresponding first image according to the received position information, the electronic device can use the CPU process to store the obtained second image in its shared memory, and Store its identification information in the second queue. The identification information of the CPU process may be the process ID of the CPU process. The second queue may be a first-in first-out queue, that is, data that enters the queue first is processed first, and data that enters the queue later is processed.
在一些实施例中,当利用CPU进程将其标识信息存入第二队列中时,电子设备还可利用该CPU进程将其得到的第二图像的图像大小、图像格式等存入该第二队列中。In some embodiments, when the CPU process is used to store its identification information in the second queue, the electronic device can also use the CPU process to store the image size, image format, etc. of the second image it obtains in the second queue. in.
在另一些实施例中,当利用CPU进程将其得到的第二图像存入其共享内存中,并将其标识信息存入第二队列中之后,电子设备可将该CPU进程挂起,直到第二GPU进程将其得到的文字识别结果发送给该CPU进程时,电子设备再使得该CPU进程进入就绪状态。In other embodiments, after the CPU process is used to store the second image obtained by it in its shared memory and its identification information is stored in the second queue, the electronic device can suspend the CPU process until the first Second, when the GPU process sends the text recognition result it obtains to the CPU process, the electronic device makes the CPU process enter a ready state.
在209中,电子设备创建第二GPU进程,并利用第二GPU进程依次从第二队列中获取标识信息。In 209, the electronic device creates a second GPU process, and uses the second GPU process to sequentially obtain identification information from the second queue.
在210中,电子设备利用第二GPU进程根据从第二队列中获取的标识信息,依次从相应CPU进程的共享内存中获取第二图像。In 210, the electronic device uses the second GPU process to sequentially obtain the second image from the shared memory of the corresponding CPU process according to the identification information obtained from the second queue.
在211中,电子设备利用第二GPU进程对第二图像进行文字识别处理,以得到文字识别结果。In 211, the electronic device uses the second GPU process to perform character recognition processing on the second image to obtain a character recognition result.
比如,当第二队列中存在至少一个标识信息时,电子设备可创建第二GPU进程,并利用第二GPU进程依次从第二队列中获取标识信息。其中,先存入第二队列中的标识信息可先被第二GPU进程获取。For example, when there is at least one piece of identification information in the second queue, the electronic device may create a second GPU process, and use the second GPU process to sequentially obtain identification information from the second queue. Wherein, the identification information first stored in the second queue can be first acquired by the second GPU process.
每得到一个标识信息,电子设备可利用第二GPU进程从该标识信息对应的CPU进程的共享内存中获取第二图像,并依次对所获取到的第二图像进行文字识别处理,以得到文字识别结果。例如,假设得到的标识信息为CPU进程P1的进程ID,那么,电子设备可利用第二GPU进程从CPU进程P1的共享内存中获取第二图像。Each time a piece of identification information is obtained, the electronic device can use the second GPU process to obtain a second image from the shared memory of the CPU process corresponding to the identification information, and sequentially perform text recognition processing on the obtained second image to obtain text recognition result. For example, assuming that the obtained identification information is the process ID of the CPU process P1, the electronic device may use the second GPU process to obtain the second image from the shared memory of the CPU process P1.
在一些实施例中,电子设备利用第二GPU进程依次对所获取到的第二图像进行文字识 别处理,以得到文字识别结果,可以包括:电子设备利用第二GPU进程采用预先训练好的文字识别模型对其所获取到的第二图像依次进行文字识别处理,以得到文字识别结果。其中,该文字识别模型可为深度神经网络模型。In some embodiments, the electronic device uses the second GPU process to sequentially perform text recognition processing on the acquired second image to obtain a text recognition result, which may include: the electronic device uses the second GPU process to use pre-trained text recognition The model sequentially performs character recognition processing on the second image it acquires to obtain the character recognition result. Among them, the character recognition model may be a deep neural network model.
在另一些实施例中,当电子设备利用第二GPU进程对CPU进程P1所得到的第二图像进行文字识别处理,得到文字识别结果之后,电子设备还可利用第二GPU进程将该文字识别结果通过管道发送给CPU进程P1。随后,电子设备可利用CPU进程P1对文字识别结果进行后处理,如对文字识别结果进行编码转换处理,从而将文字识别结果转换成用户所需的格式等。In other embodiments, when the electronic device uses the second GPU process to perform character recognition processing on the second image obtained by the CPU process P1, and obtains the character recognition result, the electronic device can also use the second GPU process to perform the character recognition result Sent to the CPU process P1 through the pipeline. Subsequently, the electronic device can use the CPU process P1 to perform post-processing on the character recognition result, such as performing code conversion processing on the character recognition result, so as to convert the character recognition result into a format required by the user.
可以理解的是,该多个第二图像根据多个第一图像得到,该多个第一图像根据多个视频帧得到。该多个视频帧属于待分类视频。因此,电子设备可根据该文字识别结果对待分类视频进行分类。It is understandable that the multiple second images are obtained based on multiple first images, and the multiple first images are obtained based on multiple video frames. The multiple video frames belong to the video to be classified. Therefore, the electronic device can classify the video to be classified according to the text recognition result.
比如,电子设备可对文字识别结果进行分词处理,以得到多个分词。当得到多个分词之后,电子设备可根据该多个分词,确定待分类视频的类别。例如,当电子设备得到的多个分词中,多次出现歌、歌词、唱等词,电子设备可将待分类视频的类别确定为歌曲类。又例如,若电子设备根据其得到的多个分词分析出该多个分词属于某首歌的歌词,电子设备可将待分类视频的类别确定为歌曲类。For example, the electronic device can perform word segmentation processing on the character recognition result to obtain multiple word segmentation. After obtaining multiple word segmentation, the electronic device can determine the category of the video to be classified according to the multiple word segmentation. For example, when words such as songs, lyrics, and singing appear multiple times among the multiple word segmentation obtained by the electronic device, the electronic device may determine the category of the video to be classified as the song category. For another example, if the electronic device analyzes the multiple word segments obtained by the electronic device that the multiple word segments belong to the lyrics of a certain song, the electronic device may determine the category of the video to be classified as the song category.
在一些实施例中,当某个CPU进程执行完对文字识别结果进行后处理的流程之后,若流程201中的多个视频帧还未处理完,电子设备可利用该CPU进程继续获取一个视频帧,并对该视频帧进行解码处理,得到第一图像。随后经流程203到流程211,得到该第一图像对应的文字识别结果。可以理解的是,在执行上述流程时,电子设备无需再创建新的CPU进程或GPU进程,继续沿用之前创建的CPU进程和GPU进程即可。In some embodiments, after a certain CPU process completes the process of post-processing the text recognition result, if multiple video frames in the process 201 have not been processed, the electronic device can use the CPU process to continue to obtain a video frame , And decode the video frame to obtain the first image. Then, through the process 203 to the process 211, the character recognition result corresponding to the first image is obtained. It is understandable that when executing the above process, the electronic device does not need to create a new CPU process or GPU process anymore, and it can continue to use the previously created CPU process and GPU process.
在另一些实施例中,当某个CPU进程执行完对相应的第一图像进行裁切处理的流程之后,若流程201中的多个视频帧还未处理完,电子设备可利用该CPU进程继续获取一个视频帧,并对该视频帧进行解码处理,得到第一图像。随后经流程203到流程211,得到该第一图像对应的文字识别结果。可以理解的是,在执行上述流程时,电子设备无需再创建新的CPU进程或GPU进程,继续沿用之前创建的CPU进程和GPU进程即可。In other embodiments, after a certain CPU process finishes the process of cropping the corresponding first image, if multiple video frames in process 201 have not been processed, the electronic device can use the CPU process to continue. Obtain a video frame, and decode the video frame to obtain the first image. Then, through the process 203 to the process 211, the character recognition result corresponding to the first image is obtained. It is understandable that when executing the above process, the electronic device does not need to create a new CPU process or GPU process anymore, and it can continue to use the previously created CPU process and GPU process.
在一些实施例中,还可合理设置CPU进程、第一GPU进程和第二GPU进程的数量,以尽可能使得三部分的运行时间相近,从而最大化资源利用率。例如,对于该文字识别方法,解码处理过程消耗时间长,因此可以将CPU进程的数量设置得较多一点,第一GPU进程和第二GPU进程可相对设置得较少一点,从而使得GPU进程可以不间断的从共享内 存中取得计算任务,如第一GPU进程可以不间断的从共享内存中取得第一图像进行位置检测处理,第二GPU进程可以不间断的从共享内存中取得第二图像进行文字识别处理,进而最大化GPU的资源利用率,提升整个***的运行速度。In some embodiments, the number of CPU processes, the first GPU process, and the second GPU process can also be set reasonably to make the running times of the three parts as close as possible, thereby maximizing resource utilization. For example, for this text recognition method, the decoding process takes a long time, so the number of CPU processes can be set a little more, and the first GPU process and the second GPU process can be set relatively less, so that the GPU process can be Continuously obtain computing tasks from the shared memory. For example, the first GPU process can continuously obtain the first image from the shared memory for position detection processing, and the second GPU process can continuously obtain the second image from the shared memory for processing. Character recognition processing, thereby maximizing the resource utilization of the GPU, and improving the operating speed of the entire system.
在另一些实施例中,由于解码处理所需的时间较长,因此,电子设备还可利用CPU进程的空闲时间进行解码处理,以提高CPU的资源利用率,以及节省整个过程的运行时间。例如,当利用某CPU进程将其得到的第一图像存入其的共享内存中,并将其的标识信息存入第一队列中之后,电子设备可再次利用该CPU进程获取视频帧进行解码处理,以再次得到第一图像。随后,电子设备可将再次得到第一图像存入该CPU进程的共享内存中,并将该CPU进程的标识信息存入第一队列中,直到第一GPU进程将其得到的位置信息发送给该CPU进程,或第二GPU进程将其得到的文字识别结果发送给该CPU进程时,若该CPU进程正处于对视频帧进行解码处理的过程中,电子设备可使得该CPU进程暂停对视频帧进行解码处理,开始根据接收到的位置信息对相应的第一图像进行裁切处理,或对接收到的文字识别结果进行后处理。若该CPU进程裁切处理完毕,或后处理完毕之后,并未接收到位置信息或文字识别结果,电子设备可利用该CPU进程继续对视频帧进行解码处理。In other embodiments, since the decoding process takes a long time, the electronic device can also use the idle time of the CPU process to perform the decoding process, so as to improve the resource utilization of the CPU and save the running time of the entire process. For example, when a CPU process is used to store the first image obtained in its shared memory, and its identification information is stored in the first queue, the electronic device can use the CPU process again to obtain video frames for decoding processing To get the first image again. Subsequently, the electronic device may store the first image obtained again in the shared memory of the CPU process, and store the identification information of the CPU process in the first queue, until the first GPU process sends the obtained position information to the CPU process. When the CPU process or the second GPU process sends the text recognition result obtained by it to the CPU process, if the CPU process is in the process of decoding the video frame, the electronic device can make the CPU process pause the video frame. The decoding process starts to perform a cropping process on the corresponding first image according to the received position information, or perform post-processing on the received character recognition result. If the cutting process of the CPU process is completed, or after the post-processing is completed, the position information or character recognition result is not received, the electronic device can use the CPU process to continue to decode the video frame.
在一些实施例中,当利用CPU进程P1对视频帧进行解码处理,得到第一图像G1之后,电子设备可利用该CPU进程P1将该第一图像G1与该CPU进程P1的标识信息关联存入数据库中。当第一GPU进程从数据库中获取到该第一图像G1,并得到该第一图像G1中文字所在的位置信息之后,电子设备可根据该第一图像G1关联的标识信息,将该位置信息发送至CPU进程P1。In some embodiments, when the CPU process P1 is used to decode the video frame to obtain the first image G1, the electronic device can use the CPU process P1 to associate the first image G1 with the identification information of the CPU process P1 and store it in In the database. After the first GPU process obtains the first image G1 from the database and obtains the location information of the text in the first image G1, the electronic device can send the location information according to the identification information associated with the first image G1 To the CPU process P1.
请参阅图3,图3为本申请实施例提供的文字识别装置的结构示意图。该文字识别装置300可以包括:获取模块301、解码模块302、保存模块303、确定模块304、裁切模块305及识别模块306。Please refer to FIG. 3, which is a schematic structural diagram of a character recognition device provided by an embodiment of the application. The text recognition device 300 may include: an acquisition module 301, a decoding module 302, a storage module 303, a determination module 304, a cutting module 305, and an identification module 306.
获取模块301,用于获取多个视频帧;The obtaining module 301 is used to obtain multiple video frames;
解码模块302,用于创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;The decoding module 302 is used to create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
保存模块303,用于将所述多个第一图像存入数据库中;The saving module 303 is configured to store the multiple first images in a database;
确定模块304,用于创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;The determining module 304 is configured to create a first GPU process, and use the first GPU process to sequentially obtain first images from the database, and sequentially determine the location information of the text from each first image, and obtain each Location information corresponding to the first image;
裁切模块305,用于根据每个第一图像对应的位置信息,对每个第一图像进行裁切处 理,得到多个第二图像;The cropping module 305 is configured to perform cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
识别模块306,用于对每个第二图像进行文字识别处理,得到文字识别结果。The recognition module 306 is configured to perform character recognition processing on each second image to obtain a character recognition result.
在一些实施例中,所述数据库包括每个CPU进程的共享内存,所述保存模块303,可以用于:利用每个CPU进程将其得到的第一图像存入其共享内存中,并将其标识信息存入第一队列中;In some embodiments, the database includes the shared memory of each CPU process, and the saving module 303 may be used to store the first image obtained by each CPU process in its shared memory and store it in the shared memory. The identification information is stored in the first queue;
所述确定模块304,可以用于:利用所述第一GPU进程依次从所述第一队列中获取所述标识信息;利用所述第一GPU进程根据所述标识信息,依次从相应CPU进程的共享内存中获取第一图像。The determining module 304 may be configured to: use the first GPU process to sequentially obtain the identification information from the first queue; use the first GPU process to sequentially obtain the identification information from the corresponding CPU process according to the identification information Acquire the first image in the shared memory.
在一些实施例中,所述确定模块304,可以用于:利用所述第一GPU进程将每个第一图像对应的位置信息依次发送至相应CPU进程;In some embodiments, the determining module 304 may be used to: use the first GPU process to send the location information corresponding to each first image to the corresponding CPU process in turn;
所述裁切模块305,可以用于:利用每个CPU进程根据接收到的所述位置信息,对相应的第一图像进行裁切处理,以依次得到多个第二图像。The cropping module 305 may be used for: using each CPU process to perform cropping processing on the corresponding first image according to the received position information, so as to obtain multiple second images in sequence.
在一些实施例中,所述裁切模块305,可以用于:利用每个CPU进程将其得到的第二图像存入其共享内存中,并将其标识信息存入第二队列中;In some embodiments, the cropping module 305 can be used to store the second image obtained by each CPU process in its shared memory, and store its identification information in the second queue;
所述识别模块306,可以用于:创建第二GPU进程,并利用所述第二GPU进程依次从所述第二队列中获取标识信息;利用所述第二GPU进程根据从所述第二队列中获取的标识信息,依次从相应CPU进程的共享内存中获取第二图像;利用所述第二GPU进程对所述第二图像进行文字识别处理,以得到文字识别结果。The identification module 306 may be used to: create a second GPU process, and use the second GPU process to sequentially obtain identification information from the second queue; use the second GPU process to obtain identification information from the second queue The identification information obtained in, sequentially obtain the second image from the shared memory of the corresponding CPU process; use the second GPU process to perform character recognition processing on the second image to obtain a character recognition result.
在一些实施例中,所述多个视频帧为待分类视频对应的视频帧,所述识别模块306,可以用于:对所述文字识别结果进行分词处理,以得到多个分词;根据所述多个分词,确定所述待分类视频的类别。In some embodiments, the multiple video frames are video frames corresponding to the video to be classified, and the recognition module 306 may be used to: perform word segmentation processing on the text recognition result to obtain multiple word segmentation; Multiple word segmentation to determine the category of the video to be classified.
在一些实施例中,所述识别模块306,可以用于:从所述多个分词中确定出目标关键词;根据所述目标关键词,确定所述待分类视频的类别。In some embodiments, the recognition module 306 may be used to: determine a target keyword from the multiple word segmentation; and determine the category of the video to be classified according to the target keyword.
在一些实施例中,所述识别模块306,可以用于:根据关键词与类别之间的预设映射关系,确定所述目标关键词对应的类别;将所述类别确定为所述待分类视频的类别。In some embodiments, the recognition module 306 may be used to: determine the category corresponding to the target keyword according to the preset mapping relationship between the keyword and the category; determine the category as the video to be classified Category.
在一些实施例中,所述识别模块306,可以用于:从所述多个分词中确定出相同的分词;确定相同的分词的数量;将数量大于预设数量所对应的相同的分词确定为目标关键词。In some embodiments, the recognition module 306 may be used to: determine the same participle from the multiple participles; determine the number of the same participle; determine the same participle corresponding to the number greater than the preset number as Target keywords.
在一些实施例中,所述识别模块306,可以用于:获取用户的用户画像;根据所述用户画像和所述待分类视频的类别,判断是否将所述待分类视频推送给所述用户;若是,则 将所述待分类视频推送给所述用户。In some embodiments, the identification module 306 may be used to: obtain a user portrait of a user; determine whether to push the to-be-classified video to the user according to the user's portrait and the category of the to-be-classified video; If yes, push the video to be classified to the user.
本申请实施例提供一种计算机可读的存储介质,其上存储有计算机程序,当所述计算机程序在计算机上执行时,使得所述计算机执行如本实施例提供的文字识别方法中的流程。The embodiment of the present application provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed on a computer, the computer is caused to execute the process in the character recognition method provided in this embodiment.
本申请实施例还提供一种电子设备,包括存储器,处理器,所述存储器中存储有计算机程序,所述处理器通过调用所述存储器中存储的所述计算机程序,用于执行本实施例提供的文字识别方法中的流程。An embodiment of the present application also provides an electronic device, including a memory, a processor, and a computer program stored in the memory. The processor is configured to execute the computer program stored in the memory by calling the computer program stored in the memory. The process in the text recognition method.
例如,上述电子设备可以是诸如平板电脑或者智能手机等移动终端。请参阅图4,图4为本申请实施例提供的电子设备的第一种结构示意图。For example, the above-mentioned electronic device may be a mobile terminal such as a tablet computer or a smart phone. Please refer to FIG. 4, which is a schematic diagram of a first structure of an electronic device provided by an embodiment of this application.
该电子设备400可以包括存储器401、中央处理器402、图形处理器403等部件。本领域技术人员可以理解,图4中示出的电子设备结构并不构成对电子设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。The electronic device 400 may include components such as a memory 401, a central processing unit 402, and a graphics processor 403. Those skilled in the art can understand that the structure of the electronic device shown in FIG. 4 does not constitute a limitation on the electronic device, and may include more or fewer components than shown in the figure, or combine certain components, or different component arrangements.
存储器401可用于存储应用程序和数据。存储器401存储的应用程序中包含有可执行代码。应用程序可以组成各种功能模块。中央处理器402通过运行存储在存储器401的应用程序,从而执行各种功能应用以及数据处理。The memory 401 can be used to store application programs and data. The application program stored in the memory 401 contains executable code. Application programs can be composed of various functional modules. The central processing unit 402 executes various functional applications and data processing by running application programs stored in the memory 401.
中央处理器402是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器401内的应用程序,以及调用存储在存储器401内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体监控。The central processing unit 402 is the control center of the electronic device. It uses various interfaces and lines to connect the various parts of the entire electronic device. It executes the electronic device by running or executing the application program stored in the memory 401 and calling the data stored in the memory 401. Various functions and processing data of the equipment, so as to monitor the electronic equipment as a whole.
图形处理器403可用于做图像和图形相关运算工作。The graphics processor 403 can be used to perform image and graphics related operations.
在本实施例中,电子设备中的中央处理器402会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行代码加载到存储器401中,并由中央处理器402来运行存储在存储器401中的应用程序,从而实现流程:In this embodiment, the central processing unit 402 in the electronic device loads the executable code corresponding to the process of one or more application programs into the memory 401 according to the following instructions, and the central processing unit 402 runs and stores the executable code. The application program in the storage 401, thereby realizing the process:
获取多个视频帧;Obtain multiple video frames;
创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;Create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
将所述多个第一图像存入数据库中;Storing the plurality of first images in a database;
创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;Create a first GPU process, and use the first GPU process to obtain the first image from the database in turn, and determine the position information of the text from each first image in turn to obtain the position corresponding to each first image information;
根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像;Performing cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
对每个第二图像进行文字识别处理,得到文字识别结果。Perform character recognition processing on each second image to obtain a character recognition result.
请参阅图5,图5为本申请实施例提供的电子设备的第二种结构示意图。Please refer to FIG. 5. FIG. 5 is a schematic diagram of a second structure of an electronic device provided by an embodiment of this application.
该电子设备400可以包括存储器401、中央处理器402、图形处理器403、输入单元404、输出单元405、显示屏406等部件。The electronic device 400 may include a memory 401, a central processing unit 402, a graphics processor 403, an input unit 404, an output unit 405, a display screen 406 and other components.
存储器401可用于存储应用程序和数据。存储器401存储的应用程序中包含有可执行代码。应用程序可以组成各种功能模块。中央处理器402通过运行存储在存储401的应用程序,从而执行各种功能应用以及数据处理。The memory 401 can be used to store application programs and data. The application program stored in the memory 401 contains executable code. Application programs can be composed of various functional modules. The central processing unit 402 executes various functional applications and data processing by running application programs stored in the storage 401.
中央处理器402是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器401内的应用程序,以及调用存储在存储器401内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体监控,如对第一图像进行解码处理、裁切处理等。The central processing unit 402 is the control center of the electronic device. It uses various interfaces and lines to connect the various parts of the entire electronic device. It executes the electronic device by running or executing the application program stored in the memory 401 and calling the data stored in the memory 401. Various functions and processing data of the device can be used to monitor the electronic device as a whole, such as decoding and cropping the first image.
图形处理器403可用于做图像和图形相关运算工作,如对第一图像进行位置检测处理、对第二图像进行文字识别处理等。The graphics processor 403 can be used to perform image and graphics-related operations, such as performing position detection processing on the first image, and performing character recognition processing on the second image.
输入单元404可用于接收输入的数字、字符信息或用户特征信息(比如指纹),以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。The input unit 404 can be used to receive input numbers, character information, or user characteristic information (such as fingerprints), and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
输出单元405可用于显示由用户输入的信息或提供给用户的信息以及电子设备的各种图形用户接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。输出单元可包括显示面板。The output unit 405 may be used to display information input by the user or information provided to the user and various graphical user interfaces of the electronic device. These graphical user interfaces may be composed of graphics, text, icons, videos, and any combination thereof. The output unit may include a display panel.
显示屏406可以用于显示文字、图片等信息。The display screen 406 can be used to display information such as text and pictures.
在本实施例中,电子设备中的中央处理器402会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行代码加载到存储器401中,并由中央处理器402来运行存储在存储器401中的应用程序,从而实现流程:In this embodiment, the central processing unit 402 in the electronic device loads the executable code corresponding to the process of one or more application programs into the memory 401 according to the following instructions, and the central processing unit 402 runs and stores the executable code. The application program in the storage 401, thereby realizing the process:
获取多个视频帧;Obtain multiple video frames;
创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;Create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
将所述多个第一图像存入数据库中;Storing the plurality of first images in a database;
创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;Create a first GPU process, and use the first GPU process to obtain the first image from the database in turn, and determine the position information of the text from each first image in turn to obtain the position corresponding to each first image information;
根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图 像;Performing cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
对每个第二图像进行文字识别处理,得到文字识别结果。Perform character recognition processing on each second image to obtain a character recognition result.
在一些实施方式中,所述数据库包括每个CPU进程的共享内存,中央处理器402执行所述将所述多个第一图像存入数据库中时,可以执行:利用每个CPU进程将其得到的第一图像存入其共享内存中,并将其标识信息存入第一队列中;中央处理器402执行所述利用所述第一GPU进程依次从所述数据库中获取第一图像时,可以执行:利用所述第一GPU进程依次从所述第一队列中获取所述标识信息;利用所述第一GPU进程根据所述标识信息,依次从相应CPU进程的共享内存中获取第一图像。In some embodiments, the database includes the shared memory of each CPU process. When the central processing unit 402 executes the storing of the plurality of first images in the database, it can execute: use each CPU process to obtain them Store the first image of the image in its shared memory, and store its identification information in the first queue; when the central processing unit 402 executes the process of using the first GPU to sequentially obtain the first image from the database, it may Execution: using the first GPU process to sequentially obtain the identification information from the first queue; using the first GPU process to sequentially obtain the first image from the shared memory of the corresponding CPU process according to the identification information.
在一些实施方式中,中央处理器402执行所述依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息之后,还可以执行:利用所述第一GPU进程将每个第一图像对应的位置信息依次发送至相应CPU进程;中央处理器402执行所述根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像时,可以执行:利用每个CPU进程根据接收到的所述位置信息,对相应的第一图像进行裁切处理,以依次得到多个第二图像。In some embodiments, the central processing unit 402 executes the sequence of determining the location information of the text from each first image, and after obtaining the location information corresponding to each first image, it may also execute: use the first GPU The process sends the location information corresponding to each first image to the corresponding CPU process in turn; the central processing unit 402 performs the cutting process on each first image according to the location information corresponding to each first image, and obtains multiple In the case of the second image, it may be executed: using each CPU process to perform cropping processing on the corresponding first image according to the received position information, so as to obtain multiple second images in sequence.
在一些实施方式中,中央处理器402执行所述利用每个CPU进程根据接收到的所述位置信息,对相应的第一图像进行裁切处理,以依次得到多个第二图像之后,还可以执行:利用每个CPU进程将其得到的第二图像存入其共享内存中,并将其标识信息存入第二队列中;中央处理器402执行所述对每个第二图像进行文字识别处理,得到文字识别结果时,可以执行:创建第二GPU进程,并利用所述第二GPU进程依次从所述第二队列中获取标识信息;利用所述第二GPU进程根据从所述第二队列中获取的标识信息,依次从相应CPU进程的共享内存中获取第二图像;利用所述第二GPU进程对所述第二图像进行文字识别处理,以得到文字识别结果。In some embodiments, after the central processing unit 402 executes the process of using each CPU to crop the corresponding first image according to the received position information to obtain multiple second images in sequence, it can also Execution: Use each CPU process to store the second image it obtains in its shared memory, and store its identification information in the second queue; the central processing unit 402 executes the character recognition processing for each second image , When the text recognition result is obtained, you can execute: create a second GPU process, and use the second GPU process to sequentially obtain identification information from the second queue; use the second GPU process to obtain the identification information from the second queue The identification information obtained in, sequentially obtain the second image from the shared memory of the corresponding CPU process; use the second GPU process to perform character recognition processing on the second image to obtain a character recognition result.
在一些实施方式中,所述多个视频帧为待分类视频对应的视频帧,中央处理器402执行所述对每个第二图像进行文字识别处理,得到文字识别结果之后,还可以执行:对所述文字识别结果进行分词处理,以得到多个分词;根据所述多个分词,确定所述待分类视频的类别。In some embodiments, the plurality of video frames are video frames corresponding to the video to be classified, and the central processor 402 executes the character recognition processing on each second image, and after obtaining the character recognition result, it may also execute: The word recognition result is subjected to word segmentation processing to obtain a plurality of word segmentation; according to the plurality of word segmentation, the category of the video to be classified is determined.
在一些实施方式中,中央处理器402执行所述根据所述多个分词,确定所述待分类视频的类别时,可以执行:从所述多个分词中确定出目标关键词;根据所述目标关键词,确定所述待分类视频的类别。In some embodiments, when the central processor 402 executes the determination of the category of the video to be classified based on the multiple word segmentation, it may execute: determine the target keyword from the multiple word segmentation; according to the target Keywords, determine the category of the video to be classified.
在一些实施方式中,中央处理器402执行所述根据所述目标关键词,确定所述待分类 视频的类别时,可以执行:根据关键词与类别之间的预设映射关系,确定所述目标关键词对应的类别;将所述类别确定为所述待分类视频的类别。In some implementation manners, when the central processor 402 executes the determination of the category of the video to be classified according to the target keyword, it may execute: determine the target according to a preset mapping relationship between keywords and categories The category corresponding to the keyword; the category is determined as the category of the video to be classified.
在一些实施方式中,中央处理器402执行所述从所述多个分词中确定出目标关键词时,可以执行:从所述多个分词中确定出相同的分词;确定相同的分词的数量;将数量大于预设数量所对应的相同的分词确定为目标关键词。In some implementation manners, when the central processor 402 executes the determination of the target keyword from the plurality of word segmentation, it may execute: determine the same word segmentation from the plurality of word segmentation; determine the number of the same word segmentation; The same word segmentation corresponding to the number greater than the preset number is determined as the target keyword.
在一些实施方式中,中央处理器402还可以执行:获取用户的用户画像;根据所述用户画像和所述待分类视频的类别,判断是否将所述待分类视频推送给所述用户;若是,则将所述待分类视频推送给所述用户。In some embodiments, the central processor 402 may also execute: obtain a user portrait of the user; determine whether to push the video to be classified to the user according to the user portrait and the category of the video to be classified; if so, Then, the video to be classified is pushed to the user.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见上文针对文字识别方法的详细描述,此处不再赘述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in an embodiment, please refer to the detailed description of the text recognition method above, which will not be repeated here.
本申请实施例提供的所述文字识别装置与上文实施例中的文字识别方法属于同一构思,在所述文字识别装置上可以运行所述文字识别方法实施例中提供的任一方法,其具体实现过程详见所述文字识别方法实施例,此处不再赘述。The text recognition device provided in the embodiment of the application belongs to the same concept as the text recognition method in the above embodiment, and any method provided in the text recognition method embodiment can be run on the text recognition device. The specific For details of the implementation process, refer to the embodiment of the character recognition method, which will not be repeated here.
需要说明的是,对本申请实施例所述文字识别方法而言,本领域普通技术人员可以理解实现本申请实施例所述文字识别方法的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,所述计算机程序可存储于一计算机可读取存储介质中,如存储在存储器中,并被至少一个处理器执行,在执行过程中可包括如所述文字识别方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)等。It should be noted that for the character recognition method described in the embodiment of the present application, a person of ordinary skill in the art can understand that all or part of the process of implementing the character recognition method described in the embodiment of the present application can be controlled by a computer program. To accomplish this, the computer program may be stored in a computer readable storage medium, such as stored in a memory, and executed by at least one processor. The execution process may include the process of the embodiment of the character recognition method. . Wherein, the storage medium may be a magnetic disk, an optical disc, a read only memory (ROM, Read Only Memory), a random access memory (RAM, Random Access Memory), etc.
对本申请实施例的所述文字识别装置而言,其各功能模块可以集成在一个处理芯片中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中,所述存储介质譬如为只读存储器,磁盘或光盘等。For the character recognition device of the embodiment of the present application, its functional modules may be integrated in one processing chip, or each module may exist alone physically, or two or more modules may be integrated in one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium, such as a read-only memory, a magnetic disk or an optical disk, etc. .
以上对本申请实施例所提供的一种文字识别方法、装置、存储介质以及电子设备进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The text recognition method, device, storage medium, and electronic equipment provided by the embodiments of the application are described in detail above. Specific examples are used in this article to illustrate the principles and implementations of the application. The description of the above embodiments is only It is used to help understand the method and core idea of this application; at the same time, for those skilled in the art, according to the idea of this application, there will be changes in the specific implementation and the scope of application. In summary, this specification The content should not be construed as a limitation on this application.

Claims (20)

  1. 一种文字识别方法,其中,包括:A text recognition method, which includes:
    获取多个视频帧;Obtain multiple video frames;
    创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;Create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
    将所述多个第一图像存入数据库中;Storing the plurality of first images in a database;
    创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;Create a first GPU process, and use the first GPU process to obtain the first image from the database in turn, and determine the position information of the text from each first image in turn to obtain the position corresponding to each first image information;
    根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像;Performing cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
    对每个第二图像进行文字识别处理,得到文字识别结果。Perform character recognition processing on each second image to obtain a character recognition result.
  2. 根据权利要求1所述的文字识别方法,其中,所述数据库包括每个CPU进程的共享内存,所述将所述多个第一图像存入数据库中,包括:The character recognition method according to claim 1, wherein the database includes shared memory of each CPU process, and storing the plurality of first images in the database includes:
    利用每个CPU进程将其得到的第一图像存入其共享内存中,并将其标识信息存入第一队列中;Use each CPU process to store the first image it obtains in its shared memory, and store its identification information in the first queue;
    所述利用所述第一GPU进程依次从所述数据库中获取第一图像,包括:The step of using the first GPU process to sequentially obtain the first image from the database includes:
    利用所述第一GPU进程依次从所述第一队列中获取所述标识信息;Using the first GPU process to sequentially obtain the identification information from the first queue;
    利用所述第一GPU进程根据所述标识信息,依次从相应CPU进程的共享内存中获取第一图像。According to the identification information, the first GPU process is used to sequentially obtain the first image from the shared memory of the corresponding CPU process.
  3. 根据权利要求2所述的文字识别方法,其中,所述依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息之后,还包括:3. The character recognition method according to claim 2, wherein the step of determining the position information of the character in each first image in turn, and obtaining the position information corresponding to each first image, further comprising:
    利用所述第一GPU进程将每个第一图像对应的位置信息依次发送至相应CPU进程;Using the first GPU process to sequentially send the location information corresponding to each first image to the corresponding CPU process;
    所述根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像,包括:According to the position information corresponding to each first image, performing cutting processing on each first image to obtain multiple second images includes:
    利用每个CPU进程根据接收到的所述位置信息,对相应的第一图像进行裁切处理,以依次得到多个第二图像。Each CPU process is used to perform cropping processing on the corresponding first image according to the received position information, so as to obtain multiple second images in sequence.
  4. 根据权利要求3所述的文字识别方法,其中,所述利用每个CPU进程根据接收到的所述位置信息,对相应的第一图像进行裁切处理,以依次得到多个第二图像之后,还包括:3. The character recognition method according to claim 3, wherein after said using each CPU process to cut the corresponding first image according to the received position information, to obtain a plurality of second images in sequence, Also includes:
    利用每个CPU进程将其得到的第二图像存入其共享内存中,并将其标识信息存入第二 队列中;Use each CPU process to store the second image obtained by it in its shared memory, and store its identification information in the second queue;
    所述对每个第二图像进行文字识别处理,得到文字识别结果,包括:The performing character recognition processing on each second image to obtain a character recognition result includes:
    创建第二GPU进程,并利用所述第二GPU进程依次从所述第二队列中获取标识信息;Creating a second GPU process, and using the second GPU process to sequentially obtain identification information from the second queue;
    利用所述第二GPU进程根据从所述第二队列中获取的标识信息,依次从相应CPU进程的共享内存中获取第二图像;Using the second GPU process to sequentially obtain a second image from the shared memory of the corresponding CPU process according to the identification information obtained from the second queue;
    利用所述第二GPU进程对所述第二图像进行文字识别处理,以得到文字识别结果。Use the second GPU process to perform character recognition processing on the second image to obtain a character recognition result.
  5. 根据权利要求1所述的文字识别方法,其中,所述多个视频帧为待分类视频对应的视频帧,所述对每个第二图像进行文字识别处理,得到文字识别结果之后,还包括:4. The text recognition method according to claim 1, wherein the multiple video frames are video frames corresponding to the video to be classified, and after performing text recognition processing on each second image to obtain a text recognition result, the method further comprises:
    对所述文字识别结果进行分词处理,以得到多个分词;Performing word segmentation processing on the character recognition result to obtain multiple word segmentation;
    根据所述多个分词,确定所述待分类视频的类别。According to the multiple word segmentation, the category of the video to be classified is determined.
  6. 根据权利要求5所述的文字识别方法,其中,所述根据所述多个分词,确定所述待分类视频的类别,包括:The text recognition method according to claim 5, wherein the determining the category of the video to be classified according to the plurality of word segmentation comprises:
    从所述多个分词中确定出目标关键词;Determine the target keyword from the multiple word segmentation;
    根据所述目标关键词,确定所述待分类视频的类别。According to the target keyword, the category of the video to be classified is determined.
  7. 根据权利要求6所述的文字识别方法,其中,所述根据所述目标关键词,确定所述待分类视频的类别,包括:7. The text recognition method according to claim 6, wherein the determining the category of the video to be classified according to the target keyword comprises:
    根据关键词与类别之间的预设映射关系,确定所述目标关键词对应的类别;Determine the category corresponding to the target keyword according to the preset mapping relationship between the keyword and the category;
    将所述类别确定为所述待分类视频的类别。The category is determined as the category of the video to be classified.
  8. 根据权利要求6所述的文字识别方法,其中,从所述多个分词中确定出目标关键词,包括:7. The text recognition method according to claim 6, wherein determining the target keyword from the plurality of word segmentation includes:
    从所述多个分词中确定出相同的分词;Determine the same participle from the multiple participles;
    确定相同的分词的数量;Determine the number of identical participles;
    将数量大于预设数量所对应的相同的分词确定为目标关键词。The same word segmentation corresponding to the number greater than the preset number is determined as the target keyword.
  9. 根据权利要求6所述的文字识别方法,其中,所述方法还包括:The character recognition method according to claim 6, wherein the method further comprises:
    获取用户的用户画像;Obtain user portraits of users;
    根据所述用户画像和所述待分类视频的类别,判断是否将所述待分类视频推送给所述用户;Judging whether to push the video to be classified to the user according to the user portrait and the category of the video to be classified;
    若是,则将所述待分类视频推送给所述用户。If yes, push the video to be classified to the user.
  10. 一种文字识别装置,其中,包括:A text recognition device, which includes:
    获取模块,用于获取多个视频帧;The acquisition module is used to acquire multiple video frames;
    解码模块,用于创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;The decoding module is used to create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
    保存模块,用于将所述多个第一图像存入数据库中;A saving module, configured to save the plurality of first images in a database;
    确定模块,用于创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;The determining module is used to create a first GPU process, and use the first GPU process to sequentially obtain the first image from the database, and sequentially determine the position information of the text from each first image, and obtain each first image. Location information corresponding to an image;
    裁切模块,用于根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像;The cropping module is configured to perform cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
    识别模块,用于对每个第二图像进行文字识别处理,得到文字识别结果。The recognition module is used to perform character recognition processing on each second image to obtain a character recognition result.
  11. 一种存储介质,其中,所述存储介质中存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行权利要求1至9任一项所述的文字识别方法。A storage medium, wherein a computer program is stored in the storage medium, and when the computer program runs on a computer, the computer executes the character recognition method according to any one of claims 1 to 9.
  12. 一种电子设备,其中,所述电子设备包括处理器和存储器,所述存储器中存储有计算机程序,所述处理器通过调用所述存储器中存储的所述计算机程序,用于执行:An electronic device, wherein the electronic device includes a processor and a memory, and a computer program is stored in the memory, and the processor is configured to execute:
    获取多个视频帧;Obtain multiple video frames;
    创建多个CPU进程,并利用每个CUP进程对每个视频帧进行解码处理,得到多个第一图像;Create multiple CPU processes, and use each CUP process to decode each video frame to obtain multiple first images;
    将所述多个第一图像存入数据库中;Storing the plurality of first images in a database;
    创建第一GPU进程,并利用所述第一GPU进程依次从所述数据库中获取第一图像,并依次从每个第一图像中确定文字所在的位置信息,得到每个第一图像对应的位置信息;Create a first GPU process, and use the first GPU process to obtain the first image from the database in turn, and determine the position information of the text from each first image in turn to obtain the position corresponding to each first image information;
    根据每个第一图像对应的位置信息,对每个第一图像进行裁切处理,得到多个第二图像;Performing cropping processing on each first image according to the position information corresponding to each first image to obtain multiple second images;
    对每个第二图像进行文字识别处理,得到文字识别结果。Perform character recognition processing on each second image to obtain a character recognition result.
  13. 根据权利要求12所述的电子设备,其中,所述数据库包括每个CPU进程的共享内存,所述处理器用于执行:The electronic device according to claim 12, wherein the database includes a shared memory of each CPU process, and the processor is configured to execute:
    利用每个CPU进程将其得到的第一图像存入其共享内存中,并将其标识信息存入第一队列中;Use each CPU process to store the first image it obtains in its shared memory, and store its identification information in the first queue;
    利用所述第一GPU进程依次从所述第一队列中获取所述标识信息;Using the first GPU process to sequentially obtain the identification information from the first queue;
    利用所述第一GPU进程根据所述标识信息,依次从相应CPU进程的共享内存中获取第一图像。According to the identification information, the first GPU process is used to sequentially obtain the first image from the shared memory of the corresponding CPU process.
  14. 根据权利要求13所述的电子设备,其中,所述处理器用于执行:The electronic device according to claim 13, wherein the processor is configured to execute:
    利用所述第一GPU进程将每个第一图像对应的位置信息依次发送至相应CPU进程;Using the first GPU process to sequentially send the location information corresponding to each first image to the corresponding CPU process;
    利用每个CPU进程根据接收到的所述位置信息,对相应的第一图像进行裁切处理,以依次得到多个第二图像。Each CPU process is used to perform cropping processing on the corresponding first image according to the received position information, so as to obtain multiple second images in sequence.
  15. 根据权利要求14所述的电子设备,其中,所述处理器用于执行:The electronic device according to claim 14, wherein the processor is configured to execute:
    利用每个CPU进程将其得到的第二图像存入其共享内存中,并将其标识信息存入第二队列中;Use each CPU process to store the second image obtained by it in its shared memory, and store its identification information in the second queue;
    创建第二GPU进程,并利用所述第二GPU进程依次从所述第二队列中获取标识信息;Creating a second GPU process, and using the second GPU process to sequentially obtain identification information from the second queue;
    利用所述第二GPU进程根据从所述第二队列中获取的标识信息,依次从相应CPU进程的共享内存中获取第二图像;Using the second GPU process to sequentially obtain a second image from the shared memory of the corresponding CPU process according to the identification information obtained from the second queue;
    利用所述第二GPU进程对所述第二图像进行文字识别处理,以得到文字识别结果。Use the second GPU process to perform character recognition processing on the second image to obtain a character recognition result.
  16. 根据权利要求12所述的电子设备,其中,所述处理器用于执行:The electronic device according to claim 12, wherein the processor is configured to execute:
    对所述文字识别结果进行分词处理,以得到多个分词;Performing word segmentation processing on the character recognition result to obtain multiple word segmentation;
    根据所述多个分词,确定所述待分类视频的类别。According to the multiple word segmentation, the category of the video to be classified is determined.
  17. 根据权利要求16所述的电子设备,其中,所述处理器用于执行:The electronic device according to claim 16, wherein the processor is configured to execute:
    从所述多个分词中确定出目标关键词;Determine the target keyword from the multiple word segmentation;
    根据所述目标关键词,确定所述待分类视频的类别。According to the target keyword, the category of the video to be classified is determined.
  18. 根据权利要求17所述的电子设备,其中,所述处理器用于执行:The electronic device according to claim 17, wherein the processor is configured to execute:
    根据关键词与类别之间的预设映射关系,确定所述目标关键词对应的类别;Determine the category corresponding to the target keyword according to the preset mapping relationship between the keyword and the category;
    将所述类别确定为所述待分类视频的类别。The category is determined as the category of the video to be classified.
  19. 根据权利要求17所述的电子设备,其中,所述处理器用于执行:The electronic device according to claim 17, wherein the processor is configured to execute:
    从所述多个分词中确定出相同的分词;Determine the same participle from the multiple participles;
    确定相同的分词的数量;Determine the number of identical participles;
    将数量大于预设数量所对应的相同的分词确定为目标关键词。The same word segmentation corresponding to the number greater than the preset number is determined as the target keyword.
  20. 根据权利要求17所述的电子设备,其中,所述处理器用于执行:The electronic device according to claim 17, wherein the processor is configured to execute:
    获取用户的用户画像;Obtain user portraits of users;
    根据所述用户画像和所述待分类视频的类别,判断是否将所述待分类视频推送给所述用户;Judging whether to push the video to be classified to the user according to the user portrait and the category of the video to be classified;
    若是,则将所述待分类视频推送给所述用户。If yes, push the video to be classified to the user.
PCT/CN2019/129963 2019-12-30 2019-12-30 Text identification method, device, storage medium, and electronic apparatus WO2021134229A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/129963 WO2021134229A1 (en) 2019-12-30 2019-12-30 Text identification method, device, storage medium, and electronic apparatus
CN201980100391.5A CN114391260A (en) 2019-12-30 2019-12-30 Character recognition method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/129963 WO2021134229A1 (en) 2019-12-30 2019-12-30 Text identification method, device, storage medium, and electronic apparatus

Publications (1)

Publication Number Publication Date
WO2021134229A1 true WO2021134229A1 (en) 2021-07-08

Family

ID=76687485

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/129963 WO2021134229A1 (en) 2019-12-30 2019-12-30 Text identification method, device, storage medium, and electronic apparatus

Country Status (2)

Country Link
CN (1) CN114391260A (en)
WO (1) WO2021134229A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023050673A1 (en) * 2021-09-28 2023-04-06 上海商汤智能科技有限公司 Image caching method and apparatus, and electronic device, storage medium and computer program product

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168396A (en) * 2022-10-27 2023-05-26 深圳市超时代软件有限公司 Character recognition device and character recognition method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103108186A (en) * 2013-02-21 2013-05-15 中国对外翻译出版有限公司 Method of achieving high-definition transmission of videos
CN105898495A (en) * 2016-05-26 2016-08-24 维沃移动通信有限公司 Method for pushing mobile terminal recommended information and mobile terminal
CN106874443A (en) * 2017-02-09 2017-06-20 北京百家互联科技有限公司 Based on information query method and device that video text message is extracted
CN110197177A (en) * 2019-04-22 2019-09-03 平安科技(深圳)有限公司 Extract method, apparatus, computer equipment and the storage medium of video caption
CN110598622A (en) * 2019-09-06 2019-12-20 广州华多网络科技有限公司 Video subtitle positioning method, electronic device, and computer storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105427236A (en) * 2015-12-18 2016-03-23 魅族科技(中国)有限公司 Method and device for image rendering
CN106446898A (en) * 2016-09-14 2017-02-22 宇龙计算机通信科技(深圳)有限公司 Extraction method and extraction device of character information in image
CN109922319B (en) * 2019-03-26 2020-10-09 重庆英卡电子有限公司 RTSP (real time streaming protocol) multi-video-stream parallel preprocessing method based on multi-core CPU (central processing unit)
CN110414517B (en) * 2019-04-18 2023-04-07 河北神玥软件科技股份有限公司 Rapid high-precision identity card text recognition algorithm used for being matched with photographing scene

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103108186A (en) * 2013-02-21 2013-05-15 中国对外翻译出版有限公司 Method of achieving high-definition transmission of videos
CN105898495A (en) * 2016-05-26 2016-08-24 维沃移动通信有限公司 Method for pushing mobile terminal recommended information and mobile terminal
CN106874443A (en) * 2017-02-09 2017-06-20 北京百家互联科技有限公司 Based on information query method and device that video text message is extracted
CN110197177A (en) * 2019-04-22 2019-09-03 平安科技(深圳)有限公司 Extract method, apparatus, computer equipment and the storage medium of video caption
CN110598622A (en) * 2019-09-06 2019-12-20 广州华多网络科技有限公司 Video subtitle positioning method, electronic device, and computer storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023050673A1 (en) * 2021-09-28 2023-04-06 上海商汤智能科技有限公司 Image caching method and apparatus, and electronic device, storage medium and computer program product

Also Published As

Publication number Publication date
CN114391260A (en) 2022-04-22

Similar Documents

Publication Publication Date Title
WO2019100738A1 (en) Multi-participant human-machine interaction method and device
WO2021143624A1 (en) Video tag determination method, device, terminal, and storage medium
WO2020119569A1 (en) Voice interaction method, device and system
WO2021097750A1 (en) Human body posture recognition method and apparatus, storage medium, and electronic device
US20190251961A1 (en) Transcription of audio communication to identify command to device
US11516550B2 (en) Generating an interactive digital video content item
WO2021159896A1 (en) Video processing method, video processing device, and storage medium
CN112235635B (en) Animation display method, animation display device, electronic equipment and storage medium
WO2021134229A1 (en) Text identification method, device, storage medium, and electronic apparatus
US20140267011A1 (en) Mobile device event control with digital images
WO2023036133A1 (en) Image detection and rendering method and apparatus, device, storage medium, and computer program product
TW202046082A (en) Thread of conversation displaying method, computer readable recording medium and computer device
JP2015521454A (en) Video transmission and reconfiguration
CN109286848B (en) Terminal video information interaction method and device and storage medium
KR20210003259A (en) Selective detection of visual cues for automatic assistant
CN109656655A (en) It is a kind of for executing the method, equipment and storage medium of interactive instruction
WO2019085625A1 (en) Emotion picture recommendation method and apparatus
US11069019B2 (en) Multi-threaded asynchronous frame processing
CN110727629A (en) Playing method of audio electronic book, electronic equipment and computer storage medium
CN112843681B (en) Virtual scene control method and device, electronic equipment and storage medium
WO2020042442A1 (en) Expression package generating method and device
WO2021218535A1 (en) Ui control generation and trigger methods, and terminal
CN103701854A (en) Network real-time audio transmission method based on application virtualization
CN108932142A (en) A kind of picture catching method and terminal
CN112417197B (en) Sorting method, sorting device, machine readable medium and equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19958308

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06-09-2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19958308

Country of ref document: EP

Kind code of ref document: A1