CN111241319B

CN111241319B - Image-text conversion method and system

Info

Publication number: CN111241319B
Application number: CN202010074440.7A
Authority: CN
Inventors: 郑文然; 文友枥; 吕金旺; 任浩男
Original assignee: Beijing Sohu New Media Information Technology Co Ltd
Current assignee: Beijing Sohu New Media Information Technology Co Ltd
Priority date: 2020-01-22
Filing date: 2020-01-22
Publication date: 2023-10-03
Anticipated expiration: 2040-01-22
Also published as: CN111241319A

Abstract

The application provides a method and a system for converting images and texts, wherein the method comprises the following steps: acquiring a picture to be converted; acquiring keywords corresponding to the picture to be converted based on an image recognition tool; inputting the keywords into a search engine, and obtaining the relativity of the keywords and each sentence in a preset corpus; and displaying the sentences with the relevance reaching the threshold value to the user. In the scheme, the image recognition tool is utilized to acquire the keywords corresponding to the pictures to be converted, and the search engine is utilized to acquire the correlation degree between the keywords and each sentence in the corpus. And displaying the sentences with the correlation degree reaching the threshold value to a user so as to enable the user to select the description sentences conforming to the pictures to be converted, thereby avoiding the situation that the user cannot find proper description sentences after uploading the pictures.

Description

Image-text conversion method and system

Technical Field

The application relates to the technical field of image recognition, in particular to a method and a system for converting images and texts.

Background

With the development of internet technology, various social software is also being introduced. Users typically use a picture upload function in using social software.

After uploading the picture, the user mostly needs to add a proper description sentence for the uploaded picture, but the user may not accurately find the description sentence conforming to the uploaded picture due to various reasons.

Disclosure of Invention

In view of this, the embodiments of the present application provide a method and a system for converting graphics and text, so as to solve the problem that a user cannot accurately find a description sentence conforming to an uploaded picture.

In order to achieve the above object, the embodiment of the present application provides the following technical solutions:

the first aspect of the embodiment of the application discloses a method for converting graphics and texts, which comprises the following steps:

acquiring a picture to be converted;

acquiring keywords corresponding to the picture to be converted based on an image recognition tool;

inputting the keywords into a search engine, and obtaining the relativity of the keywords and each sentence in a preset corpus;

and displaying the statement with the relevance reaching the threshold value to a user.

Preferably, the obtaining, based on the image recognition tool, the keyword corresponding to the to-be-converted picture includes:

identifying the picture to be converted based on an image identification tool to obtain elements forming the picture to be converted;

and acquiring keywords corresponding to each element according to the characteristic information of each element.

Preferably, the process of constructing the corpus includes:

acquiring a plurality of sentences;

performing word segmentation processing on each sentence to obtain a word segmentation result and weight of each sentence;

and storing the word segmentation result and the weight corresponding to each sentence into a corpus corresponding to the search engine.

Preferably, the displaying the statement that the correlation degree reaches the threshold to the user includes:

sorting the sentences with the correlation degree reaching a threshold value according to the sequence of the correlation degree from high to low;

and displaying the sentences with the sorted relevance reaching the threshold value to a user.

scoring the sentences with the relevance reaching a threshold by using the weight of each sentence to obtain the score of the sentences with the relevance reaching the threshold;

sorting the sentences with the correlation degree reaching a threshold value according to the sequence of the scores from high to low;

Preferably, before the obtaining the correlation degree between the keyword and each sentence in the preset corpus, the method further includes:

and if the keyword is English, translating the keyword into Chinese.

The second aspect of the embodiment of the application discloses a system for converting graphics and texts, which comprises:

the first acquisition unit is used for acquiring a picture to be converted;

the second acquisition unit is used for acquiring keywords corresponding to the picture to be converted based on an image recognition tool;

the third acquisition unit is used for inputting the keywords into a search engine and acquiring the relativity of the keywords and each sentence in a preset corpus;

and the display unit is used for displaying the statement with the relevance reaching the threshold value to a user.

Preferably, the second acquisition unit includes:

the identification module is used for identifying the picture to be converted based on an image identification tool to obtain elements forming the picture to be converted;

and the acquisition module is used for acquiring keywords corresponding to each element according to the characteristic information of each element.

Preferably, the third acquisition unit includes:

the acquisition module is used for acquiring a plurality of sentences;

the word segmentation module is used for carrying out word segmentation processing on each sentence to obtain a word segmentation result and weight of each sentence;

and the storage module is used for storing the word segmentation result and the weight corresponding to each sentence into a corpus corresponding to the search engine.

Preferably, the display unit includes:

the processing module is used for sequencing the sentences with the correlation degree reaching a threshold value according to the sequence of the correlation degree from high to low;

and the display module is used for displaying the sentences with the sorted relevance reaching the threshold value to a user.

Based on the method and the system for converting the image and text provided by the embodiment of the application, the method comprises the following steps: acquiring a picture to be converted; acquiring keywords corresponding to the picture to be converted based on an image recognition tool; inputting the keywords into a search engine, and obtaining the relativity of the keywords and each sentence in a preset corpus; and displaying the sentences with the relevance reaching the threshold value to the user. In the scheme, the image recognition tool is utilized to acquire the keywords corresponding to the pictures to be converted, and the search engine is utilized to acquire the correlation degree between the keywords and each sentence in the corpus. And displaying the sentences with the correlation degree reaching the threshold value to a user so as to enable the user to select the description sentences conforming to the pictures to be converted, thereby avoiding the situation that the user cannot find proper description sentences after uploading the pictures.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a method for converting text and graphics according to an embodiment of the present application;

FIG. 2 is a flow chart of constructing a corpus provided by an embodiment of the present application;

FIG. 3 is a flowchart of displaying a sentence with a correlation degree reaching a threshold to a user according to an embodiment of the present application;

fig. 4 is a block diagram of a system for converting graphics and text according to an embodiment of the present application;

FIG. 5 is a block diagram of another system for converting text and graphics according to an embodiment of the present application;

FIG. 6 is a block diagram of a system for converting text and graphics according to an embodiment of the present application;

fig. 7 is a block diagram of a system for converting graphics and text according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In the present disclosure, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As known from the background art, at present, after a user uploads a picture, it is generally required to add a suitable description sentence for the picture, but the user may not be able to accurately find the description sentence conforming to the uploaded picture for various reasons.

Therefore, the embodiment of the application provides a method and a system for converting images and texts, which utilize an image recognition tool to obtain keywords corresponding to images to be converted, and obtain the correlation degree between the keywords and each sentence in a corpus through a search engine. And displaying the sentences with the correlation degree reaching the threshold value to the user, so that the situation that the user cannot find out proper description sentences after uploading the pictures is avoided.

It should be noted that the method and the system for converting graphics and texts related in the embodiment of the application are not only applicable to the social field, but also applicable to the field of message creation such as advertisements, customs and posters. That is, in the field of text creation, after uploading a picture, a user may also use the method of image-text conversion according to the embodiment of the present application to match a corresponding sentence for the uploaded picture and display the sentence to the user for the user to select.

Referring to fig. 1, a flowchart of a method for converting graphics and text provided by an embodiment of the present application is shown, where the method includes the following steps:

step S101: and obtaining a picture to be converted.

It can be understood that the obtained picture to be converted is a picture uploaded by the user, and the mode of uploading the picture to be converted by the user is as follows: the user can select and upload pictures from the album in the device terminal, and the user can take pictures by using the device terminal and upload the pictures.

It should be noted that, the user may also obtain and upload the picture to be converted in other manners, which is not limited to two manners of selecting the picture to be converted from the album and obtaining the picture to be converted by photographing, and the other manners are not described in detail.

Step S102: and acquiring keywords corresponding to the picture to be converted based on the image recognition tool.

In the specific implementation process of step S102, the image recognition tool is used to recognize the image to be converted, obtain the elements constituting the image to be converted, and obtain the keywords corresponding to each element according to the feature information of each element.

It is understood that the image recognition tool is a tool having an image recognition function, for example Google Cloud Vision API.

Note that Google Cloud Vision API is equipped with an intelligent machine learning system "TensorFlow", which can divide pictures into thousands of categories, and can detect the relevant emotion of a face in a picture, and detect information such as characters and elements on the picture.

In the process of identifying the picture to be converted, an image identification tool is utilized to identify the elements such as the theme, the color, the characteristics and the like of the picture to be converted, and the keywords of the elements are obtained according to the characteristic information of each element. For example: assuming that the picture to be converted uploaded by the user is a sky photo, using an image recognition tool, recognizing that elements constituting the picture to be converted are cloud, sky, blue, sunny days and the like, that is, keywords obtained by recognizing the picture to be converted are cloud, sky, blue and sunny days.

It will be appreciated that the above examples of identifying the picture to be converted are only for illustration, and when the picture to be converted includes other types of information such as face information and text information, the image identifying tool may also be used to identify the emotion in the face information and identify the text in the text information, and obtain the keywords corresponding to the identified emotion and text. For example: assuming that one of the elements contained in the picture to be converted is face information of a laugh, the image recognition tool can be used for recognizing that the corresponding keyword in the picture to be converted is happy.

It should be noted that the languages of use of the image recognition tools corresponding to different countries may be different, that is, the keywords corresponding to the pictures to be converted obtained by using the image recognition tools do not necessarily conform to the language currently used by the user. Such as: the language currently used by the user is Chinese, and the keyword corresponding to the picture to be converted acquired by the image recognition tool is English, so that the keyword is translated into Chinese.

Also for example: the language currently used by the user is French, and the keyword corresponding to the picture to be converted acquired by the image recognition tool is Chinese, so that the keyword is translated into French.

It can be understood that if the keyword corresponding to the picture to be converted obtained by the image recognition tool accords with the language currently used by the user, the keyword does not need to be translated.

If the keyword needs to be translated, the keyword may be translated into a corresponding language by using an open source translation word stock.

Step S103: and inputting the keywords into a search engine, and obtaining the relativity of the keywords and each sentence in a preset corpus.

It should be noted that a large number of sentences are collected in advance, and a corpus is constructed using the collected large number of sentences.

It can be appreciated that, for different application fields, sentences corresponding to each application field are collected, for example, for a social field, a user usually needs to describe a picture uploaded by the user by using deliberate and graceful sentences when uploading the picture. When the corpus is constructed, a large number of graceful sentences are required to be collected in advance, and the corresponding corpus is constructed by using the large number of graceful sentences.

For example, in the field of posters, a user needs to describe the subject of the poster using appropriate sentences when uploading a picture. When a corpus is constructed, a large number of poster expressions are required to be collected in advance, and the corresponding corpus is constructed by utilizing the large number of poster expressions.

In the specific implementation process of step S103, the keywords obtained in the above steps are input into a search engine, so that the search engine searches from a corpus constructed in advance to obtain the relevance (may also be referred to as relevance) between the keywords and each sentence in the corpus.

The index of the correlation degree between the keyword and the sentence is word frequency and density, in general, the density of the keyword and the number of times the keyword appears in the sentence are in positive correlation, and the more the number of times the keyword appears, the more the description density is, and the higher the correlation degree between the keyword and the sentence is.

It should be further noted that the types of search engines mentioned above include, but are not limited to, elastosearch, and other search engines having similar functions to elastosearch are equally applicable to the solutions according to the embodiments of the present application.

In the process of utilizing the search engine, all keywords are input into the search engine, the search engine retrieves sentences in the corpus, and the sentences are inverted according to the relativity with the keywords.

Step S104: and displaying the sentences with the relevance reaching the threshold value to the user.

In the specific implementation process of step S104, the sentences with the relevance reaching the threshold value are ranked according to the sequence of the relevance from high to low, and the sentences with the relevance reaching the threshold value after the ranking are displayed to the user for the user to select.

It is to be understood that the above-mentioned sorting order may be an order of low-to-high correlation, which is not limited herein.

That is, n sentences with the highest degree of correlation with the keywords are displayed to the user, so that the user selects corresponding sentences according to the own needs, and n is an integer greater than 0.

In the embodiment of the application, the image recognition tool is utilized to acquire the keywords corresponding to the picture to be converted, and the search engine is utilized to acquire the correlation degree between the keywords and each sentence in the corpus. And displaying the sentences with the correlation degree reaching the threshold value to a user so as to enable the user to select the description sentences conforming to the pictures to be converted, thereby avoiding the situation that the user cannot find proper description sentences after uploading the pictures.

The process of building a corpus in step S103 of fig. 1 according to the above embodiment of the present application, referring to fig. 2, shows a flowchart of building a corpus provided by the embodiment of the present application, including the following steps:

step S201: a plurality of sentences is obtained.

In the process of implementing step S201 specifically, for different application fields, sentences corresponding to each application field are collected, and specific content can be referred to the content in step S103 in fig. 1 in the above embodiment of the present application, which is not described herein again.

After a plurality of sentences are acquired, submitting the acquired sentences to a search engine, and carrying out corresponding processing on the acquired sentences.

Step S202: and performing word segmentation processing on each sentence to obtain a word segmentation result and weight of each sentence.

In the specific implementation process of step S202, after submitting the multiple sentences to the search engine, the word segmentation controller is utilized to segment each sentence, so as to obtain the word segmentation result and the weight corresponding to each sentence.

Step S203: and storing the word segmentation result and the weight corresponding to each sentence into a corpus corresponding to the search engine.

In the specific implementation process of step S203, after word segmentation is performed on each sentence, the word segmentation result and the weight corresponding to each sentence are stored in the corpus corresponding to the search engine.

In the embodiment of the application, the obtained multiple sentences are submitted to a search engine for word segmentation, and the word segmentation result and the weight of each sentence are stored in a corpus of the search engine. After the keywords in the pictures to be converted are identified, the keywords are searched by utilizing a search engine, and sentences with the correlation degree reaching a threshold value with the keywords are displayed to a user so that the user can select descriptive sentences conforming to the pictures to be converted, and therefore the situation that the user cannot find proper descriptive sentences after uploading the pictures is avoided.

The process of displaying the sentence with the correlation degree reaching the threshold value to the user in step S104 in the foregoing embodiment of the present application is shown in fig. 3 in combination with the content of fig. 2, where a flowchart of displaying the sentence with the correlation degree reaching the threshold value to the user provided in the embodiment of the present application includes the following steps:

step S301: and scoring the sentences with the relevance reaching the threshold value by using the weight of each sentence to obtain the score of the sentences with the relevance reaching the threshold value.

According to the content in fig. 2, the word segmentation processing is performed on each sentence obtained to obtain a corresponding word segmentation result and weight. In the specific implementation process of step S301, for the sentence whose correlation reaches the threshold value, the score of the sentence whose correlation reaches the threshold value is obtained by scoring with the weight corresponding to the sentence.

Step S302: and sorting sentences with the correlation degree reaching a threshold value according to the order of the scores from high to low.

In the specific implementation process of step S302, after scoring the sentences with the correlation degree reaching the threshold value, sorting the sentences with the correlation degree reaching the threshold value according to the order of the scores.

It is to be understood that the ordering may be from low to high, and is not specifically limited herein.

Step S303: and displaying the sentences with the sorted relevance reaching the threshold value to a user.

In the specific implementation process of step S303, after the sentences with the correlation degree reaching the threshold value are ranked, the ranked sentences with the correlation degree reaching the threshold value are displayed to the user, so that the user can select according to the own requirement.

In the embodiment of the application, the sentences with the correlation degree reaching the threshold value are scored by using the weight of each sentence. And sorting sentences with the relevance reaching a threshold value according to the score order, and displaying the sorting result to a user so as to enable the user to select the description sentences conforming to the pictures to be converted, thereby avoiding the situation that the user cannot find proper description sentences after uploading the pictures.

Corresponding to the method for converting graphics and texts provided in the above embodiment of the present application, referring to fig. 4, the embodiment of the present application further provides a block diagram of a graphics and texts conversion system, where the block diagram includes: a first acquisition unit 401, a second acquisition unit 402, a third acquisition unit 403, and a display unit 404;

a first obtaining unit 401 is configured to obtain a picture to be converted.

The second obtaining unit 402 is configured to obtain, based on the image recognition tool, a keyword corresponding to the picture to be converted.

Preferably, the second obtaining unit 402 is further configured to: if the keyword is English, the keyword is translated into Chinese.

The third obtaining unit 403 is configured to enter the keyword into a search engine, and obtain a relevance between the keyword and each sentence in a preset corpus.

And a display unit 404, configured to display the sentence with the relevance reaching the threshold to the user.

Preferably, referring to fig. 5 in conjunction with fig. 4, a block diagram of a graphics context conversion system provided in an embodiment of the present application is shown, where the second obtaining unit 402 includes:

the identifying module 4021 is configured to identify a picture to be converted based on the image identifying tool, to obtain elements that constitute the picture to be converted.

The obtaining module 4022 is configured to obtain, according to the feature information of each element, a keyword corresponding to each element.

Preferably, referring to fig. 6 in conjunction with fig. 4, a block diagram of a system for converting graphics and text provided by an embodiment of the present application is shown, and a third obtaining unit 403: an acquisition module 4031, a word segmentation module 4032 and a storage module 4033;

the obtaining module 4031 is configured to obtain a plurality of sentences.

The word segmentation module 4032 is configured to perform word segmentation on each sentence to obtain a word segmentation result and a weight of each sentence.

And the storage module 4033 is configured to store the word segmentation result and the weight corresponding to each sentence into a corpus corresponding to the search engine.

Preferably, referring to fig. 7 in conjunction with fig. 4, a block diagram of a graphics context conversion system provided in an embodiment of the present application is shown, where the display unit 404 includes:

a processing module 4041, configured to rank the sentences with relevance reaching a threshold value according to the order of relevance from high to low.

And the display module 4042 is used for displaying the sentences with the sorted relevance reaching the threshold value to the user.

Preferably, in combination with the content shown in fig. 7, in another specific implementation, the processing module 4041 is configured to score the sentences with the relevance reaching the threshold value by using the weight of each sentence, obtain the score of the sentences with the relevance reaching the threshold value, and rank the sentences with the relevance reaching the threshold value according to the order of the score from high to low.

In summary, the embodiment of the application provides a method and a system for converting graphics and text, wherein the method comprises the following steps: acquiring a picture to be converted; acquiring keywords corresponding to the picture to be converted based on an image recognition tool; inputting the keywords into a search engine, and obtaining the relativity of the keywords and each sentence in a preset corpus; and displaying the sentences with the relevance reaching the threshold value to the user. In the scheme, the image recognition tool is utilized to acquire the keywords corresponding to the pictures to be converted, and the search engine is utilized to acquire the correlation degree between the keywords and each sentence in the corpus. And displaying the sentences with the correlation degree reaching the threshold value to a user so as to enable the user to select the description sentences conforming to the pictures to be converted, thereby avoiding the situation that the user cannot find proper description sentences after uploading the pictures.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present application without undue burden.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of converting text to graphics, the method comprising:

acquiring a picture to be converted;

inputting the keywords into a search engine, and acquiring the relativity of the keywords and each sentence in a preset corpus, wherein the corpus comprises word segmentation results and weights of each sentence;

displaying the sentences with the relevance reaching a threshold value to a user;

the obtaining, based on the image recognition tool, the keyword corresponding to the picture to be converted includes:

acquiring keywords corresponding to each element according to the characteristic information of each element;

the statement that the correlation degree reaches the threshold value is displayed to the user, and the statement comprises:

2. The method of claim 1, wherein constructing the corpus comprises:

acquiring a plurality of sentences;

3. The method of claim 1, wherein the statement that the relevancy reaches a threshold is displayed to a user, comprising:

4. The method of claim 1, further comprising, before the obtaining the relevance between the keyword and each sentence in the preset corpus:

and if the keyword is English, translating the keyword into Chinese.

5. A system for converting text to graphics, the system comprising:

the first acquisition unit is used for acquiring a picture to be converted;

the third acquisition unit is used for inputting the keywords into a search engine, and acquiring the relativity of the keywords and each sentence in a preset corpus, wherein the corpus comprises word segmentation results and weights of each sentence;

the display unit is used for displaying the statement with the correlation reaching the threshold value to a user;

wherein the second acquisition unit includes:

the acquisition module is used for acquiring keywords corresponding to each element according to the characteristic information of each element;

the display unit includes:

the processing module is used for scoring the sentences with the relevance reaching a threshold value by utilizing the weight of each sentence to obtain the score of the sentences with the relevance reaching the threshold value; sorting the sentences with the correlation degree reaching a threshold value according to the sequence of the scores from high to low;

6. The system of claim 5, wherein the third acquisition unit comprises:

the acquisition module is used for acquiring a plurality of sentences;

7. The system of claim 5, wherein the display unit comprises: