CN107742315B

CN107742315B - Method and device for generating character word cloud portrait

Info

Publication number: CN107742315B
Application number: CN201710934962.8A
Authority: CN
Inventors: 周松文; 何金虎
Original assignee: Taikang Insurance Group Co Ltd
Current assignee: Taikang Insurance Group Co Ltd
Priority date: 2017-10-10
Filing date: 2017-10-10
Publication date: 2020-12-08
Anticipated expiration: 2037-10-10
Also published as: CN107742315A

Abstract

The embodiment of the invention provides a method and a device for generating a character word cloud portrait, electronic equipment and a computer readable medium, and relates to the technical field of text processing. The method for generating the character word cloud portrait comprises the following steps: processing the input picture by adopting a color lead mode to obtain a color lead effect picture; processing an input text by using a preset word bank to generate a word sequencing document, wherein the preset word bank comprises a plurality of words; and filling the words into the colorful-plumb effect picture according to the sequencing document of the words to obtain a figure word cloud portrait. The method can quickly generate the portrait word cloud portrait according to the input pictures and texts, can quickly reflect the change of data in real time, and can ensure the accuracy of the portrait, thereby obtaining the portrait with beautiful appearance and rich colors.

Description

Method and device for generating character word cloud portrait

Technical Field

The embodiment of the invention relates to the technical field of text processing, in particular to a method and a device for generating a character word cloud portrait.

Background

The word cloud visually highlights the keywords with high occurrence frequency in the network text to form a keyword cloud layer or keyword rendering, so that a large amount of text information is filtered, and a person browsing a webpage can draw the text at a glance.

In the data analysis process, the character word cloud portrait is a description form which is friendly and easy to understand to character characteristic information. In the prior art, the following two modes are mainly adopted in the process of generating the character word cloud portrait:

(1) the earlier stage data processing is used, and after the later stage is used for beautifying and whitening the picture, the processed data are filled one by one, so that the advantages of accurate generated portrait, higher aesthetic degree and too long period for generating the portrait, and the data change cannot be responded in real time and quickly.

(2) The figure word cloud portrait is generated by directly using pictures and words, an original picture effect diagram before the figure word cloud portrait is directly generated is shown in figure 1, and an effect diagram before the figure word cloud portrait is directly generated is shown in figure 2.

Therefore, there is a need for improvement in that neither of the two processing methods of the prior art can shorten the period of generating an image and ensure the accuracy of the image.

The above information disclosed in this background section is only for enhancement of understanding of the background of the embodiments of the present invention and therefore it may contain information that does not form the prior art that is already known to a person of ordinary skill in the art.

Disclosure of Invention

The embodiment of the invention provides a method, a device, electronic equipment and a computer readable medium for generating a character word cloud portrait, and solves the technical problem that the existing technical scheme can not generate the character word cloud portrait quickly and accurately.

Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of embodiments of the invention.

According to a first aspect of the embodiments of the present invention, there is provided a method for generating a character word cloud representation, including:

processing the input picture by adopting a color lead mode to obtain a color lead effect picture;

processing an input text by using a preset word bank to generate a word sequencing document, wherein the preset word bank comprises a plurality of words;

and filling the words into the colorful-plumb effect picture according to the sequencing document of the words to obtain a figure word cloud portrait.

In some embodiments of the present invention, before processing the input picture, the method further includes:

judging whether the format of the input picture meets the preset format requirement or not, if so, continuing to process the input picture in a colored lead mode; and if the format of the input picture does not meet the preset format requirement, re-inputting a new picture, wherein the preset format requirement comprises PNG and JPG.

In some embodiments of the present invention, processing the input picture in a color-leaded manner includes:

converting the input picture into a black and white picture;

and carrying out color lead treatment on the black and white picture to obtain the color lead effect picture.

In some embodiments of the present invention, processing the input text using a predetermined lexicon, generating a ranked document of words comprising:

removing stop words from the input text;

calculating the word frequency and the weight of each vocabulary in the preset word bank by using the preset word bank;

calculating to obtain a sorting reference value of the vocabulary according to the word frequency and the weight of the vocabulary;

and sequencing the vocabulary according to the sequencing reference value of the vocabulary to obtain a sequencing document of the vocabulary.

In some embodiments of the present invention, the input text is derived from a document containing words in the predetermined lexicon, and calculating the weight of the words comprises:

counting the number of files containing the vocabulary;

and calculating the weight of the vocabulary according to the number of the files containing the vocabulary and the number of the total files.

In some embodiments of the present invention, the ranking reference value of the vocabulary is a product of a word frequency of the vocabulary and a weight of the vocabulary.

In some embodiments of the present invention, filling the vocabulary into the colored-plumb effect picture according to the sorted documents of the vocabulary further comprises:

and determining the font size filled in the vocabulary according to the size of the sorting reference value of the vocabulary.

According to a second aspect of the embodiments of the present invention, there is provided an apparatus for generating a cloud representation of a character word, including:

the picture processing module is configured to process the input picture in a color lead mode to obtain a color lead effect picture;

the text processing module is configured to process an input text by utilizing a preset word bank to generate a sequencing document of words, wherein the preset word bank comprises a plurality of words;

and the filling module is configured to fill the words into the colorful-plumb effect picture according to the sorted documents of the words to obtain the figure word cloud portrait.

In some embodiments of the invention, further comprising:

the format judging module is configured to judge whether the format of the input picture meets a preset format requirement before the input picture is processed, and if the format of the input picture meets the preset format requirement, the input picture is continuously processed in a colored lead mode; and if the format of the input picture does not meet the preset format requirement, re-inputting a new picture, wherein the preset format requirement comprises PNG and JPG.

In some embodiments of the invention, the picture processing module comprises:

a black and white conversion sub-module configured to convert the input picture into a black and white picture;

and the color lead sub-module is configured to perform color lead processing on the black and white picture to obtain the color lead effect picture.

In some embodiments of the invention, the text processing module comprises:

a stop word submodule configured to remove stop words from the input text;

the first calculation submodule is configured to calculate the word frequency and the weight of each vocabulary in the preset word bank by using the preset word bank;

the second calculation submodule is configured to calculate a ranking reference value of the vocabulary according to the word frequency and the weight of the vocabulary;

and the sequencing submodule is configured to sequence the vocabulary according to the sequencing reference value of the vocabulary to obtain a sequencing document of the vocabulary.

In some embodiments of the present invention, the input text is derived from a file containing words in the predetermined lexicon, and the first calculation sub-module is configured to count the number of files containing the words and calculate the weight of the words according to the number of files containing the words and the number of total files.

In some embodiments of the present invention, the second calculation submodule obtains the ranking reference value of the vocabulary according to a product of the word frequency of the vocabulary and the weight of the vocabulary.

In some embodiments of the invention, the fill-in module determines the font size of the filled-in vocabulary according to the size of the sorted reference value of the vocabulary.

According to a third aspect of embodiments of the present invention, there is provided an electronic apparatus, including: a memory; a processor and a computer program stored on the memory and executable on the processor, the program implementing the method steps described above when executed by the processor.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable medium having stored thereon computer-executable instructions which, when executed by a processor, implement the above-described method steps.

According to the method, the device, the electronic equipment and the computer readable medium for generating the portrait word cloud portrait, the portrait word cloud portrait can be generated rapidly according to the input pictures and texts, the change of data can be reflected rapidly and in real time, and the accuracy of portrait can be ensured, so that the portrait with attractive appearance and rich colors can be obtained.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of embodiments of the invention.

Drawings

The above and other objects, features and advantages of the embodiments of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.

Fig. 1 shows an effect diagram of an original picture before a character word cloud portrait is directly generated in a prior art scheme.

Fig. 2 shows an effect diagram of directly generating a cloud portrait of a character word in a prior art scheme.

FIG. 3 is a flowchart illustrating a method for generating a cloud representation of character words according to an embodiment of the present invention.

Fig. 4 shows a flowchart of step S31 in the embodiment of the present invention.

Fig. 5 is a diagram illustrating an effect of converting the picture shown in fig. 1 into a black-and-white picture according to an embodiment of the present invention.

Fig. 6 is a diagram showing the effect of the color-lead processing on the black-and-white picture shown in fig. 5 in the embodiment of the invention.

Fig. 7 shows a flowchart of step S32 in the embodiment of the present invention.

FIG. 8 is a diagram illustrating a ranked documents of vocabulary in an embodiment of the present invention.

Fig. 9 shows an effect diagram of a person word cloud image finally obtained in the embodiment of the present invention.

FIG. 10 is a schematic diagram of an apparatus for generating a cloud representation of character words according to an embodiment of the present invention.

Fig. 11 is a schematic diagram of a picture processing module in an embodiment of the present invention.

FIG. 12 is a diagram illustrating a text processing module in an embodiment of the invention.

FIG. 13 is a schematic diagram of another apparatus for generating a cloud representation of character words according to an embodiment of the present invention.

Fig. 14 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application according to another embodiment of the present invention.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The drawings are merely schematic representations of embodiments of the invention, which are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known structures, methods, devices, implementations, materials, or operations are not shown or described in detail to avoid obscuring aspects of embodiments of the invention.

Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are described in further detail below with reference to the accompanying drawings.

As shown in fig. 3, in step S31, the input picture is processed by the color-lead method to obtain a color-lead effect picture.

As shown in fig. 3, in step S32, the input text is processed using a preset lexicon, which includes a plurality of words, to generate a sorted document of words.

As shown in fig. 3, in step S33, words are filled in the colored-plumb effect picture according to the sorted documents of the words, so as to obtain a character word cloud image.

The method comprises the steps of carrying out colored-leaded processing on an input picture to enable the picture to become a colored-leaded effect picture, filling vocabularies into the colored-leaded effect picture according to the sequencing documents of the vocabularies, and finally obtaining the character word cloud portrait with high accuracy and high generation speed.

In the embodiment of the present invention, before processing the input picture, the method further includes:

judging whether the format of the input picture meets the preset format requirement, if so, continuing to step S31, namely processing the input picture in a colored lead mode; if the format of the input picture does not meet the preset format requirement, a new picture is input again, wherein the preset format requirement comprises, but is not limited to, bitmaps such as PNG (portable network generator), JPG (Java Web page) and the like, and the preset format requirement on the picture can be changed according to the requirement in the actual use process.

In the embodiment of the present invention, fig. 4 shows a flowchart of step S31 in the embodiment of the present invention, that is, the step S31 of processing the input picture in the color-lead manner includes the following steps:

as shown in fig. 4, in step S41, the input picture is converted into a black and white picture, the input picture is as shown in fig. 1, and fig. 5 shows an effect of converting the picture shown in fig. 1 into a black and white picture in the embodiment of the present invention.

As shown in fig. 4, in step S42, a color-lead effect picture is obtained by performing a color-lead process on the black-and-white picture, and fig. 6 shows an effect picture after the color-lead process is performed on the black-and-white picture shown in fig. 5 in the embodiment of the present invention.

After the black-and-white processing in step S41, the black-and-white contrast becomes more obvious after the processing into a black-and-white picture, and then the color-rendering processing in step S42, it should be noted that in step S42, only the regions other than white are color-rendered, and the white regions remain white.

In the embodiment of the present invention, fig. 7 shows a flowchart of step S32 in the embodiment of the present invention, that is, step S32 processes the input text with a preset lexicon, and generating a ranked document of vocabularies includes the following steps:

as shown in fig. 7, in step S71, stop words including punctuation, emoticons, prepositions, and conjunctions are removed from the input text. The input text is derived from files containing words in a preset lexicon, the content of the files is generally related description of characters in the pictures, and the source of the files is not limited and can comprise articles screened from a network or introductory articles in enterprises and the like. The preset word bank is generally formed by adding key words meeting requirements, such as industry common words, on the basis of mass words according to requirements, so that the words in the preset word bank are richer than the mass words, the industry characteristics can be better embodied, and the description of characters in the pictures is more appropriate.

As shown in fig. 7, in step S72, the word frequency and the weight of each vocabulary in the preset lexicon are calculated by using the preset lexicon. The word frequency of a vocabulary is used for representing the number of times of the occurrence of the vocabulary, and the weight of the vocabulary is used for representing the weight of the vocabulary in the file dimension, specifically, the weight calculation of the vocabulary can adopt the following method:

firstly, counting the number of files containing vocabularies; secondly, the weight of the vocabulary is calculated according to the number of the files containing the vocabulary and the number of the total files, the number of the total files can be divided by the number of the files containing the vocabulary, the obtained quotient is subjected to logarithm taking, and the obtained numerical value is the weight of the vocabulary.

As shown in fig. 7, in step S73, a ranking reference value of the vocabulary is calculated according to the word frequency and the weight of the vocabulary, and specifically, the ranking reference value of the vocabulary may be the product of the word frequency and the weight of the vocabulary.

As shown in fig. 7, in step S74, the vocabulary is sorted according to the sorting reference value of the vocabulary to obtain the sorted documents of the vocabulary, and fig. 8 is a schematic diagram of the sorted documents of the vocabulary in the embodiment of the present invention, wherein the number behind the vocabulary is the sorting reference value of the vocabulary.

In the embodiment of the present invention, step S33 is to fill the vocabulary into the colored-plumbum effect picture according to the sorted documents of the vocabulary, and determine the font size of the filled vocabulary according to the size of the sorted reference value of the vocabulary.

In the prior art, the word frequency is often used as the basis of the font size in the process of filling words into a picture to generate an image, and the product of the word frequency and the word weight is used as the basis of the font size in the embodiment of the invention, so that the association degree of the words and the task in the picture can be embodied from multiple dimensions, and the accuracy of the obtained character word cloud image is higher.

Fig. 9 shows an effect diagram of a person word cloud image finally obtained in the embodiment of the present invention. As can be seen from fig. 9, since the black change and the color-lead treatment are performed on the picture, the white area is not filled when the words are filled into the color-lead effect picture, and thus the obtained character word cloud picture is beautiful and rich in color.

In summary, in the method for generating the character word cloud image provided in the embodiment of the present invention, the input picture is processed in a color-lead manner, and the white area is not filled, so that the obtained character word cloud image is beautiful and rich in color. The input text is processed by utilizing the preset word bank, so that the inserted words are more appropriate to the description of the characters in the picture, the association degree is higher, and the accuracy of the portrait can be ensured. The automatic filling process is fast, the period of generating the portrait word cloud portrait is shortened, the portrait word cloud portrait can be rapidly given according to the changes of the input text and the input picture, and the change of data can be rapidly reflected in real time.

Fig. 10 is a schematic diagram of an apparatus for generating a cloud representation of a character word according to an embodiment of the present invention, as shown in fig. 10, the apparatus 1000 includes: a picture processing module 1010, a text processing module 1020, and a fill-in module 1030.

The picture processing module 1010 is configured to process the input picture in a color-lead manner to obtain a color-lead effect picture; the text processing module 1020 is configured to process the input text by using a preset lexicon, and generate a sorted document of vocabularies, wherein the preset lexicon comprises a plurality of vocabularies; the fill-in module 1030 is configured to fill in the words into the colored-plumb effect picture according to the sorted documents of the words to obtain a portrait word cloud representation.

Fig. 11 is a schematic diagram of a picture processing module according to an embodiment of the present invention, where the picture processing module 1010 includes: a black-white conversion sub-module 1011 and a color leaded sub-module 1012, the black-white conversion sub-module 1011 being configured to convert an input picture into a black-white picture; the color-leaded sub-module 1012 is configured to perform color-leaded processing on the black-and-white picture to obtain a color-leaded effect picture.

Fig. 12 is a schematic diagram of a text processing module in an embodiment of the present invention, where the text processing module 1020 includes: a decommissioning sub-module 1021, a first computation sub-module 1022, a second computation sub-module 1023, and a sorting sub-module 1024.

The stop word sub-module 1021 is configured to remove stop words from the input text; the first calculating submodule 1022 is configured to calculate word frequencies and weights of words in the preset lexicon by using the preset lexicon, where calculating the weights of the words specifically includes: firstly, counting the number of files containing vocabularies; secondly, calculating the weight of the vocabulary according to the number of the files containing the vocabulary and the number of the total files; the second calculating submodule 1023 is configured to calculate a ranking reference value of the vocabulary according to the word frequency and the weight of the vocabulary, and specifically, the ranking reference value of the vocabulary is obtained according to the product of the word frequency and the weight of the vocabulary; the ranking submodule 1024 is configured to rank the vocabulary according to the ranking reference value of the vocabulary to obtain a ranked document of the vocabulary. In an embodiment of the present invention, the input text is derived from a document containing words in a predetermined lexicon. The first calculation submodule is used for counting the number of the files containing the vocabulary, and calculating the weight of the vocabulary according to the number of the files containing the vocabulary and the number of the total files.

In an embodiment of the present invention, the fill-in module 1030 determines the font size of the filled-in vocabulary according to the size of the sorted reference value of the vocabulary.

Fig. 13 is a schematic diagram of another apparatus for generating a cloud representation of character words according to an embodiment of the present invention, as shown in fig. 13, the apparatus 1300 includes, in addition to: besides the picture processing module 1310, the text processing module 1320, and the fill-in module 1330, the method further includes: the format judging module 1340 is configured to judge whether the format of the input picture meets a preset format requirement before the input picture is processed, and if the format of the input picture meets the preset format requirement, continue to process the input picture in a colored lead manner; and if the format of the input picture does not meet the preset format requirement, re-inputting a new picture, wherein the preset format requirement comprises PNG and JPG.

The composition and function of each of the picture processing module 1310, the text processing module 1320, and the padding module 1330 in the apparatus shown in fig. 13 are described with reference to fig. 11 and 12, and are not described again here.

In addition, the functions of each module in the apparatus shown in fig. 10 and 13 refer to the related description in the above method embodiment, and are not described again here.

The device for generating the character word cloud portrait can achieve the same technical effects as the method for generating the character word cloud portrait, and the method is not repeated herein.

In another aspect, the present invention also provides an electronic device, including a processor and a memory, where the memory stores operating instructions for the processor to control the following method:

processing the input picture by adopting a color lead mode to obtain a color lead effect picture; processing an input text by using a preset word bank to generate a word sequencing document, wherein the preset word bank comprises a plurality of words; and filling the words into the colorful-plumb effect picture according to the sequencing document of the words to obtain the figure word cloud portrait.

Referring now to FIG. 14, shown is a block diagram of a computer system 1400 suitable for use with the electronic device implementing an embodiment of the present invention. The electronic device shown in fig. 14 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 14, the computer system 1400 includes a Central Processing Unit (CPU)1401, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)1402 or a program loaded from a storage portion 1407 into a Random Access Memory (RAM) 1403. In the RAM 1403, various programs and data necessary for the operation of the system 1400 are also stored. The CPU 1401, ROM 1402, and RAM 1403 are connected to each other via a bus 1404. An input/output (I/O) interface 1405 is also connected to bus 1404.

The following components are connected to the I/O interface 1405: an input portion 1406 including a keyboard, a mouse, and the like; an output portion 1407 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like; a storage portion 1408 including a hard disk and the like; and a communication portion 1409 including a network interface card such as a LAN card, a modem, or the like. The communication section 1409 performs communication processing via a network such as the internet. The driver 1410 is also connected to the I/O interface 1405 as necessary. A removable medium 1411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1410 as necessary, so that a computer program read out therefrom is installed into the storage section 1408 as necessary.

In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 1409 and/or installed from the removable medium 1411. The above-described functions defined in the system of the present application are executed when the computer program is executed by a Central Processing Unit (CPU) 1401.

It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable medium or any combination of the two. A computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a transmitting unit, an obtaining unit, a determining unit, and a first processing unit. The names of these units do not in some cases constitute a limitation to the unit itself, and for example, the sending unit may also be described as a "unit sending a picture acquisition request to a connected server".

On the other hand, the embodiment of the present invention also provides a computer-readable medium, which may be included in the apparatus described in the above embodiment; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include the method steps of:

It should be clearly understood that the embodiments of the present invention describe how to make and use specific examples, but the principles of the embodiments of the present invention are not limited to any of the details of these examples. Rather, these principles can be applied to many other implementations based on the teachings disclosed in the present examples.

Exemplary embodiments of the present invention are specifically illustrated and described above. It is to be understood that the embodiments of the invention are not limited to the precise arrangements, instrumentalities, or instrumentalities described herein; on the contrary, the embodiments of the invention are intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A method for generating a character word cloud picture is characterized by comprising the following steps:

filling the words into the colorful-lead effect picture according to the sequencing document of the words to obtain a figure word cloud portrait; the method specifically comprises the following steps: filling the vocabulary into a non-white area of the colorful-lead effect picture according to the sequencing document of the vocabulary to obtain a character word cloud portrait;

wherein, adopt various plumbous mode to handle the picture of input including:

converting the input picture into a black and white picture;

carrying out color lead treatment on the black and white picture to obtain the color lead effect picture; the method specifically comprises the following steps: and carrying out colored lead treatment on the areas except the white area in the black and white picture to obtain the colored lead effect picture.

2. The method of claim 1, wherein processing the input picture further comprises:

3. The method of claim 1, wherein processing the input text using the predetermined lexicon, and generating a ranked document of words comprises:

removing stop words from the input text;

4. The method of claim 3, wherein the input text is derived from a document containing words in the predetermined lexicon, and wherein computing the weight of the words comprises:

counting the number of files containing the vocabulary;

5. The method according to claim 3 or 4, wherein the ranking reference value of the vocabulary is a product of a word frequency of the vocabulary and a weight of the vocabulary.

6. The method of claim 5, wherein populating the vocabulary into the colored-plumb effects picture according to the sorted documents of the vocabulary further comprises:

7. An apparatus for generating a cloud representation of a character word, comprising:

the filling module is configured to fill the vocabularies into the colorful-plumb effect picture according to the sorted documents of the vocabularies to obtain a figure word cloud portrait; the method specifically comprises the following steps: filling the vocabulary into a non-white area of the colorful-lead effect picture according to the sequencing document of the vocabulary to obtain a character word cloud portrait;

wherein the picture processing module comprises:

the color-lead sub-module is configured to perform color-lead processing on the black-and-white picture to obtain the color-lead effect picture; the method specifically comprises the following steps: and carrying out colored lead treatment on the areas except the white area in the black and white picture to obtain the colored lead effect picture.

8. The apparatus of claim 7, further comprising:

9. The apparatus of claim 7, wherein the text processing module comprises:

a stop word submodule configured to remove stop words from the input text;

10. The apparatus of claim 9, wherein the input text is derived from files containing words in the predetermined lexicon, and the first computing sub-module is configured to count the number of files containing the words and compute the weight of the words according to the number of files containing the words and the number of total files.

11. The apparatus according to claim 9 or 10, wherein the second computation submodule obtains the ranking reference value of the vocabulary according to the product of the word frequency of the vocabulary and the weight of the vocabulary.

12. The apparatus of claim 11, wherein the population module determines a font size for populating the vocabulary according to a size of the sorted reference value for the vocabulary.

13. An electronic device, comprising: a memory; processor and computer program stored on the memory and executable on the processor, characterized in that the program realizes the method steps of any of claims 1-6 when executed by the processor.

14. A computer-readable medium having stored thereon computer-executable instructions, which when executed by a processor, perform the method steps of any one of claims 1-6.