CN109977368B - Text-to-vector diagram method and device - Google Patents

Text-to-vector diagram method and device Download PDF

Info

Publication number
CN109977368B
CN109977368B CN201711472913.3A CN201711472913A CN109977368B CN 109977368 B CN109977368 B CN 109977368B CN 201711472913 A CN201711472913 A CN 201711472913A CN 109977368 B CN109977368 B CN 109977368B
Authority
CN
China
Prior art keywords
vector
word
primitive
sequence
diagram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711472913.3A
Other languages
Chinese (zh)
Other versions
CN109977368A (en
Inventor
胡庆
何茂强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hytera Communications Corp Ltd
Original Assignee
Hytera Communications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hytera Communications Corp Ltd filed Critical Hytera Communications Corp Ltd
Priority to CN201711472913.3A priority Critical patent/CN109977368B/en
Publication of CN109977368A publication Critical patent/CN109977368A/en
Application granted granted Critical
Publication of CN109977368B publication Critical patent/CN109977368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/56Information retrieval; Database structures therefor; File system structures therefor of still image data having vectorial format
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/106Display of layout of documents; Previewing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a method and a device for converting text into a vector diagram, wherein the method comprises the following steps: decomposing the text to be converted into word sequences; each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained; inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by training a plurality of known word vector sequences serving as input and a corresponding plurality of known vector diagram vectors serving as output; and converting the vector sequence of the vector diagram into the vector diagram according to preset requirements. Through the mode, the text can be automatically converted into the vector diagram, and the editing threshold is low.

Description

Text-to-vector diagram method and device
Technical Field
The present disclosure relates to the field of text conversion technologies, and in particular, to a method and an apparatus for converting text into a vector diagram.
Background
Flowcharts are widely used in various fields, and are diagrams representing algorithms by adding specific graphic symbols to the description, so that the order of the working process can be clearly described. In many systems, for example: office automation systems, plan handling schemes, etc., have visual flow chart editing tools for user-friendly use to allow users to customize flows.
In the prior art, the editing flow chart is usually edited by a drawing tool in a manual mode; some special flow chart conversion software is manufactured, mainly by defining text in a strict fixed format, so that the text can be converted into a flow chart.
However, the inventor of the application finds that in the long-term research and development process, a manual editing mode is adopted, so that the time is relatively long, and the editing threshold is high; in software, the related personnel are required to be familiar with the definition of the specifications, and the languages are not natural languages, are not humanized enough, and are not similar to the process of making the flow chart by using a flow chart drawing tool.
Disclosure of Invention
The technical problem that this application mainly solves is to provide a method and device that text changes vector graphics, can be automatic with text conversion to vector graphics, and is simple and edit threshold low.
In order to solve the technical problems, one technical scheme adopted by the application is as follows: a method of text to vector graphics is provided, the method comprising: decomposing the text to be converted into word sequences; each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained; inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by training a plurality of known word vector sequences serving as input and a corresponding plurality of known vector diagram vectors serving as output; and converting the vector sequence of the vector diagram into the vector diagram according to preset requirements.
In order to solve the technical problems, another technical scheme adopted by the application is as follows: an apparatus for providing a text-to-vector diagram, the apparatus comprising: a processor, a memory, and an input-output device, the processor being coupled to the memory, the input-output device, respectively, wherein the memory is configured to store a program; the input-output device is used for inputting the text to be converted and outputting a vector diagram; the processor is used for acquiring the text to be converted through the input and output device when the program is run, and decomposing the text to be converted into word sequences; each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained; inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by training a plurality of known word vector sequences serving as input and a corresponding plurality of known vector diagram vectors serving as output; and converting the vector sequence of the vector diagram into the vector diagram according to preset requirements.
The beneficial effects of this application are: different from the prior art, the text to be converted is decomposed into word sequences; each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained; inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by training a plurality of known word vector sequences serving as input and a corresponding plurality of known vector diagram vectors serving as output; and converting the vector sequence of the vector diagram into the vector diagram according to preset requirements. The text to be converted is decomposed into word sequences, the word vector sequences are further obtained, the word vector sequences are input into a trained neural network model, a vector diagram vector sequence formed by a plurality of vector diagrams is obtained, the vector diagram vector sequences are converted into vector diagrams according to preset requirements, and in this way, the text can be automatically converted into the vector diagrams without a user grasping a special editing tool or a specific language, the editing threshold is low, and the method is simple, convenient and rapid.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a flow chart of one embodiment of a method of text-to-vector mapping of the present application;
FIG. 2 is a flow chart of another embodiment of a method of text-to-vector mapping of the present application;
FIG. 3 is a schematic diagram of a BP neural network model;
FIG. 4 is another schematic diagram of a BP neural network model;
FIG. 5 is a specific flowchart of a method for converting text into a vector diagram according to the present application;
FIG. 6 is a flow chart of a method of converting text into a vector diagram according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an embodiment of the apparatus for converting text into vector diagrams in the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present invention based on the embodiments herein.
Referring to fig. 1, fig. 1 is a flowchart of an embodiment of a method for converting text into a vector diagram of the present application, the method comprising:
step S101: the text to be converted is decomposed into a sequence of words.
The text to be converted refers to the text to be converted into the vector diagram, and the text to be converted has a fixed logic relationship, and can be converted into the vector diagram by a manual mode according to the text. Vector graphics, also known as object-oriented images or pictorial images, are computer graphics in which images are represented by geometric primitives based on mathematical equations, such as points, lines, or polygons. The text expression form is single, the emphasis cannot be highlighted, the sequence of the process cannot be clearly described, and the like, and the vector diagram can highlight the emphasis in the form of an image, and the process can be clearly described, so that the text to be converted is automatically converted into the vector diagram, the requirements of users can be met, and the subsequent popularization of the method is facilitated.
Word sequence refers to a sequence of words. In general, when decomposing a text to be converted into a word sequence, words in the word sequence obtained by final decomposition are different by using different methods and different decomposition tools. Words in the word sequence may also include punctuation marks in the text.
For example: chinese, words after decomposition can be: middle, national, human; the decomposed words may also be: china, people; the decomposed words may also be: chinese, national people; etc.
In one embodiment, the text to be converted is decomposed into word sequences by a word segmentation unit. A word segmentation device is a tool that analyzes a piece of text entered by a user into logical terms.
Step S102: and changing each word in the decomposed word sequence into a word vector by using the established word vector table to obtain a word vector sequence.
To pass natural language to algorithms in machine learning for processing, it is often necessary to first mathematical the language, and word vectors are one way to mathematical words in the language. One of the simplest word vector approaches is one-hot representation, which uses a very long vector to represent a word, the dimension (or length) of this vector is the size of the word vector table, the components of the vector are only one 1, and the other positions of all 0,1 correspond to the position of the word in the word vector table. The method is simple, any two words are isolated, whether the two words have a relation or not cannot be seen from the two vectors, and similarity between the words cannot be described.
For example:
"microphone" is denoted as [ 0001 000 000 000 000
"microphone" is denoted as [ 000 000 001 000 000 0
Another word vector approach is Distributed Representation, which overcomes the shortcomings of one-hot representation. The basic idea is that: by training to map each word in a language into a short vector of fixed length (of course "short" here is relative to "long" of one-hot representation), put all these vectors together to form a word vector space, and each vector is a point in this space, introducing "distance" in this space, the lexical, semantic similarity between words can be judged from their distance. Such vectors generally grow as such: for example: "microphone" [0.792, -0.177, -0.107,0.109, -0.542, ]. Dimensions are more common in the 50 and 100 dimensions.
The word vector is not unique, and the quality of the word vector depends on factors such as training corpus, training algorithm, word vector length and the like. To introduce how the word vectors are trained, the language model has to be mentioned. To learn something from a section of unlabeled natural text, it is necessary to count information such as word frequency, co-occurrence of words, word collocation, and the like. And a language model is counted and built from natural texts, which is a task requiring the most accuracy.
The language model is built by learning word vectors from a large amount of unlabeled ordinary text data without supervision (the language model is originally based on the idea), and if labeled corpus is used, more word vector training methods are ensured. However, depending on the current corpus scale, some methods are used which do not label the corpus. The most classical training of word vectors is 3, C & W2008, M & H2008, mikolov 2010, respectively.
The word vector table is established in advance through a language model aiming at a specific corpus system, and the word vector table can be utilized to directly convert decomposed words belonging to the corpus system into word vectors, and after each word in the decomposed word sequence is changed into the word vector, the word vector sequence can be obtained. In this way, a word vector sequence can be obtained very quickly.
Step S103: the word vector sequence is input into a trained neural network model, a vector diagram vector sequence composed of a plurality of vector diagram vectors is obtained, and the neural network model is obtained by training with a plurality of known word vector sequences as input and a corresponding plurality of known vector diagram vectors as output.
Neural Networks (NNs) are artificial systems with intelligent information processing functions such as learning, association, memory and pattern recognition by modeling and connecting neurons, which are basic units of the human brain, to explore models for simulating the functions of the human brain nervous system. An important feature of a neural network is that it is capable of learning from the environment and storing the learned result distribution in the synaptic connections of the network. The learning of the neural network is a process, under the excitation of the environment where the neural network is located, a plurality of sample modes are sequentially input to the network, the weight matrix of each layer of the network is adjusted according to a certain rule (learning algorithm), and when the weights of each layer of the network are converged to a certain value, the learning process is finished. The neural network model is described on the basis of a mathematical model of neurons, which is simply a mathematical model. The neural network model is represented by a network topology, node characteristics, and learning rules. The neural network model has the following special advantages: first, parallel distribution processing; second, high robustness and fault tolerance; thirdly, distributing storage and learning capacity; fourth, complex nonlinear relationships can be approximated sufficiently. There are tens of neural network models available today, and more typical neural network models include BP (Back Propagation) neural networks, hopfield networks, ART networks, and Kohonen networks.
The vector diagrams are structured data when implemented within a program, and in order to be able to use a neural network model, the vector diagrams need to be represented by a sequence of vectors. It is thus possible to use the neural network model for the purpose of automatically converting text into a vector image.
The unfamiliar neural network must be trained before it can work properly. The method comprises the steps of training a strange neural network by taking a plurality of known word vectors as input and a corresponding plurality of known vector diagram vectors as output in advance to obtain a trained neural network model. The trained neural network model may be reused. After a period of use, the neural network model can be trained again to obtain a better neural network model which meets the actual requirements better. The training neural network model may employ an unsupervised learning mode, a supervised learning mode, semi-supervised learning, or reinforcement learning, etc.
Referring to fig. 2, in an embodiment, if the neural network is strange and has not been trained, step S103 may further include:
step S105: a supervised learning mode is employed to train with a plurality of known, labeled word vector sequences as inputs and a corresponding plurality of known vector graphics vectors as outputs to obtain a neural network model.
Supervised learning (Supervised Learning) trains the network using examples of known correct answers. The process comprises the following steps: creation and classification of data sets, training, validation, and use. In the present embodiment, a large amount of labeled data (i.e., a labeled word vector sequence), that is, a large amount of input data for knowing the correct result is required for performing the supervised learning. In this embodiment, a program may be used to automatically generate a large number of flowcharts, and then convert the flowcharts into text, so that a large amount of labeling data is obtained.
Referring to fig. 3 and 4, in one embodiment, the trained model is a BP neural network, and the mathematical derivation of the BP neural network is specifically described below.
The training algorithm is essentially a problem of finding the minimum of the error function. The algorithm adopts the steepest descent method in nonlinear programming, and the weight coefficient is modified according to the negative gradient direction of the error function.
Assuming that the neural network of the present embodiment has m layers and X in the input layer, the sum of inputs of the neurons in the k-th layer is set as
Figure BDA0001530469600000061
The output is Y i k The method comprises the steps of carrying out a first treatment on the surface of the The weight coefficient from the jth neuron of the k-1 th layer to the ith neuron of the k-1 th layer is W ij The relationship of the variables can be expressed by the following related mathematical expression if the excitation function of each neuron is f:
Figure BDA0001530469600000062
Figure BDA0001530469600000063
firstly, defining an error function e, taking the sum of squares of the differences between the expected output and the actual output as the error function, and then:
Figure BDA0001530469600000071
wherein:
Figure BDA0001530469600000072
is the expected value of the output unit, i.e. the training signal; y is Y i m Representing the actual output of the mth layer. The essence of training is the negative ladder according to the error function eThe degree direction modifies the weight coefficient so:
Figure BDA0001530469600000073
wherein: η is the learning rate, i.e. the step size.
According to the chain law:
Figure BDA0001530469600000074
wherein the method comprises the steps of
Figure BDA0001530469600000075
The method can obtain:
Figure BDA0001530469600000076
thereby:
Figure BDA0001530469600000077
thus, there are:
Figure BDA0001530469600000078
and (3) making:
Figure BDA0001530469600000079
then there are:
Figure BDA00015304696000000710
wherein eta is learning rate and takes a value between 0 and 1.
Figure BDA0001530469600000081
Due to the presence of
Figure BDA0001530469600000082
Available->
Figure BDA0001530469600000083
To facilitate derivation, f is taken as a continuous function, and typically a nonlinear continuous function, such as a Sigmoid function, is taken. When f is taken as an asymmetric Sigmoid function, there are:
Figure BDA0001530469600000084
then there are:
Figure BDA0001530469600000085
consider the meta-term in 1-11
Figure BDA0001530469600000086
There are two cases to consider:
1) If k=m, then it is the output layer, when there is
Figure BDA0001530469600000087
Is the output expected value, which is a constant. From formulas 1-3 are:
Figure BDA0001530469600000088
thereby having the following characteristics
Figure BDA0001530469600000089
2) If k < m, then this layer is an implicit layer, for which the previous layer is considered, so there is:
Figure BDA00015304696000000810
from formulas 1 to 9, it can be seen that:
Figure BDA00015304696000000811
from formulas 1-2, there may be:
Figure BDA0001530469600000091
thus, there are:
Figure BDA0001530469600000092
finally, the method comprises the following steps:
Figure BDA0001530469600000093
from the above procedure, it is known that: the training method of the multi-layer network is to add a sample to the input layer and according to the rule of forward propagation:
Figure BDA0001530469600000094
the output Y can be finally obtained at the output layer by continuously transmitting the output Y layer by layer i m . Handle Y i m And (2) desired output->
Figure BDA0001530469600000095
And comparing, if the signal error of the two signals is larger than the set value, back-propagating the modification weight coefficient according to the following formula.
Figure BDA0001530469600000096
Wherein:
Figure BDA0001530469600000097
Figure BDA0001530469600000098
in the above formula, the layer is calculated
Figure BDA0001530469600000099
In this case, one layer higher than the first layer is used>
Figure BDA00015304696000000910
It can be seen that the error function is derived as a back-propagation process from the output layer to the input layer. The error is continuously found recursively in the process.
Through repeated training of a plurality of samples, the weight coefficient is corrected towards the direction that the error gradually decreases, so that the error is finally eliminated. It can be also known from the above formula that if the number of layers of the network is large, the amount of calculation is considerable, so that the convergence speed is not fast.
In order to accelerate the convergence rate, the weight coefficient of the last time is generally considered and is used as one of the bases of the correction, so that a correction formula is provided:
Figure BDA0001530469600000101
wherein: η is learning rate and is about 0.1 to 0.4; alpha is a weight coefficient correction constant, and takes a value of about 0.7-0.9.
For neural networks without hidden layers, it is preferable to:
Figure BDA0001530469600000102
wherein:
Figure BDA0001530469600000103
is the desired output; y is Y j Is the actual output; y is Y i Is the output of the input layer.
When the back propagation algorithm is applied to the feed-forward multilayer network and Sigmoid is adopted as an excitation function, the weight coefficient W of the network can be obtained by the following steps ij And (5) performing recursive calculation. Note that there are n neurons for each layer, i.e., i=1, 2, …, n; j=1, 2, …, n. For the ith neuron of the kth layer, there are n weighting coefficients W i1 ,W i2 ,…,W in In addition take a plurality of W i,n+1 For indicating the offset theta i The method comprises the steps of carrying out a first treatment on the surface of the And when inputting sample X, take x= (X 1 ,X 2 ,…,X n ,1)。
The steps of the execution of the BP network model are as follows:
1) Weight coefficient W ij Setting an initial value:
weight coefficient W for each layer ij Placing a smaller non-zero random number, but wherein W i,n+1 =-θ。
2) Input one sample x= (X 1 ,X 2 ,…,X n 1), and corresponds to the desired output y= (Y) 1 ,Y 2 ,…,Y n )。
3) Calculating the output of each layer:
output Y for the kth layer ith neuron i k The method comprises the following steps:
Figure BDA0001530469600000104
Figure BDA0001530469600000105
4) Solving the learning error of each layer
Figure BDA0001530469600000106
For the output layer, where k=m, then there is:
Figure BDA0001530469600000111
corresponding to other layers, there are:
Figure BDA0001530469600000112
5) Correction weight coefficient W ij And bias θ:
the following are used in formulas 1 to 22:
Figure BDA0001530469600000113
the following are used in the formulae 1 to 25:
Figure BDA0001530469600000114
wherein:
Figure BDA0001530469600000115
6) After each weight coefficient of each layer is obtained, whether the requirements are met or not can be judged according to the given quality index. If the requirements are met, the algorithm ends; if the requirements are not met, the execution of (3) is returned.
This learning process is performed for any given sample xp= (Xp 1, xp2, … Xpn, 1) and the desired output yp= (Yp 1, yp2, …, ypn) until all input-output requirements are met.
Step S104: and converting the vector sequence of the vector diagram into the vector diagram according to the preset requirement.
In one embodiment, the vector diagram is a flow chart, and the flow chart is composed of a plurality of primitives with up-down connection relationship and/or left-right connection relationship, and the vector diagram vector comprises a primitive serial number, a parent primitive serial number, a primitive unique identifier, a parameter 1, a parameter 2, a parameter … … and a parameter n.
Further, step S104 may specifically include: according to the primitive serial numbers in the vector graphics vectors, determining which layer of primitives in the upper and lower position relations the primitives are located, according to the parent primitive serial numbers in the vector graphics vectors, determining the primitive of the upper layer directly connected with the primitives, according to the primitive unique identification in the vector graphics vectors, determining which primitive in the left and right position relations the primitives are located, and according to parameter 1, parameter 2, … … and parameter n, determining the specific content of the primitives.
Referring to fig. 5, fig. 5 is a specific flowchart, which is composed of a series of primitives having a relationship of up, down, left, and right. Each vector image vector is each primitive vector.
The vectorization representation mode of the graphic primitive is as follows: [ primitive number, parent primitive number, primitive id, parameter, … ]. The [0, … ] zero vector represents null.
Each operation is numbered as: calling 10034, opening a video bayonet 10035, opening a train ticket monitor 10056, if 20001, ending 1000, starting 20002 the condition, ending 20003 the condition.
Each of the operation objects has a number such as: a sub office duty leader 50007, a place of issue 50013, a patrol police 50018, a whole city 80001, a report situation 90024, a return to case 40035 and a potential escape 60019.
Primitive ordering rules of the flow chart: from top to bottom and from left to right, the above-mentioned flow chart vector sequences are respectively:
y1:[1,0,10034,50007,90024,0,0,…]
y2:[2,1,10035,50013,5,0,0,…]
y3:[3,2,10034,50018,50013,3,0,0,…]
y4:[4,3,20001,0,0,…]
y5:[5,4,1000,20002,40035,20003,…]
y6:[6,4,10056,80001,20002,60019,20003,…]
the method of the present application is described below with a specific example.
The first step: after the text to be converted enters the word segmentation device, the word segmentation result is as follows
Calling
Urban office
Duty leader
Then
Opening the valve
Place of occurrence
5
Kilometers
Bayonet socket
And a second step of: the corresponding words are changed into word vectors according to the word vector table to form word vector sequences, and the results are respectively as follows:
[0.782,-0.623,0.109,…]
[0.682,-0.623,0.109,…]
[0.732,-0.523,0.109,…]
[0.7,-0.6,0.1,…]
[0.452,-0.603,0.149,…]
[0.982,0.673,0.109,…]
[-0.782,0.623,0.179,…]
[0.282,-0.223,0.309,…]
[0.902,-0.673,0.109,…]
[0.782,0.623,0.109,…]
and a third step of: the word vector sequence is taken as input to enter a trained neural network model, and the flow chart vector sequence is output according to the convention of the flow chart vector, and the results are respectively as follows:
[1,0,10034,50007,90024,0,0,…]
[2,1,10035,50013,5,0,0,…]
[0,0,0,0,0,…]
......
[0,0,0,0,0,…]
fourth step: the sequence of flow chart vectors is converted to a flow chart according to the convention of flow chart vectors, as shown in FIG. 6.
In summary, the present embodiment decomposes text to be converted into word sequences; each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained; inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by training a plurality of known word vector sequences serving as input and a corresponding plurality of known vector diagram vectors serving as output; and converting the vector sequence of the vector diagram into the vector diagram according to preset requirements. The text to be converted is decomposed into word sequences, the word vector sequences are further obtained, the word vector sequences are input into a trained neural network model, a vector diagram vector sequence formed by a plurality of vector diagrams is obtained, the vector diagram vector sequences are converted into vector diagrams according to preset requirements, and in this way, the text can be automatically converted into the vector diagrams without a user grasping a special editing tool or a specific language, the editing threshold is low, and the method is simple, convenient and rapid.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an embodiment of a device for converting text into vector diagrams, and it should be noted that the device of the embodiment may perform the steps of the method, and details of the related content refer to the method section and are not described herein.
The device comprises: processor 1, memory 2, and input-output device 3, processor 1 being coupled to memory 2 and input-output device 3, respectively.
The memory 2 is used for storing programs; the input-output device 3 is used for inputting the text to be converted and outputting a vector diagram; the processor 1 is used for acquiring a text to be converted through the input and output device 3 when running the program, and decomposing the text to be converted into word sequences; each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained; inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by taking a plurality of known word vector sequences as input and a corresponding plurality of known vector diagram vectors as output for training; and converting the vector sequence of the vector diagram into the vector diagram according to the preset requirement.
Wherein the processor 1 is further adapted to, when running the program, take a supervised learning mode, train with a plurality of known, labeled word vector sequences as inputs and a corresponding plurality of known vector graphics vectors as outputs to obtain the neural network model.
Wherein the processor 1 is further adapted to decompose the text to be converted into a word sequence by means of a word segmentation unit when running the program.
The vector diagram is a flow chart, and the flow chart is composed of a plurality of primitives with up-down connection relations and/or left-right connection relations, and the vector diagram vector comprises a primitive serial number, a parent primitive serial number, a primitive unique identifier, a parameter 1, a parameter 2, … … and a parameter n.
The processor 1 is further configured to determine, when the program is running, which primitive in the upper-lower positional relationship is located according to the primitive number in the vector graphics vector, determine, according to the parent primitive number in the vector graphics vector, the primitive of the upper layer directly connected to the primitive, determine, according to the primitive unique identifier in the vector graphics vector, which primitive in the left-right positional relationship is located, and determine the specific content of the primitive according to parameter 1, parameter 2, … …, and parameter n.
According to the method and the device, the text to be converted is decomposed into word sequences; each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained; inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by training a plurality of known word vector sequences serving as input and a corresponding plurality of known vector diagram vectors serving as output; and converting the vector sequence of the vector diagram into the vector diagram according to preset requirements. The text to be converted is decomposed into word sequences, the word vector sequences are further obtained, the word vector sequences are input into a trained neural network model, a vector diagram vector sequence formed by a plurality of vector diagrams is obtained, the vector diagram vector sequences are converted into vector diagrams according to preset requirements, and in this way, the text can be automatically converted into the vector diagrams without a user grasping a special editing tool or a specific language, the editing threshold is low, and the method is simple, convenient and rapid.
The foregoing description is only of embodiments of the present application, and is not intended to limit the scope of the patent application, and all equivalent structures or equivalent processes using the descriptions and the contents of the present application or other related technical fields are included in the scope of the patent application.

Claims (6)

1. A method of converting text into a vector diagram, the method comprising:
decomposing the text to be converted into word sequences;
each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained;
inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by training a plurality of known word vector sequences serving as input and a corresponding plurality of known vector diagram vectors serving as output;
converting the vector sequence of the vector diagram into a vector diagram according to preset requirements;
the vector diagram is a flow diagram, and the flow diagram consists of a plurality of primitives with an up-down connection relationship and/or a left-right connection relationship, wherein the vector diagram vector comprises a primitive serial number, a parent primitive serial number, a primitive unique identifier, a parameter 1, a parameter 2, a … … and a parameter n;
the converting the vector sequence of the vector diagram into the vector diagram according to the preset requirement comprises the following steps:
determining which layer of primitive in the upper-lower position relation is located according to the primitive serial number in the vector diagram vector, determining the primitive of the upper layer directly connected with the primitive according to the parent primitive serial number in the vector diagram vector, determining which primitive in the left-right position relation is located according to the primitive unique identification in the vector diagram vector, and determining the specific content of the primitive according to the parameter 1, the parameter 2, the parameter … … and the parameter n.
2. The method according to claim 1, wherein the method further comprises:
the neural network model is obtained by taking a supervised learning model, training with a plurality of known, labeled word vector sequences as inputs, and a corresponding plurality of known vector diagram vectors as outputs.
3. The method of claim 1, wherein the decomposing the text to be converted into word sequences comprises:
the text to be converted is decomposed into word sequences by a word segmentation device.
4. An apparatus for converting text into a vector image, the apparatus comprising: a processor, a memory, and an input-output device, the processor being coupled to the memory, the input-output device, respectively, wherein,
the memory is used for storing programs;
the input-output device is used for inputting the text to be converted and outputting a vector diagram;
the processor is used for acquiring the text to be converted through the input and output device when the program is run, and decomposing the text to be converted into word sequences; each word in the decomposed word sequence is changed into a word vector by using the established word vector table, and a word vector sequence is obtained; inputting the word vector sequence into a trained neural network model to obtain a vector diagram vector sequence composed of a plurality of vector diagram vectors, wherein the neural network model is obtained by training a plurality of known word vector sequences serving as input and a corresponding plurality of known vector diagram vectors serving as output; converting the vector sequence of the vector diagram into a vector diagram according to preset requirements;
the vector diagram is a flow diagram, and the flow diagram consists of a plurality of primitives with an up-down connection relationship and/or a left-right connection relationship, wherein the vector diagram vector comprises a primitive serial number, a parent primitive serial number, a primitive unique identifier, a parameter 1, a parameter 2, a … … and a parameter n;
the processor is further configured to determine, when the program is running, which primitive in the upper and lower positional relationships is located by the primitive sequence number in the vector graphics vector, determine, according to the parent primitive sequence number in the vector graphics vector, the primitive of the upper layer directly connected to the primitive, determine, according to the primitive unique identifier in the vector graphics vector, which primitive in the left and right positional relationships is located by the primitive, and determine the specific content of the primitive according to the parameter 1, the parameter 2, the parameter … …, and the parameter n.
5. The apparatus of claim 4, wherein the processor, when executing the program, is further configured to take a supervised learning mode and to train with a plurality of known, labeled word vector sequences as inputs and a corresponding plurality of known vector graphics vectors as outputs to obtain the neural network model.
6. The apparatus of claim 4, wherein the processor, when executing the program, is further configured to decompose text to be converted into a word sequence by a word segmentation unit.
CN201711472913.3A 2017-12-28 2017-12-28 Text-to-vector diagram method and device Active CN109977368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711472913.3A CN109977368B (en) 2017-12-28 2017-12-28 Text-to-vector diagram method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711472913.3A CN109977368B (en) 2017-12-28 2017-12-28 Text-to-vector diagram method and device

Publications (2)

Publication Number Publication Date
CN109977368A CN109977368A (en) 2019-07-05
CN109977368B true CN109977368B (en) 2023-06-16

Family

ID=67075712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711472913.3A Active CN109977368B (en) 2017-12-28 2017-12-28 Text-to-vector diagram method and device

Country Status (1)

Country Link
CN (1) CN109977368B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106033403A (en) * 2015-03-20 2016-10-19 广州金山移动科技有限公司 Text transition method and device
CN107221019A (en) * 2017-03-07 2017-09-29 武汉唯理科技有限公司 Chart conversion method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102167719B1 (en) * 2014-12-08 2020-10-19 삼성전자주식회사 Method and apparatus for training language model, method and apparatus for recognizing speech

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106033403A (en) * 2015-03-20 2016-10-19 广州金山移动科技有限公司 Text transition method and device
CN107221019A (en) * 2017-03-07 2017-09-29 武汉唯理科技有限公司 Chart conversion method and device

Also Published As

Publication number Publication date
CN109977368A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN111488734B (en) Emotional feature representation learning system and method based on global interaction and syntactic dependency
CN108182295B (en) Enterprise knowledge graph attribute extraction method and system
CN110609891A (en) Visual dialog generation method based on context awareness graph neural network
CN109783666B (en) Image scene graph generation method based on iterative refinement
CN103838836B (en) Based on discriminant multi-modal degree of depth confidence net multi-modal data fusion method and system
CN108280064A (en) Participle, part-of-speech tagging, Entity recognition and the combination treatment method of syntactic analysis
CN110555084B (en) Remote supervision relation classification method based on PCNN and multi-layer attention
CN112949647B (en) Three-dimensional scene description method and device, electronic equipment and storage medium
CN108197294A (en) A kind of text automatic generation method based on deep learning
CN113656570A (en) Visual question answering method and device based on deep learning model, medium and equipment
Riley et al. Integrating non-monotonic logical reasoning and inductive learning with deep learning for explainable visual question answering
CN107451230A (en) A kind of answering method and question answering system
CN107818080A (en) Term recognition methods and device
CN111027292B (en) Method and system for generating limited sampling text sequence
CN115080764A (en) Medical similar entity classification method and system based on knowledge graph and clustering algorithm
CN114821271B (en) Model training method, image description generation device and storage medium
CN111046178A (en) Text sequence generation method and system
CN115511069A (en) Neural network training method, data processing method, device and storage medium
CN111506729A (en) Information processing method and device and computer readable storage medium
CN115809340A (en) Entity updating method and system of knowledge graph
CN114417785A (en) Knowledge point annotation method, model training method, computer device, and storage medium
CN110297894A (en) A kind of Intelligent dialogue generation method based on auxiliary network
CN113779988A (en) Method for extracting process knowledge events in communication field
CN110705274B (en) Fusion type word meaning embedding method based on real-time learning
CN109977368B (en) Text-to-vector diagram method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant