CN110765997B

CN110765997B - Interactive reading realization method

Info

Publication number: CN110765997B
Application number: CN201911097147.6A
Authority: CN
Inventors: 江周平
Original assignee: Shenzhen Yikuai Interactive Network Technology Co ltd
Current assignee: Beijing Anxin Zhitong Technology Co ltd
Priority date: 2019-11-11
Filing date: 2019-11-11
Publication date: 2023-12-26
Anticipated expiration: 2039-11-11
Also published as: CN110765997A

Abstract

The invention discloses an interactive reading realization method, which comprises the following steps: acquiring a cover page feature library and a content page feature library; forming a multimedia content library on an original page of a printed matter in advance; the method comprises the steps that an image acquisition component of a hardware reading pen is used for acquiring a local image of a cover page, and a processor is used for matching feature points extracted from the local image with a cover page feature library to obtain printed matter information; and acquiring an area image of an interested area on a content page of the printed matter by using an image acquisition component of the reading pen, extracting characteristic points, matching the extracted characteristic points with a content page characteristic library, acquiring position information of the interested area according to a matching result, acquiring a multimedia file preset at a corresponding position, and playing the multimedia file. The practical effect of 'where to read and where to shoot' is achieved. The invention does not need to prefabricate codes on books, gets rid of the limitation of prefabricate codes, and can ensure the accuracy of content broadcasting.

Description

Interactive reading realization method

Technical Field

The invention relates to the technical field of multimedia education, in particular to an interactive reading realization method.

Background

The click-reading is an intelligent reading and learning mode realized by utilizing an optical image recognition technology and a digital voice technology, which embodies perfect integration of an electronic multimedia technology and the education industry and realizes the scientific and technological human-oriented concept.

With existing point-and-read devices, it is often necessary to pre-process the book, print or paste specific codes on the book, otherwise the book contents cannot be identified. In addition, due to the limitation of the coding rule, the total coding quantity is limited, and the mode of reading and identifying codes shows obvious limitation for books with more contents.

Disclosure of Invention

The invention aims to provide an interactive reading realization method which does not need to prefabricate codes on books and gets rid of the limitation of reading contents due to the limitation of the codes.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

the utility model provides an interactive reading realization method, its is realized based on hardware reading pen, the reading pen includes pen main part, inductive switch subassembly, illumination suggestion subassembly and image acquisition subassembly, be equipped with treater and memory in the pen main part, inductive switch subassembly is located on the pen main part and be located the pen body position, the image acquisition subassembly is located on the pen main part and be located the nib position, illumination suggestion subassembly is located on the pen main part and be located near the image acquisition subassembly, inductive switch subassembly, image acquisition subassembly, illumination suggestion subassembly and memory are connected respectively the treater, this method includes the following steps:

s1, extracting characteristic points of a cover page and a content page of a printed matter in advance, so as to obtain a cover page characteristic library and a content page characteristic library, and storing the cover page characteristic library and the content page characteristic library into the memory;

s2, forming a corresponding region play multimedia content library in advance on an original page of the printed matter in a manner that certain specific regions correspond to certain multimedia files;

s3, irradiating the cover page of the printed matter by using a reading pen, collecting a partial image of the cover page by using an image collecting assembly, extracting characteristic points of the partial image by using a processor, and matching the extracted characteristic points with a characteristic library of the cover page to obtain the information of the printed matter;

s4, irradiating the page position of the content page of the printed matter by using a reading pen, acquiring a page image by using an image acquisition component, and performing OCR (optical character recognition) on numbers in the page image by using a processor to obtain page information; the method aims at reducing the searching range of the interested area in the content page feature library, improving the searching speed and the searching precision.

S5, using a reading pen to irradiate an interested region on a content page of a printed matter, controlling an image acquisition assembly to acquire a region image of the interested region by operating a finger of an operator through a sensing switch assembly, extracting feature points of the region image by a processor, matching the extracted feature points with a content page feature library, and obtaining position information of the interested region according to a matching result, wherein if the step S4 can be skipped on the premise that the searching range is not narrowed by the step S4, a clear and reliable matching result can be obtained.

S6, based on the position information of the S5, acquiring the multimedia file preset at the corresponding position in the S2, and playing the multimedia file.

Further, the feature point extraction in the steps S1, S3 and S5 is achieved by the following method:

extracting the characteristic points with unchanged scale, unchanged rotation, robust brightness variation, stable noise and visual angle variation.

And describing the feature points to obtain feature descriptors.

Preferably, if the computing power of the processor built in the reading pen is sufficient, the feature point extracting by using the key point detection algorithm is specifically as follows:

and continuously downsampling the original image to obtain a series of images with different sizes, further carrying out Gaussian filtering on the images with different scales, subtracting the two images after the Gaussian filtering of the same image with similar scales to obtain a Gaussian difference image, and carrying out extremum detection, wherein extremum points meeting curvature conditions are characteristic points.

If the computing power of the processor built in the reading pen is limited, the feature point extraction by using the key point detection algorithm is specifically as follows:

step one: a point P is selected from the image, and a circle with a radius of 3 pixels is drawn by taking the P as a circle center. If the gray value of n continuous pixel points on the circumference is larger or smaller than the gray value of P points, P is regarded as the characteristic point. Typically n is set to 12. In order to accelerate the extraction of the characteristic points and rapidly discharge the non-characteristic points, firstly, the gray values at the positions of 1, 9, 5 and 13 are detected, and if P is the characteristic point, 3 or more than 3pixel values at the four positions are all larger or smaller than the gray value of the P point. If not, this point is directly discharged.

Step two: training a decision tree by using an ID3 algorithm, inputting 16 pixels on the circumference of the feature point into the decision tree, and screening out the optimal feature point.

Step three: non-maxima suppression removes locally denser feature points.

Step four: a scale factor and the number of layers of the pyramid are set. The original image is reduced into a plurality of images according to the scale factors, and the sum of the characteristic points of the images with different scales is extracted to serve as the characteristic point of the image, so that the scale invariance of the characteristic point is realized.

Step five: calculating the centroid of the characteristic point in a radius range by using the moment, wherein the centroid is taken as r, and the direction from the coordinates of the characteristic point to the centroid is taken as a vector of the characteristic point, so that the rotation invariance of the characteristic point is realized.

And when the computing power of the built-in processor of the reading pen cannot meet the computing requirement of feature point extraction, the feature point extraction is performed by using an external processor which is connected in a wireless mode.

Preferably, the step S1 specifically includes the following substeps:

s11, extracting feature points of a cover page image of a cover page of a printed matter, carrying out hash transformation and sequencing on feature descriptors, and storing the feature descriptors in a cover page feature library;

s12, for the content page of the printed matter, firstly dividing the content page image into a group of image blocks, wherein the dividing method comprises the steps of, but not limited to, uniform dividing and selected area dividing, then extracting characteristic points of the image blocks, finally carrying out hash transformation and sequencing on characteristic descriptors, and storing the characteristic descriptors in a content page characteristic library.

Preferably, the matching of the extracted feature points with the cover page feature library in the step S3 is specifically implemented by the following method:

carrying out hash transformation and sequencing on feature descriptors corresponding to the feature points extracted from the partial image, comparing the hash value with the hash value of the feature points stored in the cover page feature library, and if the distance is smaller than a preset first threshold value, recognizing that the pair of feature points are matched;

and counting the number of the matched characteristic points, and if the number of the matched characteristic points is larger than a preset second threshold value, recognizing that the partial image is matched with the corresponding cover page image.

Preferably, the matching of the extracted feature points with the content page feature library in the step S5 is specifically implemented by the following method:

carrying out hash transformation and sequencing on feature descriptors corresponding to the feature points extracted from the region image, comparing the hash value with the hash value of the feature points stored in a content page feature library, and if the distance is smaller than a preset first threshold value, recognizing that the pair of feature points are matched;

and counting the number of the matched characteristic points, and if the number of the matched characteristic points is larger than a preset second threshold value, recognizing that the regional image is matched with the corresponding image block.

Preferably, the hash transformation adopts a locally sensitive hash function to map the multidimensional feature into a single numerical value, and the numerical value difference of the point pairs with far distance in the multidimensional space after mapping is large, and the numerical value difference of the point pairs with near distance after mapping is small.

The technical scheme is characterized in that in the using process, direct physical contact is not generated between the reading pen and the printed matter, and the distance between the reading pen and the printed matter is variable. The illumination prompt component can generate an indication light spot or light spot which can be identified by human eyes on the printed matter by illumination in the use process, and plays an indication role in selecting the region of interest by a user.

After the technical scheme is adopted, compared with the background technology, the invention has the following advantages:

1. the invention realizes the identification of the interested content area based on the mode of extracting and matching the image characteristic points, does not need to prefabricate codes on books, and gets rid of the limitation of reading the content due to the limitation of the codes.

2. The invention respectively identifies the cover page, the page number and the interested region, realizes the inquiry mode of book-page number-content position, has small data processing amount in the identification and matching process and has high processing efficiency.

3. After the characteristic point extraction operation, the invention carries out hash transformation and sequencing treatment, reduces the data volume and is convenient for improving the efficiency of the subsequent identification matching step.

Drawings

FIG. 1 is a schematic workflow diagram of the present invention;

FIG. 2 is a schematic illustration of the use of the present invention;

FIG. 3 is a flow chart of the content page of the present invention;

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Examples

The invention discloses an interactive reading realization method based on a hardware reading pen. For better understanding of the present invention, the structure of the reading pen will be described before describing the interactive reading implementation method in detail.

As shown in fig. 2, the hardware reading pen provided by the invention comprises a pen main body, a sensing switch component, an illumination prompt component and an image acquisition component, wherein a processor and a memory are arranged in the pen main body, of course, the processor and the memory can also be enhanced by using external cloud services, the sensing switch component is arranged at the pen body position of the pen main body, the user can conveniently use fingers to operate, the image acquisition component is arranged on the pen main body and is positioned at the pen point position, the illumination prompt component is near the image acquisition component, and the sensing switch component, the illumination prompt component, the image acquisition component and the memory are respectively connected with the processor.

When the reading pen is used, a user hangs hardware on a printed matter, the illumination prompting component irradiates light on the printed matter to form light spots to prompt the user to adjust the position and the range of the region of interest, the sensing switch component detects finger touch signals and transmits the finger touch signals to the processor, and the processor controls the image acquisition component to take a picture. In this embodiment, the inductive switch assembly employs a capacitive touch sensor, and the image acquisition assembly employs a camera.

1-3, the click-to-read interaction implementation method of the invention comprises the following steps:

s1, extracting feature points of a cover page and a content page of a printed matter in advance, so as to obtain a cover page feature library and a content page feature library, and storing the cover page feature library and the content page feature library into a memory. The method specifically comprises the following steps of:

s11, extracting feature points of the cover page image of the cover page of the printed matter, carrying out hash transformation and sequencing on feature descriptors, and storing the feature descriptors in a cover page feature library.

S12, for the content page of the printed matter, firstly, dividing the content page image into a group of image blocks, wherein each group of image blocks can only contain one picture, then extracting characteristic points of the image blocks, finally, carrying out hash transformation and sequencing on characteristic descriptors, and storing the characteristic descriptors in a content page characteristic library.

s3, irradiating the cover page of the brushed article by using a reading pen, collecting a partial image of the cover page by using an image collecting assembly, extracting characteristic points of the partial image by using a processor, and matching the extracted characteristic points with a characteristic library of the cover page to obtain printed article information (namely, determining which book). The above-mentioned matching of the extracted feature points with the cover page feature library is specifically realized by the following method:

S4, irradiating the page position of the content page of the printed matter by using a reading pen, acquiring a page image by using an image acquisition component, and performing OCR (optical character recognition) on numbers in the page image by using a processor to obtain page information. The method aims at reducing the searching range of the interested area in the content page feature library, improving the searching speed and the searching precision.

S5, using a reading pen to irradiate an interested region on the content page of the printed matter, collecting a region image of the interested region by the image collecting assembly, extracting feature points of the region image by the processor, matching the extracted feature points with a content page feature library, and obtaining the position information of the interested region according to a matching result. The matching of the extracted feature points with the content page feature library is specifically realized by the following method:

carrying out hash transformation and sequencing on feature descriptors corresponding to the feature points extracted from the region image, comparing the hash value with the hash value of the feature points stored in the content page feature library, and if the distance is smaller than a preset first threshold value, recognizing that the pair of feature points are matched;

If the step S4 of obtaining a clear and reliable matching result without narrowing the search range through S4 can be skipped.

The feature points involved in steps S1, S3 and S5 are extracted in order to extract feature points with constant scale, constant rotation, robust brightness variation, stable noise and viewing angle variation.

In the present embodiment, the feature point extraction operations in steps S1, S3, and S5 are implemented by the following methods:

when the calculation force of the processor arranged in the reading pen exceeds the calculation force threshold value II, the following algorithm is adopted, and the calculation force threshold value II is defined as 200MIPS in the embodiment.

a. And (5) image graying processing. The acquired image is a color image (for example, an RGB three-channel color image), and the gray-scale processing needs to be performed first to facilitate the execution of the subsequent steps. In this embodiment, the calculation formula of graying adopts:

Gray＝(R*30+G*59+B*11+50)/100

wherein Gray is a Gray value.

b. And extracting the characteristic points by using a key point detection algorithm. And continuously downsampling the original image to obtain a series of images with different sizes, further carrying out Gaussian filtering on the images with different scales, subtracting the two images after the Gaussian filtering of the same image with similar scales to obtain a Gaussian difference image, and carrying out extremum detection, wherein extremum points meeting curvature conditions are characteristic points. The gaussian difference image D (x, y, σ) operates as follows, G (x, y, σ) is a gaussian filter function, I (x, y) corresponds to the original image, and L (x, y, σ) represents the gaussian filtered image of the scale σ:

D(x，y，σ)＝(G(x，y，σ(s+1))-G(x，y，σ(s)))*I(x，y)

＝L(x，y，σ(s+1))-L(x，y,σ(s))

c. and carrying out feature point direction identification based on histogram statistics. After the gradient calculation of the feature points is completed, the histogram is used for counting the gradient and the direction of the pixels in the neighborhood. The gradient histogram divides the range of directions from 0 to 360 degrees into 18 bins (bins), with 20 degrees per bin. The peak direction of the histogram represents the main direction of the feature point. L is the scale space value of the key point, and the calculation formulas of the gradient m and the direction theta of each pixel point are as follows:

θ(x，y)＝tan ^-1 ((L(x，y+1)-L(x，y-1))/L(x+1，y)-L(x-1，y)))

and describing the feature points to obtain feature descriptors. Determining a neighborhood with the size of 21 multiplied by 21 for the feature points, and rotating the neighborhood to a main direction; calculating the horizontal gradient and the vertical gradient of the pixel points in the neighborhood, so that each feature point determines a feature descriptor with the size of 19 multiplied by 2=722; the description of the feature points includes coordinates, dimensions, directions. It should be noted that, since the obtained feature descriptors are high-dimensional (722-dimensional in this embodiment), for convenience of subsequent processing, the dimension reduction and hash transformation is performed, in this embodiment, the dimension reduction processing is performed by using the principal component analysis dimension reduction method, the dimension reduction processing is performed in 20 dimensions, and after the locally sensitive hash transformation, namely, the descriptor Ha Xihua in fig. 3, the feature descriptors in 20 dimensions are mapped into 1 32-bit floating point values. The specific operation of PCA is as follows:

firstly, constructing a characteristic matrix X by using characteristic data of a large number of acquired images, obtaining characteristic values of the matrix X, sorting the characteristic values according to the size, and obtaining characteristic vectors corresponding to the characteristic values to form a transformation matrix W ^T The original feature matrix Y is projected to the matrix Z, the high-dimensional feature matrix Y is reduced to a low-dimensional new feature matrix Z, and the new features are linear irrelevant.

When the calculation force of the processor arranged in the reading pen is lower than the second calculation force threshold and higher than the first calculation force threshold, the following algorithm is adopted, the second calculation force threshold is defined as 200MIPS, and the first calculation force threshold is defined as 80MIPS.

1. Firstly, constructing a scale pyramid;

the pyramid has n layers, and each layer has only one image; the s layer has a Scale _s =factors, factor initial scale (default 1.2), original at layer 0;

layer s image size:

2. detecting characteristic points on different scales;

a point P is selected from the image, and a circle with a radius of 3 pixels is drawn by taking the P as a circle center. If the gray value of n continuous pixel points on the circumference is larger or smaller than the gray value of P points, P is regarded as the characteristic point. Typically n is set to 12. In order to accelerate the extraction of the characteristic points and rapidly discharge the non-characteristic points, firstly, the gray values at the positions of 1, 9, 5 and 13 are detected, and if P is the characteristic point, 3 or more than 3pixel values at the four positions are all larger or smaller than the gray value of the P point. If not, this point is directly discharged. Training a decision tree by using an ID3 algorithm, inputting 16 pixels on the circumference of the feature point into the decision tree, and screening out the optimal feature point. Non-maxima suppression removes locally denser feature points.

3. Sorting characteristic point response values according to Harris angular point response values on the layer, and taking the first n characteristic points as characteristic points of the layer;

4. calculating a main direction (centroid method) of each feature point;

5. rotating the Patch of each feature point to the main direction, and performing tau test on the feature points by adopting the optimal 256 selected in the step 3 to form a 256-dimensional descriptor;

and when the calculation force of the processor arranged in the reading pen is lower than the calculation force threshold value, extracting the characteristic points by an external processor connected through wireless. The present embodiment defines the first computational force threshold as 80MIPS.

The first calculation threshold and the second calculation threshold can be adjusted according to practical application.

The specific operation of the hash change using the locality sensitive hashing function LSH is as follows:

(1) Selecting a locally sensitive hash function meeting (d 1, d2, p1, p 2) sensitivity;

(2) Determining the number L of hash tables according to the accuracy of the search result, the number K of hash functions in each table and parameters related to the sensitive hash;

(3) All data are hashed into corresponding barrels through a local sensitive hash function, and one or more hash tables are formed;

the matching calculation distance process is as follows:

and calculating the distance between the hash value of the query feature point and 2L of data in the database, wherein the distance is defined as, but not limited to, the absolute value of the two differences, and if the distance is smaller than the set second threshold value, the feature point pair is judged to be matched.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims

1. The utility model provides an interactive reading realization method, its characterized in that is based on hardware reading pen realization, the reading pen includes pen main part, inductive switch subassembly, illumination suggestion subassembly and image acquisition subassembly, be equipped with treater and memory in the pen main part, inductive switch subassembly is located on the pen main part and be located the pen body position, image acquisition subassembly is located on the pen main part and be located the nib position, illumination suggestion subassembly is located on the pen main part and be located near the image acquisition subassembly, inductive switch subassembly, image acquisition subassembly, illumination suggestion subassembly and memory are connected respectively the treater, this method includes the following steps:

s2, forming a corresponding region play multimedia content library in the mode that the set region corresponds to the multimedia file on the original page of the printed matter in advance;

s4, irradiating the page position of the content page of the printed matter by using a reading pen, acquiring a page image by using an image acquisition component, and performing OCR (optical character recognition) on numbers in the page image by using a processor to obtain page information; the method aims at reducing the searching range of the interested region in the content page feature library, improving the searching speed and the searching precision;

s5, using a reading pen to irradiate an interested region on a content page of a printed matter, controlling an image acquisition assembly to acquire a region image of the interested region by operating a finger of an operator through a sensing switch assembly, extracting characteristic points of the region image by a processor, matching the extracted characteristic points with a content page characteristic library, and obtaining position information of the interested region according to a matching result, wherein the step S4 is skipped if the step S4 of obtaining a clear and reliable matching result on the premise of not reducing a searching range through the step S4;

s6, based on the position information of the S5, acquiring a multimedia file preset at a corresponding position in the S2, and playing the multimedia file;

the feature point extraction in the steps S1, S3 and S5 is realized by the following method:

extracting feature points with unchanged scale, unchanged rotation, robust brightness variation, stable noise and visual angle variation;

describing the feature points to obtain feature descriptors;

when the calculation force of the built-in processor of the reading pen exceeds a calculation force threshold II, the characteristic points are extracted by using a key point detection algorithm specifically as follows:

continuously downsampling an original image to obtain a series of images with different sizes, further carrying out Gaussian filtering on the images with different scales, subtracting the two images after the Gaussian filtering of the same image with similar scales to obtain a Gaussian difference image, and carrying out extremum detection, wherein extremum points meeting curvature conditions are characteristic points;

when the calculation force of the processor arranged in the reading pen is lower than a second calculation force threshold value and exceeds the first calculation force threshold value, the feature point extraction by using the key point detection algorithm is specifically as follows:

step one: selecting a point P from the image, drawing a circle with a radius of 3pixel by taking the P as a circle center, and considering the P as a characteristic point if the gray value of n continuous pixel points on the circle is larger or smaller than the gray value of the P point;

step two: training a decision tree by using an ID3 algorithm, inputting 16 pixels on the circumference of the feature point into the decision tree, and screening out the optimal feature point;

step three: non-maximum suppression removes local denser feature points;

step four: setting a scale factor and the layer number of the pyramid, reducing the original image into a plurality of images according to the scale factor, and extracting the sum of characteristic points of the images with different scales as the characteristic points of the image to realize scale invariance of the characteristic points;

step five: calculating the barycenter of the characteristic point in a radius range by using the moment, wherein the barycenter is a vector formed from the coordinates of the characteristic point to the barycenter, and the vector is used as the direction of the characteristic point to realize the rotation invariance of the characteristic point;

the step S1 specifically comprises the following sub-steps:

s12, for a content page of a printed matter, firstly dividing the content page image into a group of image blocks, wherein the dividing method comprises uniform dividing and selected area dividing, then extracting characteristic points of the image blocks, and finally carrying out hash transformation and sequencing on characteristic descriptors and storing the characteristic descriptors in a content page characteristic library;

the step S3 of matching the extracted feature points with the cover page feature library is specifically realized by the following steps:

counting the number of the matched characteristic points, and if the number is larger than a preset second threshold value, recognizing that the partial image is matched with the corresponding cover page image;

the step S5 of matching the extracted feature points with the content page feature library is specifically realized by the following steps:

2. The interactive reading realization method according to claim 1, wherein the image capturing steps in steps S3, S4 and S5 do not generate direct physical contact between the reading pen and the printed matter and the distance between the reading pen and the printed matter is variable.

3. The interactive reading realization method of claim 1, wherein the illumination prompting component generates an indication light spot or light spot recognized by human eyes on the printed matter by illumination in the use process, thereby playing an indication role for a user to select the region of interest.

4. The method of claim 1, wherein the processor is a processor built in the pen body or an external processor with computing power through wireless connection.

5. The method according to claim 1, wherein the feature library in step S1 is stored in a storage device built in the pen body or in an external storage device.

6. The interactive reading implementing method according to claim 1, wherein the multimedia files in steps S2 and S6 are stored in a storage device built in the pen body or in an external storage device.

7. The interactive reading implementation method as claimed in claim 1, wherein the playing of the multimedia file in step S6 is performed by using a display screen or a speaker integrated in a reading pen, or by using a WIFI or bluetooth function of the reading pen, connecting an external intelligent terminal and performing playing by using a screen or a speaker of the external intelligent terminal, or performing playing by using a projection device integrated in the reading pen.

8. The method for implementing interactive reading according to claim 1, wherein when the computing power of the processor built in the reading pen is lower than the computing power threshold computing power, the feature point extracting by using the key point detection algorithm is specifically as follows:

feature point extraction is performed using a wirelessly connected external processor.