CN110765997A

CN110765997A - Interactive reading implementation method

Info

Publication number: CN110765997A
Application number: CN201911097147.6A
Authority: CN
Inventors: 江周平
Original assignee: Shenzhen Yikuai Interactive Network Technology Co Ltd
Current assignee: Beijing Anxin Zhitong Technology Co ltd
Priority date: 2019-11-11
Filing date: 2019-11-11
Publication date: 2020-02-07
Anticipated expiration: 2039-11-11
Also published as: CN110765997B

Abstract

The invention discloses an interactive reading implementation method, which comprises the following steps: obtaining a cover page feature library and a content page feature library; forming a multimedia content library on an original page of a printed matter in advance; acquiring a local image of the cover page by using an image acquisition component of a hardware reading pen, and matching the feature points extracted from the local image with a cover page feature library by using a processor to obtain printed matter information; and acquiring an area image of an interested area on a content page of the printed matter by using an image acquisition assembly of the reading pen to extract the characteristic points, matching the extracted characteristic points with a content page characteristic library, acquiring the position information of the interested area according to the matching result, and acquiring and playing the multimedia file preset at the corresponding position. The practical effect of 'where and where to read' is achieved. The invention does not need to prefabricate codes on books, gets rid of the limitation of the prefabricate codes and can ensure the accuracy of content broadcasting.

Description

Interactive reading implementation method

Technical Field

The invention relates to the technical field of multimedia education, in particular to an interactive reading implementation method.

Background

The point reading is an intelligent reading and learning mode realized by utilizing an optical image recognition technology and a digital voice technology, embodies the perfect integration of an electronic multimedia technology and an education industry, and realizes the people-oriented concept of science and technology.

With existing point-reading devices, it is often necessary to pre-process the book, print or attach a specific code to the book, or otherwise the contents of the book cannot be identified. In addition, due to the limitation of the encoding rule, the total encoding number is limited, and for books with more contents, the method for reading and encoding presents obvious limitation.

Disclosure of Invention

The invention aims to provide an interactive reading implementation method, which does not need to perform encoding on books, and gets rid of the limitation of point reading content due to encoding limitation.

In order to achieve the purpose, the invention adopts the following technical scheme:

the utility model provides an interactive reading implementation method, its realization based on hardware reading pen, reading pen includes pen main part, inductive switch subassembly, illumination suggestion subassembly and image acquisition subassembly, be equipped with treater and memory in the pen main part, the inductive switch subassembly is located in the pen main part and be located a body position, the image acquisition subassembly is located in the pen main part and be located the nib position, illumination suggestion subassembly is located in the pen main part and be located near the image acquisition subassembly, inductive switch subassembly, image acquisition subassembly, illumination suggestion subassembly and memory are connected respectively the treater, this method includes following steps:

s1, respectively extracting feature points of the cover page and the content page of the printed matter in advance to obtain a cover page feature library and a content page feature library, and storing the cover page feature library and the content page feature library in the memory;

s2, forming a corresponding area playing multimedia content library on the original page of the printed matter in advance in a mode that certain specific areas correspond to certain multimedia files;

s3, irradiating the cover page of the printed matter by using a reading pen, acquiring a local image of the cover page by using an image acquisition assembly, extracting characteristic points of the local image by using a processor, and matching the extracted characteristic points with a cover page characteristic library to obtain information of the printed matter;

s4, irradiating the page position of the content page of the printed matter by using the reading pen, acquiring a page image by using the image acquisition assembly, and performing OCR (optical character recognition) on the numbers in the page image by using the processor to obtain page information; the purpose of this step is to narrow the search range of the region of interest in the content page feature library, improve the search speed and improve the retrieval accuracy.

S5, irradiating an interested area on the content page of the printed matter by using a reading pen, operating the inductive switch assembly by an operator finger to control the image acquisition assembly to acquire an area image of the interested area, extracting the characteristic points of the area image by the processor, matching the extracted characteristic points with the content page characteristic library, and acquiring the position information of the interested area according to the matching result, wherein the step S4 can be skipped if the searching range is not narrowed by S4.

And S6, acquiring the preset multimedia file at the corresponding position in S2 based on the position information of S5, and playing the multimedia file.

Further, the feature point extraction in the steps S1, S3 and S5 is realized by the following method:

and extracting the characteristic points with unchanged scale, unchanged rotation, robustness of brightness change and stable noise and visual angle change.

And describing the feature points to obtain a feature descriptor.

Preferably, if the computing power of the processor built in the reading pen is sufficient, the extracting the feature points by using the key point detection algorithm specifically comprises:

continuously carrying out step-down sampling on an original image to obtain a series of images with different sizes, further carrying out Gaussian filtering on the images with different scales, subtracting two images after similar-scale Gaussian filtering of the same image to obtain a Gaussian difference image, carrying out extreme value detection, wherein an extreme value point meeting a curvature condition is a characteristic point.

If the computing power of the processor arranged in the reading pen is limited, the method for extracting the feature points by using the key point detection algorithm specifically comprises the following steps:

the method comprises the following steps: a point P is selected from the image, and a circle with the radius of 3 pixels is drawn by taking the point P as the center of the circle. If the gray value of n continuous pixel points on the circumference is larger or smaller than the gray value of the P point, the P point is considered as the characteristic point. Typically n is set to 12. In order to accelerate the extraction of the feature points, the non-feature points are quickly removed, the gray values at the positions 1, 9, 5 and 13 are firstly detected, if P is the feature point, 3 or more than 3pixel values at the four positions are all larger or smaller than the gray value of the P point. If not, the point is directly drained.

Step two: and (3) training a decision tree by using an ID3 algorithm, inputting 16 pixels on the circumference of the feature point into the decision tree, and screening out the optimal feature point.

Step three: non-maxima suppress removal of locally denser feature points.

Step four: setting a scale factor and the pyramid layer number. The original image is reduced into a plurality of images according to the scale factor, and the sum of the characteristic points of the plurality of images with different scales is extracted to be used as the characteristic point of the image, so that the scale invariance of the characteristic point is realized.

Step five: and (3) calculating the feature point by using the moment, wherein r is the center of mass in the radius range, and forming a vector from the coordinates of the feature point to the center of mass to serve as the direction of the feature point so as to realize the rotation invariance of the feature point.

And when the calculation power of a processor arranged in the reading pen cannot meet the requirement of feature point extraction calculation, the feature point extraction is carried out by using an external processor in wireless connection.

Preferably, the step S1 specifically includes the following sub-steps:

s11, extracting feature points of the cover page image according to the cover page of the printed matter, performing hash transformation and sorting on the feature descriptors, and storing the feature descriptors in a cover page feature library;

s12, aiming at the content page of the printed matter, firstly dividing the content page image into a group of image blocks, wherein the dividing method comprises but is not limited to uniform division and selected area division, then extracting the feature points of the image blocks, and finally performing Hash transformation and sequencing on the feature descriptors and storing the Hash transformation and sequencing to the content page feature library.

Preferably, the matching of the extracted feature points with the cover page feature library in the step S3 is specifically realized by the following method:

performing hash transformation and sorting on feature descriptors corresponding to the feature points extracted from the local image, then comparing the hash value with the hash value of the feature points stored in the cover page feature library, and if the distance is smaller than a preset first threshold value, determining that the pair of feature points are matched;

and counting the number of the matched feature points, and if the number of the matched feature points is greater than a preset second threshold value, determining that the local image is matched with the corresponding cover page image.

Preferably, the matching of the extracted feature points with the content page feature library in the step S5 is specifically realized by the following method:

performing hash transformation and sorting on feature descriptors corresponding to the feature points extracted from the regional image, then comparing the hash value with the hash value of the feature points stored in a content page feature library, and if the distance is smaller than a preset first threshold value, determining that the pair of feature points are matched;

and counting the number of the matched feature points, and if the number of the matched feature points is greater than a preset second threshold, determining that the area image is matched with the corresponding image block.

Preferably, the hash transformation uses a locality sensitive hash function to map the multidimensional characteristics into a single value, and it is satisfied that a point pair with a far distance in the multidimensional space has a large value difference after mapping, and a point pair with a near distance has a small value difference after mapping.

The technical scheme has the remarkable characteristics that in the using process, direct physical contact is not generated between the reading pen and a printed matter, and the distance between the reading pen and the printed matter is variable. The illumination prompting component generates an indicating light point or light spot which can be identified by human eyes on the printed matter by using illumination in the using process, and plays a role in indicating the region of interest selected by a user.

After adopting the technical scheme, compared with the background technology, the invention has the following advantages:

1. the method and the device realize the identification of the interested content area based on the image characteristic point extraction and matching mode, do not need to prefabricate codes on books, and get rid of the limitation of point reading content due to the coding limitation.

2. The invention respectively identifies the cover page, the page number and the region of interest, realizes the query mode of book-page number-content position, and has small data processing amount and high processing efficiency in the identification and matching process.

3. According to the invention, after the feature point extraction operation, Hash conversion and sorting processing are carried out, the data volume is reduced, and the efficiency of the subsequent identification and matching steps is conveniently improved.

Drawings

FIG. 1 is a schematic flow chart of the present invention;

FIG. 2 is a schematic representation of the use of the present invention;

FIG. 3 is a flow chart of a content page according to the present invention;

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Examples

The invention discloses an interactive reading implementation method which is implemented based on a hardware reading pen. Before describing the interactive reading implementation method in detail, the structure of the reading pen is explained to facilitate better understanding of the present invention.

As shown in fig. 2, the hardware reading pen according to the present invention includes a pen main body, an inductive switch component, an illumination prompting component, and an image collecting component, wherein a processor and a memory are disposed in the pen main body, and certainly, the processor and the memory can also be enhanced by using an external cloud service, the inductive switch component is disposed at a pen body position of the pen main body, so as to facilitate a user to use a finger to operate, the image collecting component is disposed on the pen main body and located at a pen point position, the illumination prompting component is disposed near the image collecting component, and the inductive switch component, the illumination prompting component, the image collecting component, and the memory are respectively connected to the processor.

When the reading pen is used, a user suspends hardware on a printed matter, the illumination prompting component irradiates light on the printed matter to form light spots to prompt the user to adjust the position and the range size of an area of interest, the inductive switch component detects a finger touch signal and transmits the finger touch signal to the processor, and the processor controls the image acquisition component to take pictures. In this embodiment, the inductive switch assembly employs a capacitive touch sensor, and the image acquisition assembly employs a camera.

With reference to fig. 1-3, the method for implementing read-on-demand interaction of the present invention comprises the following steps:

and S1, respectively extracting feature points of the cover page and the content page of the printed matter in advance, thereby obtaining a cover page feature library and a content page feature library, and storing the cover page feature library and the content page feature library in a memory. The method comprises the following steps:

and S11, extracting the feature points of the cover page image according to the cover page of the printed matter, performing hash transformation and sequencing on the feature descriptors, and storing the feature descriptors in a cover page feature library.

And S12, aiming at the content page of the printed matter, firstly dividing the content page image into a group of image blocks, wherein each group of image blocks can only contain one image, then extracting the feature points of the image blocks, and finally performing Hash transformation and sequencing on the feature descriptors and storing the Hash transformation and sequencing to the content page feature library.

s3, irradiating the cover page of the printed product by using a reading pen, acquiring a local image of the cover page by using an image acquisition assembly, extracting the characteristic points of the local image by using a processor, and matching the extracted characteristic points with a cover page characteristic library to obtain printed product information (namely, which book is determined). The matching of the extracted feature points with the cover page feature library is specifically realized by the following method:

performing Hash transformation and sorting on feature descriptors corresponding to feature points extracted from the local image, then comparing the Hash value with the Hash value of the feature points stored in the cover page feature library, and if the distance is smaller than a preset first threshold value, determining that the pair of feature points are matched;

S4, irradiating the page position of the content page of the printed matter by using the reading pen, acquiring the page image by the image acquisition assembly, and performing OCR recognition on the numbers in the page image by the processor to obtain page information. The purpose of this step is to narrow the search range of the region of interest in the content page feature library, improve the search speed and improve the retrieval accuracy.

S5, irradiating the interested area on the content page of the printed matter by using the reading pen, acquiring the area image of the interested area by the image acquisition assembly, extracting the characteristic points of the area image by the processor, matching the extracted characteristic points with the content page characteristic library, and acquiring the position information of the interested area according to the matching result. The matching of the extracted feature points with the content page feature library is specifically realized by the following method:

carrying out Hash transformation and sorting on feature descriptors corresponding to feature points extracted from the regional image, then comparing the Hash value with the Hash value of the feature points stored in a content page feature library, and if the distance is smaller than a preset first threshold value, determining that the pair of feature points are matched;

The step S4 may be skipped if an unambiguous and reliable matching result can be obtained without narrowing the search range by S4.

The feature point extraction involved in steps S1, S3, and S5 is performed to extract feature points with unchanged scale, rotation, robustness to luminance change, and stability to noise and view angle change.

In the present embodiment, the feature point extracting operations involved in steps S1, S3, and S5 are implemented by the following method:

when the calculated power of the processor built in the reading pen exceeds the calculated power threshold two, the following algorithm is adopted, and the calculated power threshold two is defined as 200MIPS in the embodiment.

a. And (5) carrying out image graying processing. Therefore, the collected image is a color image (for example, an RGB three-channel color image), and a graying process is required to be performed first, so as to facilitate the execution of the subsequent steps. In this embodiment, the formula for calculating graying is as follows:

Gray＝(R*30+G*59+B*11+50)/100

wherein Gray is a Gray value.

b. And extracting the characteristic points by using a key point detection algorithm. Continuously carrying out step-down sampling on an original image to obtain a series of images with different sizes, further carrying out Gaussian filtering on the images with different scales, subtracting two images after similar-scale Gaussian filtering of the same image to obtain a Gaussian difference image, carrying out extreme value detection, wherein an extreme value point meeting a curvature condition is a characteristic point. The gaussian difference image D (x, y, σ) operates as follows, G (x, y, σ) is a gaussian filter function, I (x, y) corresponds to the original image, L (x, y, σ) represents the gaussian filtered image at the scale σ:

D(x，y，σ)＝(G(x，y，σ(s+1))-G(x，y，σ(s)))*I(x，y)

＝L(x，y，σ(s+1))-L(x，y,σ(s))

c. and identifying the direction of the feature points based on the histogram statistics. After the gradient calculation of the feature points is completed, the gradient and the direction of the pixels in the neighborhood are counted by using the histogram. The gradient histogram divides the direction range of 0-360 degrees into 18 bins, with 20 degrees per bin. The direction of the peak of the histogram represents the dominant direction of the feature point. L is a scale space value where the key point is located, and the gradient m and the direction theta of each pixel point are calculated according to the following formula:

θ(x，y)＝tan^-1((L(x，y+1)-L(x，y-1))/L(x+1，y)-L(x-1，y)))

and describing the feature points to obtain a feature descriptor. Determining a neighborhood with the size of 21 multiplied by 21 for the feature point, and rotating the neighborhood to the main direction; calculating the horizontal gradient and the vertical gradient of pixel points in the neighborhood, thus determining a characteristic descriptor with the size of 19 multiplied by 2 to 722 dimensions for each characteristic point; the description of the feature points includes coordinates, dimensions, and directions. It should be noted here that, since the obtained feature descriptor is high-dimensional (722 dimensions in this embodiment), for convenience of subsequent processing, dimension reduction and hash transformation are performed, in this embodiment, a principal component analysis dimension reduction method is used to perform dimension reduction processing, the dimension reduction processing is 20 dimensions, and after the locality sensitive hash transformation, that is, hashing of the descriptor in fig. 3, the 20-dimensional feature descriptor is mapped to 1 32-bit floating point value. The specific operation of the PCA is as follows:

firstly, using characteristic data of a large number of collected images to construct a characteristic matrix X, obtaining characteristic values of the matrix X, sorting the characteristic values according to sizes, obtaining characteristic vectors corresponding to the characteristic values to construct a transformation matrix W, and under the condition of the existing transformation matrix W, for any one piece of characteristic data Y of the collected images, enabling Z to be YW^TThe original feature matrix Y is projected to the matrix Z, the high-dimensional feature matrix Y is reduced to a low-dimensional new feature matrix Z, and the new features are linearly independent.

When the calculated power of the processor built in the reading pen is lower than the second calculated power threshold and higher than the first calculated power threshold, the following algorithm is adopted, and in the embodiment, the second calculated power threshold is defined to be 200MIPS, and the first calculated power threshold is defined to be 80 MIPS.

1. Firstly, constructing a scale pyramid;

the pyramid has n layers, and each layer only has one image; scale of the s-th layer_sFactors, Factor initial scale (default 1.2), original image at layer 0;

size of image of s-th layer:

2. detecting feature points are adopted on different scales;

a point P is selected from the image, and a circle with the radius of 3 pixels is drawn by taking the point P as the center of the circle. If the gray value of n continuous pixel points on the circumference is larger or smaller than the gray value of the P point, the P point is considered as the characteristic point. Typically n is set to 12. In order to accelerate the extraction of the feature points, the non-feature points are quickly removed, the gray values at the positions 1, 9, 5 and 13 are firstly detected, if P is the feature point, 3 or more than 3pixel values at the four positions are all larger or smaller than the gray value of the P point. If not, the point is directly drained. And (3) training a decision tree by using an ID3 algorithm, inputting 16 pixels on the circumference of the feature point into the decision tree, and screening out the optimal feature point. Non-maxima suppress removal of locally denser feature points.

3. Sorting the response values of the characteristic points on the layer according to Harris angular point response values, and taking the first n characteristic points as the characteristic points of the layer;

4. calculating the principal direction (centroid method) of each feature point;

5. rotating the Patch of each feature point to the main direction, and performing tau test on the feature points by adopting the optimal 256 selected in the step 3 to form a 256-dimensional descriptor;

when the calculation force of a processor built in the reading pen is lower than the calculation force threshold value, the characteristic point extraction is carried out by an external processor connected wirelessly. This embodiment defines the computational threshold one as 80 MIPS.

The first computing force threshold and the second computing force threshold can be adjusted according to practical application.

The specific operation of the partially sensitive hash function LSH for hash change is as follows:

(1) selecting a locality sensitive hash function satisfying sensitivity (d1, d2, p1, p 2);

(2) determining the number L of hash tables, the number K of hash functions in each table and parameters related to the sensitive hashes according to the accuracy of the search results;

(3) hashing all data into corresponding buckets through a locality sensitive hash function to form one or more hash tables;

the matching calculation distance process is as follows:

and calculating the distance between the hash value of the query feature point and 2L data in the database, wherein the distance is defined as but not limited to the absolute value of the difference between the two numbers, and if the distance is smaller than a set second threshold, the feature point is judged to be matched.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. The utility model provides an interactive reading implementation method, its characterized in that, it is realized based on hardware reading pen, reading pen includes pen main part, inductive switch subassembly, illumination suggestion subassembly and image acquisition subassembly, be equipped with treater and memory in the pen main part, the inductive switch subassembly is located in the pen main part and be located a body position, the image acquisition subassembly is located in the pen main part and be located the nib position, illumination suggestion subassembly is located in the pen main part and be located near the image acquisition subassembly, inductive switch subassembly, image acquisition subassembly, illumination suggestion subassembly and memory are connected respectively the treater, and the method includes following steps:

2. The interactive reading method of claim 1, wherein the image capturing steps of steps S3, S4 and S5 do not generate direct physical contact between the reading pen and the printed matter and the distance between the reading pen and the printed matter is variable.

3. The interactive reading implementation method of claim 1, wherein the illumination prompting component generates a pointing light point or a light spot on the printed matter during the use process, the pointing light point or the light spot being recognizable by human eyes, and the pointing light point or the light spot is used for pointing the region of interest selected by the user.

4. The interactive reading implementation method of claim 1, wherein the feature point extraction in steps S1, S3 and S5 is implemented by:

And describing the feature points to obtain a feature descriptor.

5. The interactive reading implementation method of claim 1, wherein the processor is a built-in processor of the pen body or an external processor with computing power connected wirelessly.

6. The interactive reading implementation method of claim 1, wherein the feature library of step S1 is stored in a storage device built in the pen body or stored in an external storage device.

7. The interactive reading method of claim 1, wherein the multimedia files of steps S2 and S6 are stored in a storage device built in the pen body or stored in an external storage device.

8. The method for implementing interactive reading of claim 1, wherein the playing of the multimedia file in step S6 is performed by using a display screen or a speaker integrated in the reading pen, or by using a WIFI or Bluetooth function of the reading pen, connecting to an external intelligent terminal and playing by using a screen or a speaker of the external intelligent terminal, or by using a projection device integrated in the reading pen.

9. The interactive reading implementation method of claim 4, wherein when the computational power of the processor built in the reading pen exceeds the computational power threshold two, the extracting the feature points by using the key point detection algorithm specifically comprises:

10. The interactive reading implementation method of claim 4, wherein when the computational power of the processor built in the reading pen is lower than the computational power threshold two and exceeds the computational power threshold one, the extracting of the feature points by using the key point detection algorithm specifically comprises:

the method comprises the following steps: a point P is selected from the image, and a circle with the radius of 3 pixels is drawn by taking the point P as the center of the circle. If the gray value of n continuous pixel points on the circumference is larger or smaller than the gray value of the P point, the P point is considered as the characteristic point.

Step three: non-maxima suppress removal of locally denser feature points.

11. The interactive reading implementation method of claim 4, wherein when the computational power of a processor built in the reading pen is lower than the computational power threshold computational power, the extracting of the feature points by using the key point detection algorithm specifically comprises:

feature point extraction is performed using a wirelessly connected external processor.

12. The method for implementing interactive reading as claimed in claim 1, wherein the step S1 specifically includes the following sub-steps:

s11, extracting feature points of the cover page image according to the cover page of the printed matter, performing hash transformation and sequencing on the feature descriptors, and storing the feature descriptors in a cover page feature library;

and S12, aiming at the content page of the printed matter, firstly dividing the content page image into a group of image blocks, wherein the dividing method comprises uniform division and selected area division, then extracting the feature points of the image blocks, and finally performing Hash transformation and sequencing on the feature descriptors and storing the Hash transformation and sequencing to the content page feature library.

13. The method for implementing interactive reading as claimed in claim 1, wherein the step S3 of matching the extracted feature points with the cover page feature library is implemented by:

14. The interactive reading implementation method of claim 1, wherein the matching of the extracted feature points with the content page feature library in the step S5 is implemented by: