CN113450361A - Crawler image processing method and device, computer equipment and storage medium - Google Patents

Crawler image processing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113450361A
CN113450361A CN202110528247.0A CN202110528247A CN113450361A CN 113450361 A CN113450361 A CN 113450361A CN 202110528247 A CN202110528247 A CN 202110528247A CN 113450361 A CN113450361 A CN 113450361A
Authority
CN
China
Prior art keywords
picture
size
preset
original
compressed picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110528247.0A
Other languages
Chinese (zh)
Other versions
CN113450361B (en
Inventor
宁林林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yingxin Computer Technology Co Ltd
Original Assignee
Shandong Yingxin Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yingxin Computer Technology Co Ltd filed Critical Shandong Yingxin Computer Technology Co Ltd
Priority to CN202110528247.0A priority Critical patent/CN113450361B/en
Publication of CN113450361A publication Critical patent/CN113450361A/en
Application granted granted Critical
Publication of CN113450361B publication Critical patent/CN113450361B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Geometry (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Television Signal Processing For Recording (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a crawler image processing method and device, computer equipment and a storage medium. The method comprises the following steps: crawling pictures on an internet webpage by using a web crawler and loading the pictures into a memory to obtain original pictures; compressing and adjusting the original picture based on the occupied byte amount and size of the original picture, the preset byte amount and the preset size to generate a compressed picture and storing the compressed picture in a release directory; and loading the compressed picture into a memory and carrying out thumbnail cutting operation to generate a thumbnail and correspondingly storing the thumbnail to the publishing directory. The scheme of the invention realizes automatic compression of the oversized picture, saves the local application space, parameterises the size of the file in the compression process, and can also automatically generate the thumbnail with uniform size.

Description

Crawler image processing method and device, computer equipment and storage medium
Technical Field
The invention relates to the technical field of crawler image processing, in particular to a crawler image processing method and device, computer equipment and a storage medium.
Background
The content aggregation type internet application is often limited by application scale, network resources and space cost, and the content of the same industry is reprinted and released on the premise that the original content of the application is limited. Besides manual reprinting, the crawler system is the most common content reprinting tool, and the crawler system is used for efficiently reprinting information such as texts, multimedia and the like published by the application in the same industry on the Internet, so that the content richness of local application can be effectively improved, and more users are attracted to use. The crawler system crawls webpage data in the Internet, stores the data and then uses the data further. The webpages contain different types of data such as texts, pictures, audio and video, and the crawler system crawls and stores various types of data respectively due to different loading modes of the different types of data in the webpages.
At present, traditional crawler system directly publishes the show with data information at local application after downloading webpage data to local, but because the picture size that crawls from a plurality of target website differs, and is not of uniform size, and the picture quantity in the target website page can be many or few moreover, and is uncontrolled (some pages may contain hundreds of thousands of pictures), local application system homepage if directly publish the show and can have following problem: (1) if the picture is displayed in the original size, the picture may occupy a large page space due to the overlarge size, and the appearance is affected. (2) If the picture is constrained in size, the original picture is stretched and distorted. (3) The hard disk space is wasted when a part of the picture files are too large, for example, a high-definition picture occupies a hard disk space of several tens of Megabytes (MB), and the waste is particularly obvious when the number of the crawled pictures is too large. (4) Without the thumbnail, the list page cannot preview the newsletter. In addition, to tensile distortion, extravagant space problem, traditional crawler system can solve after crawling the picture after manual handling, but need the art designer to tailor the picture earlier, and website editors can just reprint the issue, need consume more manpower to the treatment effeciency is lower, therefore needs urgent need to improve.
Disclosure of Invention
In view of the above, there is a need to provide a crawler image processing method, apparatus, computer device and storage medium.
According to a first aspect of the present invention, there is provided a crawler image processing method, the method including:
crawling pictures on an internet webpage by using a web crawler and loading the pictures into a memory to obtain original pictures;
compressing and adjusting the original picture based on the occupied byte amount and size of the original picture, the preset byte amount and the preset size to generate a compressed picture and storing the compressed picture in a release directory;
and loading the compressed picture into a memory and carrying out thumbnail cutting operation to generate a thumbnail and correspondingly storing the thumbnail to the publishing directory.
In one embodiment, the step of performing compression adjustment on the original picture based on the occupied byte amount and size of the original picture, the preset byte amount, and the preset size to generate a compressed picture, and storing the compressed picture in the distribution directory includes:
performing pixel adjustment on an original picture based on the occupied byte quantity and the preset byte quantity of the original picture to generate a first compressed picture;
and carrying out size adjustment on the first compressed picture based on the size and the preset size of the first compressed picture to generate a second compressed picture and storing the second compressed picture in a release directory.
In one embodiment, the step of performing pixel adjustment on the original picture based on the occupied byte amount and the preset byte amount of the original picture to generate a first compressed picture comprises:
calculating the file capacity of the original picture to obtain the occupied byte quantity;
comparing the byte quantity occupied by the original picture with a preset byte quantity;
and in response to the occupied byte amount of the original picture exceeding the preset byte amount, adjusting the pixels of the original picture to a preset value to generate the first compressed picture.
In one embodiment, the resizing the first compressed picture based on the size of the first compressed picture and a preset size to generate a second compressed picture and storing the second compressed picture in a distribution directory includes:
calculating the transverse size and the longitudinal size of the first compressed picture, and the ratio of the transverse size to the longitudinal size to obtain an original ratio;
in response to that the transverse size is larger than a first preset value and the longitudinal size is smaller than or equal to a second preset value, modifying the transverse size of the first compressed picture into the first preset value, and adjusting the longitudinal size according to the original proportion;
in response to that the longitudinal size is larger than a second preset value and the transverse size is smaller than or equal to a first preset value, modifying the longitudinal size of the first compressed picture into the second preset value, and adjusting the transverse size according to the original proportion;
in response to that the transverse size is larger than a first preset value, the longitudinal size is larger than a second preset value, and the original proportion is larger than 1, modifying the transverse size of the first compressed picture into the first preset value, and adjusting the longitudinal size according to the original proportion;
in response to that the transverse size is larger than a first preset value, the longitudinal size is larger than a second preset value, and the original proportion is smaller than 1, modifying the longitudinal size of the first compressed picture into the second preset value, and adjusting the transverse size according to the original proportion;
and taking the first compressed picture after the size adjustment as the second compressed picture and writing the second compressed picture into a release directory.
In one embodiment, the step of loading the compressed picture into the memory and performing thumbnail cropping operation to generate a thumbnail and correspondingly storing the thumbnail in the distribution directory includes:
loading a second compressed picture into a memory, and calculating the transverse size and the longitudinal size of the second compressed picture;
in response to the fact that the original proportion is larger than the preset horizontal-vertical proportion, calculating a new horizontal size according to the preset horizontal-vertical proportion by taking the vertical length of the second compressed picture as a standard, and horizontally cutting the second compressed picture according to the newly calculated new horizontal size;
in response to the original proportion being smaller than the preset horizontal-vertical proportion, calculating a new longitudinal size according to the preset horizontal-vertical proportion by taking the horizontal length of the second compressed picture as a standard, and cutting the longitudinal direction of the second compressed picture according to the newly calculated new longitudinal size;
and adjusting the size of the cut second compressed picture according to the preset transverse size or the preset longitudinal size to generate a thumbnail and write the thumbnail into the release directory.
In one embodiment, the method further comprises:
establishing a mapping relation between the compressed pictures in the release directory and the second compressed pictures;
displaying the thumbnail in the release directory in response to receiving the release command;
and in response to receiving a display command of a certain thumbnail, determining a second compressed picture corresponding to the certain thumbnail based on the mapping relation between the thumbnail and the second compressed picture and displaying the second compressed picture.
In one embodiment, the method further comprises:
adjusting the generated second compressed picture and the thumbnail into a preset format before storing the second compressed picture and the thumbnail into a release directory;
wherein the preset format is a jpeg format and/or a png format.
According to a second aspect of the present invention, there is provided a crawler picture processing apparatus, the apparatus comprising:
the crawling module is used for crawling pictures on internet webpages by using a web crawler and loading the pictures into the memory to obtain original pictures;
the compression module is used for compressing and adjusting the original picture based on the occupied byte amount and size of the original picture, the preset byte amount and the preset size to generate a compressed picture and storing the compressed picture in the release directory;
and the thumbnail module is used for loading the compressed picture into the memory and carrying out thumbnail cutting operation to generate a thumbnail and correspondingly storing the thumbnail to the publishing directory.
According to a third aspect of the present invention, there is also provided a computer apparatus comprising:
at least one processor; and
the memory stores a computer program capable of running on the processor, and the processor executes the crawler image processing method when executing the program.
According to a fourth aspect of the present invention, there is also provided a computer-readable storage medium storing a computer program which, when executed by a processor, performs the aforementioned crawler picture processing method.
The method for processing the crawler picture comprises the steps of crawling pictures on an internet webpage by using a web crawler, loading the pictures into a memory to obtain original pictures, compressing and adjusting the original pictures based on the occupied byte amount and size, the preset byte amount and the preset size of the original pictures to generate compressed pictures, storing the compressed pictures into a release directory, loading the compressed pictures into the memory, carrying out thumbnail cutting operation to generate thumbnails, and correspondingly storing the thumbnails into the release directory, so that the overlarge pictures are automatically compressed, the local application space is saved, the file size is parameterized during the compression process, parameters can be flexibly set according to the display requirements for displaying local applications and the size of a hard disk capacity, thumbnails can be automatically generated, and the pictures crawled by a crawler system can be directly used for releasing and displaying local application homepages through the processing, manual modification is not needed, and the processing efficiency of the crawler pictures is remarkably improved.
In addition, the invention also provides a crawler image processing device, a computer device and a computer readable storage medium, which can also achieve the technical effects and are not repeated herein.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a crawler image processing method according to an embodiment of the present invention;
FIG. 2 is a schematic view of a crawler image processing flow according to another embodiment of the present invention;
fig. 3 is a schematic structural diagram of a crawler image processing apparatus according to another embodiment of the present invention;
fig. 4 is an internal structural view of a computer device according to another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
In an embodiment, referring to fig. 1, the present invention provides a method for processing a crawler image, including the following steps:
s100, crawling pictures on an Internet webpage by using a web crawler and loading the pictures into a memory to obtain original pictures;
s100, compressing and adjusting the original picture based on the occupied byte amount and size of the original picture, a preset byte amount and a preset size to generate a compressed picture and storing the compressed picture in a release directory;
it should be noted that the preset byte amount and the preset size can both flexibly set parameters according to the display requirements of the display local application and the size of the hard disk capacity, for example, when the storage capacity is large, the preset byte amount can be set to be relatively high, and the preset size can also be set to be relatively large, so that the two are not limited to a specific numerical value.
And S300, loading the compressed picture into a memory, carrying out thumbnail cutting operation to generate a thumbnail and correspondingly storing the thumbnail to the release directory.
The method for processing the crawler picture comprises the steps of crawling pictures on an internet webpage by using a web crawler, loading the pictures into a memory to obtain original pictures, compressing and adjusting the original pictures based on the occupied byte amount and size, the preset byte amount and the preset size of the original pictures to generate compressed pictures, storing the compressed pictures into a release directory, loading the compressed pictures into the memory, carrying out thumbnail cutting operation to generate thumbnails, and correspondingly storing the thumbnails into the release directory, so that the overlarge pictures are automatically compressed, the local application space is saved, the file size is parameterized during the compression process, parameters can be flexibly set according to the display requirements for displaying local applications and the size of a hard disk capacity, thumbnails can be automatically generated, and the pictures crawled by a crawler system can be directly used for releasing and displaying local application homepages through the processing, manual modification is not needed, and the processing efficiency of the crawler pictures is remarkably improved.
In another embodiment, the foregoing step S200 specifically includes the following sub-steps:
s210, performing pixel adjustment on an original picture based on the occupied byte quantity and the preset byte quantity of the original picture to generate a first compressed picture;
preferably, the step of generating the first compressed picture is as follows:
s211, calculating the file capacity of the original picture to obtain the occupied byte amount;
s212, comparing the byte quantity occupied by the original picture with a preset byte quantity;
s213, in response to the occupied byte amount of the original picture exceeding the preset byte amount, adjusting the pixels of the original picture to a preset value to generate the first compressed picture.
S220, adjusting the size of the first compressed picture based on the size of the first compressed picture and a preset size to generate a second compressed picture and storing the second compressed picture in a distribution directory.
Preferably, the step of generating the second compressed picture is as follows:
s221, calculating the transverse size and the longitudinal size of the first compressed picture, and the ratio of the transverse size to the longitudinal size to obtain an original proportion;
s222, in response to the fact that the transverse size is larger than a first preset value and the longitudinal size is smaller than or equal to a second preset value, modifying the transverse size of the first compressed picture into the first preset value, and adjusting the longitudinal size according to the original proportion;
s223, in response to that the longitudinal size is larger than a second preset value and the transverse size is smaller than or equal to a first preset value, modifying the longitudinal size of the first compressed picture into the second preset value, and adjusting the transverse size according to the original proportion;
s224, in response to that the transverse size is larger than a first preset value, the longitudinal size is larger than a second preset value, and the original proportion is larger than 1, modifying the transverse size of the first compressed picture into the first preset value, and adjusting the longitudinal size according to the original proportion in the same proportion;
s225, in response to the fact that the transverse size is larger than a first preset value, the longitudinal size is larger than a second preset value, and the original proportion is smaller than 1, modifying the longitudinal size of the first compressed picture into the second preset value, and adjusting the transverse size according to the original proportion in the same proportion;
s226, the first compressed picture after the size adjustment is used as the second compressed picture and written into the distribution directory.
In another embodiment, the foregoing step S300 specifically includes the following sub-steps:
s310, loading a second compressed picture into a memory, and calculating the transverse size and the longitudinal size of the second compressed picture;
s320, in response to the fact that the original proportion is larger than the preset horizontal-vertical proportion, calculating a new horizontal size according to the preset horizontal-vertical proportion by taking the vertical length of the second compressed picture as a standard, and horizontally cutting the second compressed picture according to the newly calculated new horizontal size;
s330, in response to the fact that the original proportion is smaller than the preset horizontal-vertical proportion, calculating a new longitudinal size according to the preset horizontal-vertical proportion by taking the horizontal length of the second compressed picture as a standard, and cutting the longitudinal of the second compressed picture according to the newly calculated new longitudinal size;
s330, the second compressed picture after being cut is subjected to size adjustment according to the preset transverse size or the preset longitudinal size so as to generate a thumbnail and write the thumbnail into the release directory.
In a further embodiment, on the basis of the previous embodiment, the method further comprises:
s410, establishing a mapping relation between the compressed picture in the release directory and the second compressed picture;
s420, in response to receiving the release command, displaying the thumbnail in the release directory;
and S430, in response to receiving a display command for a certain thumbnail, determining a second compressed picture corresponding to the certain thumbnail based on the mapping relation between the thumbnail and the second compressed picture and displaying the second compressed picture.
For example, the second compressed image and the thumbnail are stored in the release directory during storage, so that the thumbnail needs to be displayed on the pictures in the release directory for convenience in display, and the thumbnail is further processed on the second compressed image, so that each thumbnail inevitably has the uniquely corresponding second compressed image, and when the user needs to further view the content of the thumbnail, the corresponding second compressed image is displayed for convenience in further detailed viewing and obtaining detailed content.
In a further embodiment, on the basis of the previous embodiment, the method further comprises:
s500, adjusting the generated second compressed picture and the thumbnail into a preset format before storing the second compressed picture and the thumbnail into a release directory;
wherein the preset format is a jpeg format and/or a png format.
In another embodiment, for facilitating understanding of the technical solution of the present invention, please refer to fig. 2, which is described in detail below by taking a complete processing procedure of a certain crawler picture as an example, where the crawler picture may be a picture in any webpage, and the specific processing steps are as follows:
step one, downloading an original picture to perform the following operations:
a) the picture data is loaded to the memory and,
b) the size of the original picture is calculated,
c) if the picture file is larger than the maximum value of a single picture preset by the system, for example, the maximum value of the capacity of the single picture can be set to be 1M, and if the picture file exceeds 1M, the step i is carried out;
and i, according to the preset percentage of the system, carrying out geometric ratio resizing on the image quality, and compressing the file size.
ii, calculating the size of the picture volume again, and returning to the step b).
d) And e) if the volume of the picture file is smaller than the maximum value of a single picture preset by the system, entering the step e).
e) And clearing the temporary file possibly generated in the step and entering the step two.
And step two, calculating the size of the picture.
a) If the transverse size of the picture is larger than the transverse preset value and the longitudinal size of the picture is smaller than the longitudinal preset value, calculating the proportion of the original picture, modifying the transverse size of the picture into the transverse preset value, reducing the longitudinal size in equal proportion, and entering the step 3.
b) If the longitudinal size of the picture is larger than the longitudinal preset value and the transverse size of the picture is smaller than the transverse preset value, calculating the proportion of the original picture, modifying the longitudinal size of the picture into the longitudinal preset value, reducing the transverse size in equal proportion, and entering the step 3.
c) And if the horizontal and vertical sizes of the picture are larger than the corresponding preset values, calculating the horizontal and vertical proportion and resetting.
d) And if the horizontal and vertical sizes of the picture are smaller than the self-corresponding preset values, directly entering the step 3.
And 3, converting the picture data into a specified format code according to a picture format (such as jpeg, png and the like) preset by the system, and writing the corrected original picture data into a specified release directory from a memory.
And step four, loading the picture obtained in the step three into a memory, calculating the horizontal heald proportion of the picture size, cutting the picture, resetting the size and generating a thumbnail. Since most of the focuses of the content pictures are focused on the center of the picture, the effect of the thumbnail is based on the center of the picture, and the center of the picture should be used as a reference when the picture is cut. And processing according to the picture size.
a) If the horizontal proportion of the original picture exceeds the thumbnail proportion specified by the system, the horizontal size of the picture is calculated according to the preset proportion by taking the vertical size of the picture as a standard, and the picture is cut according to the calculated coordinates to be in accordance with the thumbnail size set by the system.
b) If the longitudinal proportion of the original picture exceeds the thumbnail proportion specified by the system, the longitudinal size of the picture is calculated according to the preset proportion by taking the transverse size of the picture as a standard, and the picture is cut according to the calculated coordinates to be in accordance with the thumbnail size set by the system.
c) If the original picture horizontal-vertical ratio is the same as the system-specified thumbnail size ratio, the picture size may be modified to the system-specified thumbnail size.
And step five, converting the thumbnail data into a specified format code according to a picture format (such as jpeg or png) preset by the system, and storing the thumbnail data into a specified release directory from a memory.
Specifically, the step one and the step five can be implemented by using C language, and refer to the following codes in detail:
Figure BDA0003066937120000101
Figure BDA0003066937120000111
Figure BDA0003066937120000121
after each crawled picture is processed, the content grabbed by the crawler meets the browsing requirement of a user, and can be directly and externally published by matching with the text content, so that the crawler system can directly store each crawled content into a publishing directory, and automatically publish the crawled content after crawling is completed.
The method of the invention at least has the following beneficial technical effects:
(1) according to the scheme, when the crawler system crawls the content, the size of the picture is modified and the file is compressed and dumped according to the preset parameters of a system administrator, thumbnails are automatically generated, useless original picture files are automatically cleaned in the processing process, the content crawling is completed and is equivalent to the reprinting completion, manual intervention is not needed, manpower is saved, quick reprinting can be achieved, and the timeliness of information propagation can guarantee the stickiness of a user.
(2) After the crawler crawls the pictures, the C language codes can be used for immediately compressing the pictures, the pressure of the compressed pictures on a CPU can be optimized, and the phenomenon that the load of the CPU is too large due to the concentrated compression of the pictures is avoided.
(3) In the aspect of saving the capacity of the hard disk, when distributed crawlers aiming at picture sites work or massive distributed crawlers crawl a large amount of contents containing pictures, the compression and dump technology can obviously reduce the space burden of the hard disk, and the configurable parameters can flexibly prolong the working time of a crawler system.
In another embodiment, referring to fig. 3, the present invention further provides a crawler image processing apparatus 60, including:
the crawling module 61 is used for crawling pictures on internet webpages by using a web crawler and loading the pictures into a memory to obtain original pictures;
the compression module 62 is configured to perform compression adjustment on the original picture based on the occupied byte amount and size of the original picture, a preset byte amount, and a preset size to generate a compressed picture, and store the compressed picture in the release directory;
and the thumbnail module 63 is configured to load the compressed picture into the memory and perform a thumbnail cropping operation to generate a thumbnail and store the thumbnail in the publishing directory correspondingly.
It should be noted that, for specific limitations of the crawler image processing apparatus, reference may be made to the above limitations on the crawler image processing method, which is not described herein again. The modules in the crawler image processing device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
According to another aspect of the present invention, a computer device is provided, and the computer device may be a server, and its internal structure is shown in fig. 4. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. When executed by a processor, the computer program implements the above-mentioned crawler image processing method, and specifically, the method includes the following steps:
crawling pictures on an internet webpage by using a web crawler and loading the pictures into a memory to obtain original pictures;
compressing and adjusting the original picture based on the occupied byte amount and size of the original picture, the preset byte amount and the preset size to generate a compressed picture and storing the compressed picture in a release directory;
and loading the compressed picture into a memory and carrying out thumbnail cutting operation to generate a thumbnail and correspondingly storing the thumbnail to the publishing directory.
According to still another aspect of the present invention, there is also provided a computer-readable storage medium storing a computer program which, when executed by a processor, performs the foregoing crawler picture processing method, specifically, the method includes the steps of:
crawling pictures on an internet webpage by using a web crawler and loading the pictures into a memory to obtain original pictures;
compressing and adjusting the original picture based on the occupied byte amount and size of the original picture, the preset byte amount and the preset size to generate a compressed picture and storing the compressed picture in a release directory;
and loading the compressed picture into a memory and carrying out thumbnail cutting operation to generate a thumbnail and correspondingly storing the thumbnail to the publishing directory.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A crawler image processing method is characterized by comprising the following steps:
crawling pictures on an internet webpage by using a web crawler and loading the pictures into a memory to obtain original pictures;
compressing and adjusting the original picture based on the occupied byte amount and size of the original picture, the preset byte amount and the preset size to generate a compressed picture and storing the compressed picture in a release directory;
and loading the compressed picture into a memory and carrying out thumbnail cutting operation to generate a thumbnail and correspondingly storing the thumbnail to the publishing directory.
2. The crawler image processing method according to claim 1, wherein the step of compressing and adjusting the original image based on the occupied byte amount and size, the preset byte amount, and the preset size of the original image to generate the compressed image and storing the compressed image in the publishing directory comprises:
performing pixel adjustment on an original picture based on the occupied byte quantity and the preset byte quantity of the original picture to generate a first compressed picture;
and carrying out size adjustment on the first compressed picture based on the size and the preset size of the first compressed picture to generate a second compressed picture and storing the second compressed picture in a release directory.
3. The crawler picture processing method according to claim 2, wherein the step of performing pixel adjustment on the original picture based on the occupied byte amount and the preset byte amount of the original picture to generate the first compressed picture comprises:
calculating the file capacity of the original picture to obtain the occupied byte quantity;
comparing the byte quantity occupied by the original picture with a preset byte quantity;
and in response to the occupied byte amount of the original picture exceeding the preset byte amount, adjusting the pixels of the original picture to a preset value to generate the first compressed picture.
4. The crawler picture processing method according to claim 2, wherein the step of resizing the first compressed picture based on the size of the first compressed picture and a preset size to generate a second compressed picture and storing the second compressed picture in a distribution directory comprises:
calculating the transverse size and the longitudinal size of the first compressed picture, and the ratio of the transverse size to the longitudinal size to obtain an original ratio;
in response to that the transverse size is larger than a first preset value and the longitudinal size is smaller than or equal to a second preset value, modifying the transverse size of the first compressed picture into the first preset value, and adjusting the longitudinal size according to the original proportion;
in response to that the longitudinal size is larger than a second preset value and the transverse size is smaller than or equal to a first preset value, modifying the longitudinal size of the first compressed picture into the second preset value, and adjusting the transverse size according to the original proportion;
in response to that the transverse size is larger than a first preset value, the longitudinal size is larger than a second preset value, and the original proportion is larger than 1, modifying the transverse size of the first compressed picture into the first preset value, and adjusting the longitudinal size according to the original proportion;
in response to that the transverse size is larger than a first preset value, the longitudinal size is larger than a second preset value, and the original proportion is smaller than 1, modifying the longitudinal size of the first compressed picture into the second preset value, and adjusting the transverse size according to the original proportion;
and taking the first compressed picture after the size adjustment as the second compressed picture and writing the second compressed picture into a release directory.
5. The crawler image processing method according to claim 4, wherein the step of loading the compressed image into the memory and performing thumbnail cropping operation to generate a thumbnail and correspondingly storing the thumbnail in the publishing directory comprises:
loading a second compressed picture into a memory, and calculating the transverse size and the longitudinal size of the second compressed picture;
in response to the fact that the original proportion is larger than the preset horizontal-vertical proportion, calculating a new horizontal size according to the preset horizontal-vertical proportion by taking the vertical length of the second compressed picture as a standard, and horizontally cutting the second compressed picture according to the newly calculated new horizontal size;
in response to the original proportion being smaller than the preset horizontal-vertical proportion, calculating a new longitudinal size according to the preset horizontal-vertical proportion by taking the horizontal length of the second compressed picture as a standard, and cutting the longitudinal direction of the second compressed picture according to the newly calculated new longitudinal size;
and adjusting the size of the cut second compressed picture according to the preset transverse size or the preset longitudinal size to generate a thumbnail and write the thumbnail into the release directory.
6. The crawler picture processing method according to claim 5, further comprising:
establishing a mapping relation between the compressed pictures in the release directory and the second compressed pictures;
displaying the thumbnail in the release directory in response to receiving the release command;
and in response to receiving a display command of a certain thumbnail, determining a second compressed picture corresponding to the certain thumbnail based on the mapping relation between the thumbnail and the second compressed picture and displaying the second compressed picture.
7. The crawler picture processing method according to any one of claims 2 to 6, wherein the method further comprises:
adjusting the generated second compressed picture and the thumbnail into a preset format before storing the second compressed picture and the thumbnail into a release directory;
wherein the preset format is a jpeg format and/or a png format.
8. A crawler picture processing apparatus, the apparatus comprising:
the crawling module is used for crawling pictures on internet webpages by using a web crawler and loading the pictures into the memory to obtain original pictures;
the compression module is used for compressing and adjusting the original picture based on the occupied byte amount and size of the original picture, the preset byte amount and the preset size to generate a compressed picture and storing the compressed picture in the release directory;
and the thumbnail module is used for loading the compressed picture into the memory and carrying out thumbnail cutting operation to generate a thumbnail and correspondingly storing the thumbnail to the publishing directory.
9. A computer device, comprising:
at least one processor; and
a memory storing a computer program operable in the processor, the processor when executing the program performing the method of any of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 7.
CN202110528247.0A 2021-05-14 2021-05-14 Crawler image processing method and device, computer equipment and storage medium Active CN113450361B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110528247.0A CN113450361B (en) 2021-05-14 2021-05-14 Crawler image processing method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110528247.0A CN113450361B (en) 2021-05-14 2021-05-14 Crawler image processing method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113450361A true CN113450361A (en) 2021-09-28
CN113450361B CN113450361B (en) 2022-08-19

Family

ID=77809745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110528247.0A Active CN113450361B (en) 2021-05-14 2021-05-14 Crawler image processing method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113450361B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514271A (en) * 2013-09-13 2014-01-15 北京奇虎科技有限公司 Method and device for providing thumbnail image corresponding to webpage content
CN103514272A (en) * 2013-09-13 2014-01-15 北京奇虎科技有限公司 Method and device for providing thumbnail corresponding to webpage content
CN103870597A (en) * 2014-04-01 2014-06-18 北京奇虎科技有限公司 Method and device for searching for watermark-free picture
CN104199728A (en) * 2014-08-14 2014-12-10 腾讯科技(深圳)有限公司 Image transmission information displaying method and device
CN105261050A (en) * 2015-09-23 2016-01-20 北京金山安全软件有限公司 Picture compression method and device and mobile terminal
CN107527319A (en) * 2016-06-20 2017-12-29 阿里巴巴集团控股有限公司 Image shrinking method and device
CN109308155A (en) * 2018-08-21 2019-02-05 中国平安人寿保险股份有限公司 Adjust method, apparatus, computer equipment and the storage medium of the page
CN109447072A (en) * 2018-11-08 2019-03-08 北京金山安全软件有限公司 Thumbnail clipping method and device, electronic equipment and readable storage medium
CN109727257A (en) * 2018-12-28 2019-05-07 北京金山安全软件有限公司 Method, device and terminal for automatically cutting picture
CN109784342A (en) * 2019-01-24 2019-05-21 厦门商集网络科技有限责任公司 A kind of OCR recognition methods and terminal based on deep learning model
CN112099873A (en) * 2020-09-15 2020-12-18 广州华多网络科技有限公司 Application program home page loading method, device, equipment and storage medium
CN112752107A (en) * 2020-12-26 2021-05-04 广东工业大学 Webpage picture preprocessing method, system, storage medium and computer equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514271A (en) * 2013-09-13 2014-01-15 北京奇虎科技有限公司 Method and device for providing thumbnail image corresponding to webpage content
CN103514272A (en) * 2013-09-13 2014-01-15 北京奇虎科技有限公司 Method and device for providing thumbnail corresponding to webpage content
CN103870597A (en) * 2014-04-01 2014-06-18 北京奇虎科技有限公司 Method and device for searching for watermark-free picture
CN104199728A (en) * 2014-08-14 2014-12-10 腾讯科技(深圳)有限公司 Image transmission information displaying method and device
CN105261050A (en) * 2015-09-23 2016-01-20 北京金山安全软件有限公司 Picture compression method and device and mobile terminal
CN107527319A (en) * 2016-06-20 2017-12-29 阿里巴巴集团控股有限公司 Image shrinking method and device
CN109308155A (en) * 2018-08-21 2019-02-05 中国平安人寿保险股份有限公司 Adjust method, apparatus, computer equipment and the storage medium of the page
CN109447072A (en) * 2018-11-08 2019-03-08 北京金山安全软件有限公司 Thumbnail clipping method and device, electronic equipment and readable storage medium
CN109727257A (en) * 2018-12-28 2019-05-07 北京金山安全软件有限公司 Method, device and terminal for automatically cutting picture
CN109784342A (en) * 2019-01-24 2019-05-21 厦门商集网络科技有限责任公司 A kind of OCR recognition methods and terminal based on deep learning model
CN112099873A (en) * 2020-09-15 2020-12-18 广州华多网络科技有限公司 Application program home page loading method, device, equipment and storage medium
CN112752107A (en) * 2020-12-26 2021-05-04 广东工业大学 Webpage picture preprocessing method, system, storage medium and computer equipment

Also Published As

Publication number Publication date
CN113450361B (en) 2022-08-19

Similar Documents

Publication Publication Date Title
CN111753501B (en) Method for merging OFD (office file format) files and displaying quick reading
CN107943997B (en) Remote website evidence obtaining method based on Google browser, terminal device and storage medium
CN111159594A (en) Information processing method, information processing device and terminal equipment
CN112506950A (en) Data aggregation processing method, computing node, computing cluster and storage medium
TWI634421B (en) Electronic apparatus for data access and data access method therefor
CN113033165B (en) Method, device and computer readable storage medium for analyzing electronic form file
CN109284428B (en) Data processing method, device and storage medium
CN113450361B (en) Crawler image processing method and device, computer equipment and storage medium
CN113934955A (en) Method for generating display PPT file by browser, browser and storage medium
CN108958755B (en) Method and device for generating application program installation package and electronic equipment
CN113704588A (en) File reading method and system based on mapping technology
CN112800371A (en) Method and device for processing spreadsheet data in web page
CN109002557B (en) Method and electronic equipment for optimizing webpage loading speed based on browser caching mechanism
CN115587075A (en) Layout file processing method and device, terminal equipment and storage medium
EP4084491A1 (en) Dividing an astc texture to a set of sub-images
US8156428B1 (en) Method and apparatus for merging digital content
CN111966262B (en) Picture display method and computing device
CN112035656A (en) Method, device, computer equipment and medium for quickly previewing document
CN113722623A (en) Data processing method and device, electronic equipment and storage medium
CN113343137A (en) Optimized SEO page generation method and device, electronic equipment and storage medium
WO2020181903A1 (en) Webpage illustration processing method, system and device, and storage medium
CN113032696A (en) Display method and display device of page picture
CN105488054A (en) Method and device for browsing image
CN114185621B (en) Application program interface picture loading method, device, equipment and storage medium
CN111880743B (en) Data storage method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant