CN110990450A - Method and device for processing potential invasive livestock and poultry epidemic situation data - Google Patents

Method and device for processing potential invasive livestock and poultry epidemic situation data Download PDF

Info

Publication number
CN110990450A
CN110990450A CN201911090230.0A CN201911090230A CN110990450A CN 110990450 A CN110990450 A CN 110990450A CN 201911090230 A CN201911090230 A CN 201911090230A CN 110990450 A CN110990450 A CN 110990450A
Authority
CN
China
Prior art keywords
data
epidemic
analysis
analysis data
standardized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911090230.0A
Other languages
Chinese (zh)
Inventor
吴绍强
仇松寅
林祥梅
刘晓飞
王慧煜
刘丹丹
李晓琳
梅琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese Academy of Inspection and Quarantine CAIQ
Original Assignee
Chinese Academy of Inspection and Quarantine CAIQ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese Academy of Inspection and Quarantine CAIQ filed Critical Chinese Academy of Inspection and Quarantine CAIQ
Priority to CN201911090230.0A priority Critical patent/CN110990450A/en
Publication of CN110990450A publication Critical patent/CN110990450A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/70Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in livestock or poultry

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Biomedical Technology (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Primary Health Care (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)

Abstract

The disclosure relates to a method and a device for processing potential invasive livestock and poultry epidemic situation data. According to the technical scheme provided by the disclosure, a plurality of original data tables are derived from data of the imported animal and animal product quarantine information database and the animal epidemic situation information website, and the data are complete and related to the livestock and poultry epidemic situation. And moreover, the obtained original data tables are deleted, deduplicated and spliced to be integrated into a general table, so that data processing and analysis are facilitated. Moreover, the general analysis data table formed according to the general table is combined with weather information, and when the potential invasion analysis of the epidemic situation of the entomopathogenic medium transmission type is carried out, whether animals contact related entomopathogenic mediums or not and whether the animals encounter high-incidence weather of the entomopathogenic mediums or not need to be analyzed. Therefore, the technical scheme provided by the disclosure can effectively analyze the problem of occurrence of the potential invasive livestock and poultry epidemic situation by providing the general analysis data table with more data directly related to the spread of the epidemic disease.

Description

Method and device for processing potential invasive livestock and poultry epidemic situation data
Technical Field
The disclosure relates to the technical field of livestock and poultry epidemic diseases, in particular to a method and a device for processing potential invading livestock and poultry epidemic disease situation data.
Background
With the development of information technology and big data technology, the livestock industry has also paid increasing attention to data acquisition and data accumulation of livestock and poultry in the processes of breeding, transportation, quarantine and the like. However, due to the fact that the existing data entry system is not standard and perfect enough, and data acquisition and entry personnel are non-data professional livestock veterinarians and other workers, the accumulated data and the data directly related to epidemic disease transmission are few, and the problem that the potential invasive livestock epidemic situation is difficult to effectively analyze is caused.
Disclosure of Invention
The invention aims to provide a method and a device for processing potential invasive livestock and poultry epidemic situation data, which aim to solve the problem that the occurrence of the potential invasive livestock and poultry epidemic situation is difficult to effectively analyze due to less data directly related to epidemic disease transmission.
In order to achieve the above object, an embodiment of the present disclosure provides a method for processing potentially invading livestock and poultry epidemic situation data, the method includes:
acquiring data of an imported animal and animal product quarantine information database and an animal epidemic situation information website to form a plurality of original data tables;
deleting the post-arrival data tables and the animal epidemic situation information data tables which cannot be associated with specific animals in the plurality of original data tables to obtain a plurality of first analysis data tables;
removing duplication of first analysis data tables with the same service table ID in the first analysis data tables to obtain a plurality of second analysis data tables;
splicing the plurality of second analysis data tables into a summary table according to a main key input by a user, wherein each row of the summary table represents one piece of data, and each column represents one record item;
and acquiring corresponding weather information from a meteorological data website according to the longitude and latitude information of each row of data in the general table, and splicing the weather information into each row of data to obtain a general analysis data table.
Optionally, the method further comprises:
for each row of data in the total analysis data table, deleting the row of data under the condition that the state in the row of data is the disease and the epidemic disease name is a default value;
and deleting each column of data in the total analysis data table if the default number of the column of data is greater than 95% of the total number or if the column of data is a single value.
Optionally, the method further comprises:
converting all epidemic names in the general analysis data table into standardized epidemic names according to the corresponding relation between the standardized epidemic names and the epidemic names, wherein each standardized epidemic name corresponds to one epidemic and corresponds to all the epidemic names of the epidemic;
semantic analysis is carried out on the fields in the total analysis data table and the standardized fields in the standardized field table, and the fields in the total analysis data table are replaced by the standardized fields with the same field semantics.
Optionally, the method further comprises:
populating a default value of the number of deaths in the total analytical data table with 0;
fill in the default value for the number of animals rejected as inbound in the total analytical data sheet as the number of deaths.
Optionally, the method further comprises:
performing one-hot encoding processing on preset discrete variables in the total analysis data table, wherein the preset discrete variables comprise: epidemic disease name, bedding grass information, livestock and poultry variety, departure place and transport means name;
standardizing preset continuous variables in the total analysis data table into values between 0 and 1, wherein the preset continuous variables comprise: number of deaths, number of animals rejected for entry, and number of shipments.
The embodiment of the present disclosure also provides a device for processing the epidemic situation data of the potentially invading livestock and poultry, the device includes:
the system comprises an original data table acquisition module, a data analysis module and a data analysis module, wherein the original data table acquisition module is used for acquiring data of an entry animal and animal product quarantine information database and an animal epidemic situation information website to form a plurality of original data tables;
the first analysis module is used for deleting a post-arrival data table and an animal epidemic situation information data table which cannot be associated with specific animals in the plurality of original data tables to obtain a plurality of first analysis data tables;
the second analysis module is used for removing duplication of the first analysis data tables with the same service table ID in the first analysis data tables to obtain a plurality of second analysis data tables;
the summary table splicing module is used for splicing the plurality of second analysis data tables into a summary table according to a main key input by a user, wherein each row of the summary table represents one piece of data, and each column of the summary table represents one record item;
and the weather splicing module is used for acquiring corresponding weather information from a meteorological data website according to the longitude and latitude information of each line of data in the general table and splicing the weather information into each line of data to obtain a general analysis data table.
Optionally, the apparatus further comprises:
the row deleting module is used for deleting each row of data in the total analysis data table under the condition that the state in the row of data is the disease and the epidemic disease name is a default value;
and the column deleting module is used for deleting each column of data in the total analysis data table under the condition that the default number in the column of data is greater than 95% of the total number or under the condition that the column of data is a single value.
Optionally, the apparatus further comprises:
an epidemic disease name standardization module for converting all epidemic disease names in the general analysis data table into standardized epidemic disease names according to the corresponding relation between the standardized epidemic disease names and the epidemic disease names, wherein each standardized epidemic disease name corresponds to an epidemic disease and corresponds to all the epidemic disease names of the epidemic disease;
and the field standardization module is used for carrying out semantic analysis on the fields in the total analysis data table and the standardized fields in the standardized field table, and replacing the fields in the total analysis data table with the standardized fields with the same field semantics.
The disclosed embodiments also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the above-described method.
An embodiment of the present disclosure further provides an electronic device, including:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to implement the steps of the above method.
According to the technical scheme provided by the disclosure, a plurality of original data tables are derived from data of the imported animal and animal product quarantine information database and the animal epidemic situation information website, and the data are complete and related to the livestock and poultry epidemic situation. And moreover, the obtained original data tables are deleted, deduplicated and spliced to be integrated into a general table, so that data processing and analysis are facilitated. Moreover, the general analysis data table formed according to the general table is combined with weather information, and when the potential invasion analysis of the epidemic situation of the entomopathogenic medium transmission type is carried out, whether animals contact related entomopathogenic mediums or not and whether the animals encounter high-incidence weather of the entomopathogenic mediums or not need to be analyzed. Therefore, the technical scheme provided by the disclosure can effectively analyze the problem of occurrence of the potential invasive livestock and poultry epidemic situation by providing the general analysis data table with more data directly related to the spread of the epidemic disease.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
fig. 1 is a flowchart of a method for processing potential invading livestock and poultry epidemic situation data provided by the embodiment of the disclosure.
Fig. 2 is a block diagram of a device for processing potential invading livestock and poultry epidemic situation data provided by the embodiment of the disclosure.
Fig. 3 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
Detailed Description
The following detailed description of specific embodiments of the present disclosure is provided in connection with the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.
The embodiment of the disclosure provides a method for processing potential invasive livestock and poultry epidemic situation data. Fig. 1 is a flowchart illustrating a method for processing data of potential epidemic diseases of livestock and poultry according to an embodiment of the present disclosure. As shown in fig. 1, the method comprises the steps of:
and step S11, acquiring data of the quarantine information database of the inbound animals and animal products and the animal epidemic situation information website to form a plurality of original data tables.
Optionally, the imported animal and animal product quarantine information database includes at least one of export animal information data, overseas farm quarantine information data, overseas quarantine information data, and domestic quarantine information data, and the data of the animal epidemic situation information website includes at least one of world animal health organization OIE, global animal disease information system, united nations food and agriculture organization FAO, world health organization WHO, and international infectious disease information.
And step S12, deleting the post-arrival data sheet and the animal epidemic situation information data sheet which cannot be associated with the specific animals in the plurality of original data sheets to obtain a plurality of first analysis data sheets.
The technical scheme provided by the application aims to provide service for predicting the epidemic situation before the entry animal arrives at a harbor, and the data sheet after arriving at a harbor is irrelevant to the epidemic situation before the entry of the animal. Therefore, the post-arrival data table of the plurality of original data tables needs to be deleted. The data sheet cannot be associated to a specific animal, and since it cannot be located to which animal or group of animals the data sheet is associated, it cannot be used effectively and needs to be deleted.
Step S13, duplicate the first analysis data table with the same service table ID in the plurality of first analysis data tables, to obtain a plurality of second analysis data tables.
Due to the system logging, a large amount of repeated data exists in the original data, namely, a single piece of data of one/batch of animals can be logged repeatedly for many times. In the technical field of livestock and poultry epidemic diseases, the ID of the business table is used as a unique identifier for identifying each animal or each batch of animals, so that whether the animals are the same or not can be judged by judging whether the IDs of the business tables are the same or not.
And step S14, splicing the plurality of second analysis data tables into a summary table according to the main key input by the user, wherein each row of the summary table represents one piece of data, and each column represents one record item.
After the original data tables are obtained, the obtained original data tables can be stored in a database, and the main key of each original data table can be generated according to user input when the original data tables are stored in the database. The data can be all data of one or a batch of animals of the same kind, including data of an imported animal and animal product quarantine information database and an animal epidemic situation information website, for example, data (farm quarantine data, transportation vehicle quarantine data, transportation journey and the like) are included, each column is a record item, for example, the general table comprises a plurality of columns, and the columns are respectively disease state, epidemic disease name, bedding information, excrement, detection items, detection methods, foreign detection laboratories, prevention and treatment-epidemic disease name, prevention and treatment-drug name, livestock and poultry immunization-vaccine kind, immunization-epidemic disease name, variety, place of departure, transportation vehicle name, death number, number of animals refused to enter the country, shipment number and the like.
And step S15, acquiring corresponding weather information from a meteorological data website according to the longitude and latitude information of each line of data in the general table, and splicing the weather information into each line of data to obtain a general analysis data table.
According to the technical scheme provided by the disclosure, a plurality of original data tables are derived from data of the imported animal and animal product quarantine information database and the animal epidemic situation information website, and the data are complete and related to the livestock and poultry epidemic situation. And moreover, the obtained original data tables are deleted, deduplicated and spliced to be integrated into a general table, so that data processing and analysis are facilitated. Moreover, the general analysis data table formed according to the general table is combined with weather information, and when the potential invasion analysis of the epidemic situation of the entomopathogenic medium transmission type is carried out, whether animals contact related entomopathogenic mediums or not and whether the animals encounter high-incidence weather of the entomopathogenic mediums or not need to be analyzed. Therefore, the technical scheme provided by the disclosure can effectively analyze the problem of occurrence of the potential invasive livestock and poultry epidemic situation by providing the general analysis data table with more data directly related to the spread of the epidemic disease.
Optionally, the method further comprises:
and deleting each row of data in the total analysis data table under the condition that the state in the row of data is the disease and the epidemic disease name is a null value.
Since the objective of the application is to predict the incidence probability of a certain epidemic disease, and the relevant fields are the name and the state of the epidemic disease, the row in which the state is the incidence and the name of the epidemic disease is the default value is deleted, that is, the application does not consider the incidence of unknown etiology.
And deleting each column of data in the total analysis data table if the default number of the column of data is greater than 95% of the total number or if the column of data is a single value.
For each column of data in the total analysis data table, if the default number in the column of data is greater than 95% of the total number, it indicates that there are a large number of defaults for the value of the column, and thus the column has little meaning for the final result, and the column is deleted. For each column of data in the total analysis data table, if the column of data is a single value, for example, the column name is whether the feed is qualified or not, whether the bedding is disinfected or not, and the corresponding value of each row of the column is that the result of the disease attack or not is positive, so that the disease attack or not is not affected, the analysis is meaningless, and the column is deleted.
Through the technical scheme, the meaningless rows and columns which have no influence on the result are deleted, so that the data volume can be reduced, and later-stage calculation, analysis and modeling are facilitated.
Optionally, the method further comprises:
and converting all the epidemic names in the general analysis data table into standardized epidemic names according to the corresponding relation between the standardized epidemic names and the epidemic names.
Wherein each standardized epidemic name corresponds to one epidemic and to all epidemic names of the epidemic. Since a disease may include a plurality of disease names, the disease names need to be standardized to reduce the number of disease names, which is convenient for calculation and analysis. For an epidemic disease comprising a plurality of epidemic names, the standardization process is as follows: and taking one of the plurality of epidemic names as a standardized epidemic name of the epidemic, enabling the plurality of epidemic names to correspond to the standardized epidemic name, and converting the epidemic name into the standardized epidemic name if the epidemic name is any one of the plurality of epidemic names in the general analysis data table.
Semantic analysis is carried out on the fields in the total analysis data table and the standardized fields in the standardized field table, and the fields in the total analysis data table are replaced by the standardized fields with the same field semantics.
Similarly, the total analysis data table is formed by splicing a plurality of data tables with different sources, and the unstructured degree is higher due to different input persons and other reasons, so that some fields with the same semantics but different expressions need to be standardized. For example, the starting point in the total analysis data sheet is Australia, etc. instead of Australia.
Through the technical scheme, the fields with the same epidemic disease types and the same semantics are standardized to reduce classification results, and later-stage calculation, analysis and modeling are facilitated.
Optionally, the method further comprises:
fill the default value for the number of deaths in the total analytical data table to 0.
Consider that in actual data entry, when the number of deaths is a default, it may be the case that there are no deaths. Thus, in the present application, the default value for the number of deaths in the total analysis data table is filled with 0.
Fill in the default value for the number of animals rejected as inbound in the total analytical data sheet as the number of deaths.
The default values for the number of animals rejected as entry in the summary analysis data sheet are filled with the meaning of the number of deaths: the minimum number of animals rejected as entry is the number of deaths, which is more reasonable to fill as a default.
Through the technical scheme, the default value is filled, so that the data is more complete, and the data availability is increased.
Optionally, the method further comprises:
and performing one-hot encoding treatment on preset discrete variables in the total analysis data table.
One-hot Encoding, also known as One-bit active Encoding, encodes N states with N-bit state registers, each state represented by a separate register, with only One bit active. The preset discrete variables include, but are not limited to: epidemic disease name, bedding grass information, livestock and poultry variety, departure place and transport means name. For example, the process of performing one-hot encoding processing on epidemic disease names is as follows: for convenience of explanation, it is assumed that three epidemic disease names are provided in the general analysis data table, namely akabane disease, pseudorabies and tuberculosis, and are 001, 010 and 100 after being subjected to unique heat coding. The value of the discrete characteristic can be expanded to the Euclidean space by carrying out the one-hot coding processing on the preset discrete variable, and a certain value of the discrete characteristic corresponds to a certain point of the Euclidean space. This would make the distance between features more reasonable to calculate. The reason for mapping to the euclidean space is: in machine learning algorithms such as regression, classification, clustering and the like, calculation of distances between features or calculation of similarity are very important, and the common calculation of distances or similarities is the calculation of similarity in Euclidean space, and cosine similarity is calculated based on the Euclidean space.
And standardizing the preset continuous variables in the total analysis data table to values between 0 and 1.
Wherein the preset continuous variable includes, but is not limited to: number of deaths, number of animals rejected for entry, and number of shipments. For the preset continuous variable, the number size represents the actual amount, so that the influence of the unit can be reduced by normalizing the preset continuous variable to a value between 0 and 1. The processing formula is as follows:
Figure BDA0002266634010000091
wherein x represents a value of a predetermined continuous variable, and xmaxRepresents the maximum value, x, of the preset continuous variableminRepresents the minimum value, x, of the preset continuous variable*Represents the normalized value of x.
Based on the inventive concept, the embodiment of the disclosure also provides a device for processing the potential epidemic situation data of the invading livestock and poultry. As shown in fig. 2, the device for processing the potential epidemic situation of the invading livestock and poultry comprises: the system comprises a raw data table acquisition module 11, a first analysis module 12, a second analysis module 13, a summary table splicing module 14 and a weather splicing module 15.
And the original data table acquisition module 11 is used for acquiring data of the quarantine information database of the inbound animals and animal products and data of the animal epidemic situation information website to form a plurality of original data tables.
And the first analysis module 12 is configured to delete the post-arrival data table and the animal epidemic situation information data table which cannot be associated with the specific animal from the plurality of original data tables to obtain a plurality of first analysis data tables.
And the second analysis module 13 is configured to duplicate the first analysis data tables with the same service table ID in the plurality of first analysis data tables to obtain a plurality of second analysis data tables.
And a summary table splicing module 14, configured to splice the plurality of second analysis data tables into a summary table according to a primary key input by a user, where each row of the summary table represents one piece of data, and each column represents one record item.
And the weather splicing module 15 is configured to obtain corresponding weather information from a weather data website according to the longitude and latitude information of each line of data in the general table, and splice the weather information into each line of data to obtain a general analysis data table.
According to the technical scheme provided by the disclosure, a plurality of original data tables are derived from data of the imported animal and animal product quarantine information database and the animal epidemic situation information website, and the data are complete and related to the livestock and poultry epidemic situation. And moreover, the obtained original data tables are deleted, deduplicated and spliced to be integrated into a general table, so that data processing and analysis are facilitated. Moreover, the general analysis data table formed according to the general table is combined with weather information, and when the potential invasion analysis of the epidemic situation of the entomopathogenic medium transmission type is carried out, whether animals contact related entomopathogenic mediums or not and whether the animals encounter high-incidence weather of the entomopathogenic mediums or not need to be analyzed. Therefore, the technical scheme provided by the disclosure can effectively analyze the problem of occurrence of the potential invasive livestock and poultry epidemic situation by providing the general analysis data table with more data directly related to the spread of the epidemic disease.
Optionally, the apparatus further comprises:
and the line deleting module is used for deleting each line of data in the total analysis data table under the condition that the state in the line of data is the disease and the epidemic disease name is a default value.
And the column deleting module is used for deleting each column of data in the total analysis data table under the condition that the default number in the column of data is greater than 95% of the total number or under the condition that the column of data is a single value.
Through the technical scheme, the meaningless rows and columns which have no influence on the result are deleted, so that the data volume can be reduced, and later-stage calculation, analysis and modeling are facilitated.
Optionally, the apparatus further comprises:
and the epidemic disease name standardization module is used for converting all the epidemic disease names in the total analysis data table into standardized epidemic disease names according to the corresponding relation between the standardized epidemic disease names and the epidemic disease names, wherein each standardized epidemic disease name corresponds to one epidemic disease and corresponds to all the epidemic disease names of the epidemic disease.
And the field standardization module is used for carrying out semantic analysis on the fields in the total analysis data table and the standardized fields in the standardized field table, and replacing the fields in the total analysis data table with the standardized fields with the same field semantics.
Through the technical scheme, the fields with the same epidemic disease types and the same semantics are standardized to reduce classification results, and later-stage calculation, analysis and modeling are facilitated.
Optionally, the apparatus further comprises:
and the death number filling module is used for filling the default value of the death number in the total analysis data table to be 0.
And the entry rejection quantity filling module is used for filling the default value of the number of the entry rejection animals in the total analysis data table into the death quantity.
Through the technical scheme, the default value is filled, so that the data is more complete, and the data availability is increased.
Optionally, the apparatus further comprises:
and the one-hot coding module is used for performing one-hot coding treatment on preset discrete variables in the total analysis data table.
And the continuous variable standardization module is used for standardizing the preset continuous variable in the total analysis data table into a value between 0 and 1.
Through the technical scheme, the independent-hot-coding processing is carried out on the preset discrete variable, the value of the discrete characteristic can be expanded to the European space, and a certain value of the discrete characteristic corresponds to a certain point of the European space. This would make the distance between features more reasonable to calculate. The reason for mapping to the euclidean space is: in machine learning algorithms such as regression, classification, clustering and the like, calculation of distances between features or calculation of similarity are very important, and the common calculation of distances or similarities is the calculation of similarity in Euclidean space, and cosine similarity is calculated based on the Euclidean space. For the preset continuous variable, the number size represents the actual amount, so that the influence of the unit can be reduced by normalizing the preset continuous variable to a value between 0 and 1.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Based on the above inventive concept, the embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the method for processing the data of the potential invasion of the livestock and poultry epidemic situation.
Based on the inventive concept, the embodiment of the present disclosure further provides an electronic device. Fig. 3 is a block diagram illustrating an electronic device 700 according to an example embodiment. As shown in fig. 3, the electronic device 700 may include: a processor 701 and a memory 702. The electronic device 700 may also include one or more of a multimedia component 703, an input/output (I/O) interface 704, and a communication component 705.
The processor 701 is configured to control the overall operation of the electronic device 700, so as to complete all or part of the steps in the method for processing the data of the potential epidemic situation of the livestock and poultry. The memory 702 is used to store various types of data to support operation at the electronic device 700, such as instructions for any application or method operating on the electronic device 700 and application-related data, such as contact data, transmitted and received messages, pictures, audio, video, and the like. The Memory 702 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk. The multimedia components 703 may include screen and audio components. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 702 or transmitted through the communication component 705. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 704 provides an interface between the processor 701 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 705 is used for wired or wireless communication between the electronic device 700 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G, 4G, NB-IOT, eMTC, or other 5G, etc., or a combination of one or more of them, which is not limited herein. The corresponding communication component 705 may thus include: Wi-Fi module, Bluetooth module, NFC module, etc.
In an exemplary embodiment, the electronic Device 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components for performing the above-mentioned method for Processing the data of the potential epidemic situation of the livestock and poultry.
In another exemplary embodiment, a computer readable storage medium including program instructions is further provided, and the program instructions, when executed by a processor, implement the steps of the above-mentioned method for processing the data of the potential epidemic situation of the livestock and poultry. For example, the computer readable storage medium may be the memory 702 including the program instructions, and the program instructions may be executed by the processor 701 of the electronic device 700 to perform the above-mentioned method for processing the data of the potential epidemic situation of the invading livestock and poultry.
In another exemplary embodiment, there is also provided a computer program product comprising a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-mentioned method for processing the potentially invasive livestock and poultry epidemic situation data when executed by the programmable apparatus.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not described again.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.

Claims (10)

1. A method for processing potential invasive livestock and poultry epidemic situation data is characterized by comprising the following steps:
acquiring data of an imported animal and animal product quarantine information database and an animal epidemic situation information website to form a plurality of original data tables;
deleting the post-arrival data tables and the animal epidemic situation information data tables which cannot be associated with specific animals in the plurality of original data tables to obtain a plurality of first analysis data tables;
removing duplication of first analysis data tables with the same service table ID in the first analysis data tables to obtain a plurality of second analysis data tables;
splicing the plurality of second analysis data tables into a summary table according to a main key input by a user, wherein each row of the summary table represents one piece of data, and each column represents one record item;
and acquiring corresponding weather information from a meteorological data website according to the longitude and latitude information of each row of data in the general table, and splicing the weather information into each row of data to obtain a general analysis data table.
2. The method of claim 1, further comprising:
for each row of data in the total analysis data table, deleting the row of data under the condition that the state in the row of data is the disease and the epidemic disease name is a default value;
and deleting each column of data in the total analysis data table if the default number of the column of data is greater than 95% of the total number or if the column of data is a single value.
3. The method of claim 1, further comprising:
converting all epidemic names in the general analysis data table into standardized epidemic names according to the corresponding relation between the standardized epidemic names and the epidemic names, wherein each standardized epidemic name corresponds to one epidemic and corresponds to all the epidemic names of the epidemic;
semantic analysis is carried out on the fields in the total analysis data table and the standardized fields in the standardized field table, and the fields in the total analysis data table are replaced by the standardized fields with the same field semantics.
4. The method of claim 1, further comprising:
populating a default value of the number of deaths in the total analytical data table with 0;
fill in the default value for the number of animals rejected as inbound in the total analytical data sheet as the number of deaths.
5. The method according to any one of claims 1-4, further comprising:
performing one-hot encoding processing on preset discrete variables in the total analysis data table, wherein the preset discrete variables comprise: epidemic disease name, bedding grass information, livestock and poultry variety, departure place and transport means name;
standardizing preset continuous variables in the total analysis data table into values between 0 and 1, wherein the preset continuous variables comprise: number of deaths, number of animals rejected for entry, and number of shipments.
6. The utility model provides a latent invasion beasts and birds epidemic situation data processing apparatus which characterized in that, the device includes:
the system comprises an original data table acquisition module, a data analysis module and a data analysis module, wherein the original data table acquisition module is used for acquiring data of an entry animal and animal product quarantine information database and an animal epidemic situation information website to form a plurality of original data tables;
the first analysis module is used for deleting a post-arrival data table and an animal epidemic situation information data table which cannot be associated with specific animals in the plurality of original data tables to obtain a plurality of first analysis data tables;
the second analysis module is used for removing duplication of the first analysis data tables with the same service table ID in the first analysis data tables to obtain a plurality of second analysis data tables;
the summary table splicing module is used for splicing the plurality of second analysis data tables into a summary table according to a main key input by a user, wherein each row of the summary table represents one piece of data, and each column of the summary table represents one record item;
and the weather splicing module is used for acquiring corresponding weather information from a meteorological data website according to the longitude and latitude information of each line of data in the general table and splicing the weather information into each line of data to obtain a general analysis data table.
7. The apparatus of claim 6, further comprising:
the row deleting module is used for deleting each row of data in the total analysis data table under the condition that the state in the row of data is the disease and the epidemic disease name is a default value;
and the column deleting module is used for deleting each column of data in the total analysis data table under the condition that the default number in the column of data is greater than 95% of the total number or under the condition that the column of data is a single value.
8. The apparatus of claim 6, further comprising:
an epidemic disease name standardization module for converting all epidemic disease names in the general analysis data table into standardized epidemic disease names according to the corresponding relation between the standardized epidemic disease names and the epidemic disease names, wherein each standardized epidemic disease name corresponds to an epidemic disease and corresponds to all the epidemic disease names of the epidemic disease;
and the field standardization module is used for carrying out semantic analysis on the fields in the total analysis data table and the standardized fields in the standardized field table, and replacing the fields in the total analysis data table with the standardized fields with the same field semantics.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
10. An electronic device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to carry out the steps of the method of any one of claims 1 to 5.
CN201911090230.0A 2019-11-08 2019-11-08 Method and device for processing potential invasive livestock and poultry epidemic situation data Pending CN110990450A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911090230.0A CN110990450A (en) 2019-11-08 2019-11-08 Method and device for processing potential invasive livestock and poultry epidemic situation data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911090230.0A CN110990450A (en) 2019-11-08 2019-11-08 Method and device for processing potential invasive livestock and poultry epidemic situation data

Publications (1)

Publication Number Publication Date
CN110990450A true CN110990450A (en) 2020-04-10

Family

ID=70083878

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911090230.0A Pending CN110990450A (en) 2019-11-08 2019-11-08 Method and device for processing potential invasive livestock and poultry epidemic situation data

Country Status (1)

Country Link
CN (1) CN110990450A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960124A (en) * 2017-03-17 2017-07-18 北京农信互联科技有限公司 Livestock epidemic situation alarming method and device
CN109509558A (en) * 2018-11-20 2019-03-22 河南省疾病预防控制中心 Fever epidemic situation fast reaction intelligence public affairs based on B/S framework defend service system
CN109918376A (en) * 2019-02-26 2019-06-21 北京致远互联软件股份有限公司 Tables of data processing method, device and electronic equipment
CN110119413A (en) * 2019-04-30 2019-08-13 京东城市(南京)科技有限公司 The method and apparatus of data fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960124A (en) * 2017-03-17 2017-07-18 北京农信互联科技有限公司 Livestock epidemic situation alarming method and device
CN109509558A (en) * 2018-11-20 2019-03-22 河南省疾病预防控制中心 Fever epidemic situation fast reaction intelligence public affairs based on B/S framework defend service system
CN109918376A (en) * 2019-02-26 2019-06-21 北京致远互联软件股份有限公司 Tables of data processing method, device and electronic equipment
CN110119413A (en) * 2019-04-30 2019-08-13 京东城市(南京)科技有限公司 The method and apparatus of data fusion

Similar Documents

Publication Publication Date Title
Podgórski et al. Contact rates in wild boar populations: Implications for disease transmission
MacLeod et al. Parasites lost–do invaders miss the boat or drown on arrival?
Young et al. Software to facilitate and streamline camera trap data management: A review
Abarenkov et al. Protax‐fungi: a web‐based tool for probabilistic taxonomic placement of fungal internal transcribed spacer sequences
Bailey et al. Bias, precision, and parameter redundancy in complex multistate models with unobservable states
CN109271356B (en) Log file format processing method, device, computer equipment and storage medium
Ly et al. Exploring the relationship between human social deprivation and animal surrender to shelters in British Columbia, Canada
CN111048214A (en) Early warning method and device for spreading situation of foreign livestock and poultry epidemic diseases
CN113836128A (en) Abnormal data identification method, system, equipment and storage medium
Falcão de Oliveira et al. Ecological niche modelling and predicted geographic distribution of Lutzomyia cruzi, vector of Leishmania infantum in South America
US11087882B1 (en) Signal processing for making predictive determinations
Dias et al. Species richness and patterns of overdispersion, clustering and randomness shape phylogenetic and functional diversity–area relationships in habitat islands
CN111191066A (en) Image recognition-based pet identity recognition method and device
Gebreyesus et al. Supervised learning techniques for dairy cattle body weight prediction from 3D digital images
CN113935788B (en) Model evaluation method, device, equipment and computer readable storage medium
WO2022271528A1 (en) Explainable artificial intelligence in computing environment
CN111291803A (en) Image grading granularity migration method, system, equipment and medium
WO2021258968A1 (en) Applet classification method, apparatus and device, and computer readable storage medium
CN110990450A (en) Method and device for processing potential invasive livestock and poultry epidemic situation data
US20200372982A1 (en) Imputing an outcome attribute to a pers record missing an outcome attribute using a structured situation string or unstructured case note text associated with the record
CN114398980A (en) Cross-modal Hash model training method, encoding method, device and electronic equipment
CN114358024A (en) Log analysis method, apparatus, device, medium, and program product
Kijazi et al. A Proposed Information System for Communicating Foot‐and‐Mouth Disease Events among Livestock Stakeholders in Gairo District, Morogoro Region, Tanzania
Middlemiss et al. SARS‐CoV‐2 in ferrets
Ellington et al. Divergent estimates of herd‐wide caribou calf survival: Ecological factors and methodological biases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200410