WO2020202444A1 - Physical condition detection system - Google Patents

Physical condition detection system Download PDF

Info

Publication number
WO2020202444A1
WO2020202444A1 PCT/JP2019/014526 JP2019014526W WO2020202444A1 WO 2020202444 A1 WO2020202444 A1 WO 2020202444A1 JP 2019014526 W JP2019014526 W JP 2019014526W WO 2020202444 A1 WO2020202444 A1 WO 2020202444A1
Authority
WO
WIPO (PCT)
Prior art keywords
physical condition
baby
voice data
voice
detection system
Prior art date
Application number
PCT/JP2019/014526
Other languages
French (fr)
Japanese (ja)
Inventor
伴之 服部
Original Assignee
株式会社ファーストアセント
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社ファーストアセント filed Critical 株式会社ファーストアセント
Priority to PCT/JP2019/014526 priority Critical patent/WO2020202444A1/en
Publication of WO2020202444A1 publication Critical patent/WO2020202444A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/15Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being formant information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance

Definitions

  • the present invention relates to a physical condition detection system.
  • Patent Document 1 discloses an infant health monitoring system that processes measurement data to determine whether there is a problem with the health of the infant.
  • the measurement data collected by the infant monitoring device includes infant movement, temperature, position and awakening.
  • the infant monitoring system disclosed in Patent Document 1 receives SIDS (Sudden Infant Death Syndrome), epilepsy, disturbed sleep pattern, fever, or stress based on the measurement data and environmental sensor data collected by the infant monitoring device. Determine infant health problems such as being in a state of being.
  • SIDS Sudden Infant Death Syndrome
  • epilepsy epilepsy
  • disturbed sleep pattern fever
  • fever or stress
  • Determine infant health problems such as being in a state of being.
  • the health problems of infants are not limited to these, and various diseases can be considered. Therefore, it is the infant monitoring system disclosed in Patent Document 1 that detects that an infant is affected by such various diseases. Has the problem of being difficult.
  • the physical condition detection system detects a voice data acquisition unit that acquires a first voice data representing a baby's voice and a plurality of frequency components included in the first voice data.
  • a frequency component detection unit a voice parameter extraction unit that extracts a plurality of parameters based on the plurality of frequency components included in the first voice data, and a second voice that represents the baby's voice by the voice data acquisition unit.
  • the infant is provided with a physical condition determination unit that determines whether or not the infant is in a poor physical condition based on the acquired second voice data and the plurality of parameters.
  • the frequency component detection unit further detects the plurality of frequency components included in the second voice data, and extracts the voice parameter.
  • the unit further extracts the plurality of parameters based on the plurality of frequency components included in the second voice data, and the physical condition determination unit is based on the plurality of frequency components included in the second voice data. It is preferable to perform the determination when at least a part of the plurality of parameters satisfies a predetermined condition.
  • the at least a part of the parameters is the fundamental frequency of the vocalization included in the second voice data, and the predetermined condition is.
  • the fundamental frequency is preferably 300 Hz or more and 800 Hz or less.
  • the physical condition determination unit includes the plurality of parameters based on the plurality of frequency components included in the first voice data. It is preferable to make the determination based on the similarity with the plurality of parameters based on the plurality of frequency components included in the second audio data.
  • the plurality of parameters are the number of occurrences of the utterance per unit time and one of the utterances.
  • the state of poor physical condition includes a state in which the baby is sick and a state in which the baby is sick. It preferably includes at least one of the afflicting conditions.
  • the physical condition determination unit determines that the baby is in poor physical condition, the baby is in poor physical condition. It is preferable to further include a message output device that outputs a message to that effect.
  • the second voice data when the second voice data is acquired by the voice data acquisition unit, the acquired second voice data is obtained. It is preferable to further include an emotion estimation unit that estimates the emotions of the baby based on the voice data of 2 and the plurality of parameters.
  • FIG. 1 is a diagram showing a configuration of a physical condition detection system according to an embodiment of the present invention.
  • FIG. 2 is a diagram illustrating how voice data representing the utterance of an infant is acquired.
  • FIG. 3 is a diagram illustrating recorded data in which a plurality of frequency components included in voice data representing a baby's utterance and values of a plurality of parameters based on the plurality of frequency components are recorded.
  • FIG. 4 is a diagram showing an example of parameter recording processing executed by running a computer program on the voice detection device according to the embodiment.
  • FIG. 5 is a diagram illustrating a case where voice data representing the vocalization of an infant is acquired and the infant is determined to be in a poor physical condition.
  • FIG. 6 is a diagram showing an example of a physical condition determination process executed by running a computer program on the voice detection device according to the embodiment.
  • FIG. 7 is a diagram illustrating a product supply mode of a computer program traveling by the voice detection device according to the embodiment.
  • FIG. 8 is a diagram illustrating recorded data used in the physical condition determination process executed by the physical condition detection system in the second modification of the embodiment of the present invention.
  • FIG. 9 is a diagram showing a configuration of a voice detection device included in the physical condition detection system according to the fifth modification of the embodiment of the present invention.
  • FIG. 10 is a diagram showing a configuration of a physical condition detection system according to a modification 7 of the embodiment of the present invention.
  • FIG. 1 is a diagram showing a configuration of a physical condition detection system 2 according to an embodiment of the present invention.
  • the physical condition detection system 2 includes a voice detection device 5 and an electronic device 10.
  • the voice detection device 5 and the electronic device 10 are connected to each other via wireless communication such as Bluetooth (registered trademark).
  • the electronic device 10 is, for example, a mobile terminal such as a smartphone, a tablet computer, or a personal computer, and may be a dedicated terminal or a general-purpose terminal.
  • the voice detection device 5 includes a microphone 3, a storage 4, a processor 6, a memory 7, and a communication module 8.
  • the microphone 3 converts voice including human utterance and environmental sound into voice data by an electric signal.
  • the processor 6 logically includes a voice data acquisition unit 61, a frequency component detection unit 62, a voice parameter extraction unit 63, and a physical condition determination unit 64 by activating a computer program stored in the memory 7. ..
  • the voice data acquisition unit 61 acquires voice data representing the voice of an infant via the microphone 3 together with voice data representing the environmental sound in the installation environment of the voice detection device 5 and / or voice data representing the voice of an adult. ..
  • the acquired voice data includes a plurality of frequency components.
  • the frequency component detection unit 62 detects a plurality of frequency components included in the voice data of the voice including the utterance of the baby. Infant vocalizations include crying and babbling vocalizations. From those plurality of detected frequency components, a fundamental frequency, a first formant frequency, and a second formant frequency can be obtained. It is generally known that the fundamental frequency determines the pitch, and the first formant frequency and the second formant frequency determine the timbre.
  • the voice parameter extraction unit 63 extracts a plurality of parameters based on a plurality of frequency components detected by the frequency component detection unit 62 included in the voice data acquired by the voice data acquisition unit 61.
  • the plurality of extracted parameters are the number of infant utterances per unit time, the duration of one infant utterance, and the fundamental frequency of the infant utterance, as will be described later using FIG. 3 (b).
  • the fundamental frequency When the audio waveform is approximated by combining a plurality of sine waves, the frequency of the sine wave indicating the lowest frequency among the plurality of sine waves is called the fundamental frequency.
  • the first formant frequency and the second formant frequency are obtained in order from the fundamental frequency as the frequency corresponding to the peak of the amplitude of the sine wave indicating a frequency that is an integral multiple of the fundamental frequency. Therefore, the fundamental frequency, the first formant frequency, and the second formant frequency included in the plurality of frequency components detected by the frequency component detection unit 62 in the voice data acquired by the voice data acquisition unit 61 are set as the above-mentioned parameters. It can be extracted by the voice parameter extraction unit 63.
  • the values of the plurality of parameters extracted by the voice parameter extraction unit 63 from the voice data acquired by the voice data acquisition unit 61 are such that the baby is in poor physical condition by the physical condition determination unit 64.
  • the recorded data referred to when determining whether or not the state is in the state is recorded in the storage 4 by the voice parameter extraction unit 63.
  • This recorded data may be recorded in a storage (not shown) outside the voice detection device 5 instead of the storage 4 inside the voice detection device 5.
  • the voice data acquisition unit 61 acquires new voice data representing the baby's utterance
  • the physical condition determination unit 64 together with the acquired new voice data and the above-mentioned. Based on the values of the plurality of parameters recorded in the storage 4, it is determined whether or not the baby is in a poor physical condition.
  • the frequency component detection unit 62 detects a plurality of frequency components included in the new voice data.
  • the voice parameter extraction unit 63 extracts the above-mentioned plurality of parameters based on the plurality of frequency components detected by the frequency component detection unit 62 included in the new voice data.
  • the physical condition determination unit 64 determines the baby. Determines whether or not the patient is in poor physical condition.
  • a predetermined condition for example, when the fundamental frequency of vocalization is 300 Hz or more and 800 Hz or less.
  • the fundamental frequency of utterance is below 300 Hz, or when the fundamental frequency is above 800 Hz, the utterance represented by the new voice data is, for example, an adult utterance, or an environmental sound such as a living sound instead of utterance. This is because it is highly possible that the vocalization is not an infant's vocalization.
  • the physical condition determination unit 64 determines a plurality of parameters recorded in the storage 4 and a plurality of frequencies included in the new voice data, as will be described later. The similarity with a plurality of parameters extracted based on the components is calculated. Based on the calculated similarity, the physical condition determination unit 64 determines whether or not the baby is in a poor physical condition. When it is determined that the baby is in poor physical condition, the physical condition determination unit 64 sends a notification signal indicating that the baby is in poor physical condition to the electronic device 10 by wireless communication via the communication module 8.
  • the electronic device 10 includes a processor 11, a memory 12, an input interface 13, a message output device 14, and a communication module 15.
  • the processor 11 logically connects the input control unit 111 that controls the operation of the input interface 13 and the message control unit 112 that controls the message output device 14 by activating the computer program stored in the memory 12.
  • the input interface 13 is, for example, a touch panel.
  • the period information described later is input to the input interface 13 as screen input data by the user.
  • the message output device 14 is, for example, a display and / or a touch panel, and outputs a message notifying a user such as a caregiver of the baby that the baby is in poor physical condition, and an input screen display of period information described later.
  • Has the function of The message control unit 112 receives a notification signal indicating that the baby is in poor physical condition from the voice detection device 5 by wireless communication via the communication module 15.
  • FIG. 2 is a diagram illustrating how voice data representing the utterance of the baby 1 is acquired.
  • the voice detection device 5 is installed in the vicinity of the baby 1 so that voice data representing the utterance of the baby 1 can be acquired.
  • the voice data 50 acquired by the voice detection device 5 is an electric signal, it is represented by a waveform as illustrated in FIG. 2B, where the horizontal axis is time and the vertical axis is amplitude.
  • the voice data 50 represented by the waveform includes a plurality of frequency components.
  • the above-mentioned plurality of parameters extracted based on the plurality of frequency components included in the voice data representing the vocalization of the baby 1 are different depending on whether the baby 1 is in a poor physical condition or not.
  • the vocalization of the baby 1 is more likely to generate noise than in normal times. Therefore, by recording the values of the plurality of parameters in the storage 4 as recorded data as described above, a new value representing the recorded values of the plurality of parameters and the newly acquired utterance of the baby 1 is represented. Based on the value of at least one parameter extracted from the voice data, it is possible to determine whether or not the baby 1 is in a poor physical condition.
  • the parameter recording process in which the recorded data is recorded will be described.
  • the voice data 50 representing the utterance of the baby 1 illustrated in FIG. 2B is acquired, the input screen illustrated in FIG. 2C is displayed on the output device 14 of the electronic device 10.
  • the caregiver of the baby 1 can be inquired about the period during which the baby 1 was sick.
  • the message 141 and the period information 142 are displayed on the input screen illustrated in FIG. 2C. Specifically, as the message 141, the message "Please input the period of illness of the child" is displayed with respect to the voice data 50 exemplified in FIG. 2 (b).
  • period information 142 indicating the beginning and end of the period during which the baby 1 was sick was input by the caregiver of the baby 1.
  • the period information 142 indicating that the beginning of the period in which the baby 1 was sick is 18:00 on March 5, 2019, and the end is 9:00 on March 6, 2019.
  • the period information 142 is input by the caregiver of the baby 1 and transmitted to the voice detection device via the communication module 15 under the control of the input control unit 111 of the processor 11 of the electronic device 10.
  • the period during which the baby 1 is sick which is shown as the period information 142, shall be indicated as the period T1 in the voice data 50 illustrated in FIG. 2 (b). It is considered that a different waveform is observed in the period T1 in which the baby 1 is sick during the period in which the voice data 50 is acquired, as compared with the normal state in which the baby 1 is not sick. Be done. Since this period T1 is a period in which the baby 1 is sick, the voice parameter extraction unit 63 of the processor 6 included in the voice detection device 5 provides the period information 142 acquired from the electronic device 10 via the communication module 8. Based on this, the period T1 is specified as a period of poor physical condition in which the baby 1 is in a state of poor physical condition. In the example shown in FIG. 2B, the period T1 is the period from 18:00 on March 5, 2019 to 9:00 on March 6, 2019.
  • the baby 1 changes from a state in which the baby is not sick to a state in which the baby 1 is sick, it is considered that the baby 1 is in a state of being sick during the transitional period.
  • the voice parameter extraction unit 63 of the processor 6 included in the voice detection device 5 sets the period T2 before the period T1 and the baby 1 is in poor physical condition. It may be specified as a period of poor physical condition in the state of. In the example shown in FIG.
  • the period T2 is the period from 15:00 on March 5, 2019 to 18:00 on March 5, 2019.
  • the voice parameter extraction unit 63 specifies, for example, the time zone of 3 hours before the period T1 as the period T2.
  • the period T2 may be specified by the voice parameter extraction unit 63 as a period in which a waveform different from the normal time is observed in the time zone before the period T1 in the voice data 50.
  • the processor 6 included in the voice detection device 5 may specify the period T3, which is a combination of the period T1 and the period T2 illustrated in FIG. 2B, as the period of poor physical condition in which the baby 1 is in a state of poor physical condition.
  • the period T3 is the period from 15:00 on March 5, 2019 to 9:00 on March 6, 2019.
  • the frequency component detection unit 62 of the processor 6 included in the voice detection device 5 detects a plurality of frequency components included in the voice data 50 of the voice including the utterance of the baby 1.
  • a plurality of frequency components are detected by the frequency component detection unit 62, and the fundamental frequency of the vocalization of the infant 1 and the first formant frequency of the vocalization of the infant 1 are selected from the plurality of frequency components. It is assumed that the second formant frequency is obtained every second.
  • the plurality of frequency components may be obtained every predetermined time, such as every 5 seconds or every 10 seconds instead of every second.
  • the voice parameter extraction unit 63 extracts a plurality of parameters based on the detected plurality of frequency components. FIG.
  • FIG 3 is a diagram illustrating a plurality of frequency components included in the voice data 50 representing the utterance of the baby 1 and recorded data 41 in which the values of a plurality of parameters based on the plurality of frequency components are recorded.
  • the recorded data 41 is recorded in the storage 4 for each baby by the voice parameter extraction unit 63.
  • FIG. 3A shows time-series data 31 including the fundamental frequency of the infant 1's utterance detected by the frequency component detection unit 62 per second, and the first formant frequency and the second formant frequency of the infant 1's utterance.
  • the time series data 31 is recorded in the storage 4.
  • the time-series data 31 shown in FIG. 3A corresponds to the voice data 50 in the poor physical condition period T1 or T3 of the baby 1 shown in FIG. 2B, and the baby at each time based on the voice data 50. It is generated by adding a record containing the fundamental frequency of the vocalization of 1 and the first and second formant frequencies of the infant 1 vocalization every second.
  • a poor physical condition period display flag is associated with each record at each time every second composed of those frequency components.
  • the poor physical condition period display flag is associated with each record by the voice parameter extraction unit 63 after the time series data 31 is recorded in the storage 4 by the frequency component detection unit 62.
  • the value "NULL” was set because none of the fundamental frequency, the first formant frequency, and the second formant frequency were detected because the infant 1 did not utter.
  • the value "1" is set in the record at this time as a display flag for the period of poor physical condition.
  • the value "0" is set as the poor physical condition period display flag in this period. It is set in the record of each time included.
  • the processor 6 Processing performance may be saved.
  • "Null” is logically used as each frequency component value of the record at the time when the baby 1 does not utter. Be done.
  • the number of vocalizations of infant 1 per unit time and 1 of the vocalizations of infant 1 in the period of the past 1 hour from the reference time with the time at that time as the reference time every second.
  • the duration of each time is calculated by the voice parameter extraction unit 63 for each reference time. They are calculated as follows. For example, if the reference time is 19:31:44 on March 5, 2019, the past hour, that is, from 18:31:45 on March 5, 2019 to 19:31:44 on March 5, 2019. It is calculated based on whether or not the value of the basic frequency of the vocalization of the infant 1 exists in each of the 3600 records included in the time series data 31 up to.
  • the value of duration (seconds) / one utterance is calculated as 2 (seconds).
  • the value of the number of utterances / time of infant 1 is 5 (times), and the utterance durations for each of these 5 utterances are 120 seconds, 100 seconds, 80 seconds, 110 seconds, and 100 seconds, respectively.
  • 102 (seconds) which is the average value of each of the five vocalization durations, is calculated as the value of the vocalization duration (seconds) / one vocalization of the infant 1.
  • the vocal time of the baby 1 ends in k seconds based on the time. It may not be possible to say. This is because baby 1 may resume vocalization after taking a breath for a few seconds when it is not vocalizing. Therefore, if the time when the baby 1 is not uttering is less than a predetermined second, for example, less than 10 seconds, it is interpreted that the utterance time of the baby 1 does not end in k seconds but continues for more than k seconds. You may.
  • the number of utterances / time of the infant 1 and the utterance duration of the infant 1 are shown as the voice parameters extracted by the voice parameter extraction unit 63 based on the time series data 31.
  • the parameter values in the defective period and the parameter values in normal times are recorded separately and updated every predetermined period.
  • the poor physical condition period is a period included in the above-mentioned period T1, T2 or T3 among the periods in which the voice data 50 is acquired, and the poor physical condition set in the time series data 31 exemplified in FIG. 3A. Identified based on the period display flag. In normal times, it is a part or all of the period in which the voice data 50 is acquired other than the period of poor physical condition.
  • the number of utterances / time of the baby 1 is calculated at the timing of recording or updating, for example, at the reference time at 1-second intervals in the past year.
  • the average value of the number of utterances / hour value in each period of poor physical condition and normal time is used.
  • the vocalization duration (seconds) / vocalization of baby 1 the average value in each period of the poor physical condition period and the normal period in the past one year is used.
  • the fundamental frequency, the first formant frequency, and the second formant frequency of the infant 1's vocalization the period of poor physical condition and the normal condition are all based on the values recorded in the records included in the time series data 31 for the past one year.
  • the average value for each period with time is used. It was decided that the average value in the past one year was used for the above-mentioned five types of parameters, but this average value is not limited to the value obtained by the simple average, for example, the old value in the past one year. It may be a value obtained by a weighted average of a relatively recent value with a greater weight.
  • An example of the voice parameter extracted by the voice parameter extraction unit 63 in this way is shown in FIG. 3 (b).
  • the values of each parameter recorded during the period of poor physical condition of the infant 1 are the number of vocalizations / hour of the infant 1 30 times / hour and the vocalization duration (seconds) / vocalization of the infant 1.
  • One time is 120 seconds / time
  • the fundamental frequency of the utterance of the baby 1 is 380 Hz
  • the first formant frequency of the utterance of the baby 1 is 1250 Hz
  • the second formant frequency of the utterance of the baby 1 is 2700 Hz.
  • each parameter recorded in the normal time of infant 1 are the number of utterances / time of infant 1 20 times / hour, the duration of utterance of infant 1 (seconds) / one utterance of 80 seconds / time, and the utterance of infant 1.
  • the fundamental frequency of the infant 1 is 350 Hz
  • the first formant frequency of the utterance of the infant 1 is 1200 Hz
  • the second formant frequency of the utterance of the infant 1 is 2500 Hz.
  • the recorded data 41 is recorded in the storage 4 by the voice parameter extraction unit 63 included in the processor 6 of the voice detection device 5.
  • FIG. 4 is a diagram showing an example of parameter recording processing executed by running a computer program on the voice detection device 5 according to the present embodiment.
  • the processor 6 of the voice detection device 5 activates the computer program stored in the memory 7 and executes the process shown in FIG. 4, thereby causing the voice data acquisition unit 61, the frequency component detection unit 62, and the voice parameter extraction. It functions as a unit 63.
  • this parameter recording process is repeatedly executed every second.
  • step S410 the voice data acquisition unit 61 acquires voice data 50 representing the utterance of the baby 1 via the microphone 3.
  • the frequency component detection unit 62 detects a plurality of frequency components included in the voice data 50 acquired in step S410, and generates the time series data 31 illustrated in FIG. 3A.
  • the fundamental frequency of the utterance of the infant 1, the first formant frequency of the utterance of the infant 1, and the second formant frequency of the utterance of the infant 1 are detected.
  • the time series data 31 is recorded in the storage 4 by the frequency component detection unit 62.
  • the voice parameter extraction unit 63 can specify the poor physical condition period T3 based on the period information 142 acquired from the electronic device 10 via the communication module 8. For example, the period T3 from 15:00 on March 5, 2019 to 9:00 on March 6, 2019, which is illustrated in FIG. 2B, is specified as a period of poor physical condition.
  • the voice parameter extraction unit 63 identifies the poor physical condition period in which the baby 1 was in a poor physical condition with respect to the time series data 31 generated in step S420 by referring to the specified poor physical condition period T3. To do.
  • the voice parameter extraction unit 63 sets the poor physical condition period display flag in association with a plurality of frequency components included in the time series data 31 generated in step S420 based on the specified poor physical condition period T3.
  • step S440 the voice parameter extraction unit 63 extracts a plurality of parameters based on the plurality of frequency components detected in step S420.
  • this processing step for example, the average number of utterances of infant 1 per unit time and the average duration of one utterance of infant 1 in each of the illness period and the normal time other than the illness period.
  • the value, the fundamental frequency average value of the utterance of the infant 1, the first formant frequency average value of the utterance of the infant 1, and the second formant frequency average value of the utterance of the infant 1 are extracted.
  • step S450 the voice parameter extraction unit 63 records the values of the plurality of parameters extracted in step S440 as the recording data 41 in the storage 4. When the process of step S450 is completed, this parameter recording process ends.
  • FIG. 5 is a diagram illustrating a case where voice data 51 representing the vocalization of the baby 1 is acquired and the baby 1 is determined to be in a poor physical condition.
  • the voice data is first recorded.
  • the acquisition unit 61 acquires new voice data 51 representing the utterance of the baby illustrated in FIG. 5 (a).
  • the frequency component detection unit 62 detects a plurality of frequency components included in the acquired new voice data 51.
  • a plurality of parameters are extracted from the acquired new voice data 51 by the voice parameter extraction unit 63 in the same manner as the parameter recording process described above.
  • the physical condition determination unit 64 determines whether or not the baby 1 is in a poor physical condition.
  • the above-mentioned similarity used in the determination by the above-mentioned physical condition determination unit 64 is calculated as follows, for example.
  • the sum of squares X of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter in the period of poor physical condition of the baby 1 recorded in the recorded data 41 is calculated.
  • the sum of squares Y of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter recorded in the recorded data 41 in normal times of the baby 1 is calculated.
  • the value of the sum of squares X is smaller than the value of the sum of squares Y
  • the plurality of parameters extracted from the new voice data 51 are more than a plurality of parameters in the period of poor physical condition of the baby 1 than the plurality of parameters in the normal time of the baby 1. It means that the similarity with the parameter is higher.
  • the physical condition determination unit 64 determines that the baby 1 is in a poor physical condition.
  • the reciprocal values 1 / X and 1 / Y of the sum of squares X and Y may be used as the above-mentioned values of similarity.
  • the value of each parameter may be weighted according to the importance of each parameter. For example, among the plurality of parameters described above, the weight given to the duration of one vocalization of infant 1 and the fundamental frequency of vocalization of infant 1 is determined by the number of vocalizations of infant 1 per unit time, infant 1
  • the above-mentioned sum of squares X and sum of squares Y are calculated as values larger than the weights given to the first formant frequency of the utterance and the second formant frequency of the utterance of the infant 1.
  • the parameter values during the poor physical condition period not only the parameter values during the poor physical condition period but also the parameter values in normal times are recorded, but the parameter values in normal times are not recorded. Only the values of the parameters during the period of poor physical condition may be recorded. In that case, in the above-mentioned calculation of similarity, the sum of squares of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter recorded in the recorded data 41 in normal times of the baby 1. Y is not calculated, only the sum of squares X of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter during the period of poor physical condition of the baby 1 recorded in the recorded data 41 is calculated. To.
  • the value of the sum of squares X or the reciprocal of the sum of squares X is taken as the value of similarity.
  • the value of the sum of squares X is smaller than the predetermined threshold value, it means that the plurality of parameters extracted from the new voice data 51 have a high degree of similarity with the plurality of parameters during the period of poor physical condition of the baby 1.
  • the physical condition determination unit 64 determines that the baby 1 is in a poor physical condition.
  • the physical condition determination unit 64 determines that the baby 1 is in a poor physical condition
  • the physical condition determination unit 64 sends a notification signal indicating that the baby 1 is in a poor physical condition to the electronic device 10 via the communication module 8. .
  • the message control unit 112 of the processor 11 of the electronic device 10 receives the transmitted notification signal via the communication module 15, the baby 1 as illustrated in FIG. 5B is in poor physical condition.
  • the message 145 is output to the message output device 14. In the example shown in FIG. 5B, the message "Your child has or is getting sick. We recommend seeing a doctor.” Is displayed as message 145 on the screen of the message output device 14. It is displayed in.
  • the message 145 includes a search button display 146 that says "search for a nearby hospital” so that the caregiver of the baby 1 who sees this message can search for a hospital near the current location.
  • a search button display 146 that says "search for a nearby hospital” so that the caregiver of the baby 1 who sees this message can search for a hospital near the current location.
  • the voice parameter extraction unit 63 is used for the record of the time when the voice data 51 is acquired by the voice data acquisition unit 61 in the time series data 31.
  • the value "1" may be set as the poor physical condition period display flag.
  • FIG. 6 is a diagram showing an example of a physical condition determination process executed by running a computer program on the voice detection device 5 according to the present embodiment.
  • the processor 6 of the voice detection device 5 activates the computer program stored in the memory 7 and executes the process shown in FIG. 6, thereby causing the voice data acquisition unit 61, the frequency component detection unit 62, and the voice parameter extraction. It functions as a unit 63 and a physical condition determination unit 64.
  • step S610 the voice data acquisition unit 61 acquires voice data 51 representing the utterance of the baby 1 via the microphone 3.
  • the frequency component detection unit 62 detects a plurality of frequency components included in the voice data 51 acquired in step S610. In this processing step, for example, the fundamental frequency of the utterance of the infant 1, the first formant frequency of the utterance of the infant 1, and the second formant frequency of the utterance of the infant 1 are detected.
  • step S630 the voice parameter extraction unit 63 extracts a plurality of parameters based on the plurality of frequency components detected in step S620.
  • the voice parameter extraction unit 63 determines whether or not the fundamental frequency of the vocalization of the baby 1 satisfies a predetermined condition among the plurality of parameters based on the plurality of frequency components included in the voice data 51 extracted in step S630. To judge.
  • the predetermined condition is that the fundamental frequency of the vocalization of the baby 1 is 300 Hz or more and 800 Hz or less. If this predetermined condition is not satisfied, it is highly probable that the utterance included in the voice data 51 acquired in step S610 was not actually the utterance of the baby, and therefore the basic frequency condition in step S640. A negative judgment is obtained in the judgment process, and the physical condition judgment process ends.
  • the physical condition determination unit 64 refers to the recorded data 41 recorded in the storage 4 in the above-mentioned parameter recording process in step S650.
  • the physical condition determination unit 64 calculates the similarity between the values of the plurality of parameters extracted in step S630 and the values of the plurality of parameters recorded in the recorded data 41 by the similarity calculation method described above. ..
  • the values of the plurality of parameters extracted in step S630 are those extracted based on the plurality of frequency components included in the voice data 51.
  • the values of the plurality of parameters recorded in the recorded data 41 are extracted based on the plurality of frequency components included in the voice data 50 for each of the poor physical condition period and the normal time other than the poor physical condition period. ..
  • step S670 the physical condition determination unit 64 determines whether or not the baby 1 is in a poor physical condition based on the similarity calculated in step S660. If a negative determination is obtained in the determination process in step S670, the physical condition determination process ends.
  • the physical condition determination unit 64 sends a notification signal indicating that the baby 1 is in poor physical condition to the electronic device 10 via the communication module 8 in step S680. .. As illustrated in FIG. 5B, the message output device 14 of the electronic device 10 outputs a message 145 to the effect that the baby 1 is in poor physical condition.
  • this physical condition determination process ends.
  • the voice detection device 5 included in the physical condition detection system 2 includes a voice data acquisition unit 61, a frequency component detection unit 62, a voice parameter extraction unit 63, and a physical condition determination unit 64.
  • the voice data acquisition unit 61 acquires voice data 50 representing the utterance of the baby 1.
  • the frequency component detection unit 62 detects a plurality of frequency components included in the voice data 50.
  • the voice parameter extraction unit 63 extracts a plurality of parameters based on a plurality of frequency components included in the voice data 50.
  • the voice data acquisition unit 61 newly acquires the voice data 51 representing the utterance of the baby 1
  • the physical condition determination unit 64 determines the baby based on the newly acquired voice data 51 and the plurality of parameters already extracted.
  • the physical condition detection system 2 in the present embodiment indicates that the baby 1 was affected by various possible diseases. There is a high possibility that it can be detected by the included voice detection device 5.
  • the frequency component detection unit 62 further detects a plurality of frequency components included in the newly acquired voice data 51.
  • the voice parameter extraction unit 63 further extracts a plurality of parameters based on the plurality of frequency components included in the newly acquired voice data 51.
  • the physical condition determination unit 64 satisfies the condition that the basic frequency of the baby 1's utterance is 300 Hz or more and 800 Hz or less among a plurality of parameters based on the plurality of frequency components included in the newly acquired voice data 51. , It is determined whether or not the baby 1 is in a poor physical condition. Therefore, when the utterance included in the voice data 51 is not actually the utterance of the baby 1, it is unlikely that the physical condition determination process of the baby 1 is mistakenly performed.
  • the physical condition determination unit 64 includes a plurality of parameters based on a plurality of frequency components included in the voice data 50 acquired in the past, and newly acquired voice. Based on the degree of similarity with the plurality of parameters based on the plurality of frequency components included in the data 51, it is determined whether or not the baby 1 is in a poor physical condition. Therefore, a stable determination result regarding whether or not the baby 1 is in a poor physical condition can be obtained.
  • the physical condition detection system 2 further includes an electronic device 10 having a message output device 14.
  • the message output device 14 of the electronic device 10 outputs a message indicating that the baby 1 is in poor physical condition. ..
  • the caregiver of the baby 1 who sees the output message can promptly take measures such as searching for a hospital near the current location.
  • FIG. 7 is a diagram illustrating a product supply mode of a computer program traveling by the voice detection device 5 of the physical condition detection system 2 in the above-described embodiment.
  • the computer program running on the voice detection device 5 can be provided to the voice detection device 5 through a recording medium 45 such as a CD-ROM or a USB memory, or a data signal flowing through a communication network 30 such as the Internet.
  • the computer program is read from the recording medium 45 by the operation terminal 46 wirelessly or wiredly connected to the voice detection device 5, and is provided to the voice detection device 5.
  • the computer program providing server 40 is a server computer that provides the above-mentioned computer program, and stores the computer program in a storage device such as a hard disk.
  • the communication network 30 is the Internet, a wireless LAN, a telephone network, a dedicated line, or the like.
  • the computer program providing server 40 reads out the computer program stored in the storage device, puts it on a carrier wave as a data signal, and transmits it to the voice detection device 5 or the operation terminal 46 via the communication network 30.
  • the operation terminal 46 provides the computer program to the voice detection device 5.
  • the computer program can be supplied as a computer-readable computer program product in various forms such as a recording medium and a data signal.
  • Modification 1 when the voice data 50 is acquired for the parameter recording process by the voice data acquisition unit 61 of the processor 6 included in the voice detection device 5 included in the physical condition detection system 2.
  • the voice parameter extraction unit 63 of the processor 6 identifies the period of poor physical condition in which the infant 1 is in a state of poor physical condition based on the period information 142 acquired from the electronic device 10 via the communication module 8.
  • the period T3 from 15:00 on March 5, 2019 to 9:00 on March 6, 2019 is specified as the period of poor physical condition.
  • the period T3 is a period in which the period T1 in which the baby 1 is sick and the period T2 in which the baby 1 is sick are combined.
  • Infant 1 becomes ill during period T1 (the period from 18:00 on March 5, 2019 to 9:00 on March 6, 2019 illustrated in FIG. 2B), or infant 1 becomes ill.
  • the period T2 of the illness (the period from 15:00 on March 5, 2019 to 18:00 on March 5, 2019 illustrated in FIG. 2B) is specified as a period of poor physical condition. May be.
  • the physical condition determination unit 64 determines whether or not the baby 1 is in a state of poor physical condition. At this time, since the state in which the baby 1 is getting sick is also detected, there is a high possibility that the deterioration of the physical condition of the baby 1 can be prevented.
  • the physical condition determination unit 64 included in the processor 6 of the voice detection device 5 includes the values of a plurality of parameters based on a plurality of frequency components included in the new voice data 51. , The degree of similarity between the poor physical condition period recorded in the recorded data 41 and the values of a plurality of parameters based on the plurality of frequency components included in the past voice data 50 in each of the normal times other than the poor physical condition period. Based on this, it is determined whether or not the infant 1 is in a poor physical condition.
  • FIG. 8 is a diagram illustrating recorded data 41 used in the physical condition determination process executed by the physical condition detection system in the second modification.
  • the recorded data 41 illustrated in FIG. 8 is different from the recorded data 41 illustrated in FIG. 3 (b) in that the standard deviation of the value of each parameter is recorded.
  • the recorded data 41 illustrated in FIG. 8 records the values of a plurality of parameters based on the plurality of frequency components included in the past voice data 50.
  • the number of utterances of infant 1 per unit time C1 that is, C1x and C1y and one utterance of infant 1 in each of the illness period x and the normal time y other than the illness period.
  • the formant frequencies C5, that is, C5x and C5y, are recorded, and the standard deviation S1 of the number of utterances C1 of the infant 1 per unit time in each of the poor physical condition period x and the normal time y other than the poor physical condition period. That is, S1x and S1y, the standard deviation S2 of the duration C2 in one utterance of the infant 1, that is, S2x and S2y, the standard deviation S3 of the basic frequency C3 of the utterance of the infant 1, S3x and S3y, and the infant 1.
  • the number of vocalizations of the infant 1 per unit time M1 and the vocalization of the infant 1 1 is extracted.
  • the above-mentioned similarity may be determined based on the Euclidean distances Dx and Dy calculated by the following equations (1) and (2).
  • the Euclidean distance Dx represents the distance between the value of each parameter extracted from the voice data 51 and the value of each parameter in the poor physical condition period x of the infant 1, and the Euclidean distance Dy is extracted from the voice data 51. It represents the distance between the value of each parameter and the value of each parameter in the normal time y of infant 1.
  • Dx 2 ⁇ (C1x-M1) / S1x ⁇ 2 + ⁇ (C2x-M2) / S2x ⁇ 2 + ⁇ (C3x-M3) / S3x ⁇ 2 + ⁇ (C4x-M4) / S4x ⁇ 2 + ⁇ (C5x-M5) / S5x ⁇ 2 ...
  • Dy 2 ⁇ (C1y-M1) / S1y ⁇ 2 + ⁇ (C2y-M2) / S2y ⁇ 2 + ⁇ (C3y-M3) / S3y ⁇ 2 + ⁇ (C4y-M4) / S4y ⁇ 2 + ⁇ (C5y-M5) / S5y ⁇ 2 ...
  • the Euclidean distances Dx and Dy calculated by the above equations (1) and (2) are compared with each other, and when the Euclidean distance is small, the similarity is high.
  • the physical condition determination unit 64 determines that the baby 1 is in poor physical condition. It is determined that the state is.
  • the caregiver of the baby 1 may determine whether the physical condition determination result obtained through the physical condition determination process illustrated in FIG. 6 is correct or not.
  • the determination result by the caregiver is used for the identification process of the poor physical condition period by the voice parameter extraction unit 63, which is performed in step S430 of the parameter recording process illustrated in FIG.
  • the value of the parameter recorded in the recorded data 41 in each of the poor physical condition period and the normal period may become a more accurate value.
  • the feedback of the judgment result by the caregiver as the teacher data of machine learning, it is possible to improve the identification accuracy of the poor physical condition period in the period in which the voice data 50 of the voice including the utterance of the baby 1 is acquired. Can be possible.
  • the value "1" is set in the poor physical condition period display flag, and if the baby 1 is in a very healthy state, the value "2" is set in the poor physical condition period display flag. Is set, and if baby 1 is not in a very healthy state but is in a "normal” state that is not in poor physical condition, the value "3" is set in the poor physical condition period display flag, and the physical condition is unknown. In the case of, the value "4" may be set in the poor physical condition period display flag. By doing so, it may be possible to estimate the physical condition of the baby 1 not only when the baby 1 is in a poor physical condition.
  • FIG. 9 is a diagram showing a configuration of a voice detection device 5 included in the physical condition detection system 2 in the modified example 5.
  • the processor 6 of the voice detection device 5 shown in FIG. 9 is different from the processor 6 of the voice detection device 5 shown in FIG. 1B in that it has an emotion estimation unit 65.
  • the frequency component detection process and the voice parameter extraction process for the voice data of the voice including the utterance of the baby 1 are performed by the parameter recording process shown in FIG. 4 and the physical condition determination process shown in FIG. It is performed in the same manner as the detection process and the voice parameter extraction process.
  • the emotion estimation unit 65 calculates the degree of similarity between the parameters extracted from the newly acquired voice data and the parameters extracted from the voice data acquired in the past and recorded in association with the emotion type. , The emotion of the infant 1 is estimated as one of the plurality of types of emotions classified.
  • the emotion estimation result of the baby 1 by the emotion estimation unit 65 may be used for the emotion type associated with the parameters extracted from the acquired voice data and recorded.
  • the voice parameters used for estimating the emotions of the baby 1 by the emotion estimation unit 65 and the voice parameters used for determining the poor physical condition of the baby 1 by the physical condition determination unit 64 are extracted by, for example, the voice parameter extraction unit 63. It may be the same kind of parameters.
  • the parameters of the same type include, for example, the number of utterances of infant 1 per unit time, the duration of one utterance of infant 1, the fundamental frequency of utterance of infant 1, the first formant frequency of utterance of infant 1, and the infant. It is the second formant frequency of the utterance of 1.
  • the emotion estimation unit 65 and the physical condition determination unit 64 may be integrally configured.
  • a functional unit in which the emotion estimation unit 65 and the physical condition determination unit 64 are integrally configured is referred to as an emotion estimation and physical condition determination unit.
  • the parameter recording process illustrated in FIG. 4 is realized by letting the learning model learn the teacher data.
  • the teacher data is the frequency component detection result detected in step S420, and may be, for example, spectrogram image data corresponding to the audio data 50 acquired in step S410.
  • Spectrogram image data is input to the input layer of the learning model, and for example, "unwell”, “happy”, “angry”, “hungry”, “sleepy”, "wants to play” in the output layer.
  • Each of those classification items set in the output layer is classified by the learning model according to the above-mentioned parameters extracted based on the plurality of frequency components included in the voice data 50.
  • the learning model performs machine learning with the input layer and the output layer set. As a result, a series of processes such as parameter extraction in steps S430, S440 and S450 of FIG. 4 by the voice parameter extraction unit 63 are completed.
  • the emotion estimation and the physical condition determination process due to the addition of the emotion estimation process to the physical condition determination process illustrated in FIG. 6 are input to the machine-learned learning model with the frequency component detection result detected in step S620, for example.
  • the training model inputs the spectrogram image data corresponding to the above-mentioned parameters extracted based on the plurality of frequency components included in the audio data 50. Output the result of classifying the data.
  • the learning model outputs the degree of similarity between the input image data and each of the above-mentioned six classification items as the classification result of the input data.
  • the parameter extraction process in step S630 of FIG. 6 by the voice parameter extraction unit 63, the emotion estimation, the parameter similarity calculation in S650 and S660 by the physical condition determination unit, and the emotion estimation process are completed.
  • the fundamental frequency condition determination in step S640 of FIG. 6 by the voice parameter extraction unit 63 is also taken into consideration by considering the fundamental frequency condition that the basic frequency of the utterance of the infant 1 is 300 Hz or more and 800 Hz or less.
  • the processing can also be included in the output of the training model.
  • the physical condition detection system 2 includes a voice detection device 5 and an electronic device 10, and the processor of the voice detection device 5
  • 6 is logically provided with a voice data acquisition unit 61, a frequency component detection unit 62, a voice parameter extraction unit 63, a physical condition determination unit 64, and an emotion estimation unit 65
  • the voice detection device 5 and the electronic device 10 may be integrally configured.
  • the processor 6 of the voice detection device 5 may have a voice data acquisition unit 61 and a frequency component detection unit 62, and the voice parameter extraction unit 63 and the physical condition determination unit 64 may be included in the processor 11 of the electronic device 10.
  • the processor 6 of the voice detection device 5 has a voice data acquisition unit 61 and a frequency component detection unit 62, and the voice parameter extraction unit 63, the physical condition determination unit 64, and the emotion estimation unit 65 are electronic devices 10.
  • the processor 11 of the above may have.
  • FIG. 10 is a diagram showing the configuration of the physical condition detection system 2 in the modified example 7.
  • the physical condition detection system 2 includes a voice detection device 5, an electronic device 10, and a physical condition determination device 20, and the voice detection device 5, the electronic device 10, and the physical condition determination device 20 are connected to a communication network 30. Are connected to each other via.
  • the physical condition determination device 20 may be, for example, a large-capacity server, and may perform physical condition determination processing of not only the baby 1 but also other infants.
  • the voice detection device 5 shown in FIG. 10B has a microphone 3, a storage 4, a processor 6, a memory 7, and a communication module 8 similar to the voice detection device 5 shown in FIG. 1B.
  • the processor 6 of the voice detection device 5 shown in FIG. 10B logically has a frequency component detection unit 62 and a voice parameter extraction unit 63 by activating a computer program stored in the memory 7. , Unlike the voice detection device 5 shown in FIG. 1B, the physical condition determination process is not performed, so that the physical condition determination unit 64 is not provided.
  • the voice parameter extraction unit 63 does not record the extracted plurality of parameters in the storage 4 as recorded data 41, but transmits the extracted parameters to the electronic device 10 via the communication module 8 as described later.
  • step S410 to step S440 of the parameter recording processing illustrated in FIG. 4 are similar to the voice detection device 5 shown in FIG. 1 (b) in the above-described embodiment, and in this modification, FIG. 10 (b) is shown. It is executed by the voice detection device 5 shown in.
  • the extracted plurality of parameter values are extracted by the voice parameter extraction unit 63 via the communication module 8 and electronically. It is transmitted to 10 and further transmitted from the electronic device 10 to the physical condition determination device 20 via the communication network 30.
  • step S610 to step S640 of the physical condition determination process illustrated in FIG. 6 are similar to the voice detection device 5 shown in FIG. 1 (b) in the above-described embodiment, and in this modification, FIG. 10 (b) is shown. It is executed by the voice detection device 5 shown in.
  • step S640 of FIG. 6 if a positive determination is obtained in step S640 of FIG. 6, the values of the plurality of parameters extracted in step S630 are transmitted to the electronic device 10 by the voice parameter extraction unit 63 via the communication module 8. , The electronic device 10 is further transmitted to the physical condition determination device 20 via the communication network 30.
  • the physical condition determination device 20 includes a processor 21, a storage 24, a memory 27, and a communication module 28.
  • the processor 21 of the physical condition determination device 20 logically has the voice parameter acquisition unit 211 and the physical condition determination unit 214 by activating the computer program stored in the memory 27.
  • the voice parameter acquisition unit 211 acquires the values of a plurality of parameters extracted by the voice detection device 5 via the communication network 30 and the communication module 28.
  • the values of the plurality of parameters extracted in the parameter recording process are recorded in the storage 24 as recorded data 41 by the voice parameter acquisition unit 211.
  • the voice parameter acquisition unit 211 executes the determination process in step S640 for the fundamental frequency among the values of the plurality of parameters extracted in the physical condition determination process.
  • FIG. 10 (c) is the same as the physical condition determination unit 64 included in the processor 6 of the voice detection device 5 shown in FIG. 1 (b). It is executed by the physical condition determination unit 214 included in the processor 21 of the physical condition determination device 20 shown in the above.
  • the function of the physical condition determination unit 64 included in the processor 6 of the voice detection device 5 shown in FIGS. 1 and 9 is provided as the physical condition determination unit 214 in the processor 21 of the physical condition determination device 20 in this modification. Can be done.
  • the function of the emotion estimation unit 65 included in the voice detection device 5 shown in FIG. 9 may be provided in the processor 21 of the physical condition determination device 20 in this modification.
  • the present invention is not limited to the configurations in each of the above-described embodiments and modifications as long as the characteristic functions of the present invention are not impaired.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

This physical condition detection system comprises: a speech data acquisition unit for acquiring first speech data that represents a vocalization of an infant; a frequency component detection unit for detecting a plurality of frequency components included in the first speech data; a speech parameter extraction unit for extracting a plurality of parameters based on the plurality of frequency components; and a physical condition determination unit for determining whether or not the infant is in poor physical condition on the basis of second speech data that represents a vocalization of the infant and the plurality of parameters.

Description

体調検出システムPhysical condition detection system
 本発明は、体調検出システムに関する。 The present invention relates to a physical condition detection system.
 特許文献1には、測定データを処理して幼児の健康面に問題があるかを判断する幼児健康監視システムが開示されている。幼児監視装置が集めた測定データは、幼児の動き、体温、***および目覚めを含んでいる。マイクロホンまたはカメラ等の各装置から集めた環境センサデータは、音声レベルおよびビデオストリームを含んでいる。 Patent Document 1 discloses an infant health monitoring system that processes measurement data to determine whether there is a problem with the health of the infant. The measurement data collected by the infant monitoring device includes infant movement, temperature, position and awakening. Environmental sensor data collected from each device, such as a microphone or camera, includes audio levels and video streams.
国際公開第2016/164373号International Publication No. 2016/164373
 特許文献1に開示された幼児監視システムは、幼児監視装置が集めた測定データおよび環境センサデータに基づいて、SIDS(乳幼児突然死症候群)、てんかん発作、睡眠パターンの乱れ、発熱、或いはストレスを受けている状態といった、幼児の健康面の問題を判断する。しかし、乳幼児の健康面の問題はそれらに限られず、様々な疾病が考えられることから、そうした様々な疾病に乳幼児が罹患したことを検出するのは、特許文献1に開示された幼児監視システムには困難であるという問題がある。 The infant monitoring system disclosed in Patent Document 1 receives SIDS (Sudden Infant Death Syndrome), epilepsy, disturbed sleep pattern, fever, or stress based on the measurement data and environmental sensor data collected by the infant monitoring device. Determine infant health problems such as being in a state of being. However, the health problems of infants are not limited to these, and various diseases can be considered. Therefore, it is the infant monitoring system disclosed in Patent Document 1 that detects that an infant is affected by such various diseases. Has the problem of being difficult.
 本発明の第1の態様によると、体調検出システムは、乳児の発声を表す第1の音声データを取得する音声データ取得部と、前記第1の音声データに含まれる複数の周波数成分を検出する周波数成分検出部と、前記第1の音声データに含まれる前記複数の周波数成分に基づく複数のパラメータを抽出する音声パラメータ抽出部と、前記音声データ取得部により前記乳児の発声を表す第2の音声データが取得されると、取得された前記第2の音声データと前記複数のパラメータとに基づき、前記乳児は体調不良の状態であるか否かの判定を行う体調判定部とを備える。
 本発明の第2の態様によると、第1の態様の体調検出システムにおいて、前記周波数成分検出部は、前記第2の音声データに含まれる前記複数の周波数成分をさらに検出し、前記音声パラメータ抽出部は、前記第2の音声データに含まれる前記複数の周波数成分に基づく前記複数のパラメータをさらに抽出し、前記体調判定部は、前記第2の音声データに含まれる前記複数の周波数成分に基づく前記複数のパラメータのうちの少なくとも一部のパラメータが所定条件を満たすとき、前記判定を行うのが好ましい。
 本発明の第3の態様によると、第2の態様の体調検出システムにおいて、前記少なくとも一部のパラメータは、前記第2の音声データに含まれる前記発声の基本周波数であり、前記所定条件は、前記基本周波数が300Hz以上800Hz以下であるのが好ましい。
 本発明の第4の態様によると、第2または第3の態様の体調検出システムにおいて、前記体調判定部は、前記第1の音声データに含まれる前記複数の周波数成分に基づく前記複数のパラメータと、前記第2の音声データに含まれる前記複数の周波数成分に基づく前記複数のパラメータとの類似度に基づいて、前記判定を行うのが好ましい。
 本発明の第5の態様によると、第1から第4までのいずれかの態様の体調検出システムにおいて、前記複数のパラメータは、単位時間あたりの前記発声の発生回数と、前記発声の1回における継続時間と、前記発声の基本周波数と、前記発声のフォルマント周波数とのうちの少なくとも一つのパラメータを含むのが好ましい。
 本発明の第6の態様によると、第1から第5までのいずれかの態様の体調検出システムにおいて、前記体調不良の状態は、前記乳児が病気に罹患した状態と、前記乳児が前記病気に罹患しつつある状態とのうちの、少なくとも一方の状態を含むのが好ましい。
 本発明の第7の態様によると、第1から第6までのいずれかの態様の体調検出システムにおいて、前記体調判定部によって前記乳児が体調不良であると判定されると、前記乳児が体調不良である旨のメッセージを出力するメッセージ出力装置をさらに備えるのが好ましい。
 本発明の第8の態様によると、第1から第7までのいずれかの態様の体調検出システムにおいて、前記音声データ取得部により前記第2の音声データが取得されると、取得された前記第2の音声データと前記複数のパラメータとに基づき、前記乳児の感情を推定する感情推定部をさらに備えるのが好ましい。
According to the first aspect of the present invention, the physical condition detection system detects a voice data acquisition unit that acquires a first voice data representing a baby's voice and a plurality of frequency components included in the first voice data. A frequency component detection unit, a voice parameter extraction unit that extracts a plurality of parameters based on the plurality of frequency components included in the first voice data, and a second voice that represents the baby's voice by the voice data acquisition unit. When the data is acquired, the infant is provided with a physical condition determination unit that determines whether or not the infant is in a poor physical condition based on the acquired second voice data and the plurality of parameters.
According to the second aspect of the present invention, in the physical condition detection system of the first aspect, the frequency component detection unit further detects the plurality of frequency components included in the second voice data, and extracts the voice parameter. The unit further extracts the plurality of parameters based on the plurality of frequency components included in the second voice data, and the physical condition determination unit is based on the plurality of frequency components included in the second voice data. It is preferable to perform the determination when at least a part of the plurality of parameters satisfies a predetermined condition.
According to the third aspect of the present invention, in the physical condition detection system of the second aspect, the at least a part of the parameters is the fundamental frequency of the vocalization included in the second voice data, and the predetermined condition is. The fundamental frequency is preferably 300 Hz or more and 800 Hz or less.
According to the fourth aspect of the present invention, in the physical condition detection system of the second or third aspect, the physical condition determination unit includes the plurality of parameters based on the plurality of frequency components included in the first voice data. It is preferable to make the determination based on the similarity with the plurality of parameters based on the plurality of frequency components included in the second audio data.
According to the fifth aspect of the present invention, in the physical condition detection system of any one of the first to fourth aspects, the plurality of parameters are the number of occurrences of the utterance per unit time and one of the utterances. It is preferable to include at least one parameter of the duration, the fundamental frequency of the utterance, and the formant frequency of the utterance.
According to the sixth aspect of the present invention, in the physical condition detection system of any one of the first to fifth aspects, the state of poor physical condition includes a state in which the baby is sick and a state in which the baby is sick. It preferably includes at least one of the afflicting conditions.
According to the seventh aspect of the present invention, in the physical condition detection system of any one of the first to sixth aspects, when the physical condition determination unit determines that the baby is in poor physical condition, the baby is in poor physical condition. It is preferable to further include a message output device that outputs a message to that effect.
According to the eighth aspect of the present invention, in the physical condition detection system of any one of the first to seventh aspects, when the second voice data is acquired by the voice data acquisition unit, the acquired second voice data is obtained. It is preferable to further include an emotion estimation unit that estimates the emotions of the baby based on the voice data of 2 and the plurality of parameters.
 本発明によれば、様々に考えられる疾病に乳児が罹患したことを検出できる可能性が高いという効果が得られる。 According to the present invention, there is a high possibility that it is possible to detect that an infant has been affected by various possible diseases.
図1は、本発明の一実施の形態における体調検出システムの構成を示す図である。FIG. 1 is a diagram showing a configuration of a physical condition detection system according to an embodiment of the present invention. 図2は、乳児の発声を表す音声データが取得される様子を例示する図である。FIG. 2 is a diagram illustrating how voice data representing the utterance of an infant is acquired. 図3は、乳児の発声を表す音声データに含まれる複数の周波数成分と、それらの複数の周波数成分に基づく複数のパラメータの値が記録された記録データを例示する図である。FIG. 3 is a diagram illustrating recorded data in which a plurality of frequency components included in voice data representing a baby's utterance and values of a plurality of parameters based on the plurality of frequency components are recorded. 図4は、一実施の形態における音声検出装置でコンピュータプログラムが走行することによって実行されるパラメータ記録処理の一例を示す図である。FIG. 4 is a diagram showing an example of parameter recording processing executed by running a computer program on the voice detection device according to the embodiment. 図5は、乳児の発声を表す音声データが取得され、その乳児が体調不良の状態であると判定された場合を例示する図である。FIG. 5 is a diagram illustrating a case where voice data representing the vocalization of an infant is acquired and the infant is determined to be in a poor physical condition. 図6は、一実施の形態における音声検出装置でコンピュータプログラムが走行することによって実行される体調判定処理の一例を示す図である。FIG. 6 is a diagram showing an example of a physical condition determination process executed by running a computer program on the voice detection device according to the embodiment. 図7は、一実施の形態における音声検出装置で走行するコンピュータプログラムの製品供給態様を例示する図である。FIG. 7 is a diagram illustrating a product supply mode of a computer program traveling by the voice detection device according to the embodiment. 図8は、本発明の一実施の形態の変形例2における体調検出システムで実行される体調判定処理に用いられる記録データを例示する図である。FIG. 8 is a diagram illustrating recorded data used in the physical condition determination process executed by the physical condition detection system in the second modification of the embodiment of the present invention. 図9は、本発明の一実施の形態の変形例5における体調検出システムに含まれる音声検出装置の構成を示す図である。FIG. 9 is a diagram showing a configuration of a voice detection device included in the physical condition detection system according to the fifth modification of the embodiment of the present invention. 図10は、本発明の一実施の形態の変形例7における体調検出システムの構成を示す図である。FIG. 10 is a diagram showing a configuration of a physical condition detection system according to a modification 7 of the embodiment of the present invention.
 図1は、本発明の一実施の形態における体調検出システム2の構成を示す図である。図1(a)において、体調検出システム2は、音声検出装置5と電子機器10とを含む。音声検出装置5と電子機器10とは互いにBluetooth(登録商標)等の無線通信を介して接続されている。電子機器10は、例えば、スマートフォン、タブレットコンピュータ、またはパーソナルコンピュータ等の携帯端末であり、専用端末であっても汎用端末であってもよい。 FIG. 1 is a diagram showing a configuration of a physical condition detection system 2 according to an embodiment of the present invention. In FIG. 1A, the physical condition detection system 2 includes a voice detection device 5 and an electronic device 10. The voice detection device 5 and the electronic device 10 are connected to each other via wireless communication such as Bluetooth (registered trademark). The electronic device 10 is, for example, a mobile terminal such as a smartphone, a tablet computer, or a personal computer, and may be a dedicated terminal or a general-purpose terminal.
 図1(b)に示すように、音声検出装置5は、マイクロホン3、ストレージ4、プロセッサ6、メモリ7および通信モジュール8を有する。マイクロホン3は、人間の発声および環境音を含む音声を電気信号による音声データに変換する。プロセッサ6は、メモリ7に格納されているコンピュータプログラムを起動することによって、音声データ取得部61と、周波数成分検出部62と、音声パラメータ抽出部63と、体調判定部64とを論理的に有する。  As shown in FIG. 1B, the voice detection device 5 includes a microphone 3, a storage 4, a processor 6, a memory 7, and a communication module 8. The microphone 3 converts voice including human utterance and environmental sound into voice data by an electric signal. The processor 6 logically includes a voice data acquisition unit 61, a frequency component detection unit 62, a voice parameter extraction unit 63, and a physical condition determination unit 64 by activating a computer program stored in the memory 7. ..
 音声データ取得部61は、マイクロホン3を介して乳児の発声を表す音声データを、例えば音声検出装置5の設置環境における環境音を表す音声データおよび/または成人の発声を表す音声データ等とともに取得する。取得された音声データには複数の周波数成分が含まれる。周波数成分検出部62は、乳児の発声を含む音声の音声データに含まれる複数の周波数成分を検出する。乳児の発声には、泣き声および喃語の発声が含まれる。検出されたそれらの複数の周波数成分より、基本周波数と、第一フォルマント周波数と、第二フォルマント周波数とが得られる。一般に、基本周波数は音の高さを決め、第一フォルマント周波数および第二フォルマント周波数は音色を決めることが知られている。 The voice data acquisition unit 61 acquires voice data representing the voice of an infant via the microphone 3 together with voice data representing the environmental sound in the installation environment of the voice detection device 5 and / or voice data representing the voice of an adult. .. The acquired voice data includes a plurality of frequency components. The frequency component detection unit 62 detects a plurality of frequency components included in the voice data of the voice including the utterance of the baby. Infant vocalizations include crying and babbling vocalizations. From those plurality of detected frequency components, a fundamental frequency, a first formant frequency, and a second formant frequency can be obtained. It is generally known that the fundamental frequency determines the pitch, and the first formant frequency and the second formant frequency determine the timbre.
 音声パラメータ抽出部63は、音声データ取得部61によって取得された音声データに含まれる、周波数成分検出部62によって検出された複数の周波数成分に基づき、複数のパラメータを抽出する。抽出される複数のパラメータは、図3(b)を用いて後述するように、単位時間あたりの乳児の発声の回数と、乳児の発声の1回における継続時間と、乳児の発声の基本周波数と、乳児の発声に含まれるフォルマント周波数(第一フォルマント周波数および/または第二フォルマント周波数)とのうちの少なくとも一つのパラメータを含む。なお、音声波形を複数の正弦波の合成で近似したとき、それら複数の正弦波のうちで最も低い周波数を示す正弦波の周波数は基本周波数と呼ばれる。また、その基本周波数の整数倍の周波数を示す正弦波の振幅のピークに対応する周波数として、基本周波数から近い順に第一フォルマント周波数および第二フォルマント周波数が得られることが知られている。そこで、音声データ取得部61によって取得された音声データにおいて周波数成分検出部62により検出された複数の周波数成分に含まれる、基本周波数、第一フォルマント周波数および第二フォルマント周波数が、上述したパラメータとして、音声パラメータ抽出部63により抽出され得る。 The voice parameter extraction unit 63 extracts a plurality of parameters based on a plurality of frequency components detected by the frequency component detection unit 62 included in the voice data acquired by the voice data acquisition unit 61. The plurality of extracted parameters are the number of infant utterances per unit time, the duration of one infant utterance, and the fundamental frequency of the infant utterance, as will be described later using FIG. 3 (b). , Includes at least one parameter of the formant frequency (first formant frequency and / or second formant frequency) included in the infant's utterance. When the audio waveform is approximated by combining a plurality of sine waves, the frequency of the sine wave indicating the lowest frequency among the plurality of sine waves is called the fundamental frequency. Further, it is known that the first formant frequency and the second formant frequency are obtained in order from the fundamental frequency as the frequency corresponding to the peak of the amplitude of the sine wave indicating a frequency that is an integral multiple of the fundamental frequency. Therefore, the fundamental frequency, the first formant frequency, and the second formant frequency included in the plurality of frequency components detected by the frequency component detection unit 62 in the voice data acquired by the voice data acquisition unit 61 are set as the above-mentioned parameters. It can be extracted by the voice parameter extraction unit 63.
 図4を用いて後述するパラメータ記録処理において、音声データ取得部61によって取得された音声データから音声パラメータ抽出部63によって抽出された複数のパラメータの値は、体調判定部64によって乳児が体調不良の状態であるか否かの判定を行う際に参照される記録データとして、音声パラメータ抽出部63によりストレージ4に記録される。この記録データは、音声検出装置5内部のストレージ4ではなく音声検出装置5外部のストレージ(不図示)に記録されてもよい。 In the parameter recording process described later with reference to FIG. 4, the values of the plurality of parameters extracted by the voice parameter extraction unit 63 from the voice data acquired by the voice data acquisition unit 61 are such that the baby is in poor physical condition by the physical condition determination unit 64. The recorded data referred to when determining whether or not the state is in the state is recorded in the storage 4 by the voice parameter extraction unit 63. This recorded data may be recorded in a storage (not shown) outside the voice detection device 5 instead of the storage 4 inside the voice detection device 5.
 図6を用いて後述する体調判定処理において、体調判定部64は、音声データ取得部61により乳児の発声を表す新たな音声データが取得されたとき、取得された新たな音声データと、上述したストレージ4に記録された複数のパラメータの値とに基づき、その乳児は体調不良の状態であるか否かの判定を行う。音声データ取得部61により上述した新たな音声データが取得されたとき、周波数成分検出部62は、その新たな音声データに含まれる複数の周波数成分を検出する。音声パラメータ抽出部63は、その新たな音声データに含まれる、周波数成分検出部62によって検出された複数の周波数成分に基づき、上述した複数のパラメータを抽出する。体調判定部64は、その新たな音声データに含まれる複数の周波数成分に基づいて音声パラメータ抽出部63により抽出された複数のパラメータのうち、少なくとも一部のパラメータが所定条件を満たすとき、その乳児は体調不良の状態であるか否かの判定を行う。少なくとも一部のパラメータが所定条件を満たすとき、とは、例えば、発声の基本周波数が300Hz以上800Hz以下であるとき、をいう。発声の基本周波数が300Hzを下回るとき、或いは基本周波数が800Hzを上回るとき、その新たな音声データで表される発声は、例えば成人の発声であったり、或いは発声ではなく生活音等の環境音であったりするなど、乳児の発声ではない可能性が高いと考えられるからである。 In the physical condition determination process described later with reference to FIG. 6, when the voice data acquisition unit 61 acquires new voice data representing the baby's utterance, the physical condition determination unit 64 together with the acquired new voice data and the above-mentioned. Based on the values of the plurality of parameters recorded in the storage 4, it is determined whether or not the baby is in a poor physical condition. When the above-mentioned new voice data is acquired by the voice data acquisition unit 61, the frequency component detection unit 62 detects a plurality of frequency components included in the new voice data. The voice parameter extraction unit 63 extracts the above-mentioned plurality of parameters based on the plurality of frequency components detected by the frequency component detection unit 62 included in the new voice data. When at least some of the parameters extracted by the voice parameter extraction unit 63 based on the plurality of frequency components included in the new voice data satisfy the predetermined conditions, the physical condition determination unit 64 determines the baby. Determines whether or not the patient is in poor physical condition. When at least some of the parameters satisfy a predetermined condition, for example, when the fundamental frequency of vocalization is 300 Hz or more and 800 Hz or less. When the fundamental frequency of utterance is below 300 Hz, or when the fundamental frequency is above 800 Hz, the utterance represented by the new voice data is, for example, an adult utterance, or an environmental sound such as a living sound instead of utterance. This is because it is highly possible that the vocalization is not an infant's vocalization.
 体調判定部64は、乳児が体調不良の状態であるか否かの判定を行う際は、後述するように、ストレージ4に記録された複数のパラメータと、新たな音声データに含まれる複数の周波数成分に基づいて抽出された複数のパラメータとの類似度を算出する。算出された類似度に基づいて、乳児が体調不良の状態であるか否かの判定が、体調判定部64により行われる。乳児が体調不良の状態であると判定されたとき、体調判定部64は、乳児が体調不良である旨の通知信号を、通信モジュール8を介して無線通信により電子機器10へ送出する。 When determining whether or not the baby is in poor physical condition, the physical condition determination unit 64 determines a plurality of parameters recorded in the storage 4 and a plurality of frequencies included in the new voice data, as will be described later. The similarity with a plurality of parameters extracted based on the components is calculated. Based on the calculated similarity, the physical condition determination unit 64 determines whether or not the baby is in a poor physical condition. When it is determined that the baby is in poor physical condition, the physical condition determination unit 64 sends a notification signal indicating that the baby is in poor physical condition to the electronic device 10 by wireless communication via the communication module 8.
 図1(c)に示すように、電子機器10は、プロセッサ11、メモリ12、入力インタフェース13、メッセージ出力装置14および通信モジュール15を有する。プロセッサ11は、メモリ12に格納されているコンピュータプログラムを起動することによって、入力インタフェース13の動作を制御する入力制御部111と、メッセージ出力装置14を制御するメッセージ制御部112とを、論理的に有する。入力インタフェース13は、例えばタッチパネルである。後述する期間情報は、ユーザによる画面入力データとして入力インタフェース13に入力される。メッセージ出力装置14は、例えばディスプレイおよび/またはタッチパネルであって、乳児が体調不良である旨を、その乳児の養育者等のユーザへ通知するメッセージと、後述する期間情報の入力画面表示とを出力する機能を有する。メッセージ制御部112は、通信モジュール15を介して、無線通信により音声検出装置5から乳児が体調不良である旨の通知信号を受信する。 As shown in FIG. 1C, the electronic device 10 includes a processor 11, a memory 12, an input interface 13, a message output device 14, and a communication module 15. The processor 11 logically connects the input control unit 111 that controls the operation of the input interface 13 and the message control unit 112 that controls the message output device 14 by activating the computer program stored in the memory 12. Have. The input interface 13 is, for example, a touch panel. The period information described later is input to the input interface 13 as screen input data by the user. The message output device 14 is, for example, a display and / or a touch panel, and outputs a message notifying a user such as a caregiver of the baby that the baby is in poor physical condition, and an input screen display of period information described later. Has the function of The message control unit 112 receives a notification signal indicating that the baby is in poor physical condition from the voice detection device 5 by wireless communication via the communication module 15.
 図2は、乳児1の発声を表す音声データが取得される様子を例示する図である。図2(a)に示すように、音声検出装置5は、乳児1の発声を表す音声データが取得されるよう、乳児1の近傍に設置される。音声検出装置5によって取得された音声データ50は、電気信号であるため、横軸を時間、縦軸を振幅とすると、図2(b)に例示されるような波形で表される。波形で表される音声データ50には、上述したように、複数の周波数成分が含まれる。乳児1の発声を表す音声データに含まれる複数の周波数成分に基づいて抽出される上述した複数のパラメータは、乳児1が体調不良の状態であるときと体調不良の状態でないときとで異なる。例えば、乳児1が喉または肺の炎症を患っていることにより体調不良の状態にあるとき、乳児1の発声には平常時と比べてノイズが生じやすくなる。そこで、その複数のパラメータの値を上述したように記録データとしてストレージ4に記録しておくことにより、その記録された複数のパラメータの値と、新たに取得された乳児1の発声を表す新たな音声データから抽出される少なくとも一つのパラメータの値とに基づいて、乳児1が体調不良の状態であるか否かの判定を行うことができる。 FIG. 2 is a diagram illustrating how voice data representing the utterance of the baby 1 is acquired. As shown in FIG. 2A, the voice detection device 5 is installed in the vicinity of the baby 1 so that voice data representing the utterance of the baby 1 can be acquired. Since the voice data 50 acquired by the voice detection device 5 is an electric signal, it is represented by a waveform as illustrated in FIG. 2B, where the horizontal axis is time and the vertical axis is amplitude. As described above, the voice data 50 represented by the waveform includes a plurality of frequency components. The above-mentioned plurality of parameters extracted based on the plurality of frequency components included in the voice data representing the vocalization of the baby 1 are different depending on whether the baby 1 is in a poor physical condition or not. For example, when the baby 1 is in a state of poor physical condition due to inflammation of the throat or lungs, the vocalization of the baby 1 is more likely to generate noise than in normal times. Therefore, by recording the values of the plurality of parameters in the storage 4 as recorded data as described above, a new value representing the recorded values of the plurality of parameters and the newly acquired utterance of the baby 1 is represented. Based on the value of at least one parameter extracted from the voice data, it is possible to determine whether or not the baby 1 is in a poor physical condition.
 まず、その記録データが記録されるパラメータ記録処理を説明する。図2(b)に例示される乳児1の発声を表す音声データ50が取得されたとき、図2(c)に例示される入力用の画面が電子機器10の出力装置14に表示される。この入力用の画面を介して乳児1が病気に罹患した状態にあった期間を乳児1の養育者に問い合わせることができる。図2(c)に例示される入力用の画面には、メッセージ141および期間情報142が表示されている。具体的には、メッセージ141として、図2(b)に例示される音声データ50に関し、「お子様が病気の期間を入力してください」というメッセージが表示されている。このメッセージに対して、乳児1が病気に罹患した状態にあった期間の始期および終期を表す期間情報142が乳児1の養育者により入力された例が示されている。この例において、乳児1が病気に罹患した状態にあった期間の始期は2019年3月5日18:00であって、終期は2019年3月6日9:00である旨の期間情報142が入力されている。期間情報142は、電子機器10が有するプロセッサ11の入力制御部111による制御のもとで、乳児1の養育者により入力され、かつ通信モジュール15を介して音声検出装置へ送信される。 First, the parameter recording process in which the recorded data is recorded will be described. When the voice data 50 representing the utterance of the baby 1 illustrated in FIG. 2B is acquired, the input screen illustrated in FIG. 2C is displayed on the output device 14 of the electronic device 10. Through this input screen, the caregiver of the baby 1 can be inquired about the period during which the baby 1 was sick. The message 141 and the period information 142 are displayed on the input screen illustrated in FIG. 2C. Specifically, as the message 141, the message "Please input the period of illness of the child" is displayed with respect to the voice data 50 exemplified in FIG. 2 (b). In response to this message, an example is shown in which period information 142 indicating the beginning and end of the period during which the baby 1 was sick was input by the caregiver of the baby 1. In this example, the period information 142 indicating that the beginning of the period in which the baby 1 was sick is 18:00 on March 5, 2019, and the end is 9:00 on March 6, 2019. Has been entered. The period information 142 is input by the caregiver of the baby 1 and transmitted to the voice detection device via the communication module 15 under the control of the input control unit 111 of the processor 11 of the electronic device 10.
 期間情報142として示される乳児1が病気に罹患した状態の期間は、図2(b)に例示される音声データ50において、期間T1として示されるものとする。音声データ50が取得された期間のうち、乳児1が病気に罹患した状態の期間T1においては、乳児1が病気に罹患していない平常時の状態と比較して異なる波形が観測されると考えられる。この期間T1は乳児1が病気に罹患した状態の期間であるから、音声検出装置5が有するプロセッサ6の音声パラメータ抽出部63は、通信モジュール8を介して電子機器10から取得した期間情報142に基づき期間T1を、乳児1が体調不良の状態にある体調不良期間として特定する。図2(b)に示す例において、期間T1は、2019年3月5日18:00から2019年3月6日9:00までの期間である。 The period during which the baby 1 is sick, which is shown as the period information 142, shall be indicated as the period T1 in the voice data 50 illustrated in FIG. 2 (b). It is considered that a different waveform is observed in the period T1 in which the baby 1 is sick during the period in which the voice data 50 is acquired, as compared with the normal state in which the baby 1 is not sick. Be done. Since this period T1 is a period in which the baby 1 is sick, the voice parameter extraction unit 63 of the processor 6 included in the voice detection device 5 provides the period information 142 acquired from the electronic device 10 via the communication module 8. Based on this, the period T1 is specified as a period of poor physical condition in which the baby 1 is in a state of poor physical condition. In the example shown in FIG. 2B, the period T1 is the period from 18:00 on March 5, 2019 to 9:00 on March 6, 2019.
 また、乳児1が病気に罹患していない状態から病気に罹患した状態へ変化する際には、その過渡期間に乳児1は病気に罹患しつつある状態であると考えられる。音声データ50が取得された期間のうち、乳児1が病気に罹患しつつある状態の期間T2においてもまた、乳児1が病気に罹患していない平常時の状態と比較して異なる波形が観測されると考えられる。この期間T2は乳児1が病気に罹患しつつある状態の期間であるから、音声検出装置5が有するプロセッサ6の音声パラメータ抽出部63は、期間T1の前の期間T2を、乳児1が体調不良の状態にある体調不良期間として特定してもよい。図2(b)に示す例において、期間T2は、2019年3月5日15:00から2019年3月5日18:00までの期間である。音声パラメータ抽出部63は、例えば、期間T1の前の3時間の時間帯を期間T2として特定する。期間T2は、音声データ50において、期間T1の前の時間帯で、平常時とは異なる波形が観測される期間として、音声パラメータ抽出部63により特定されることとしてもよい。 Further, when the baby 1 changes from a state in which the baby is not sick to a state in which the baby 1 is sick, it is considered that the baby 1 is in a state of being sick during the transitional period. During the period T2 in which the baby 1 is getting sick during the period when the voice data 50 is acquired, a different waveform is observed as compared with the normal state in which the baby 1 is not sick. It is thought that. Since this period T2 is a period in which the baby 1 is getting sick, the voice parameter extraction unit 63 of the processor 6 included in the voice detection device 5 sets the period T2 before the period T1 and the baby 1 is in poor physical condition. It may be specified as a period of poor physical condition in the state of. In the example shown in FIG. 2B, the period T2 is the period from 15:00 on March 5, 2019 to 18:00 on March 5, 2019. The voice parameter extraction unit 63 specifies, for example, the time zone of 3 hours before the period T1 as the period T2. The period T2 may be specified by the voice parameter extraction unit 63 as a period in which a waveform different from the normal time is observed in the time zone before the period T1 in the voice data 50.
 さらに、上述したように、期間T1は乳児1が病気に罹患した状態の期間であり、期間T2は乳児1が病気に罹患しつつある状態の期間であるから、音声検出装置5が有するプロセッサ6の音声パラメータ抽出部63は、図2(b)に例示する期間T1および期間T2を併合した期間T3を、乳児1が体調不良の状態にある体調不良期間として特定としてもよい。図2(b)に示す例において、期間T3は、2019年3月5日15:00から2019年3月6日9:00までの期間である。 Further, as described above, since the period T1 is the period in which the baby 1 is sick and the period T2 is the period in which the baby 1 is sick, the processor 6 included in the voice detection device 5 The voice parameter extraction unit 63 of the above may specify the period T3, which is a combination of the period T1 and the period T2 illustrated in FIG. 2B, as the period of poor physical condition in which the baby 1 is in a state of poor physical condition. In the example shown in FIG. 2B, the period T3 is the period from 15:00 on March 5, 2019 to 9:00 on March 6, 2019.
 上述したように、音声検出装置5が有するプロセッサ6の周波数成分検出部62は、乳児1の発声を含む音声の音声データ50に含まれる複数の周波数成分を検出する。本実施の形態では、周波数成分検出部62によって、複数の周波数成分が検出され、それらの複数の周波数成分のうちから、乳児1の発声の基本周波数と、乳児1の発声の第一フォルマント周波数および第二フォルマント周波数とが、毎秒得られるものとする。なお、それら複数の周波数成分は、毎秒ではなく5秒毎、或いは10秒毎というように、所定時間毎に得られることとしてよい。上述したように、音声パラメータ抽出部63は、検出されたそれら複数の周波数成分に基づき、複数のパラメータを抽出する。図3は、乳児1の発声を表す音声データ50に含まれる複数の周波数成分と、それらの複数の周波数成分に基づく複数のパラメータの値が記録された記録データ41とを例示する図である。記録データ41は音声パラメータ抽出部63により乳児毎にストレージ4に記録される。 As described above, the frequency component detection unit 62 of the processor 6 included in the voice detection device 5 detects a plurality of frequency components included in the voice data 50 of the voice including the utterance of the baby 1. In the present embodiment, a plurality of frequency components are detected by the frequency component detection unit 62, and the fundamental frequency of the vocalization of the infant 1 and the first formant frequency of the vocalization of the infant 1 are selected from the plurality of frequency components. It is assumed that the second formant frequency is obtained every second. The plurality of frequency components may be obtained every predetermined time, such as every 5 seconds or every 10 seconds instead of every second. As described above, the voice parameter extraction unit 63 extracts a plurality of parameters based on the detected plurality of frequency components. FIG. 3 is a diagram illustrating a plurality of frequency components included in the voice data 50 representing the utterance of the baby 1 and recorded data 41 in which the values of a plurality of parameters based on the plurality of frequency components are recorded. The recorded data 41 is recorded in the storage 4 for each baby by the voice parameter extraction unit 63.
 図3(a)は、周波数成分検出部62が毎秒検出する乳児1の発声の基本周波数と、乳児1の発声の第一フォルマント周波数および第二フォルマント周波数とを含む時系列データ31を示す。時系列データ31はストレージ4に記録される。図3(a)に示されるこの時系列データ31は、図2(b)に示す乳児1の体調不良期間T1またはT3における音声データ50に対応し、その音声データ50に基づき、各時刻における乳児1の発声の基本周波数と、乳児1の発声の第一フォルマント周波数および第二フォルマント周波数とを含むレコードが毎秒追加されることによって、生成される。さらに、この時系列データ31では、それらの周波数成分から構成される1秒毎の各時刻における各レコードに体調不良期間表示フラグが対応付けられている。体調不良期間表示フラグは、周波数成分検出部62によって時系列データ31がストレージ4に記録された後、音声パラメータ抽出部63によって、レコード毎に対応付けられる。 FIG. 3A shows time-series data 31 including the fundamental frequency of the infant 1's utterance detected by the frequency component detection unit 62 per second, and the first formant frequency and the second formant frequency of the infant 1's utterance. The time series data 31 is recorded in the storage 4. The time-series data 31 shown in FIG. 3A corresponds to the voice data 50 in the poor physical condition period T1 or T3 of the baby 1 shown in FIG. 2B, and the baby at each time based on the voice data 50. It is generated by adding a record containing the fundamental frequency of the vocalization of 1 and the first and second formant frequencies of the infant 1 vocalization every second. Further, in this time-series data 31, a poor physical condition period display flag is associated with each record at each time every second composed of those frequency components. The poor physical condition period display flag is associated with each record by the voice parameter extraction unit 63 after the time series data 31 is recorded in the storage 4 by the frequency component detection unit 62.
 図3(a)に示される例では、2019年3月5日18時31分45秒において、乳児1の発声があり、その基本周波数が387Hz、第一フォルマント周波数が1261Hz、第二フォルマント周波数が2732Hzであって、この時刻において乳児1が体調不良の状態にあることを示す値「1」が、この時刻のレコードに対応する体調不良期間表示フラグとして設定される。2019年3月5日18時31分46秒においては、乳児1の発声があり、その基本周波数が388Hz、第一フォルマント周波数が1264Hz、第二フォルマント周波数が2735Hzであって、体調不良期間表示フラグとして値「1」がこの時刻のレコードに設定される。2019年3月5日19時31分44秒においては、乳児1の発声が無いため、基本周波数、第一フォルマント周波数および第二フォルマント周波数のいずれもが検出されなかったことから値「NULL」がこの時刻のレコードに設定されるとともに、体調不良期間表示フラグとして値「1」がこの時刻のレコードに設定される。なお、上述した期間T1またはT3以外の期間、すなわち乳児1が体調不良の状態にある体調不良期間とは特定されていない期間においては、体調不良期間表示フラグとして値「0」が、この期間に含まれる各時刻のレコードに設定される。また、乳児1の発声が無い時刻におけるレコードには、各周波数成分値として「NULL」を設定する代わりに何も設定しないことにより、すなわち各周波数成分値を欠損値とすることにより、プロセッサ6の処理性能を節約してもよい。その場合、時系列データ31に基づき音声パラメータ抽出部63が音声パラメータを抽出する処理(後述)においては、乳児1の発声が無い時刻におけるレコードの各周波数成分値として論理的に「NULL」が用いられる。 In the example shown in FIG. 3A, at 18:31:45 on March 5, 2019, the infant 1 was uttered, its fundamental frequency was 387 Hz, its first formant frequency was 1261 Hz, and its second formant frequency was A value "1", which is 2732 Hz and indicates that the infant 1 is in a poor physical condition at this time, is set as a poor physical condition period display flag corresponding to the record at this time. At 18:31:46 on March 5, 2019, the infant 1 was uttered, its fundamental frequency was 388 Hz, its first formant frequency was 1264 Hz, its second formant frequency was 2735 Hz, and the poor physical condition period display flag. The value "1" is set in the record at this time. At 19:31:44 on March 5, 2019, the value "NULL" was set because none of the fundamental frequency, the first formant frequency, and the second formant frequency were detected because the infant 1 did not utter. Along with being set in the record at this time, the value "1" is set in the record at this time as a display flag for the period of poor physical condition. In the period other than the above-mentioned period T1 or T3, that is, in the period not specified as the period in which the baby 1 is in a state of poor physical condition, the value "0" is set as the poor physical condition period display flag in this period. It is set in the record of each time included. Further, in the record at the time when the infant 1 does not utter, by not setting anything instead of setting "Null" as each frequency component value, that is, by setting each frequency component value as a missing value, the processor 6 Processing performance may be saved. In that case, in the process (described later) in which the voice parameter extraction unit 63 extracts the voice parameter based on the time series data 31, "Null" is logically used as each frequency component value of the record at the time when the baby 1 does not utter. Be done.
 時系列データ31に基づき、1秒経過毎にその時点における時刻を基準時刻として、その基準時刻から過去1時間の期間における、単位時間あたりの乳児1の発声の回数と、乳児1の発声の1回における継続時間とが、各基準時刻毎に、音声パラメータ抽出部63によって算出される。それらは、次のようにして算出される。例えば基準時刻が2019年3月5日19時31分44秒の場合は、過去1時間、すなわち2019年3月5日18時31分45秒から2019年3月5日19時31分44秒までの時系列データ31に含まれる3600件のレコードの各々に乳児1の発声の基本周波数の値が存在するか否か、に基づいて算出される。2019年3月5日18時31分45秒およびそれに引き続く18時31分46秒においては、基本周波数の値がそれぞれ存在する。このように連続して基本周波数の値が存在する間は、1回の発声が継続していると解釈できる。2019年3月5日19時31分44秒においては基本周波数の値が存在しない。もし、2019年3月5日18時31分47秒から19時31分44秒までずっと基本周波数の値が存在しなかったとすれば、基準時刻が2019年3月5日19時31分44秒の場合において、単位時間あたりの乳児1の発声の回数、すなわち乳児1の発声回数/時間の値は1(回)と算出され、乳児1の発声の1回における継続時間、すなわち乳児1の発声継続時間(秒)/発声1回の値は2(秒)と算出される。例えば、乳児1の発声回数/時間の値が5(回)であって、それら5回の発声それぞれにおける発声継続時間が、順に、120秒、100秒、80秒、110秒、100秒であった場合は、乳児1の発声継続時間(秒)/発声1回の値として、それら5回の発声継続時間の1回当たりの平均値である102(秒)が算出される。 Based on the time-series data 31, the number of vocalizations of infant 1 per unit time and 1 of the vocalizations of infant 1 in the period of the past 1 hour from the reference time, with the time at that time as the reference time every second. The duration of each time is calculated by the voice parameter extraction unit 63 for each reference time. They are calculated as follows. For example, if the reference time is 19:31:44 on March 5, 2019, the past hour, that is, from 18:31:45 on March 5, 2019 to 19:31:44 on March 5, 2019. It is calculated based on whether or not the value of the basic frequency of the vocalization of the infant 1 exists in each of the 3600 records included in the time series data 31 up to. At 18:31:45 on March 5, 2019 and subsequent 18:31:46, there are fundamental frequency values, respectively. As long as the fundamental frequency value exists continuously in this way, it can be interpreted that one utterance continues. At 19:31:44 on March 5, 2019, there is no fundamental frequency value. If the basic frequency value did not exist from 18:31:47 on March 5, 2019 to 19:31:44, the reference time would be 19:31:44 on March 5, 2019. In the case of, the number of utterances of infant 1 per unit time, that is, the value of the number of utterances / hour of infant 1, is calculated as 1 (times), and the duration of one utterance of infant 1, that is, the utterance of infant 1. The value of duration (seconds) / one utterance is calculated as 2 (seconds). For example, the value of the number of utterances / time of infant 1 is 5 (times), and the utterance durations for each of these 5 utterances are 120 seconds, 100 seconds, 80 seconds, 110 seconds, and 100 seconds, respectively. In this case, 102 (seconds), which is the average value of each of the five vocalization durations, is calculated as the value of the vocalization duration (seconds) / one vocalization of the infant 1.
 ただし、例えば、乳児1が発声している時間がk秒間継続した後、乳児1が発声していない時間が数秒間継続したとしても、それに基づいて、乳児1の発声時間がk秒間で終了したとは言えない可能性がある。乳児1が、その発声していない数秒間の間に息継ぎをした後に発声を再開する可能性があるからである。そこで、乳児1が発声していない時間が所定秒間未満の場合、例えば10秒未満の場合、乳児1の発声時間はk秒間で終了したのではなく、k秒間を越えて継続していると解釈してもよい。 However, for example, even if the time during which the baby 1 is speaking continues for k seconds and the time during which the baby 1 is not speaking continues for several seconds, the vocal time of the baby 1 ends in k seconds based on the time. It may not be possible to say. This is because baby 1 may resume vocalization after taking a breath for a few seconds when it is not vocalizing. Therefore, if the time when the baby 1 is not uttering is less than a predetermined second, for example, less than 10 seconds, it is interpreted that the utterance time of the baby 1 does not end in k seconds but continues for more than k seconds. You may.
 図3(b)に例示される記録データ41には、音声パラメータ抽出部63によって時系列データ31に基づき抽出される音声パラメータとして、乳児1の発声回数/時間と、乳児1の発声継続時間(秒)/発声1回と、乳児1の発声の基本周波数(Hz)、乳児1の発声の第一フォルマント周波数(Hz)と、乳児1の発声の第二フォルマント周波数(Hz)とが、それぞれ体調不良期間におけるパラメータの値と、平常時におけるパラメータの値とに分かれて記録され、所定期間経過毎に更新される。体調不良期間は、音声データ50が取得された期間のうち、上述した期間T1、T2またはT3に含まれる期間であり、図3(a)に例示される時系列データ31に設定された体調不良期間表示フラグに基づいて識別される。平常時は体調不良期間以外の音声データ50が取得された期間の一部または全部である。 In the recorded data 41 illustrated in FIG. 3 (b), the number of utterances / time of the infant 1 and the utterance duration of the infant 1 are shown as the voice parameters extracted by the voice parameter extraction unit 63 based on the time series data 31. Seconds) / One utterance, the basic frequency of utterance of infant 1 (Hz), the first formant frequency (Hz) of utterance of infant 1, and the second formant frequency (Hz) of utterance of infant 1, respectively. The parameter values in the defective period and the parameter values in normal times are recorded separately and updated every predetermined period. The poor physical condition period is a period included in the above-mentioned period T1, T2 or T3 among the periods in which the voice data 50 is acquired, and the poor physical condition set in the time series data 31 exemplified in FIG. 3A. Identified based on the period display flag. In normal times, it is a part or all of the period in which the voice data 50 is acquired other than the period of poor physical condition.
 音声パラメータ抽出部63によって抽出される音声パラメータのうち、乳児1の発声回数/時間には、記録または更新されるタイミングにおいて、例えば過去1年間で1秒間隔の基準時刻毎に算出された乳児1の発声回数/時間の値の、体調不良期間と平常時とのそれぞれの期間における平均値が用いられる。同様に、乳児1の発声継続時間(秒)/発声1回についても、過去1年間のうち、体調不良期間と平常時とのそれぞれの期間における平均値が用いられる。さらに、乳児1の発声の基本周波数、第一フォルマント周波数および第二フォルマント周波数について、いずれも、過去1年間の時系列データ31に含まれるレコードにそれぞれ記録された値に基づき、体調不良期間と平常時とのそれぞれの期間における平均値が用いられる。なお、上述した5種類のパラメータには、過去1年間における平均値が用いられることとしたが、この平均値は単純平均によって得られた値に限られず、例えば、過去1年間のうちの古い値よりも比較的最近の値に、より大きな重みが付された加重平均によって得られた値であってもよい。このようにして音声パラメータ抽出部63によって抽出される音声パラメータの例を、図3(b)に示す。 Among the voice parameters extracted by the voice parameter extraction unit 63, the number of utterances / time of the baby 1 is calculated at the timing of recording or updating, for example, at the reference time at 1-second intervals in the past year. The average value of the number of utterances / hour value in each period of poor physical condition and normal time is used. Similarly, for the vocalization duration (seconds) / vocalization of baby 1, the average value in each period of the poor physical condition period and the normal period in the past one year is used. Furthermore, regarding the fundamental frequency, the first formant frequency, and the second formant frequency of the infant 1's vocalization, the period of poor physical condition and the normal condition are all based on the values recorded in the records included in the time series data 31 for the past one year. The average value for each period with time is used. It was decided that the average value in the past one year was used for the above-mentioned five types of parameters, but this average value is not limited to the value obtained by the simple average, for example, the old value in the past one year. It may be a value obtained by a weighted average of a relatively recent value with a greater weight. An example of the voice parameter extracted by the voice parameter extraction unit 63 in this way is shown in FIG. 3 (b).
 図3(b)に示す例において、乳児1の体調不良期間に記録される各パラメータの値は、乳児1の発声回数/時間が30回/時間、乳児1の発声継続時間(秒)/発声1回が120秒/回、乳児1の発声の基本周波数が380Hz、乳児1の発声の第一フォルマント周波数が1250Hz、乳児1の発声の第二フォルマント周波数が2700Hzである。乳児1の平常時に記録される各パラメータの値は、乳児1の発声回数/時間が20回/時間、乳児1の発声継続時間(秒)/発声1回が80秒/回、乳児1の発声の基本周波数が350Hz、乳児1の発声の第一フォルマント周波数が1200Hz、乳児1の発声の第二フォルマント周波数が2500Hzである。記録データ41は、音声検出装置5のプロセッサ6が有する音声パラメータ抽出部63により、ストレージ4に記録される。 In the example shown in FIG. 3B, the values of each parameter recorded during the period of poor physical condition of the infant 1 are the number of vocalizations / hour of the infant 1 30 times / hour and the vocalization duration (seconds) / vocalization of the infant 1. One time is 120 seconds / time, the fundamental frequency of the utterance of the baby 1 is 380 Hz, the first formant frequency of the utterance of the baby 1 is 1250 Hz, and the second formant frequency of the utterance of the baby 1 is 2700 Hz. The values of each parameter recorded in the normal time of infant 1 are the number of utterances / time of infant 1 20 times / hour, the duration of utterance of infant 1 (seconds) / one utterance of 80 seconds / time, and the utterance of infant 1. The fundamental frequency of the infant 1 is 350 Hz, the first formant frequency of the utterance of the infant 1 is 1200 Hz, and the second formant frequency of the utterance of the infant 1 is 2500 Hz. The recorded data 41 is recorded in the storage 4 by the voice parameter extraction unit 63 included in the processor 6 of the voice detection device 5.
 図4は、本実施の形態における音声検出装置5でコンピュータプログラムが走行することによって実行されるパラメータ記録処理の一例を示す図である。音声検出装置5のプロセッサ6は、メモリ7に格納されているコンピュータプログラムを起動し、図4に示す処理を実行することによって、音声データ取得部61と、周波数成分検出部62と、音声パラメータ抽出部63として機能する。本実施の形態において、このパラメータ記録処理は、毎秒繰り返し実行される。 FIG. 4 is a diagram showing an example of parameter recording processing executed by running a computer program on the voice detection device 5 according to the present embodiment. The processor 6 of the voice detection device 5 activates the computer program stored in the memory 7 and executes the process shown in FIG. 4, thereby causing the voice data acquisition unit 61, the frequency component detection unit 62, and the voice parameter extraction. It functions as a unit 63. In the present embodiment, this parameter recording process is repeatedly executed every second.
 このパラメータ記録処理が開始されると、ステップS410において、音声データ取得部61は、マイクロホン3を介して乳児1の発声を表す音声データ50を取得する。ステップS420において、周波数成分検出部62は、ステップS410で取得された音声データ50に含まれる複数の周波数成分を検出し、図3(a)に例示される時系列データ31を生成する。この処理ステップにおいては、例えば、乳児1の発声の基本周波数と、乳児1の発声の第一フォルマント周波数と、乳児1の発声の第二フォルマント周波数とが検出される。時系列データ31は、周波数成分検出部62によってストレージ4に記録される。 When this parameter recording process is started, in step S410, the voice data acquisition unit 61 acquires voice data 50 representing the utterance of the baby 1 via the microphone 3. In step S420, the frequency component detection unit 62 detects a plurality of frequency components included in the voice data 50 acquired in step S410, and generates the time series data 31 illustrated in FIG. 3A. In this processing step, for example, the fundamental frequency of the utterance of the infant 1, the first formant frequency of the utterance of the infant 1, and the second formant frequency of the utterance of the infant 1 are detected. The time series data 31 is recorded in the storage 4 by the frequency component detection unit 62.
 上述したように、音声パラメータ抽出部63は、通信モジュール8を介して電子機器10から取得した期間情報142に基づいて体調不良期間T3を特定することができる。例えば、図2(b)に例示する2019年3月5日15:00から2019年3月6日9:00までの期間T3が、体調不良期間として特定される。ステップS430において、音声パラメータ抽出部63は、特定した体調不良期間T3を参照することによって、ステップS420で生成された時系列データ31について、乳児1が体調不良の状態にあった体調不良期間を識別する。音声パラメータ抽出部63は、特定した体調不良期間T3に基づき、ステップS420で生成された時系列データ31に含まれる複数の周波数成分に対応付けて体調不良期間表示フラグを設定する。 As described above, the voice parameter extraction unit 63 can specify the poor physical condition period T3 based on the period information 142 acquired from the electronic device 10 via the communication module 8. For example, the period T3 from 15:00 on March 5, 2019 to 9:00 on March 6, 2019, which is illustrated in FIG. 2B, is specified as a period of poor physical condition. In step S430, the voice parameter extraction unit 63 identifies the poor physical condition period in which the baby 1 was in a poor physical condition with respect to the time series data 31 generated in step S420 by referring to the specified poor physical condition period T3. To do. The voice parameter extraction unit 63 sets the poor physical condition period display flag in association with a plurality of frequency components included in the time series data 31 generated in step S420 based on the specified poor physical condition period T3.
 ステップS440において、音声パラメータ抽出部63は、ステップS420で検出された複数の周波数成分に基づき、複数のパラメータを抽出する。この処理ステップにおいては、例えば、体調不良期間と、体調不良期間以外の平常時とのそれぞれにおける、単位時間あたりの乳児1の発声の回数平均値と、乳児1の発声の1回における継続時間平均値と、乳児1の発声の基本周波数平均値と、乳児1の発声の第一フォルマント周波数平均値と、乳児1の発声の第二フォルマント周波数平均値とが抽出される。ステップS450において、音声パラメータ抽出部63は、ステップS440で抽出された複数のパラメータの値を、記録データ41として、ストレージ4に記録する。ステップS450の処理が完了すると、このパラメータ記録処理は終了する。 In step S440, the voice parameter extraction unit 63 extracts a plurality of parameters based on the plurality of frequency components detected in step S420. In this processing step, for example, the average number of utterances of infant 1 per unit time and the average duration of one utterance of infant 1 in each of the illness period and the normal time other than the illness period. The value, the fundamental frequency average value of the utterance of the infant 1, the first formant frequency average value of the utterance of the infant 1, and the second formant frequency average value of the utterance of the infant 1 are extracted. In step S450, the voice parameter extraction unit 63 records the values of the plurality of parameters extracted in step S440 as the recording data 41 in the storage 4. When the process of step S450 is completed, this parameter recording process ends.
 図5は、乳児1の発声を表す音声データ51が取得され、その乳児1が体調不良の状態であると判定された場合を例示する図である。図4を用いて説明したパラメータ記録処理により、乳児1の発声のパラメータの値が記録データ41として記録された後、図6を用いて後述する体調判定処理が実行されると、まず、音声データ取得部61により図5(a)に例示する乳児の発声を表す新たな音声データ51が取得される。その後、周波数成分検出部62は、取得された新たな音声データ51に含まれる複数の周波数成分を検出する。続いて、上述したパラメータ記録処理と同様に、取得された新たな音声データ51から音声パラメータ抽出部63によって複数のパラメータが抽出される。こうして抽出された、音声データ51に含まれる複数の周波数成分に基づく複数のパラメータの値と、記録データ41に記録された、体調不良期間と、体調不良期間以外の平常時とのそれぞれにおける、音声データ50に含まれる複数の周波数成分に基づく複数のパラメータの値との類似度に基づいて、体調判定部64による、乳児1が体調不良の状態であるか否かの判定が行われる。 FIG. 5 is a diagram illustrating a case where voice data 51 representing the vocalization of the baby 1 is acquired and the baby 1 is determined to be in a poor physical condition. When the value of the vocalization parameter of the baby 1 is recorded as the recorded data 41 by the parameter recording process described with reference to FIG. 4, and then the physical condition determination process described later is executed with reference to FIG. 6, the voice data is first recorded. The acquisition unit 61 acquires new voice data 51 representing the utterance of the baby illustrated in FIG. 5 (a). After that, the frequency component detection unit 62 detects a plurality of frequency components included in the acquired new voice data 51. Subsequently, a plurality of parameters are extracted from the acquired new voice data 51 by the voice parameter extraction unit 63 in the same manner as the parameter recording process described above. The values of a plurality of parameters based on the plurality of frequency components included in the voice data 51 extracted in this way, and the voice recorded in the recorded data 41 in each of the poor physical condition period and the normal time other than the poor physical condition period. Based on the similarity with the values of the plurality of parameters based on the plurality of frequency components included in the data 50, the physical condition determination unit 64 determines whether or not the baby 1 is in a poor physical condition.
 上述した体調判定部64による判定の際に用いられる上述した類似度は、例えば次のようにして算出される。新たな音声データ51から抽出された各パラメータの値と、記録データ41に記録された乳児1の体調不良期間における各パラメータの値との差の二乗和Xを算出する。二乗和Xの値が小さいほど、新たな音声データ51から抽出された複数のパラメータと、乳児1の体調不良期間における複数のパラメータとの類似度が高くなる。同様にして、新たな音声データ51から抽出された各パラメータの値と、記録データ41に記録された乳児1の平常時における各パラメータの値との差の二乗和Yを算出する。二乗和Yの値が小さいほど、新たな音声データ51から抽出された複数のパラメータと、乳児1の平常時における複数のパラメータとの類似度が高くなる。二乗和Xの値が二乗和Yの値よりも小さいとき、新たな音声データ51から抽出された複数のパラメータは、乳児1の平常時における複数のパラメータよりも乳児1の体調不良期間における複数のパラメータとの類似度の方が高いということになる。このとき、体調判定部64は、乳児1は体調不良の状態であると判定する。なお、二乗和XおよびYのそれぞれの逆数の値1/Xおよび1/Yを、上述した類似度の値としてもよい。 The above-mentioned similarity used in the determination by the above-mentioned physical condition determination unit 64 is calculated as follows, for example. The sum of squares X of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter in the period of poor physical condition of the baby 1 recorded in the recorded data 41 is calculated. The smaller the value of the sum of squares X, the higher the similarity between the plurality of parameters extracted from the new voice data 51 and the plurality of parameters during the period of poor physical condition of the baby 1. Similarly, the sum of squares Y of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter recorded in the recorded data 41 in normal times of the baby 1 is calculated. The smaller the value of the sum of squares Y, the higher the similarity between the plurality of parameters extracted from the new voice data 51 and the plurality of parameters of the baby 1 in normal times. When the value of the sum of squares X is smaller than the value of the sum of squares Y, the plurality of parameters extracted from the new voice data 51 are more than a plurality of parameters in the period of poor physical condition of the baby 1 than the plurality of parameters in the normal time of the baby 1. It means that the similarity with the parameter is higher. At this time, the physical condition determination unit 64 determines that the baby 1 is in a poor physical condition. The reciprocal values 1 / X and 1 / Y of the sum of squares X and Y may be used as the above-mentioned values of similarity.
 また、上述した二乗和Xおよび二乗和Yを算出する際には、各パラメータの重要度に応じて各パラメータの値に重み付けがされることとしてもよい。例えば、上述した複数のパラメータのうち、乳児1の発声の1回における継続時間および乳児1の発声の基本周波数に対して付与される重みを、単位時間あたりの乳児1の発声の回数、乳児1の発声の第一フォルマント周波数および乳児1の発声の第二フォルマント周波数に対して付与される重みよりも大きな値として、上述した二乗和Xおよび二乗和Yが算出される。 Further, when calculating the sum of squares X and the sum of squares Y described above, the value of each parameter may be weighted according to the importance of each parameter. For example, among the plurality of parameters described above, the weight given to the duration of one vocalization of infant 1 and the fundamental frequency of vocalization of infant 1 is determined by the number of vocalizations of infant 1 per unit time, infant 1 The above-mentioned sum of squares X and sum of squares Y are calculated as values larger than the weights given to the first formant frequency of the utterance and the second formant frequency of the utterance of the infant 1.
 図3(b)に例示される記録データ41には、体調不良期間におけるパラメータの値のみならず、平常時におけるパラメータの値も記録されているが、平常時におけるパラメータの値が記録されずに体調不良期間におけるパラメータの値のみが記録されることとしてもよい。その場合は、上述した類似度の算出において、新たな音声データ51から抽出された各パラメータの値と、記録データ41に記録された乳児1の平常時における各パラメータの値との差の二乗和Yは算出されず、新たな音声データ51から抽出された各パラメータの値と、記録データ41に記録された乳児1の体調不良期間における各パラメータの値との差の二乗和Xのみが算出される。二乗和Xの値または二乗和Xの逆数の値を類似度の値とする。二乗和Xの値が所定の閾値よりも小さいとき、新たな音声データ51から抽出された複数のパラメータと、乳児1の体調不良期間における複数のパラメータとの類似度が高いということになるので、体調判定部64は、乳児1は体調不良の状態であると判定する。 In the recorded data 41 illustrated in FIG. 3B, not only the parameter values during the poor physical condition period but also the parameter values in normal times are recorded, but the parameter values in normal times are not recorded. Only the values of the parameters during the period of poor physical condition may be recorded. In that case, in the above-mentioned calculation of similarity, the sum of squares of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter recorded in the recorded data 41 in normal times of the baby 1. Y is not calculated, only the sum of squares X of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter during the period of poor physical condition of the baby 1 recorded in the recorded data 41 is calculated. To. The value of the sum of squares X or the reciprocal of the sum of squares X is taken as the value of similarity. When the value of the sum of squares X is smaller than the predetermined threshold value, it means that the plurality of parameters extracted from the new voice data 51 have a high degree of similarity with the plurality of parameters during the period of poor physical condition of the baby 1. The physical condition determination unit 64 determines that the baby 1 is in a poor physical condition.
 体調判定部64は、乳児1は体調不良の状態であると判定すると、体調判定部64は、乳児1が体調不良である旨の通知信号を、通信モジュール8を介して電子機器10へ送出する。送出されたその通知信号を、電子機器10が有するプロセッサ11のメッセージ制御部112が、通信モジュール15を介して受信すると、図5(b)に例示するような乳児1が体調不良である旨のメッセージ145を、メッセージ出力装置14に出力させる。図5(b)に示す例では、「お子様が病気に罹患したかまたは病気に罹患しつつあります。医師の診察をお勧めします。」というメッセージが、メッセージ145として、メッセージ出力装置14の画面に表示されている。メッセージ145は、このメッセージを視認した乳児1の養育者が、現在地の近くの病院を検索できるように、「近くの病院を検索する」という検索ボタン表示146を含んでいる。乳児1の養育者が、この検索ボタン表示146に触れると、不図示の通信ネットワークを介して現在地の近くの病院が検索され、メッセージ145を表示する画面が、その検索結果を表示する画面に切り替わる。 When the physical condition determination unit 64 determines that the baby 1 is in a poor physical condition, the physical condition determination unit 64 sends a notification signal indicating that the baby 1 is in a poor physical condition to the electronic device 10 via the communication module 8. .. When the message control unit 112 of the processor 11 of the electronic device 10 receives the transmitted notification signal via the communication module 15, the baby 1 as illustrated in FIG. 5B is in poor physical condition. The message 145 is output to the message output device 14. In the example shown in FIG. 5B, the message "Your child has or is getting sick. We recommend seeing a doctor." Is displayed as message 145 on the screen of the message output device 14. It is displayed in. The message 145 includes a search button display 146 that says "search for a nearby hospital" so that the caregiver of the baby 1 who sees this message can search for a hospital near the current location. When the caregiver of Infant 1 touches this search button display 146, a hospital near the current location is searched via a communication network (not shown), and the screen displaying message 145 switches to the screen displaying the search results. ..
 なお、体調判定部64により乳児1が体調不良であると判定された際、時系列データ31において、音声データ取得部61によって音声データ51が取得された時刻のレコードに対し、音声パラメータ抽出部63により体調不良期間表示フラグとして値「1」が設定されることとしてもよい。 When the physical condition determination unit 64 determines that the baby 1 is in poor physical condition, the voice parameter extraction unit 63 is used for the record of the time when the voice data 51 is acquired by the voice data acquisition unit 61 in the time series data 31. The value "1" may be set as the poor physical condition period display flag.
 図6は、本実施の形態における音声検出装置5でコンピュータプログラムが走行することによって実行される体調判定処理の一例を示す図である。音声検出装置5のプロセッサ6は、メモリ7に格納されているコンピュータプログラムを起動し、図6に示す処理を実行することによって、音声データ取得部61と、周波数成分検出部62と、音声パラメータ抽出部63と、体調判定部64として機能する。 FIG. 6 is a diagram showing an example of a physical condition determination process executed by running a computer program on the voice detection device 5 according to the present embodiment. The processor 6 of the voice detection device 5 activates the computer program stored in the memory 7 and executes the process shown in FIG. 6, thereby causing the voice data acquisition unit 61, the frequency component detection unit 62, and the voice parameter extraction. It functions as a unit 63 and a physical condition determination unit 64.
 この体調判定処理が開始されると、ステップS610において、音声データ取得部61は、マイクロホン3を介して乳児1の発声を表す音声データ51を取得する。ステップS620において、周波数成分検出部62は、ステップS610で取得された音声データ51に含まれる複数の周波数成分を検出する。この処理ステップにおいては、例えば、乳児1の発声の基本周波数と、乳児1の発声の第一フォルマント周波数と、乳児1の発声の第二フォルマント周波数とが検出される。 When this physical condition determination process is started, in step S610, the voice data acquisition unit 61 acquires voice data 51 representing the utterance of the baby 1 via the microphone 3. In step S620, the frequency component detection unit 62 detects a plurality of frequency components included in the voice data 51 acquired in step S610. In this processing step, for example, the fundamental frequency of the utterance of the infant 1, the first formant frequency of the utterance of the infant 1, and the second formant frequency of the utterance of the infant 1 are detected.
 ステップS630において、音声パラメータ抽出部63は、ステップS620で検出された複数の周波数成分に基づき、複数のパラメータを抽出する。この処理ステップにおいては、例えば、単位時間あたりの乳児1の発声の回数と、乳児1の発声の1回における継続時間と、乳児1の発声の基本周波数と、乳児1の発声の第一フォルマント周波数と、乳児1の発声の第二フォルマント周波数とが抽出される。ステップS640において、音声パラメータ抽出部63は、ステップS630で抽出された音声データ51に含まれる複数の周波数成分に基づく複数のパラメータのうち、乳児1の発声の基本周波数が所定条件を満たすか否かを判定する。本実施の形態において、その所定条件は、乳児1の発声の基本周波数が300Hz以上800Hz以下であるというものである。もしこの所定条件が充足されないときは、ステップS610で取得された音声データ51に含まれる発声が、実際には乳児の発声ではなかった可能性が高いと考えられることから、ステップS640における基本周波数条件判定処理では否定判定が得られ、体調判定処理は終了する。 In step S630, the voice parameter extraction unit 63 extracts a plurality of parameters based on the plurality of frequency components detected in step S620. In this processing step, for example, the number of utterances of infant 1 per unit time, the duration of one utterance of infant 1, the fundamental frequency of utterance of infant 1, and the first formant frequency of utterance of infant 1. And the second formant frequency of the utterance of the infant 1 are extracted. In step S640, the voice parameter extraction unit 63 determines whether or not the fundamental frequency of the vocalization of the baby 1 satisfies a predetermined condition among the plurality of parameters based on the plurality of frequency components included in the voice data 51 extracted in step S630. To judge. In the present embodiment, the predetermined condition is that the fundamental frequency of the vocalization of the baby 1 is 300 Hz or more and 800 Hz or less. If this predetermined condition is not satisfied, it is highly probable that the utterance included in the voice data 51 acquired in step S610 was not actually the utterance of the baby, and therefore the basic frequency condition in step S640. A negative judgment is obtained in the judgment process, and the physical condition judgment process ends.
 ステップS640での基本周波数条件判定処理で肯定判定が得られた場合、ステップS650において、体調判定部64は、上述したパラメータ記録処理においてストレージ4に記録された記録データ41を参照する。ステップS660において、体調判定部64は、上述した類似度算出方法により、ステップS630で抽出された複数のパラメータの値と、記録データ41に記録された複数のパラメータの値との類似度を算出する。ステップS630で抽出された複数のパラメータの値は、音声データ51に含まれる複数の周波数成分に基づいて抽出されたものである。記録データ41に記録された複数のパラメータの値は、体調不良期間と、体調不良期間以外の平常時とのそれぞれについて、音声データ50に含まれる複数の周波数成分に基づいて抽出されたものである。 When an affirmative determination is obtained in the basic frequency condition determination process in step S640, the physical condition determination unit 64 refers to the recorded data 41 recorded in the storage 4 in the above-mentioned parameter recording process in step S650. In step S660, the physical condition determination unit 64 calculates the similarity between the values of the plurality of parameters extracted in step S630 and the values of the plurality of parameters recorded in the recorded data 41 by the similarity calculation method described above. .. The values of the plurality of parameters extracted in step S630 are those extracted based on the plurality of frequency components included in the voice data 51. The values of the plurality of parameters recorded in the recorded data 41 are extracted based on the plurality of frequency components included in the voice data 50 for each of the poor physical condition period and the normal time other than the poor physical condition period. ..
 ステップS670において、体調判定部64は、ステップS660で算出された類似度に基づき、乳児1が体調不良の状態であるか否かの判定を行う。ステップS670での判定処理で否定判定が得られた場合、体調判定処理は終了する。ステップS670での判定処理で肯定判定が得られた場合、ステップS680において、体調判定部64は、乳児1が体調不良である旨の通知信号を、通信モジュール8を介して電子機器10へ送出する。電子機器10のメッセージ出力装置14は、図5(b)に例示するように、乳児1が体調不良である旨のメッセージ145を出力する。ステップS680の処理が完了すると、この体調判定処理は終了する。 In step S670, the physical condition determination unit 64 determines whether or not the baby 1 is in a poor physical condition based on the similarity calculated in step S660. If a negative determination is obtained in the determination process in step S670, the physical condition determination process ends. When an affirmative determination is obtained in the determination process in step S670, the physical condition determination unit 64 sends a notification signal indicating that the baby 1 is in poor physical condition to the electronic device 10 via the communication module 8 in step S680. .. As illustrated in FIG. 5B, the message output device 14 of the electronic device 10 outputs a message 145 to the effect that the baby 1 is in poor physical condition. When the process of step S680 is completed, this physical condition determination process ends.
 本実施の形態における体調検出システム2によれば、以下の作用効果が得られる。 According to the physical condition detection system 2 in the present embodiment, the following effects can be obtained.
(1)体調検出システム2に含まれる音声検出装置5は、音声データ取得部61と、周波数成分検出部62と、音声パラメータ抽出部63と、体調判定部64とを備える。音声データ取得部61は、乳児1の発声を表す音声データ50を取得する。周波数成分検出部62は、音声データ50に含まれる複数の周波数成分を検出する。音声パラメータ抽出部63は、音声データ50に含まれる複数の周波数成分に基づく複数のパラメータを抽出する。音声データ取得部61により乳児1の発声を表す音声データ51が新たに取得されたとき、体調判定部64は、新たに取得された音声データ51と既に抽出された複数のパラメータとに基づき、乳児1は体調不良の状態であるか否かの判定を行う。音声データ50が過去に乳児1が体調不良の状態であった際に取得されたものであれば、様々に考えられる疾病に乳児1が罹患したことを、本実施の形態における体調検出システム2に含まれる音声検出装置5によって検出できる可能性が高い。 (1) The voice detection device 5 included in the physical condition detection system 2 includes a voice data acquisition unit 61, a frequency component detection unit 62, a voice parameter extraction unit 63, and a physical condition determination unit 64. The voice data acquisition unit 61 acquires voice data 50 representing the utterance of the baby 1. The frequency component detection unit 62 detects a plurality of frequency components included in the voice data 50. The voice parameter extraction unit 63 extracts a plurality of parameters based on a plurality of frequency components included in the voice data 50. When the voice data acquisition unit 61 newly acquires the voice data 51 representing the utterance of the baby 1, the physical condition determination unit 64 determines the baby based on the newly acquired voice data 51 and the plurality of parameters already extracted. 1 determines whether or not the patient is in poor physical condition. If the voice data 50 was acquired when the baby 1 was in a poor physical condition in the past, the physical condition detection system 2 in the present embodiment indicates that the baby 1 was affected by various possible diseases. There is a high possibility that it can be detected by the included voice detection device 5.
(2)体調検出システム2に含まれる音声検出装置5において、周波数成分検出部62は、新たに取得された音声データ51に含まれる複数の周波数成分をさらに検出する。音声パラメータ抽出部63は、その新たに取得された音声データ51に含まれる複数の周波数成分に基づく複数のパラメータをさらに抽出する。体調判定部64は、その新たに取得された音声データ51に含まれる複数の周波数成分に基づく複数のパラメータのうち、乳児1の発声の基本周波数が、300Hz以上800Hz以下であるという条件を満たすとき、乳児1が体調不良の状態であるか否かの判定を行う。したがって、音声データ51に含まれる発声が、実際には乳児1の発声ではなかった場合に、誤って乳児1の体調判定処理が行われる可能性が低い。 (2) In the voice detection device 5 included in the physical condition detection system 2, the frequency component detection unit 62 further detects a plurality of frequency components included in the newly acquired voice data 51. The voice parameter extraction unit 63 further extracts a plurality of parameters based on the plurality of frequency components included in the newly acquired voice data 51. When the physical condition determination unit 64 satisfies the condition that the basic frequency of the baby 1's utterance is 300 Hz or more and 800 Hz or less among a plurality of parameters based on the plurality of frequency components included in the newly acquired voice data 51. , It is determined whether or not the baby 1 is in a poor physical condition. Therefore, when the utterance included in the voice data 51 is not actually the utterance of the baby 1, it is unlikely that the physical condition determination process of the baby 1 is mistakenly performed.
(3)体調検出システム2に含まれる音声検出装置5において、体調判定部64は、過去に取得された音声データ50に含まれる複数の周波数成分に基づく複数のパラメータと、新たに取得された音声データ51に含まれる複数の周波数成分に基づく複数のパラメータとの類似度に基づいて、乳児1が体調不良の状態であるか否かの判定を行う。したがって、乳児1が体調不良の状態であるか否かに関する安定した判定結果が得られる。 (3) In the voice detection device 5 included in the physical condition detection system 2, the physical condition determination unit 64 includes a plurality of parameters based on a plurality of frequency components included in the voice data 50 acquired in the past, and newly acquired voice. Based on the degree of similarity with the plurality of parameters based on the plurality of frequency components included in the data 51, it is determined whether or not the baby 1 is in a poor physical condition. Therefore, a stable determination result regarding whether or not the baby 1 is in a poor physical condition can be obtained.
(4)体調検出システム2は、メッセージ出力装置14を有する電子機器10をさらに含む。音声検出装置5が有するプロセッサ6の体調判定部64によって乳児1が体調不良であると判定されると、電子機器10のメッセージ出力装置14は、乳児1が体調不良である旨のメッセージを出力する。出力されたメッセージを視認した乳児1の養育者は、現在地の近くの病院を検索するといった対応を速やかにとることができる。 (4) The physical condition detection system 2 further includes an electronic device 10 having a message output device 14. When the physical condition determination unit 64 of the processor 6 included in the voice detection device 5 determines that the baby 1 is in poor physical condition, the message output device 14 of the electronic device 10 outputs a message indicating that the baby 1 is in poor physical condition. .. The caregiver of the baby 1 who sees the output message can promptly take measures such as searching for a hospital near the current location.
 図7は、上述した一実施の形態における体調検出システム2の音声検出装置5で走行するコンピュータプログラムの製品供給態様を例示する図である。音声検出装置5で走行するコンピュータプログラムは、CD-ROMやUSBメモリ等の記録媒体45や、インターネット等の通信ネットワーク30を流れるデータ信号を通じて音声検出装置5へ提供することができる。例えば、音声検出装置5に無線接続または有線接続された操作端末46により記録媒体45からコンピュータプログラムが読み出され、音声検出装置5へ提供される。 FIG. 7 is a diagram illustrating a product supply mode of a computer program traveling by the voice detection device 5 of the physical condition detection system 2 in the above-described embodiment. The computer program running on the voice detection device 5 can be provided to the voice detection device 5 through a recording medium 45 such as a CD-ROM or a USB memory, or a data signal flowing through a communication network 30 such as the Internet. For example, the computer program is read from the recording medium 45 by the operation terminal 46 wirelessly or wiredly connected to the voice detection device 5, and is provided to the voice detection device 5.
 コンピュータプログラム提供サーバ40は、上述したコンピュータプログラムを提供するサーバコンピュータであり、ハードディスク等の記憶装置にそのコンピュータプログラムを格納する。通信ネットワーク30は、インターネット、無線LAN、電話網、或いは専用線等である。コンピュータプログラム提供サーバ40は、記憶装置に格納されたコンピュータプログラムを読み出し、データ信号として搬送波に載せ、通信ネットワーク30を介して音声検出装置5または操作端末46へ送信する。コンピュータプログラムが操作端末46へ送信された場合は、操作端末46によりそのコンピュータプログラムが音声検出装置5へ提供される。このように、コンピュータプログラムは、記録媒体やデータ信号などの種々の形態のコンピュータ読み込み可能なコンピュータプログラム製品として供給され得る。 The computer program providing server 40 is a server computer that provides the above-mentioned computer program, and stores the computer program in a storage device such as a hard disk. The communication network 30 is the Internet, a wireless LAN, a telephone network, a dedicated line, or the like. The computer program providing server 40 reads out the computer program stored in the storage device, puts it on a carrier wave as a data signal, and transmits it to the voice detection device 5 or the operation terminal 46 via the communication network 30. When the computer program is transmitted to the operation terminal 46, the operation terminal 46 provides the computer program to the voice detection device 5. As described above, the computer program can be supplied as a computer-readable computer program product in various forms such as a recording medium and a data signal.
 次のような変形も本発明の範囲内であり、変形例の一つ、もしくは複数を上述の実施形態と組み合わせることも可能である。 The following modifications are also within the scope of the present invention, and one or more of the modifications can be combined with the above-described embodiment.
(変形例1)上述した一実施の形態において、体調検出システム2に含まれる音声検出装置5が有するプロセッサ6の音声データ取得部61により、パラメータ記録処理に向けて音声データ50が取得されると、プロセッサ6の音声パラメータ抽出部63は、通信モジュール8を介して電子機器10から取得した期間情報142に基づき、乳児1が体調不良の状態にある体調不良期間を特定する。図2(b)に例示する音声データ50においては、2019年3月5日15:00から2019年3月6日9:00までの期間T3が、体調不良期間として特定されることとした。 (Modification 1) In the above-described embodiment, when the voice data 50 is acquired for the parameter recording process by the voice data acquisition unit 61 of the processor 6 included in the voice detection device 5 included in the physical condition detection system 2. The voice parameter extraction unit 63 of the processor 6 identifies the period of poor physical condition in which the infant 1 is in a state of poor physical condition based on the period information 142 acquired from the electronic device 10 via the communication module 8. In the voice data 50 illustrated in FIG. 2B, the period T3 from 15:00 on March 5, 2019 to 9:00 on March 6, 2019 is specified as the period of poor physical condition.
 上述したように、期間T3は、乳児1が病気に罹患した状態の期間T1と、乳児1が病気に罹患しつつある状態の期間T2とを併合した期間である。乳児1が病気に罹患した状態の期間T1(図2(b)に例示する2019年3月5日18:00から2019年3月6日9:00までの期間)、または乳児1が病気に罹患しつつある状態の期間T2(図2(b)に例示する2019年3月5日15:00から2019年3月5日18:00までの期間)が、体調不良期間として特定されることとしてもよい。乳児1が病気に罹患しつつある状態の期間T2または期間T2を含む期間T3が体調不良期間として特定されると、体調判定部64による乳児1が体調不良の状態であるか否かの判定処理の際に、乳児1が病気に罹患しつつある状態も検出されるため、乳児1の体調悪化を予防できる可能性が高い。 As described above, the period T3 is a period in which the period T1 in which the baby 1 is sick and the period T2 in which the baby 1 is sick are combined. Infant 1 becomes ill during period T1 (the period from 18:00 on March 5, 2019 to 9:00 on March 6, 2019 illustrated in FIG. 2B), or infant 1 becomes ill. The period T2 of the illness (the period from 15:00 on March 5, 2019 to 18:00 on March 5, 2019 illustrated in FIG. 2B) is specified as a period of poor physical condition. May be. When the period T2 in which the baby 1 is getting sick or the period T3 including the period T2 is specified as the period of poor physical condition, the physical condition determination unit 64 determines whether or not the baby 1 is in a state of poor physical condition. At this time, since the state in which the baby 1 is getting sick is also detected, there is a high possibility that the deterioration of the physical condition of the baby 1 can be prevented.
(変形例2)上述した実施の形態および変形例において、音声検出装置5のプロセッサ6が有する体調判定部64は、新たな音声データ51に含まれる複数の周波数成分に基づく複数のパラメータの値と、記録データ41に記録された、体調不良期間と、体調不良期間以外の平常時とのそれぞれにおける、過去の音声データ50に含まれる複数の周波数成分に基づく複数のパラメータの値との類似度に基づいて、乳児1が体調不良の状態であるか否かの判定を行う。類似度の算出には、新たな音声データ51から抽出された各パラメータの値と、記録データ41に記録された乳児1の体調不良期間における各パラメータの値との差の二乗和X、および新たな音声データ51から抽出された各パラメータの値と、記録データ41に記録された乳児1の平常時における各パラメータの値との差の二乗和Yが用いられることとした。本変形例では、他の算出法によって、上述した類似度が算出される。図8は、変形例2における体調検出システムで実行される体調判定処理に用いられる記録データ41を例示する図である。図8に例示する記録データ41は、図3(b)に例示する記録データ41と比較すると、各パラメータの値の標準偏差が記録されている点において異なる。 (Modification 2) In the above-described embodiment and modification, the physical condition determination unit 64 included in the processor 6 of the voice detection device 5 includes the values of a plurality of parameters based on a plurality of frequency components included in the new voice data 51. , The degree of similarity between the poor physical condition period recorded in the recorded data 41 and the values of a plurality of parameters based on the plurality of frequency components included in the past voice data 50 in each of the normal times other than the poor physical condition period. Based on this, it is determined whether or not the infant 1 is in a poor physical condition. To calculate the similarity, the sum of squares X of the difference between the value of each parameter extracted from the new voice data 51 and the value of each parameter recorded in the recorded data 41 during the period of poor physical condition of the infant 1 and the new It was decided that the sum of squares Y of the difference between the value of each parameter extracted from the voice data 51 and the value of each parameter recorded in the recorded data 41 in normal times of the infant 1 was used. In this modification, the above-mentioned similarity is calculated by another calculation method. FIG. 8 is a diagram illustrating recorded data 41 used in the physical condition determination process executed by the physical condition detection system in the second modification. The recorded data 41 illustrated in FIG. 8 is different from the recorded data 41 illustrated in FIG. 3 (b) in that the standard deviation of the value of each parameter is recorded.
 上述したパラメータ記録処理を通じて、図8に例示する記録データ41には、過去の音声データ50に含まれる複数の周波数成分に基づく複数のパラメータの値が記録されている。記録データ41には、体調不良期間xと、体調不良期間以外の平常時yとのそれぞれにおける、単位時間あたりの乳児1の発声の回数C1、すなわちC1xおよびC1yと、乳児1の発声の1回における継続時間C2、すなわちC2xおよびC2yと、乳児1の発声の基本周波数C3、すなわちC3xおよびC3yと、乳児1の発声の第一フォルマント周波数C4、すなわちC4xおよびC4yと、乳児1の発声の第二フォルマント周波数C5、すなわちC5xおよびC5yとが、記録されるとともに、体調不良期間xと、体調不良期間以外の平常時yとのそれぞれにおける、単位時間あたりの乳児1の発声の回数C1の標準偏差S1、すなわちS1xおよびS1yと、乳児1の発声の1回における継続時間C2の標準偏差S2、すなわちS2xおよびS2yと、乳児1の発声の基本周波数C3の標準偏差S3、すなわちS3xおよびS3yと、乳児1の発声の第一フォルマント周波数C4の標準偏差S4、すなわちS4xおよびS4yと、乳児1の発声の第二フォルマント周波数C5の標準偏差S5、すなわちS5xおよびS5yとが、記録される。図8によれば、C1x=30、S1x=4.5、C1y=20、S1y=3、C2x=120、S2x=20、C2y=80、S2y=35、C3x=380、S3x=130、C3y=350、S3y=80、C4x=1250、S4x=120、C4y=1200、S4y=100、C5x=2700、S5x=200、C5y=2500、S5y=150である。 Through the parameter recording process described above, the recorded data 41 illustrated in FIG. 8 records the values of a plurality of parameters based on the plurality of frequency components included in the past voice data 50. In the recorded data 41, the number of utterances of infant 1 per unit time C1, that is, C1x and C1y and one utterance of infant 1 in each of the illness period x and the normal time y other than the illness period. The durations C2, i.e. C2x and C2y, the basic frequencies C3, i.e. C3x and C3y, of the infant 1 utterance, the first formant frequencies C4, i.e. C4x and C4y, of the infant 1 utterance, and the second of the infant 1 utterance. The formant frequencies C5, that is, C5x and C5y, are recorded, and the standard deviation S1 of the number of utterances C1 of the infant 1 per unit time in each of the poor physical condition period x and the normal time y other than the poor physical condition period. That is, S1x and S1y, the standard deviation S2 of the duration C2 in one utterance of the infant 1, that is, S2x and S2y, the standard deviation S3 of the basic frequency C3 of the utterance of the infant 1, S3x and S3y, and the infant 1. The standard deviation S4 of the first formant frequency C4 of the utterance, S4x and S4y, and the standard deviation S5 of the second formant frequency C5 of the utterance of the infant 1, S5x and S5y, are recorded. According to FIG. 8, C1x = 30, S1x = 4.5, C1y = 20, S1y = 3, C2x = 120, S2x = 20, C2y = 80, S2y = 35, C3x = 380, S3x = 130, C3y = 350, S3y = 80, C4x = 1250, S4x = 120, C4y = 1200, S4y = 100, C5x = 2700, S5x = 200, C5y = 2500, S5y = 150.
 次に、上述した体調判定処理に際に、新たな音声データ51に含まれる複数の周波数成分に基づく複数のパラメータとして、単位時間あたりの乳児1の発声の回数M1と、乳児1の発声の1回における継続時間M2と、乳児1の発声の基本周波数M3と、乳児1の発声の第一フォルマント周波数M4と、乳児1の発声の第二フォルマント周波数M5とが、抽出されたとする。例えば、M1=28、M2=110、M3=360、M4=1230、M5=2600とする。上述した類似度は、以下の式(1)(2)により算出されるユークリッド距離DxおよびDyに基づいて定められることとしてもよい。なお、ユークリッド距離Dxは、音声データ51から抽出された各パラメータの値と、乳児1の体調不良期間xにおける各パラメータの値との距離を表し、ユークリッド距離Dyは、音声データ51から抽出された各パラメータの値と、乳児1の平常時yにおける各パラメータの値との距離を表す。
Dx={(C1x-M1)/S1x}+{(C2x-M2)/S2x}
   +{(C3x-M3)/S3x}+{(C4x-M4)/S4x}
   +{(C5x-M5)/S5x}     ・・・(1)
Dy={(C1y-M1)/S1y}+{(C2y-M2)/S2y}
   +{(C3y-M3)/S3y}+{(C4y-M4)/S4y}
   +{(C5y-M5)/S5y}     ・・・(2)
Next, in the above-mentioned physical condition determination process, as a plurality of parameters based on a plurality of frequency components included in the new voice data 51, the number of vocalizations of the infant 1 per unit time M1 and the vocalization of the infant 1 1 It is assumed that the duration M2 in the times, the fundamental frequency M3 of the utterance of the infant 1, the first formant frequency M4 of the utterance of the infant 1, and the second formant frequency M5 of the utterance of the infant 1 are extracted. For example, M1 = 28, M2 = 110, M3 = 360, M4 = 1230, M5 = 2600. The above-mentioned similarity may be determined based on the Euclidean distances Dx and Dy calculated by the following equations (1) and (2). The Euclidean distance Dx represents the distance between the value of each parameter extracted from the voice data 51 and the value of each parameter in the poor physical condition period x of the infant 1, and the Euclidean distance Dy is extracted from the voice data 51. It represents the distance between the value of each parameter and the value of each parameter in the normal time y of infant 1.
Dx 2 = {(C1x-M1) / S1x} 2 + {(C2x-M2) / S2x} 2
+ {(C3x-M3) / S3x} 2 + {(C4x-M4) / S4x} 2
+ {(C5x-M5) / S5x} 2 ... (1)
Dy 2 = {(C1y-M1) / S1y} 2 + {(C2y-M2) / S2y} 2
+ {(C3y-M3) / S3y} 2 + {(C4y-M4) / S4y} 2
+ {(C5y-M5) / S5y} 2 ... (2)
 上述した式(1)(2)により算出されるユークリッド距離DxおよびDyを互いに比較し、ユークリッド距離が小さいとき、類似度が高いとする。例えば、Dx<Dyのとき、音声データ51が取得されたときの乳児1の体調は、体調不良期間xにおける乳児1の体調に類似しているので、体調判定部64は、乳児1は体調不良の状態であると判定する。上述したM1=28、M2=110、M3=360、M4=1230、M5=2600を、上述した式(1)および(2)に代入すると、Dx≒0.87およびDy≒2.90が得られるので、Dx<Dyが成立し、乳児1は体調不良の状態であると判定される。 The Euclidean distances Dx and Dy calculated by the above equations (1) and (2) are compared with each other, and when the Euclidean distance is small, the similarity is high. For example, when Dx <Dy, the physical condition of the baby 1 when the voice data 51 is acquired is similar to the physical condition of the baby 1 in the poor physical condition period x. Therefore, the physical condition determination unit 64 determines that the baby 1 is in poor physical condition. It is determined that the state is. Substituting the above-mentioned M1 = 28, M2 = 110, M3 = 360, M4 = 1230, and M5 = 2600 into the above-mentioned equations (1) and (2), Dx≈0.87 and Dy≈2.90 are obtained. Therefore, Dx <Dy is established, and the baby 1 is determined to be in a poor physical condition.
(変形例3)上述した実施の形態および変形例において、図6に例示する体調判定処理を通じて得られる体調判定結果の正否を、乳児1の養育者が判断することとしてもよい。その養育者による判断結果が、図4に例示するパラメータ記録処理のステップS430で行われる、音声パラメータ抽出部63による体調不良期間の識別処理に用いられる。このようにすることによって、記録データ41に記録される体調不良期間および平常時のそれぞれの期間におけるパラメータの値が、より正確な値となる可能性がある。また、こうした養育者による判断結果のフィードバックを機械学習の教師データとして用いることにより、乳児1の発声を含む音声の音声データ50が取得された期間のうちの体調不良期間の識別精度を向上させることが可能となり得る。 (Modified Example 3) In the above-described embodiment and modified example, the caregiver of the baby 1 may determine whether the physical condition determination result obtained through the physical condition determination process illustrated in FIG. 6 is correct or not. The determination result by the caregiver is used for the identification process of the poor physical condition period by the voice parameter extraction unit 63, which is performed in step S430 of the parameter recording process illustrated in FIG. By doing so, the value of the parameter recorded in the recorded data 41 in each of the poor physical condition period and the normal period may become a more accurate value. Further, by using the feedback of the judgment result by the caregiver as the teacher data of machine learning, it is possible to improve the identification accuracy of the poor physical condition period in the period in which the voice data 50 of the voice including the utterance of the baby 1 is acquired. Can be possible.
(変形例4)上述した実施の形態および変形例において、乳児1が体調不良の状態にある場合、図3(a)に例示される体調不良期間表示フラグには値「1」が設定される。しかし、医師の診察により乳児1が罹患している疾病の名称が判明している場合を考慮して、乳児1が罹患しているのが、例えば喉の炎症である場合は体調不良期間表示フラグには値「2」を設定し、肺炎である場合は体調不良期間表示フラグには値「3」を設定し、その他の疾病である場合および疾病の名称が判明していない場合は体調不良期間表示フラグには値「1」が設定されることとしてもよい。このようにすることによって、乳児1の体調不良の原因を推定することが可能となり得る。 (Modified Example 4) In the above-described embodiment and modified example, when the baby 1 is in a state of poor physical condition, a value “1” is set in the poor physical condition period display flag illustrated in FIG. 3 (a). .. However, in consideration of the case where the name of the disease affecting infant 1 is known by a doctor's examination, if infant 1 is affected by inflammation of the throat, for example, the illness period display flag Set the value "2" for, set the value "3" for the illness period display flag if it is pneumonia, and set the illness period if it is another illness or if the name of the illness is unknown. The value "1" may be set for the display flag. By doing so, it may be possible to estimate the cause of the poor physical condition of the baby 1.
 或いは、乳児1が体調不良の状態にある場合、体調不良期間表示フラグには値「1」が設定され、乳児1がとても元気な状態にある場合、体調不良期間表示フラグには値「2」が設定され、乳児1がとても元気な状態とまでは至らないものの体調不良ではない「ふつう」の状態にある場合、体調不良期間表示フラグには値「3」が設定され、体調が不明の状態にある場合、体調不良期間表示フラグには値「4」が設定されることとしてもよい。このようにすることによって、乳児1が体調不良の状態にある場合に限らず乳児1の体調を推定することが可能となり得る。 Alternatively, if the baby 1 is in poor physical condition, the value "1" is set in the poor physical condition period display flag, and if the baby 1 is in a very healthy state, the value "2" is set in the poor physical condition period display flag. Is set, and if baby 1 is not in a very healthy state but is in a "normal" state that is not in poor physical condition, the value "3" is set in the poor physical condition period display flag, and the physical condition is unknown. In the case of, the value "4" may be set in the poor physical condition period display flag. By doing so, it may be possible to estimate the physical condition of the baby 1 not only when the baby 1 is in a poor physical condition.
(変形例5)上述した実施の形態および変形例において、音声検出装置5のプロセッサ6が有する音声データ取得部61によって取得された音声データを対象として周波数成分検出部62によって検出された複数の周波数成分に基づき、音声パラメータ抽出部63によって抽出された複数のパラメータを用いて、乳児1が体調不良の状態であるか否かの判定が、体調判定部64によって行われる。さらに、それらの周波数成分或いは他の周波数成分に基づいて抽出される複数のパラメータを用いて、乳児1の感情推定のための処理が、図9に示す音声検出装置5のプロセッサ6が論理的に有する感情推定部65によって行われることとしてもよい。図9は、変形例5における体調検出システム2に含まれる音声検出装置5の構成を示す図である。図9に示す音声検出装置5のプロセッサ6は、感情推定部65を有する点において、図1(b)に示す音声検出装置5のプロセッサ6と異なる。 (Modification 5) In the above-described embodiment and modification, a plurality of frequencies detected by the frequency component detection unit 62 for the voice data acquired by the voice data acquisition unit 61 included in the processor 6 of the voice detection device 5. Based on the components, the physical condition determination unit 64 determines whether or not the baby 1 is in a poor physical condition using a plurality of parameters extracted by the voice parameter extraction unit 63. Further, the processor 6 of the voice detection device 5 shown in FIG. 9 logically performs the processing for emotion estimation of the baby 1 by using a plurality of parameters extracted based on those frequency components or other frequency components. It may be performed by the emotion estimation unit 65 having. FIG. 9 is a diagram showing a configuration of a voice detection device 5 included in the physical condition detection system 2 in the modified example 5. The processor 6 of the voice detection device 5 shown in FIG. 9 is different from the processor 6 of the voice detection device 5 shown in FIG. 1B in that it has an emotion estimation unit 65.
 乳児1の発声を含む音声の音声データを対象とする周波数成分の検出処理および音声パラメータの抽出処理は、上述した図4に示すパラメータ記録処理および図6に示す体調判定処理で行われる周波数成分の検出処理および音声パラメータの抽出処理と同様にして行われる。感情推定部65は、新たに取得された音声データから抽出されたパラメータと、過去に取得された音声データから抽出され、感情種別に対応付けて記録されたパラメータとの類似度を算出することによって、乳児1の感情が、分類された複数種類の感情のうちのいずれかに該当するものとして、乳児1の感情を推定する。感情推定部65による乳児1の感情推定結果は、取得された音声データから抽出されて記録されるパラメータに対応付けられる感情種別に用いられることとしてもよい。 The frequency component detection process and the voice parameter extraction process for the voice data of the voice including the utterance of the baby 1 are performed by the parameter recording process shown in FIG. 4 and the physical condition determination process shown in FIG. It is performed in the same manner as the detection process and the voice parameter extraction process. The emotion estimation unit 65 calculates the degree of similarity between the parameters extracted from the newly acquired voice data and the parameters extracted from the voice data acquired in the past and recorded in association with the emotion type. , The emotion of the infant 1 is estimated as one of the plurality of types of emotions classified. The emotion estimation result of the baby 1 by the emotion estimation unit 65 may be used for the emotion type associated with the parameters extracted from the acquired voice data and recorded.
 なお、上述した感情推定部65による乳児1の感情推定に用いられる音声パラメータと、体調判定部64による乳児1の体調不良判定に用いられる音声パラメータとは、例えば、音声パラメータ抽出部63によって抽出される同一種類のパラメータであってもよい。同一種類のパラメータとは、例えば、単位時間あたりの乳児1の発声の回数、乳児1の発声の1回における継続時間、乳児1の発声の基本周波数、乳児1の発声の第一フォルマント周波数および乳児1の発声の第二フォルマント周波数である。その場合、感情推定部65と体調判定部64とを一体に構成してもよい。感情推定部65と体調判定部64とが一体に構成された機能部を、感情推定および体調判定部と呼ぶこととする。 The voice parameters used for estimating the emotions of the baby 1 by the emotion estimation unit 65 and the voice parameters used for determining the poor physical condition of the baby 1 by the physical condition determination unit 64 are extracted by, for example, the voice parameter extraction unit 63. It may be the same kind of parameters. The parameters of the same type include, for example, the number of utterances of infant 1 per unit time, the duration of one utterance of infant 1, the fundamental frequency of utterance of infant 1, the first formant frequency of utterance of infant 1, and the infant. It is the second formant frequency of the utterance of 1. In that case, the emotion estimation unit 65 and the physical condition determination unit 64 may be integrally configured. A functional unit in which the emotion estimation unit 65 and the physical condition determination unit 64 are integrally configured is referred to as an emotion estimation and physical condition determination unit.
 その感情推定および体調判定部による乳児1の感情推定および体調判定の処理と、その前段に行われる音声パラメータ抽出部63によるパラメータ抽出処理とには、機械学習が済んだ学習モデルが用いられてもよい。そのためには、まず、図4に例示されるパラメータ記録処理が、学習モデルに教師データを学習させることによって実現される。教師データは、ステップS420で検出される周波数成分検出結果であり、例えば、ステップS410で取得された音声データ50に対応するスペクトログラム画像データであってもよい。学習モデルの入力層にスペクトログラム画像データが入力され、出力層に例えば「体調不良」「喜んでいる」「怒っている」「おなかがすいている」「眠い」「遊んで欲しがっている」といった体調および感情に関する6つの分類項目が設定される。出力層に設定されるそれらの各分類項目は、音声データ50に含まれる複数の周波数成分に基づいて抽出される上述のパラメータに対応して、学習モデルにより分類される。学習モデルは、こうした入力層と出力層とが設定された状態で機械学習する。これにより、音声パラメータ抽出部63による図4のステップS430、S440およびS450におけるパラメータ抽出等の一連の処理が完了する。 Even if a learning model that has completed machine learning is used for the emotion estimation and emotion estimation and physical condition determination processing of the baby 1 by the emotion estimation and physical condition determination unit, and the parameter extraction processing by the voice parameter extraction unit 63 performed in the previous stage. Good. For that purpose, first, the parameter recording process illustrated in FIG. 4 is realized by letting the learning model learn the teacher data. The teacher data is the frequency component detection result detected in step S420, and may be, for example, spectrogram image data corresponding to the audio data 50 acquired in step S410. Spectrogram image data is input to the input layer of the learning model, and for example, "unwell", "happy", "angry", "hungry", "sleepy", "wants to play" in the output layer. Six classification items related to physical condition and emotion are set. Each of those classification items set in the output layer is classified by the learning model according to the above-mentioned parameters extracted based on the plurality of frequency components included in the voice data 50. The learning model performs machine learning with the input layer and the output layer set. As a result, a series of processes such as parameter extraction in steps S430, S440 and S450 of FIG. 4 by the voice parameter extraction unit 63 are completed.
 そして、図6に例示される体調判定処理に感情推定処理が加わったことによる感情推定および体調判定処理が、機械学習済みの学習モデルにステップS620で検出される周波数成分検出結果が入力され、例えば「体調不良」「喜んでいる」「怒っている」「おなかがすいている」「眠い」「遊んで欲しがっている」といった体調および感情に関する6つの分類項目が設定された出力が得られることによって実現される。ステップS610で取得された音声データ51に対応するスペクトログラム画像データが入力されると、学習モデルは、音声データ50に含まれる複数の周波数成分に基づいて抽出される上述のパラメータに対応してその入力データを分類した結果を出力する。学習モデルは、入力データの分類結果として、入力された画像データと、上述した6つの分類項目のそれぞれとの類似度を出力する。これにより、音声パラメータ抽出部63による図6のステップS630におけるパラメータ抽出処理、感情推定および体調判定部によるS650およびS660におけるパラメータ類似度算出等の処理ならびに感情推定処理が完了する。なお、学習モデルの教師あり学習において、乳児1の発声の基本周波数が300Hz以上800Hz以下であるという基本周波数条件も考慮することにより、音声パラメータ抽出部63による図6のステップS640における基本周波数条件判定処理も、学習モデルの出力に含まれるようにすることができる。 Then, the emotion estimation and the physical condition determination process due to the addition of the emotion estimation process to the physical condition determination process illustrated in FIG. 6 are input to the machine-learned learning model with the frequency component detection result detected in step S620, for example. You can get an output with 6 classification items related to physical condition and emotions such as "unwell", "happy", "angry", "hungry", "sleepy", and "want to play". It is realized by. When the spectrogram image data corresponding to the audio data 51 acquired in step S610 is input, the training model inputs the spectrogram image data corresponding to the above-mentioned parameters extracted based on the plurality of frequency components included in the audio data 50. Output the result of classifying the data. The learning model outputs the degree of similarity between the input image data and each of the above-mentioned six classification items as the classification result of the input data. As a result, the parameter extraction process in step S630 of FIG. 6 by the voice parameter extraction unit 63, the emotion estimation, the parameter similarity calculation in S650 and S660 by the physical condition determination unit, and the emotion estimation process are completed. In the supervised learning of the learning model, the fundamental frequency condition determination in step S640 of FIG. 6 by the voice parameter extraction unit 63 is also taken into consideration by considering the fundamental frequency condition that the basic frequency of the utterance of the infant 1 is 300 Hz or more and 800 Hz or less. The processing can also be included in the output of the training model.
(変形例6)上述した実施の形態および変形例において、図1および図9に示すように、体調検出システム2は音声検出装置5と電子機器10とを含むこととし、音声検出装置5のプロセッサ6は音声データ取得部61と周波数成分検出部62と音声パラメータ抽出部63と体調判定部64と感情推定部65とを論理的に有することとしたが、他の構成であってもよい。例えば、音声検出装置5と電子機器10とが一体に構成されてもよい。或いは、音声検出装置5のプロセッサ6が音声データ取得部61および周波数成分検出部62を有し、音声パラメータ抽出部63および体調判定部64を電子機器10のプロセッサ11が有することとしてもよい。上述した変形例5においては、音声検出装置5のプロセッサ6が音声データ取得部61および周波数成分検出部62を有し、音声パラメータ抽出部63、体調判定部64および感情推定部65を電子機器10のプロセッサ11が有することとしてもよい。 (Modification 6) In the above-described embodiment and modification, as shown in FIGS. 1 and 9, the physical condition detection system 2 includes a voice detection device 5 and an electronic device 10, and the processor of the voice detection device 5 Although 6 is logically provided with a voice data acquisition unit 61, a frequency component detection unit 62, a voice parameter extraction unit 63, a physical condition determination unit 64, and an emotion estimation unit 65, other configurations may be used. For example, the voice detection device 5 and the electronic device 10 may be integrally configured. Alternatively, the processor 6 of the voice detection device 5 may have a voice data acquisition unit 61 and a frequency component detection unit 62, and the voice parameter extraction unit 63 and the physical condition determination unit 64 may be included in the processor 11 of the electronic device 10. In the above-described modification 5, the processor 6 of the voice detection device 5 has a voice data acquisition unit 61 and a frequency component detection unit 62, and the voice parameter extraction unit 63, the physical condition determination unit 64, and the emotion estimation unit 65 are electronic devices 10. The processor 11 of the above may have.
(変形例7)上述した実施の形態および変形例において、図4に例示するパラメータ記録処理および図6に例示する体調判定処理が、体調検出システム2に含まれる音声検出装置5が有するプロセッサ11によって実行されることとした。しかし、一部の処理が、例えば、音声検出装置5とは異なる他の装置で実行されることとしてもよい。図10は、変形例7における体調検出システム2の構成を示す図である。図10(a)において、体調検出システム2は、音声検出装置5と電子機器10と体調判定装置20を含み、音声検出装置5と、電子機器10と、体調判定装置20とが、通信ネットワーク30を介して互いに接続されている。体調判定装置20は、例えば大容量サーバであって、乳児1に限らず他の乳児の体調判定処理も行われることとしてもよい。 (Modification 7) In the above-described embodiment and modification, the parameter recording process illustrated in FIG. 4 and the physical condition determination process illustrated in FIG. 6 are performed by the processor 11 included in the voice detection device 5 included in the physical condition detection system 2. It was decided to be executed. However, some processing may be executed, for example, by another device different from the voice detection device 5. FIG. 10 is a diagram showing the configuration of the physical condition detection system 2 in the modified example 7. In FIG. 10A, the physical condition detection system 2 includes a voice detection device 5, an electronic device 10, and a physical condition determination device 20, and the voice detection device 5, the electronic device 10, and the physical condition determination device 20 are connected to a communication network 30. Are connected to each other via. The physical condition determination device 20 may be, for example, a large-capacity server, and may perform physical condition determination processing of not only the baby 1 but also other infants.
 図10(b)に示す音声検出装置5は、図1(b)に示す音声検出装置5と同様に、マイクロホン3、ストレージ4、プロセッサ6、メモリ7および通信モジュール8を有する。図10(b)に示す音声検出装置5のプロセッサ6は、メモリ7に格納されているコンピュータプログラムを起動することによって、周波数成分検出部62と、音声パラメータ抽出部63とを論理的に有するが、図1(b)に示す音声検出装置5と異なり体調判定処理が行われないため、体調判定部64を有さない。音声パラメータ抽出部63は、抽出した複数のパラメータを、ストレージ4に記録データ41として記録するのではなく、後述するように、通信モジュール8を介して電子機器10へ送信する。 The voice detection device 5 shown in FIG. 10B has a microphone 3, a storage 4, a processor 6, a memory 7, and a communication module 8 similar to the voice detection device 5 shown in FIG. 1B. The processor 6 of the voice detection device 5 shown in FIG. 10B logically has a frequency component detection unit 62 and a voice parameter extraction unit 63 by activating a computer program stored in the memory 7. , Unlike the voice detection device 5 shown in FIG. 1B, the physical condition determination process is not performed, so that the physical condition determination unit 64 is not provided. The voice parameter extraction unit 63 does not record the extracted plurality of parameters in the storage 4 as recorded data 41, but transmits the extracted parameters to the electronic device 10 via the communication module 8 as described later.
 図4に例示するパラメータ記録処理のステップS410からステップS440までの処理ステップは、上述した実施の形態における図1(b)に示す音声検出装置5と同様に、本変形例では図10(b)に示す音声検出装置5で実行される。本変形例においては、図4のステップS450におけるパラメータの値を記録する処理が行われる代わりに、抽出された複数のパラメータの値が、音声パラメータ抽出部63により、通信モジュール8を介して電子機器10へ送信され、電子機器10からさらに体調判定装置20へ、通信ネットワーク30を介して送信される。 The processing steps from step S410 to step S440 of the parameter recording processing illustrated in FIG. 4 are similar to the voice detection device 5 shown in FIG. 1 (b) in the above-described embodiment, and in this modification, FIG. 10 (b) is shown. It is executed by the voice detection device 5 shown in. In this modification, instead of performing the process of recording the parameter values in step S450 of FIG. 4, the extracted plurality of parameter values are extracted by the voice parameter extraction unit 63 via the communication module 8 and electronically. It is transmitted to 10 and further transmitted from the electronic device 10 to the physical condition determination device 20 via the communication network 30.
 図6に例示する体調判定処理のステップS610からステップS640までの処理ステップは、上述した実施の形態における図1(b)に示す音声検出装置5と同様に、本変形例では図10(b)に示す音声検出装置5で実行される。本変形例において、図6のステップS640において肯定判定が得られると、ステップS630において抽出された複数のパラメータの値は、音声パラメータ抽出部63により、通信モジュール8を介して電子機器10へ送信され、電子機器10からさらに体調判定装置20へ、通信ネットワーク30を介して送信される。 The processing steps from step S610 to step S640 of the physical condition determination process illustrated in FIG. 6 are similar to the voice detection device 5 shown in FIG. 1 (b) in the above-described embodiment, and in this modification, FIG. 10 (b) is shown. It is executed by the voice detection device 5 shown in. In this modification, if a positive determination is obtained in step S640 of FIG. 6, the values of the plurality of parameters extracted in step S630 are transmitted to the electronic device 10 by the voice parameter extraction unit 63 via the communication module 8. , The electronic device 10 is further transmitted to the physical condition determination device 20 via the communication network 30.
 図10(c)に示すように、体調判定装置20は、プロセッサ21、ストレージ24、メモリ27および通信モジュール28を有する。体調判定装置20のプロセッサ21は、メモリ27に格納されているコンピュータプログラムを起動することによって、音声パラメータ取得部211と、体調判定部214とを、論理的に有する。音声パラメータ取得部211は、音声検出装置5で抽出された複数のパラメータの値を、通信ネットワーク30および通信モジュール28を介して取得する。パラメータ記録処理において抽出された複数のパラメータの値は、音声パラメータ取得部211により、ストレージ24へ記録データ41として記録される。体調判定処理において抽出された複数のパラメータの値のうちの基本周波数については、音声パラメータ取得部211により、ステップS640における判定処理が実行される。この判定処理において肯定判定が得られた後のステップS650以降の処理ステップについては、図1(b)に示す音声検出装置5のプロセッサ6が有する体調判定部64と同様に、図10(c)に示す体調判定装置20のプロセッサ21が有する体調判定部214によって実行される。 As shown in FIG. 10C, the physical condition determination device 20 includes a processor 21, a storage 24, a memory 27, and a communication module 28. The processor 21 of the physical condition determination device 20 logically has the voice parameter acquisition unit 211 and the physical condition determination unit 214 by activating the computer program stored in the memory 27. The voice parameter acquisition unit 211 acquires the values of a plurality of parameters extracted by the voice detection device 5 via the communication network 30 and the communication module 28. The values of the plurality of parameters extracted in the parameter recording process are recorded in the storage 24 as recorded data 41 by the voice parameter acquisition unit 211. The voice parameter acquisition unit 211 executes the determination process in step S640 for the fundamental frequency among the values of the plurality of parameters extracted in the physical condition determination process. Regarding the processing steps after step S650 after the affirmative determination is obtained in this determination process, FIG. 10 (c) is the same as the physical condition determination unit 64 included in the processor 6 of the voice detection device 5 shown in FIG. 1 (b). It is executed by the physical condition determination unit 214 included in the processor 21 of the physical condition determination device 20 shown in the above.
 上述したように、図1および図9に示す音声検出装置5のプロセッサ6が有する体調判定部64の機能を、本変形例における体調判定装置20のプロセッサ21に、体調判定部214として配備することができる。同様に、図9に示す音声検出装置5が有する感情推定部65の機能を、本変形例における体調判定装置20のプロセッサ21に配備してもよい。 As described above, the function of the physical condition determination unit 64 included in the processor 6 of the voice detection device 5 shown in FIGS. 1 and 9 is provided as the physical condition determination unit 214 in the processor 21 of the physical condition determination device 20 in this modification. Can be done. Similarly, the function of the emotion estimation unit 65 included in the voice detection device 5 shown in FIG. 9 may be provided in the processor 21 of the physical condition determination device 20 in this modification.
 本発明の特徴的な機能を損なわない限り、本発明は、上述した各実施の形態および各変形例における構成に何ら限定されない。 The present invention is not limited to the configurations in each of the above-described embodiments and modifications as long as the characteristic functions of the present invention are not impaired.
1 乳児、2 体調検出システム、3 マイクロホン、4 ストレージ、5 音声検出装置、6 プロセッサ、7 メモリ、8 通信モジュール、10 電子機器、11 プロセッサ、12 メモリ、13 入力インタフェース、14 メッセージ出力装置、15 通信モジュール、20 体調判定装置、21 プロセッサ、24 ストレージ、27 メモリ、28 通信モジュール、30 通信ネットワーク、31 時系列データ、40 コンピュータプログラム提供サーバ、41 記録データ、45 記録媒体、46 操作端末、50 音声データ、51 音声データ、61 音声データ取得部、62 周波数成分検出部、63 音声パラメータ抽出部、64 体調判定部、65 感情推定部、111 入力制御部、112 メッセージ制御部、141 メッセージ、142 期間情報、145 メッセージ、146 検索ボタン表示、211 音声パラメータ取得部、214 体調判定部
 
1 Infant, 2 Physical condition detection system, 3 Microphone, 4 Storage, 5 Voice detector, 6 Processor, 7 Memory, 8 Communication module, 10 Electronic device, 11 Processor, 12 Memory, 13 Input interface, 14 Message output device, 15 Communication Module, 20 physical condition judgment device, 21 processor, 24 storage, 27 memory, 28 communication module, 30 communication network, 31 time series data, 40 computer program providing server, 41 recording data, 45 recording medium, 46 operation terminal, 50 voice data , 51 voice data, 61 voice data acquisition unit, 62 frequency component detection unit, 63 voice parameter extraction unit, 64 physical condition judgment unit, 65 emotion estimation unit, 111 input control unit, 112 message control unit, 141 message, 142 period information, 145 message, 146 search button display, 211 voice parameter acquisition unit, 214 physical condition judgment unit

Claims (8)

  1.  乳児の発声を表す第1の音声データを取得する音声データ取得部と、
     前記第1の音声データに含まれる複数の周波数成分を検出する周波数成分検出部と、
     前記第1の音声データに含まれる前記複数の周波数成分に基づく複数のパラメータを抽出する音声パラメータ抽出部と、
     前記音声データ取得部により前記乳児の発声を表す第2の音声データが取得されると、取得された前記第2の音声データと前記複数のパラメータとに基づき、前記乳児は体調不良の状態であるか否かの判定を行う体調判定部とを備える、
     体調検出システム。
    A voice data acquisition unit that acquires the first voice data representing the baby's utterance,
    A frequency component detection unit that detects a plurality of frequency components included in the first audio data,
    An audio parameter extraction unit that extracts a plurality of parameters based on the plurality of frequency components included in the first audio data, and an audio parameter extraction unit.
    When the second voice data representing the utterance of the baby is acquired by the voice data acquisition unit, the baby is in a poor physical condition based on the acquired second voice data and the plurality of parameters. It is equipped with a physical condition determination unit that determines whether or not it is present.
    Physical condition detection system.
  2.  請求項1に記載の体調検出システムにおいて、
     前記周波数成分検出部は、前記第2の音声データに含まれる前記複数の周波数成分をさらに検出し、
     前記音声パラメータ抽出部は、前記第2の音声データに含まれる前記複数の周波数成分に基づく前記複数のパラメータをさらに抽出し、
     前記体調判定部は、前記第2の音声データに含まれる前記複数の周波数成分に基づく前記複数のパラメータのうちの少なくとも一部のパラメータが所定条件を満たすとき、前記判定を行う、
     体調検出システム。
    In the physical condition detection system according to claim 1,
    The frequency component detection unit further detects the plurality of frequency components included in the second voice data, and further detects the plurality of frequency components.
    The voice parameter extraction unit further extracts the plurality of parameters based on the plurality of frequency components included in the second voice data, and further extracts the plurality of parameters.
    The physical condition determination unit makes the determination when at least a part of the plurality of parameters based on the plurality of frequency components included in the second voice data satisfies a predetermined condition.
    Physical condition detection system.
  3.  請求項2に記載の体調検出システムにおいて、
     前記少なくとも一部のパラメータは、前記第2の音声データに含まれる前記発声の基本周波数であり、
     前記所定条件は、前記基本周波数が300Hz以上800Hz以下である、
     体調検出システム。
    In the physical condition detection system according to claim 2,
    The at least a part of the parameters is the fundamental frequency of the utterance included in the second voice data.
    The predetermined condition is that the fundamental frequency is 300 Hz or more and 800 Hz or less.
    Physical condition detection system.
  4.  請求項2または請求項3に記載の体調検出システムにおいて、
     前記体調判定部は、前記第1の音声データに含まれる前記複数の周波数成分に基づく前記複数のパラメータと、前記第2の音声データに含まれる前記複数の周波数成分に基づく前記複数のパラメータとの類似度に基づいて、前記判定を行う、
     体調検出システム。
    In the physical condition detection system according to claim 2 or 3.
    The physical condition determination unit includes the plurality of parameters based on the plurality of frequency components included in the first voice data, and the plurality of parameters based on the plurality of frequency components included in the second voice data. The determination is made based on the degree of similarity.
    Physical condition detection system.
  5.  請求項1から請求項4までのいずれか一項に記載の体調検出システムにおいて、
     前記複数のパラメータは、単位時間あたりの前記発声の回数と、前記発声の1回における継続時間と、前記発声の基本周波数と、前記発声のフォルマント周波数とのうちの少なくとも一つのパラメータを含む、
     体調検出システム。
    In the physical condition detection system according to any one of claims 1 to 4.
    The plurality of parameters include at least one parameter of the number of utterances per unit time, the duration of one utterance, the fundamental frequency of the utterance, and the formant frequency of the utterance.
    Physical condition detection system.
  6.  請求項1から請求項5までのいずれか一項に記載の体調検出システムにおいて、
     前記体調不良の状態は、前記乳児が病気に罹患した状態と、前記乳児が前記病気に罹患しつつある状態とのうちの、少なくとも一方の状態を含む、
     体調検出システム。
    In the physical condition detection system according to any one of claims 1 to 5.
    The unwell condition includes at least one of a condition in which the baby is sick and a condition in which the baby is sick.
    Physical condition detection system.
  7.  請求項1から請求項6までのいずれか一項に記載の体調検出システムにおいて、
     前記体調判定部によって前記乳児が体調不良であると判定されると、前記乳児が体調不良である旨のメッセージを出力するメッセージ出力装置をさらに備える、
     体調検出システム。
    In the physical condition detection system according to any one of claims 1 to 6.
    When the baby is determined to be unwell by the physical condition determination unit, a message output device for outputting a message to the effect that the baby is unwell is further provided.
    Physical condition detection system.
  8.  請求項1から請求項7までのいずれか一項に記載の体調検出システムにおいて、
     前記音声データ取得部により前記第2の音声データが取得されると、取得された前記第2の音声データと前記複数のパラメータとに基づき、前記乳児の感情を推定する感情推定部をさらに備える、
     体調検出システム。
     
    In the physical condition detection system according to any one of claims 1 to 7.
    When the second voice data is acquired by the voice data acquisition unit, an emotion estimation unit that estimates the emotion of the baby based on the acquired second voice data and the plurality of parameters is further provided.
    Physical condition detection system.
PCT/JP2019/014526 2019-04-01 2019-04-01 Physical condition detection system WO2020202444A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/014526 WO2020202444A1 (en) 2019-04-01 2019-04-01 Physical condition detection system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/014526 WO2020202444A1 (en) 2019-04-01 2019-04-01 Physical condition detection system

Publications (1)

Publication Number Publication Date
WO2020202444A1 true WO2020202444A1 (en) 2020-10-08

Family

ID=72666745

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/014526 WO2020202444A1 (en) 2019-04-01 2019-04-01 Physical condition detection system

Country Status (1)

Country Link
WO (1) WO2020202444A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004531800A (en) * 2001-03-15 2004-10-14 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Automated system for monitoring persons requiring monitoring and their caretakers
WO2007102505A1 (en) * 2006-03-06 2007-09-13 Nagasaki University Infant emotion judging method, and device and program therefor
KR20100000466A (en) * 2008-06-25 2010-01-06 김봉현 Infant diagnostic apparatus and diagnostic methode using it
US20150265206A1 (en) * 2012-08-29 2015-09-24 Brown University Accurate analysis tool and method for the quantitative acoustic assessment of infant cry

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004531800A (en) * 2001-03-15 2004-10-14 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Automated system for monitoring persons requiring monitoring and their caretakers
WO2007102505A1 (en) * 2006-03-06 2007-09-13 Nagasaki University Infant emotion judging method, and device and program therefor
KR20100000466A (en) * 2008-06-25 2010-01-06 김봉현 Infant diagnostic apparatus and diagnostic methode using it
US20150265206A1 (en) * 2012-08-29 2015-09-24 Brown University Accurate analysis tool and method for the quantitative acoustic assessment of infant cry

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ASTHANA, SHUBHAM ET AL.: "Preliminary Analysis of Causes of Infant Cry", IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 15 December 2014 (2014-12-15), pages 468 - 473, XP032795524 *

Similar Documents

Publication Publication Date Title
US11830517B2 (en) Systems for and methods of intelligent acoustic monitoring
US10516938B2 (en) System and method for assessing speaker spatial orientation
JP5853635B2 (en) Sleep evaluation device
US11837249B2 (en) Visually presenting auditory information
US11751813B2 (en) System, method and computer program product for detecting a mobile phone user&#39;s risky medical condition
JPWO2019087811A1 (en) Information processing device and information processing method
JP6892426B2 (en) Learning device, detection device, learning method, learning program, detection method, and detection program
US20220266161A1 (en) Personal assistant control system
US20160217322A1 (en) System and method for inspecting emotion recognition capability using multisensory information, and system and method for training emotion recognition using multisensory information
JP2022544757A (en) System and method for detecting subject&#39;s fall using wearable sensor
JP2021146214A (en) Techniques for separating driving emotion from media induced emotion in driver monitoring system
JP6258172B2 (en) Sound information processing apparatus and system
Alishamol et al. System for infant cry emotion recognition using DNN
WO2020202444A1 (en) Physical condition detection system
WO2018109120A1 (en) Children monitoring system
JP2021519122A (en) Detection of subjects with respiratory disabilities
WO2018079018A1 (en) Information processing device and information processing method
JP6900089B2 (en) Personal assistant control system
US20240008766A1 (en) System, method and computer program product for processing a mobile phone user&#39;s condition
US20240055014A1 (en) Visualizing Auditory Content for Accessibility
US20230145714A1 (en) Apparatus and method for a phonation system
US20230363668A1 (en) Device for detecting challenging behaviors in people with autism
JP2022097269A (en) Information processing device, information processing method, and information processing program
US20170063769A1 (en) Communication support device, communication support method, and computer program product

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19922813

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19922813

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP