JP2018018184A

JP2018018184A - Vehicle event discrimination device

Info

Publication number: JP2018018184A
Application number: JP2016145907A
Authority: JP
Inventors: 悠輔谷澤; Yusuke Tanizawa
Original assignee: Calsonic Kansei Corp
Current assignee: Marelli Corp
Priority date: 2016-07-26
Filing date: 2016-07-26
Publication date: 2018-02-01

Abstract

PROBLEM TO BE SOLVED: To mainly raise a model with high discrimination accuracy in a short period.SOLUTION: A vehicle event discrimination device includes a voice data recognition section 4, a keyword extraction section 7, an acoustic discrimination section 12, a discrimination event specification section 16, and a data registration section 18. The vehicle event discrimination device also includes: voice collection devices 21 for collecting voice 2 daily uttered on an outside of a vehicle 1; and model raising sections 22 for raising a language model 5, a normal acoustic model 8, and a specific acoustic model 9 from the voice 2 daily collected on the outside of the vehicle 1 by the voice collection devices 21.SELECTED DRAWING: Figure 1

Description

この発明は、車両用事象判別装置に関するものである。 The present invention relates to a vehicle event determination device.

車両内で乗員から発せられた言葉に基いて、道路交通情報を収集すると共に、収集した道路交通情報を配信し得るようにした情報集配信装置が知られている（例えば、特許文献１、２参照）。 2. Description of the Related Art An information collection / distribution device is known that collects road traffic information and distributes the collected road traffic information based on words uttered by passengers in the vehicle (for example, Patent Documents 1 and 2). reference).

特開２００３−１２３１８５号公報JP 2003-123185 A 特開２０１４−１２３２３９号公報JP 2014-123239 A

上記情報集配信装置では、車両内で乗員から発せられた言葉に基いてシステムを構築するようにしている。しかし、車両内で発せられる言葉は量的に少ないため、判別精度の高いシステムが育成されるまでに長い時間がかかるという問題があった。判別精度の高いシステムの育成には、例えば、１人の人間が２万ｋｍ走行するのに要する時間が必要などとされている。 In the above information collection and distribution apparatus, a system is constructed based on words uttered by passengers in the vehicle. However, since there are few words in the vehicle, there is a problem that it takes a long time to develop a system with high discrimination accuracy. In order to train a system with high discrimination accuracy, for example, a time required for one person to travel 20,000 km is required.

そこで、本発明は、上記した問題点を解決することを、主な目的としている。 Accordingly, the main object of the present invention is to solve the above-described problems.

上記課題を解決するために、本発明は、
車両内で発せられた音声から音声データを認識する音声データ認識部と、
該音声データ認識部で認識された音声データから、事象判別用に設定された言語モデルを用いて、判別しようとする事象に関するキーワードを抽出するキーワード抽出部と、
該キーワード抽出部で抽出されたキーワードに伴う音響情報を、予め設定された通常音響モデルおよび特定音響モデルと比較して乗員状態が通常状態であるか特定状態であるかを判別する音響判別部と、
キーワード抽出部で抽出されたキーワードと、音響判別部で判別された乗員状態と、キーワードが抽出された時点またはその前後における乗員状況、車両状況、データベースに蓄積されたデータのうちのいずれか１つに基いて、乗員に生じている事象を特定する判別事象特定部と、
少なくとも、該判別事象特定部で特定された判別事象と、前記キーワード抽出部で判別されたキーワードとを関連付けたデータを、前記データベースに登録するデータ登録部と、を備えており、
更に、車両外で日常的に発せられた音声を収集可能な音声収集装置が設けられ、
該音声収集装置によって車両外で日常的に収集された音声から前記言語モデル、通常音響モデル、特定音響モデルを育成可能なモデル育成部が備えられたことを特徴とする。 In order to solve the above problems, the present invention provides:
A voice data recognition unit for recognizing voice data from voice generated in the vehicle;
A keyword extraction unit for extracting a keyword related to an event to be determined from the speech data recognized by the speech data recognition unit, using a language model set for event determination;
An acoustic discriminating unit for discriminating whether the occupant state is the normal state or the specific state by comparing the acoustic information accompanying the keyword extracted by the keyword extracting unit with a preset normal acoustic model and a specific acoustic model; ,
Any one of the keyword extracted by the keyword extraction unit, the occupant status determined by the acoustic determination unit, the occupant status, vehicle status, and data accumulated in the database at or before and after the keyword was extracted Based on the discriminant event identification unit that identifies the event occurring in the occupant,
At least a data registration unit for registering in the database data associated with the discrimination event specified by the discrimination event specifying unit and the keyword determined by the keyword extraction unit;
Furthermore, a voice collecting device capable of collecting voices routinely emitted outside the vehicle is provided,
A model nurturing unit capable of cultivating the language model, the normal acoustic model, and the specific acoustic model from speech collected routinely outside the vehicle by the speech collecting device is provided.

本発明によれば、上記構成によって、短期間のうちに判別精度の高いモデルを育成することができる。 According to the present invention, a model with high discrimination accuracy can be nurtured within a short period of time by the above configuration.

本実施の形態の実施例にかかる車両用事象判別装置の構成図である。It is a block diagram of the event determination apparatus for vehicles concerning the Example of this Embodiment. 複数の音声収集装置における、言語モデル、通常音響モデル、特定音響モデルの構成を示す図である。このうち、（ａ）は第一の端末、（ｂ）は第二の端末に対するものである。It is a figure which shows the structure of a language model, a normal acoustic model, and a specific acoustic model in a some audio | voice collection apparatus. Of these, (a) is for the first terminal and (b) is for the second terminal. データベースのデータ例を示す図である。It is a figure which shows the example of data of a database. 車両に対して、最小限の構成の車両用事象判別装置を適用した図である。It is the figure which applied the event judging device for vehicles of the minimum composition to vehicles. 図４に入力系の構成を加えた車両用事象判別装置の構成図である。It is a block diagram of the event determination apparatus for vehicles which added the structure of the input system to FIG. 図５に出力系の構成を加えた車両用事象判別装置の構成図である。It is a block diagram of the event determination apparatus for vehicles which added the structure of the output system to FIG. 入力系の構成の詳細図（その１）である。FIG. 3 is a detailed diagram (part 1) of the configuration of the input system. 入力系の構成の詳細図（その２）である。FIG. 5 is a detailed diagram (part 2) of the configuration of the input system. 出力系の構成の詳細図である。It is a detailed view of the configuration of the output system. データベースの状況を示す図である。It is a figure which shows the condition of a database.

以下、本実施の形態を、図面を用いて詳細に説明する。
図１〜図９は、この実施の形態を説明するためのものである。 Hereinafter, the present embodiment will be described in detail with reference to the drawings.
1 to 9 are for explaining this embodiment.

＜構成＞以下、この実施例にかかる車両用事象判別装置の構成について説明する。 <Configuration> The configuration of the vehicle event determination apparatus according to this embodiment will be described below.

（１）先ず、図１に示すように、車両１内で発せられた音声２から「音声データ３」を認識する音声データ認識部４と、
この音声データ認識部４で認識された音声データ３から、事象判別用に設定された言語モデル５（または事象判別言語モデル）を用いて、判別しようとする事象に関するキーワード６を抽出する（事象キーワード判別処理を行う）キーワード抽出部７と、
このキーワード抽出部７で抽出されたキーワード６に伴う音響情報を、予め設定された通常音響モデル８（または通常時音響モデル）および特定音響モデル９（または事象判別音響モデル）と比較して乗員状態１１が通常状態であるか特定状態であるかを判別する（事象音声判別処理を行う）音響判別部１２と、
キーワード抽出部７で抽出されたキーワード６と、音響判別部１２で判別された乗員状態１１と、キーワード６が抽出された時点またはその前後における乗員状況１３（ドライバー状況を含む）、車両状況１４、データベース１５に蓄積されたデータのうちのいずれか１つに基いて、乗員に生じている事象を特定する（事象判別処理を行う）判別事象特定部１６と、
少なくとも、この判別事象特定部１６で特定された判別事象１７と、キーワード抽出部７で判別されたキーワード６とを関連付けたデータ（事象関連データ）を、データベース１５に登録するデータ登録部１８と、を備えるようにする。
更に、車両１外で日常的に発せられた音声２を収集可能な音声収集装置２１を設けるようにする。
そして、この音声収集装置２１によって車両１外で日常的に収集された音声２から言語モデル５、通常音響モデル８、特定音響モデル９を育成可能なモデル育成部２２が備えられる。 (1) First, as shown in FIG. 1, a voice data recognition unit 4 for recognizing “voice data 3” from a voice 2 uttered in the vehicle 1;
From the speech data 3 recognized by the speech data recognition unit 4, a keyword 6 relating to the event to be identified is extracted using the language model 5 (or event discrimination language model) set for event discrimination (event keyword). A keyword extraction unit 7 for performing a discrimination process;
The acoustic information associated with the keyword 6 extracted by the keyword extracting unit 7 is compared with the normal acoustic model 8 (or normal acoustic model) and the specific acoustic model 9 (or event discriminating acoustic model) set in advance, and the passenger state An acoustic discriminating unit 12 for discriminating whether 11 is in a normal state or a specific state (performing event audio discrimination processing);
The keyword 6 extracted by the keyword extraction unit 7, the occupant state 11 determined by the acoustic determination unit 12, the occupant status 13 (including the driver status) at the time or before and after the keyword 6 is extracted, the vehicle status 14, Based on any one of the data stored in the database 15, a determination event specifying unit 16 that specifies an event occurring in the occupant (performs an event determination process);
A data registration unit 18 for registering at least data (event-related data) in which the discrimination event 17 specified by the discrimination event specification unit 16 and the keyword 6 determined by the keyword extraction unit 7 are associated with each other; Be prepared.
Furthermore, a sound collection device 21 capable of collecting the sound 2 uttered daily outside the vehicle 1 is provided.
And the model training part 22 which can grow the language model 5, the normal acoustic model 8, and the specific acoustic model 9 from the audio | voice 2 collected everyday outside the vehicle 1 with this audio | voice collection apparatus 21 is provided.

ここで、車両１内で発せられた音声２は、マイク２５で集められ、ノイズリダクション部２６でノイズリダクション処理（ＮＲ：Ｎoise Ｒeduction（ノイズ低減））によって雑音を除去された後に、音声データ認識部４（ＶＲ：Ｖoice Ｒecognition（音声認識））へ送られる。判別事象１７の特定には、乗員状況１３（ドライバー状況を含む）、車両状況１４、データベース１５に蓄積されたデータの他に、周辺環境状況なども用いることができる。データベース１５には、例えば、通常行動モデルや車内行動モデルなどを含むこともできる。通常行動モデルや車内行動モデルは、標準的なモデルを基にして個人別に育成したものとなっている。なお、乗員には、ドライバーや、助手席乗員や、後席乗員などを広く含むが、ドライバーや、助手席乗員や、後席乗員などを区別して用いる場合には、それ以外の乗員という意味で使うことがある。 Here, the voice 2 uttered in the vehicle 1 is collected by the microphone 25, and after the noise is removed by the noise reduction processing (NR: Noise Reduction) by the noise reduction unit 26, the voice data recognition unit. 4 (VR: Voice Recognition). In order to specify the discriminating event 17, in addition to data stored in the passenger situation 13 (including the driver situation), the vehicle situation 14, and the database 15, the surrounding environment situation and the like can be used. The database 15 can include, for example, a normal action model, an in-vehicle action model, and the like. The normal behavior model and the in-vehicle behavior model are cultivated for each individual based on a standard model. The passenger includes a wide range of drivers, passengers in the front seats, rear passengers, etc., but in the case of distinguishing between drivers, passengers in the front passengers, passengers in the rear seats, etc. May be used.

言語モデル５は、図２に示すように、少なくとも、個人ＩＤと、キーワードと、出現頻度の情報を備えたものとされる。出現頻度は、個人のキーワードの使い方（言語癖など）を調べる指標などとして利用される。言語モデル５は、標準的なモデルを基にして個人別に育成するものとなっている。キーワードには、上記の他に、位置や人数や気分などのシチュエーションに関する各種の情報や、その他の情報などを適宜付随させることができる。 As shown in FIG. 2, the language model 5 includes at least personal ID, keywords, and appearance frequency information. The appearance frequency is used as an index or the like for examining how to use individual keywords (such as language 癖). The language model 5 is trained for each individual based on a standard model. In addition to the above, various information related to the situation such as position, number of people, and mood, and other information can be appropriately attached to the keyword.

通常音響モデル８は、少なくとも、個人ＩＤと、キーワードと、音量、抑揚、（声の）高さなどの音響情報のうちの少なくとも一つを入力値として備えたものとされる。通常音響モデル８は、通常状態の時の音響情報を数値で評価するようにしたものとされる。通常音響モデル８も、標準的なモデルを基にして個人別に育成したものとなっている。 The normal acoustic model 8 includes at least one of acoustic information such as personal ID, keyword, volume, inflection, and (voice) height as an input value. The normal acoustic model 8 is assumed to evaluate numerically the acoustic information in the normal state. The normal acoustic model 8 is also trained individually based on a standard model.

同様に、特定音響モデル９は、少なくとも、個人ＩＤと、キーワードと、音量、抑揚、（声の）高さなどの音響情報のうちの少なくとも一つを入力値として備えたものとされる。特定状態の時の音響情報を数値で評価するようにしたものとされる。特定音響モデル９も、標準的なモデルを基にして個人別に育成したものとなっている。 Similarly, the specific acoustic model 9 includes at least one of acoustic information such as personal ID, keyword, volume, inflection, and (voice) height as an input value. It is assumed that acoustic information in a specific state is evaluated numerically. The specific acoustic model 9 is also trained for each individual based on a standard model.

音声収集装置２１は、複数用いることができる（端末２１ａ、端末２１ｂ・・・）。モデル育成部２２は、各音声収集装置２１に備えるのが好ましい。モデル育成部２２は、ソフトウェア（アプリ）として構築することができる。そして、複数の端末２１ａ、端末２１ｂで、モデル育成部２２によって、言語モデル５や通常音響モデル８や特定音響モデル９やデータベース１５がそれぞれ独自に育成されて行くことになる。 A plurality of voice collecting devices 21 can be used (terminal 21a, terminal 21b...). The model training unit 22 is preferably provided in each voice collection device 21. The model training unit 22 can be constructed as software (application). The language model 5, the normal acoustic model 8, the specific acoustic model 9, and the database 15 are each independently trained by the model training unit 22 in the plurality of terminals 21 a and terminals 21 b.

データベース１５に蓄積されたデータは、例えば、図３に示すように、少なくとも、個人ＩＤと、キーワードと、乗員状態１１との情報を備え、更に、乗員状況１３や、車両状況１４や、判別事象１７などの情報を必要に応じて備えたものとされる。更に、判別事象１７の原因や、後述するように、提供情報や車両制御に関する内容などの情報を含むようにしても良い。乗員状態１１は、音響情報に基いて判別されたものであり、乗員状況１３は、その他のものであり、様々な項目を含むことができる。車両制御に関する内容は、例えば、自動運転車などで利用することができるものとすることができる。 For example, as shown in FIG. 3, the data stored in the database 15 includes at least information of a personal ID, a keyword, and an occupant state 11, and further includes an occupant situation 13, a vehicle situation 14, and a discrimination event. It is assumed that information such as 17 is provided as necessary. Furthermore, information such as the cause of the determination event 17 and the content related to the provision information and vehicle control may be included as will be described later. The occupant state 11 is determined based on the acoustic information, and the occupant state 13 is other, and can include various items. The contents regarding the vehicle control can be used in, for example, an autonomous driving vehicle.

なお、言語モデル５や通常音響モデル８や特定音響モデル９やデータベース１５やモデル育成部２２は、車両１に備えるようにしても良い。更に、言語モデル５や通常音響モデル８や特定音響モデル９やデータベース１５やモデル育成部２２は、図４に示すように、インターネット上のクラウド２９（管理クラウドやオープンクラウドなど）に備えることもできる。これらをクラウド２９上に設けた場合には、車両１や音声収集装置２１の言語モデル５や通常音響モデル８や特定音響モデル９やデータベース１５などは、簡易版にすることができる。車両１、音声収集装置２１、クラウド２９に設けられた言語モデル５や通常音響モデル８や特定音響モデル９やデータベース１５は、相互にメンテナンスして最新情報を共有できるようにするのが好ましい。更に、クラウド２９を用いることにより、クラウド２９上の様々なデータベースを利用することも可能となる。 The language model 5, the normal acoustic model 8, the specific acoustic model 9, the database 15, and the model training unit 22 may be provided in the vehicle 1. Furthermore, the language model 5, the normal acoustic model 8, the specific acoustic model 9, the database 15, and the model training unit 22 can be provided in a cloud 29 (management cloud, open cloud, etc.) on the Internet as shown in FIG. 4. . When these are provided on the cloud 29, the language model 5, the normal acoustic model 8, the specific acoustic model 9, the database 15, and the like of the vehicle 1 and the voice collection device 21 can be simplified. The language model 5, the normal acoustic model 8, the specific acoustic model 9, and the database 15 provided in the vehicle 1, the voice collection device 21, and the cloud 29 are preferably maintained so that the latest information can be shared. Furthermore, by using the cloud 29, various databases on the cloud 29 can be used.

図４は、車両用事象判別装置を、具体的に車両１に対して適用した場合の図である。この場合、上記したノイズリダクション部２６、音声データ認識部４、キーワード抽出部７、音響判別部１２、判別事象特定部１６は、ドライバーセンシングコントローラと、乗員センシングコントローラとに対してそれぞれ備えられる。ドライバーセンシングコントローラおよび乗員センシングコントローラは、運転席乗員と助手席乗員（または後席乗員など、以下、助手席乗員の場合について説明する）とに対してそれぞれ個別に設けられた演算制御ユニットである。ノイズリダクション部２６、音声データ認識部４、キーワード抽出部７、音響判別部１２、判別事象特定部１６は、演算制御ユニット内でソフトウェア上の機能ブロックとして構築される。上記したマイク２５には、運転席や助手席の声だけを聞き取ることができる高指向性のアレイマイクなどが用いられる。アレイマイクは、運転席と助手席とに対してそれぞれ別個に備えられる。 FIG. 4 is a diagram in a case where the vehicle event determination device is specifically applied to the vehicle 1. In this case, the noise reduction unit 26, the voice data recognition unit 4, the keyword extraction unit 7, the acoustic discrimination unit 12, and the discrimination event specifying unit 16 are provided for the driver sensing controller and the occupant sensing controller, respectively. The driver sensing controller and the occupant sensing controller are arithmetic control units provided individually for a driver occupant and a passenger occupant (or a rear seat occupant, which will be described below as a passenger occupant). The noise reduction unit 26, the voice data recognition unit 4, the keyword extraction unit 7, the acoustic discrimination unit 12, and the discrimination event identification unit 16 are constructed as functional blocks on software in the arithmetic control unit. As the microphone 25 described above, a highly directional array microphone that can hear only the voices of the driver's seat and the passenger seat is used. The array microphone is provided separately for the driver seat and the passenger seat.

そして、ドライバーセンシングコントローラからのドライバー状況データ（運転席乗員に対する判別事象１７）および乗員センシングコントローラからの乗員状況データ（助手席乗員に対する判別事象１７）は、情報統合コントローラを介して統合状況データとされ、情報制御コントローラ（データ登録部１８およびモデル育成部２２）で車両内データベース（データベース１５）に登録される。更に、統合状況データと、車両内データベースのデータは、通信制御コントローラおよび通信機類を介して、クラウド２９や音声収集装置２１などと情報交換できるようになっている。 The driver status data from the driver sensing controller (discriminating event 17 for the driver occupant) and the occupant status data from the occupant sensing controller (discriminating event 17 for the passenger occupant) are converted into integrated status data via the information integration controller. The information controller (the data registration unit 18 and the model training unit 22) is registered in the in-vehicle database (database 15). Further, the integrated status data and the in-vehicle database data can be exchanged with the cloud 29, the voice collecting device 21 and the like via the communication control controller and communication devices.

また、音声収集装置２１も、モデル育成部２２の他に、ドライバーセンシングコントローラや乗員センシングコントローラと同様に、マイク２５、ノイズリダクション部２６、音声データ認識部４、キーワード抽出部７、音響判別部１２、判別事象特定部１６、データ登録部１８を備えていても良く、通信機類を介してクラウド２９や車両１と情報交換できるようになっている。同様に、クラウド２９にも、音声データ認識部４、キーワード抽出部７、音響判別部１２、判別事象特定部１６、データ登録部１８、および、モデル育成部２２を備えても良い。 In addition to the model training unit 22, the voice collection device 21 also has a microphone 25, a noise reduction unit 26, a voice data recognition unit 4, a keyword extraction unit 7, and an acoustic discrimination unit 12, similarly to the driver sensing controller and occupant sensing controller. The discrimination event specifying unit 16 and the data registration unit 18 may be provided, and information can be exchanged with the cloud 29 and the vehicle 1 via communication devices. Similarly, the cloud 29 may include the voice data recognition unit 4, the keyword extraction unit 7, the acoustic discrimination unit 12, the discrimination event identification unit 16, the data registration unit 18, and the model development unit 22.

その他に、図５に示すように、車両１内に設置された各種のセンサー類からの情報を入力して環境状況データを得るためのフュージョンコントローラや、車外センシングコントローラを備えることができる。また、車車間通信や路車間通信によってＩＴＳ状況データ（ＩＴＳ：Intelligent Transport Systems（高度道路交通システム）を得るための車両通信制御コントローラを備えることができる。更に、ＣＡＮ（ＦＤ）通信やＰrivate ＣＡＮ通信や個別センサーなどからの情報を入力して車両データを得るための車両状態検出コントローラを備えることができる。なお、上記した環境状況データやＩＴＳ状況データや車両データは、情報統合コントローラへ入力され、ドライバー状況データや乗員状況データに統合されて統合状況データとされる。 In addition, as shown in FIG. 5, a fusion controller for inputting information from various sensors installed in the vehicle 1 to obtain environmental status data and an outside-vehicle sensing controller can be provided. In addition, a vehicle communication control controller for obtaining ITS status data (ITS: Intelligent Transport Systems) can be provided by vehicle-to-vehicle communication or road-to-vehicle communication. Furthermore, CAN (FD) communication or Private CAN communication can be provided. And a vehicle state detection controller for obtaining vehicle data by inputting information from individual sensors, etc. The above-mentioned environmental status data, ITS status data and vehicle data are input to the information integration controller, It is integrated into driver status data and occupant status data to be integrated status data.

（２）図６に示すように、判別事象特定部１６で特定された判別事象１７に基いて、必要な情報を車両１内に報知する報知部３１を備えるようにしても良い。 (2) As shown in FIG. 6, a notification unit 31 that notifies necessary information in the vehicle 1 based on the determination event 17 specified by the determination event specifying unit 16 may be provided.

ここで、報知部３１は、音声や表示などによって、車両１内に情報を報知するようにしたものとされる。報知部３１は、ドライバー出力系システムおよび乗員出力系システムなどとして設けられている。ドライバー出力系システムは、ドライバーＨＭＩ制御コントローラ（ＨＭＩ：ヒューマンマシンインターフェース）によって制御されるようになっている（報知制御部）。また、乗員出力系システムは、乗員ＨＭＩ制御コントローラによって制御されるようになっている（報知制御部）。なお、乗員ＨＭＩ制御コントローラからの信号は、ドライバーＨＭＩ制御コントローラに対して入力されるようになっている。 Here, the notification unit 31 is configured to notify information in the vehicle 1 by voice or display. The notification unit 31 is provided as a driver output system, an occupant output system, or the like. The driver output system is controlled by a driver HMI controller (HMI: human machine interface) (notification control unit). The occupant output system is controlled by an occupant HMI controller (notification control unit). A signal from the occupant HMI controller is input to the driver HMI controller.

この際、時系列統計モデル生成器が、車両内データベース（データベース１５）のデータおよび統合状況データを用いて、ドライバー行動モデルと、ドライバーメンタルモデルと、乗員メンタルモデルとを育成するようになっている。そして、ドライバー行動モデルとドライバーメンタルモデルは、ドライバーＨＭＩ制御コントローラへ入力されて、ドライバー出力系システムへの出力に使用される。また、乗員メンタルモデルは、乗員ＨＭＩ制御コントローラへ入力されて、乗員出力系システムへの出力に使用される。 At this time, the time-series statistical model generator uses the data in the in-vehicle database (database 15) and the integrated status data to train a driver behavior model, a driver mental model, and an occupant mental model. . The driver behavior model and the driver mental model are input to the driver HMI controller and used for output to the driver output system. The occupant mental model is input to the occupant HMI controller and used for output to the occupant output system.

更に、各サービスアルゴリズムが、車両内データベース（データベース１５）のデータおよび統合状況データを用いて育成されるようになっている。サービスアルゴリズムからの情報は、統合状況データと共に、サービス制御コントローラや、乗員ＨＭＩ制御コントローラへ入力される。 Furthermore, each service algorithm is trained using data in the in-vehicle database (database 15) and integration status data. Information from the service algorithm is input to the service control controller and the occupant HMI control controller together with the integrated status data.

ドライバーＨＭＩ制御コントローラや、乗員ＨＭＩ制御コントローラや、サービス制御コントローラの状態は、それぞれ、ドライバーＨＭＩ制御フラグ、乗員ＨＭＩ制御フラグ、サービス制御フラグとして、情報統合コントローラへ入力される。 The states of the driver HMI control controller, the occupant HMI control controller, and the service control controller are input to the information integration controller as a driver HMI control flag, an occupant HMI control flag, and a service control flag, respectively.

（３）音声収集装置２１が、持ち運び可能な携帯端末などの端末２１ａ，２１ｂ、または、車両１外に設置された固定端末などの端末２１ｆとされる。 (3) The voice collection device 21 is a terminal 21 a or 21 b such as a portable terminal that can be carried or a terminal 21 f such as a fixed terminal installed outside the vehicle 1.

ここで、持ち運び可能な携帯端末などの端末２１ａ，２１ｂには、例えば、スマートフォン（多機能携帯通信端末）や、スマートウォッチ（腕時計型ウェアラブル端末）や、スマートグラス（メガネ型ウェアラブル端末）などのＩｏＥソリューション（Internet of Everything）を各種用いることができる。 Here, examples of the portable terminals 21a and 21b include IoE such as a smart phone (multifunctional mobile communication terminal), a smart watch (watch-type wearable terminal), and a smart glass (glasses-type wearable terminal). Various solutions (Internet of Everything) can be used.

また、固定端末などの端末２１ｆには、例えば、家屋内に設置された固定電話や、インターフォンや、その他の（例えば、次世代型照明、次世代型テレビ、次世代型エアコン、次世代型冷蔵庫、次世代型掃除機などのような）マイク２５が付いていて、しかも、インターネットに接続可能とされた各種の電化製品を、ＩｏＥソリューション（Internet of Everything）として広く用いることができる。 The terminal 21f such as a fixed terminal includes, for example, a fixed telephone installed in a house, an interphone, and other (eg, next-generation lighting, next-generation television, next-generation air conditioner, next-generation refrigerator) Various electric appliances equipped with a microphone 25 (such as a next-generation vacuum cleaner) and capable of being connected to the Internet can be widely used as an IoE solution (Internet of Everything).

（４）図３に示すように、判別事象１７は、ヒヤリハット事象とすることができる。 (4) As shown in FIG. 3, the discrimination event 17 can be a near-miss event.

ここで、ヒヤリハット事象は、車両１搭乗中にヒヤリとしたり、ハットしたりした事象のことである。即ち、道路交通状況に関連する事象のことである。この実施例では、ヒヤリハット事象を「危ない」というキーワードや、乗員状態１１が特定状態となっていることや、乗員状況１３に変化が見られることや、車両状況１４として急ブレーキがかけられたことなどによって総合的に判別するようにしている。但し、ヒヤリハット事象に関連するキーワードは、「危ない」に限るものではない。また、判別に関わったその他の項目の内容についても、上記に限るものではない。 Here, the near-miss event refers to an event that occurs when the vehicle 1 is on board or is hated. That is, an event related to the road traffic situation. In this embodiment, the keyword that the near-miss event is “dangerous”, that the occupant state 11 is in a specific state, that the occupant state 13 has changed, or that the vehicle state 14 has been suddenly braked. A comprehensive determination is made based on the above. However, the keyword related to the near-miss event is not limited to “dangerous”. Further, the contents of other items involved in the determination are not limited to the above.

これに対し、「危ない」というキーワードがあっても、乗員状態１１が通常状態となっている場合や、乗員状況１３に変化が見られない場合や、車両状況１４に変化が見られない場合などには、通常事象であると判別されることになる。 On the other hand, even if there is a keyword “dangerous”, when the occupant state 11 is in the normal state, when the occupant state 13 does not change, or when the vehicle state 14 does not change, etc. Is determined to be a normal event.

（５）判別事象１７は、わくわく事象とすることができる。 (5) The discrimination event 17 can be an exciting event.

ここで、わくわく事象は、車両１搭乗中に乗員（ドライバーを含む）がわくわくする事象のことである。即ち、乗員の現在の感情状態に起因する事象のことである。この実施例では、わくわく事象を「楽しい」というキーワードや、乗員状態１１が特定状態となっていることや、乗員状況１３に変化が見られることや、車両状況１４として緩い加速状態になっていることなどによって総合的に判別するようにしている。但し、わくわく事象に関連するキーワードは、「楽しい」に限るものではない。また、判別に関わったその他の項目の内容についても、上記に限るものではない。 Here, the exciting event is an event in which an occupant (including a driver) is excited while boarding the vehicle 1. That is, an event caused by the passenger's current emotional state. In this embodiment, the exciting event is the keyword “fun”, that the occupant state 11 is in a specific state, that the occupant state 13 is changed, and that the vehicle state 14 is in a moderately accelerated state. It is made to distinguish comprehensively by things. However, the keyword related to the exciting event is not limited to “fun”. Further, the contents of other items involved in the determination are not limited to the above.

これに対し、「楽しい」というキーワードがあっても、乗員状態１１が通常状態となっている場合や、乗員状況１３に変化が見られない場合や、車両状況１４に変化が見られない場合などには、通常事象であると判別されることになる。 On the other hand, even if there is a keyword “fun”, when the occupant state 11 is in the normal state, when the occupant state 13 does not change, or when the vehicle state 14 does not change, etc. Is determined to be a normal event.

（６）判別事象１７は、感動事象とすることができる。 (6) The discrimination event 17 can be a touching event.

ここで、感動事象は、車両１搭乗に関連して乗員が感動するような事象のことである。即ち、乗員の価値観に関連性の高い事象のことである。感動事象には、例えば、景勝地や歴史的遺産などに関する事項が含まれる。この実施例では、感動事象を「綺麗だ」というキーワードや、乗員状態１１が特定状態となっていることや、乗員状況１３に変化が見られることや、車両状況１４として緩い減速状態になっていることなどによって総合的に判別するようにしている。但し、感動事象に関連するキーワードは、「綺麗だ」に限るものではない。また、判別に関わったその他の項目の内容についても、上記に限るものではない。 Here, the impression event is an event that the occupant is impressed in connection with the boarding of the vehicle 1. In other words, it is an event that is highly relevant to the occupant's values. The emotional event includes, for example, matters relating to scenic spots and historical heritage. In this embodiment, the moving event is a keyword “beautiful”, that the occupant state 11 is in a specific state, that the occupant state 13 is changed, and that the vehicle state 14 is in a slow deceleration state. So that it can be comprehensively determined. However, the keyword related to the emotional event is not limited to “beautiful”. Further, the contents of other items involved in the determination are not limited to the above.

これに対し、「綺麗だ」というキーワードがあっても、乗員状態１１が通常状態となっている場合や、乗員状況１３に変化が見られない場合や、車両状況１４に変化が見られない場合などには、通常事象であると判別されることになる。 On the other hand, even if there is a keyword “beautiful”, when the occupant state 11 is in the normal state, when the occupant state 13 does not change, or when the vehicle state 14 does not change For example, it is determined that the event is a normal event.

（７）図２で既に示したように、言語モデル５が、判別事象１７に関連するキーワード６をデータ化したものとされる。
通常音響モデル８が、日常的に発する言葉に付随する音量、抑揚、高さ、速さ、長さ、強さ、溜め、こぶし、しゃくり、ビブラート、ファルセット（ファルセット率、ファルセット変化度）、ブレス（ブレス形態）などの音響情報のうちの少なくとも１つを入力量として有するものとされる。
特定音響モデル９が、判別事象１７に相当する時に発するキーワード６に付随する音量、抑揚、高さ、速さ、長さ、強さ、溜め、こぶし、しゃくり、ビブラート、ファルセット（ファルセット率、ファルセット変化度）、ブレス（ブレス形態）などの音響情報のうちの少なくとも１つを入力量として有するものとされる。 (7) As already shown in FIG. 2, it is assumed that the language model 5 is obtained by converting the keyword 6 related to the discrimination event 17 into data.
Usually, the acoustic model 8 has the volume, inflection, height, speed, length, strength, reservoir, fist, shackle, vibrato, falsetset (falset rate, falsetset change), breath It is assumed that at least one of acoustic information such as a breath form) is input as an input amount.
Volume, inflection, height, speed, length, strength, reservoir, fist, shackle, vibrato, falsetset (falseset rate, falsetset change) associated with the keyword 6 generated when the specific acoustic model 9 corresponds to the discrimination event 17 Degree), breath (brace form) and the like as at least one of acoustic information.

ここで、溜め、こぶし、しゃくり、ビブラート、ファルセットなどは、歌に関する技術などとして広く認識されているものであり、例えば、歌唱力診断ソフトなどで、採点基準として利用されているものである。なお、ファルセットとは、裏声のことである。 Here, a reservoir, a fist, a shackle, a vibrato, a falset, etc. are widely recognized as techniques related to the song, and are used as a scoring standard in, for example, singing ability diagnosis software. In addition, a falset is a back voice.

（８）既にしたように、言語モデル５、特定音響モデル９、通常音響モデル８は、個人別に作成されるようにするのが好ましい。 (8) As already described, it is preferable that the language model 5, the specific acoustic model 9, and the normal acoustic model 8 are created for each individual.

ここで、個人別は、例えば、個人ＩＤを登録することで、識別できるようになる。なお、生年月日、住所、氏名、年齢、職業・・などの個人情報と関連付けるようにしても良い。 Here, each individual can be identified, for example, by registering a personal ID. It may be associated with personal information such as date of birth, address, name, age, occupation, etc.

以下、車両用事象判別装置の詳細部分について説明する。 Hereafter, the detailed part of the event determination apparatus for vehicles is demonstrated.

図７Ａ、図７Ｂは、入力系の下位システムを表す図である。 7A and 7B are diagrams showing a lower system of the input system.

そして、図７Ａに示すように、上記したドライバーセンシングコントローラには、直接センシングするものと、非接触センシングするものとがあり、そのどちらも用いることができる。直接センシングするものには、シートセンサー、ステアセンサー、ベルトセンサー、肘かけセンサー、埋込み型センサー、ウェアラブルセンサー、スマホセンサー、タッチパネルセンサー、シフトノブセンサー、サンバイザーセンサー、スイッチセンサーなどがあり、このうち、スイッチセンサーには、コラムスイッチセンサー、オーディオスイッチセンサー、ミラースイッチセンサー、ステアリングスイッチセンサー、ナビスイッチセンサー、空調スイッチセンサー、その他のスイッチセンサー、コマンダースイッチセンサーなどがある。また、非接触センシングするものには、状態検出カメラ、音響センシング、レーザセンサー、超音波センサー、電波センサーなどがあり、このうち、音響センシングで、上記したように、マイク２５に入力した音声を、ノイズリダクション部２６でノイズリダクションして、音声認識を行うようにする。マイク２５に入力した音声は、体調検出や感情認識などに使うこともできる。 As shown in FIG. 7A, the driver sensing controller described above includes a direct sensing type and a non-contact sensing type, both of which can be used. Direct sensing devices include seat sensors, steering sensors, belt sensors, armrest sensors, embedded sensors, wearable sensors, smartphone sensors, touch panel sensors, shift knob sensors, sun visor sensors, switch sensors, etc. Sensors include column switch sensors, audio switch sensors, mirror switch sensors, steering switch sensors, navigation switch sensors, air conditioning switch sensors, other switch sensors, and commander switch sensors. In addition, non-contact sensing includes state detection cameras, acoustic sensing, laser sensors, ultrasonic sensors, radio wave sensors, etc. Among them, as described above, the sound input to the microphone 25 by acoustic sensing, The noise reduction unit 26 performs noise reduction to perform voice recognition. The voice input to the microphone 25 can also be used for physical condition detection and emotion recognition.

同様に、上記した乗員センシングコントローラには、直接センシングするものと、非接触センシングするものとがあり、そのどちらも用いることもできる。直接センシングするものには、シートセンサー、ベルトセンサー、肘かけセンサー、ウェアラブルセンサー、スマホセンサー、タッチパネルセンサー、スイッチセンサー、サンバイザーセンサー、埋込み型センサーなどがあり、このうち、スイッチセンサーには、コマンダースイッチセンサー、オーディオスイッチセンサー、ナビスイッチセンサー、空調スイッチセンサー、その他のスイッチセンサー、などがある。また、非接触センシングするものには、音響センシング、状態検出カメラ、レーザセンサー、超音波センサー、電波センサーなどがあり、このうち、音響センシングで、上記したように、マイク２５に入力した音声を、ノイズリダクション部２６でノイズリダクションして、音声認識を行うようにする。マイク２５に入力した音声は、体調検出や感情認識などに使うこともできる。 Similarly, the occupant sensing controller described above includes a direct sensing type and a non-contact sensing type, both of which can be used. Direct sensing devices include seat sensors, belt sensors, armrest sensors, wearable sensors, smartphone sensors, touch panel sensors, switch sensors, sun visor sensors, and embedded sensors. Of these, switchers include commander switches. There are sensors, audio switch sensors, navigation switch sensors, air conditioning switch sensors, and other switch sensors. In addition, non-contact sensing includes acoustic sensing, a state detection camera, a laser sensor, an ultrasonic sensor, a radio wave sensor, etc. Among these, as described above, the sound input to the microphone 25 by acoustic sensing, The noise reduction unit 26 performs noise reduction to perform voice recognition. The voice input to the microphone 25 can also be used for physical condition detection and emotion recognition.

また、図７Ｂに示すように、上記したフュージョンコントローラに入力されるセンサー類には、周辺検知カメラ、前方検知カメラ、後方検知カメラ、ミリ波レーダー、赤外線レーザー、前方検知ＬｉＤａＲ（Light Detection and Ranging、Laser Imaging Detection and Ranging：「光検出と測距」ないし「レーザー画像検出と測距」）、左右検知ＬｉＤａＲ、後方ＬｉＤａＲ、超音波センサー、車外音響センサーなどがある。 As shown in FIG. 7B, the sensors input to the fusion controller include a peripheral detection camera, a front detection camera, a rear detection camera, a millimeter wave radar, an infrared laser, a front detection LiDaR (Light Detection and Ranging, Laser Imaging Detection and Ranging: “light detection and ranging” or “laser image detection and ranging”), left and right detection LiDaR, rear LiDaR, ultrasonic sensor, vehicle exterior acoustic sensor, and the like.

上記した車両状態検出コントローラにＣＡＮ他などによって入力されるものには、ＣＡＮＦＤ、ＯＢＤII、ＧＰＳ、モーションセンサ、車内音響センサー、周期センサー、車
内状態センサー、コンディションセンサー、などがある。 Examples of what is input to the above-described vehicle state detection controller by CAN and others include CAN FD, OBDII, GPS, motion sensor, in-vehicle acoustic sensor, period sensor, in-vehicle state sensor, condition sensor, and the like.

上記した通信制御コントローラと通信を行う通信機器は、Bluetooth class1 ver4.0、Bluetooth class2 ver4.0(Bluetooth:登録商標)、ＷｉＭＡＸ２、ＩＥＥＥ８０２．１１、ＬＴＥ、Ｚigbee、５Ｇなどによる接続が可能な音声収集装置２１（端末２１ａ，２１ｂ，２１ｆなど）とされる。 The communication equipment that communicates with the above-mentioned communication control controller can be connected by Bluetooth class1 ver4.0, Bluetooth class2 ver4.0 (Bluetooth: registered trademark), WiMAX2, IEEE802.11, LTE, Zigbee, 5G, etc. The device 21 (terminals 21a, 21b, 21f, etc.) is used.

上記した車両通信制御コントローラによる車車間通信では、５．８Ｇや、７６０ＭＨｚの帯域を使うものがあり、路車間通信には、ＤＳＲＣ（Dedicated Short Range Communications）、光ビーコン、７６０ＭＨｚの帯域を使うものなどがある。 In the vehicle-to-vehicle communication by the vehicle communication control controller described above, there are those that use a band of 5.8G and 760 MHz, and road-to-vehicle communication that uses a DSRC (Dedicated Short Range Communications), an optical beacon, a band of 760 MHz, etc. There is.

これらの入力系の下位システムは、乗員状況１３や車両状況１４や周辺環境状況などの検出に使用することができる。 These lower systems of the input system can be used to detect the passenger situation 13, the vehicle situation 14, the surrounding environment situation, and the like.

次に、図８は、出力系の下位システムを表す図である。 Next, FIG. 8 is a diagram showing a lower system of the output system.

ドライバーＨＭＩ制御コントローラで制御される出力系システム（報知部３１）には、ＨＵＤ、ナビモニター、セカンドモニター、音出力システム、シート出力システム、後面鏡、ステア出力システム、後写鏡、メーターディスプレイ、ウィンドウ、非接触触感核出力システム、匂い出力システム、肘かけ出力システム、フロントウィンドウＡＲ、擬似サウンドシステム、オービスレーダー、スイッチ出力システム、フロントライト、シートベルト、ＡピラーＡＲ、スマホ、ウェアラブルデバイス、タブレット端末などを使用することができる。 The output system (notification unit 31) controlled by the driver HMI controller includes HUD, navigation monitor, second monitor, sound output system, seat output system, rear mirror, steer output system, rearview mirror, meter display, window , Non-contact tactile nucleus output system, odor output system, armrest output system, front window AR, pseudo sound system, orbis radar, switch output system, front light, seat belt, A pillar AR, smartphone, wearable device, tablet terminal, etc. Can be used.

また、乗員ＨＭＩ制御コントローラで制御される出力系システム（報知部３１）には、ナビモニター、セカンドモニター、音出力システム、シート出力システム、ウィンドウ、後席ディスプレイ、非接触触感核出力システム、匂い出力システム、肘かけ出力システム、フロントウィンドウＡＲ、擬似サウンドシステム、ＡＲ出力ＡＲ、ゲーム端末、スイッチ出力システム、フロントライト、シートベルト、ＡピラーＡＲ、スマホ、ウェアラブルデバイス、タブレット端末などを使用することができる。 The output system (notification unit 31) controlled by the occupant HMI controller includes a navigation monitor, a second monitor, a sound output system, a seat output system, a window, a rear seat display, a non-contact tactile nucleus output system, and an odor output. System, armrest output system, front window AR, pseudo sound system, AR output AR, game terminal, switch output system, front light, seat belt, A pillar AR, smartphone, wearable device, tablet terminal, etc. can be used. .

これらの出力系の下位システムは、単独で、または、組み合わせて報知部３１などに使用することができる。 These output lower systems can be used alone or in combination for the notification unit 31 and the like.

そして、図９は、データベース１５の状況を示すものである。 FIG. 9 shows the status of the database 15.

車両１内のデータベース１５は、簡易データベースとなっており、例えば、ヒヤリハット関連データや、わくわく関連データや、簡易車内行動モデルなどを備えたものとされている。 The database 15 in the vehicle 1 is a simple database and includes, for example, near-miss related data, exciting related data, a simple in-vehicle action model, and the like.

そして、スタティックデータとして、車両Ｎｏ、ドライバー種別、乗員種別、日時、曜日、運転乗車回数などの情報を備えることができる。 And as static data, information, such as vehicle No., a driver classification, a crew member type, a date, a day of the week, a driving boarding number, can be provided.

車両状況には、速度、緯度経度、高度、斜度、アクセル状態、ブレーキ状態、クラッチ状態、ウィンカ状態、ギア状態、ワイパー状態、ドアミラー状態、シート状態、運転操作レベル、オーディオ状態、ワーニング状態、ライト状態、ステア状態、アイドル状態、エアコン状態、シートベルトなどの情報を備えることができる。 Vehicle status includes speed, latitude / longitude, altitude, slope, accelerator state, brake state, clutch state, winker state, gear state, wiper state, door mirror state, seat state, driving operation level, audio state, warning state, light Information such as a state, a steer state, an idle state, an air conditioner state, and a seat belt can be provided.

（周辺）環境情報には、歩行者状況、通信状態、道路状況、ルール、車両状況、信号状況、ＰＯＩ（Point of Interest）、自車位置状況、周辺上空移動体状況、ＳＮＳ、サーバー情報などの情報を備えることができる。このうち、自車位置状況には、緯度、高度、車線位置、湿度、天候、経度、斜度、温度、明るさ、透明度などの詳細情報を備えている。道路状況には、道路種別、車線数などの詳細情報を備えることができる。車両状況には、前方車両状況、並走車両状況、交差車両状況、後方車両状況などの詳細情報を備えることができる。また、歩行者状況、周辺上空移動体状況、信号状況などにも、各種の詳細情報を備えることができる。 (Nearby) environmental information includes pedestrian status, communication status, road status, rules, vehicle status, signal status, POI (Point of Interest), own vehicle position status, surrounding mobile object status, SNS, server information, etc. Information can be provided. Among these, the vehicle position situation includes detailed information such as latitude, altitude, lane position, humidity, weather, longitude, slope, temperature, brightness, and transparency. The road situation can be provided with detailed information such as the road type and the number of lanes. The vehicle situation can be provided with detailed information such as the front vehicle situation, the parallel running vehicle situation, the crossing vehicle situation, and the rear vehicle situation. Moreover, various detailed information can be provided also in a pedestrian situation, a surrounding sky moving body situation, a signal situation, etc.

ドライバー状況／乗員状況には、生体状況、視線、胎動、わくわく度、感動度、要求内容、運転（乗車）継続時間、満足度、感情、顔向き、見やすさ（まぶしさ）、聞きやすさ、発話内容、楽しさ、瞬き状態などの情報を備えることができる。このうち、生体状況には、覚醒度、脳波、脳血流、ＨｂＡｌｃ、血糖値、体感温度、疲れ、血圧、γ−ＧＰＴ、血中アミノ酸、心拍、体温、常備薬血中濃度、空腹感などの詳細情報を備えることができる。 Driver status / occupant status includes biological status, line of sight, fetal movement, excitement, sensitivity, requirements, driving (ride) duration, satisfaction, emotion, face orientation, visibility (glare), ease of hearing, Information such as utterance content, enjoyment, blink state, etc. can be provided. Among these, the biological status includes arousal level, brain wave, cerebral blood flow, HbAlc, blood sugar level, body temperature, fatigue, blood pressure, γ-GPT, blood amino acids, heart rate, body temperature, blood concentration of common medicines, hunger, etc. Detailed information can be provided.

また、音声収集装置２１は、内部に簡易データベースや簡易通常行動モデルなどを備えたものとされている。この際、ユーザ状況として、生体状況、視線、胎動、わくわく度、感動度、要求内容、運転（乗車）継続時間、満足度、感情、顔向き、見やすさ（まぶしさ）、聞きやすさ、発話内容、楽しさ、瞬き状態などの情報を備えることができる。このうち、生体状況には、覚醒度、脳波、脳血流、ＨｂＡｌｃ、血糖値、体感温度、疲れ、血圧、γ−ＧＰＴ、血中アミノ酸、心拍、体温、常備薬血中濃度、空腹感などの詳細情報を備えることができる。 The voice collection device 21 is provided with a simple database, a simple normal action model, and the like. At this time, the user status includes biological status, line of sight, fetal movement, excitement, sensitivity, required content, driving (riding) duration, satisfaction, emotion, face orientation, visibility (glare), ease of hearing, speech Information such as contents, enjoyment, and blinking state can be provided. Among these, the biological status includes arousal level, brain wave, cerebral blood flow, HbAlc, blood sugar level, body temperature, fatigue, blood pressure, γ-GPT, blood amino acids, heart rate, body temperature, blood concentration of common medicines, hunger, etc. Detailed information can be provided.

クラウド２９のサーバーは、各種のデータベースや、車内行動モデルや、通常行動モデルを備えたものとされている。 The server of the cloud 29 includes various databases, an in-vehicle behavior model, and a normal behavior model.

＜作用＞以下、この実施例の作用について説明する。 <Operation> The operation of this embodiment will be described below.

図１に示すように、車両用事象判別装置では、先ず、音声データ認識部４が、車両１内で発せられた音声２から「音声データ３」を認識する。 As shown in FIG. 1, in the vehicle event determination device, first, the voice data recognition unit 4 recognizes “voice data 3” from the voice 2 uttered in the vehicle 1.

次に、キーワード抽出部７が、音声データ認識部４で認識された音声データ３から、事象判別用に設定された言語モデル５（または事象判別言語モデル）を用いて、判別しようとする事象に関するキーワード６を抽出する（事象キーワード判別処理を行う）。 Next, the keyword extraction unit 7 relates to the event to be determined from the speech data 3 recognized by the speech data recognition unit 4 using the language model 5 (or event determination language model) set for event determination. Keyword 6 is extracted (event keyword discrimination processing is performed).

そして、音響判別部１２が、キーワード抽出部７で抽出されたキーワード６に伴う音響情報を、予め設定された通常音響モデル８（または通常時音響モデル）および特定音響モデル９（または事象判別音響モデル）と比較して乗員状態１１が通常状態であるか特定状態であるかを判別する（事象音声判別処理を行う）。 Then, the acoustic discriminating unit 12 uses the normal acoustic model 8 (or the normal acoustic model) and the specific acoustic model 9 (or the event discriminating acoustic model) set in advance for the acoustic information associated with the keyword 6 extracted by the keyword extracting unit 7. ) To determine whether the occupant state 11 is a normal state or a specific state (event voice discrimination processing is performed).

そして、判別事象特定部１６が、キーワード抽出部７で抽出されたキーワード６と、音響判別部１２で判別された乗員状態１１と、キーワード６が抽出された時点またはその前後における乗員状況１３（ドライバー状況を含む）、車両状況１４、データベース１５に蓄積されたデータのうちのいずれか１つに基いて、乗員に生じている事象を特定する（事象判別処理を行う）。 Then, the discriminating event specifying unit 16 detects the keyword 6 extracted by the keyword extracting unit 7, the occupant state 11 determined by the acoustic discriminating unit 12, and the occupant situation 13 (driver) before or after the keyword 6 is extracted. The event occurring in the occupant is specified based on any one of the vehicle status 14 and the data accumulated in the database 15 (event discrimination processing is performed).

更に、データ登録部１８が、少なくとも、この判別事象特定部１６で特定された判別事象１７と、キーワード抽出部７で判別されたキーワード６とを関連付けたデータ（事象関連データ）を、データベース１５に登録する。 Further, the data registration unit 18 stores at least data (event related data) in which the discrimination event 17 specified by the discrimination event specification unit 16 and the keyword 6 determined by the keyword extraction unit 7 are associated with each other in the database 15. sign up.

この際、音声収集装置２１が、車両１外で日常的に発せられた音声２を収集する。 At this time, the sound collecting device 21 collects the sound 2 uttered daily outside the vehicle 1.

そして、モデル育成部２２が、音声収集装置２１によって車両１外で日常的に収集された音声２から言語モデル５、通常音響モデル８、特定音響モデル９を育成する。 The model training unit 22 trains the language model 5, the normal acoustic model 8, and the specific acoustic model 9 from the speech 2 that is routinely collected outside the vehicle 1 by the speech collection device 21.

＜効果＞この実施例によれば、以下のような効果を得ることができる。 <Effect> According to this embodiment, the following effects can be obtained.

（効果１）例えば、車両１内で発せられる音声２のみを用いて言語モデル５や通常音響モデル８や特定音響モデル９などの基準となるモデルを育成しようとした場合、車両１内で発せられる音声２では量的に不足しているので、判別精度の高いモデルが育成されるまでには長い時間がかかってしまう。判別精度の高いシステムの育成には、例えば、１人の人間が２万ｋｍ走行するのに要する時間が必要などとされている。 (Effect 1) For example, when an attempt is made to develop a reference model such as the language model 5, the normal acoustic model 8, or the specific acoustic model 9 using only the voice 2 uttered in the vehicle 1, it is emitted in the vehicle 1. Since the voice 2 is insufficient in quantity, it takes a long time to develop a model with high discrimination accuracy. In order to train a system with high discrimination accuracy, for example, a time required for one person to travel 20,000 km is required.

そこで、外部の音声収集装置２１で、車両１外で発せられる日常的な音声２を収集して、モデル育成部２２で、車両１外の日常的な音声２を用いて言語モデル５や通常音響モデル８や特定音響モデル９などのモデルを育成させるようにした。これにより、短期間のうちに判別精度の高いモデルを得ることが可能となり、判別精度の高いモデルを用いた判別を容易に行うことが可能となる。 Therefore, the daily voice 2 uttered outside the vehicle 1 is collected by the external voice collecting device 21, and the language model 5 and the normal sound are collected by the model growing unit 22 using the daily voice 2 outside the vehicle 1. Models such as model 8 and specific acoustic model 9 were trained. As a result, it is possible to obtain a model with high discrimination accuracy within a short period of time, and it is possible to easily perform discrimination using a model with high discrimination accuracy.

（効果２）必要な情報を車両１内に報知する報知部３１を備えるようにした。これにより、判別事象特定部１６で特定された判別事象１７に基いて、報知部３１で、必要な情報をタイミング良く車両１内に積極的に報知することができるようになる。よって、判別事象１７を基に、道路交通情報を配信したり、その他の有用な情報を提供したりすることができる。 (Effect 2) An informing unit 31 for informing the vehicle 1 of necessary information is provided. Thereby, based on the discriminating event 17 specified by the discriminating event specifying unit 16, the notifying unit 31 can actively notify the necessary information in the vehicle 1 in a timely manner. Therefore, based on the determination event 17, road traffic information can be distributed or other useful information can be provided.

（効果３）携帯端末２１ａ，２１ｂや固定端末２１ｆを音声収集装置２１として利用するようにした。携帯端末２１ａ，２１ｂは、個人が携帯して持ち歩くものなので、多くの音声２を集める音声収集装置２１としては最適なものである。また、固定端末２１ｆは、人が居住する場所などに設置されるものなので、音声収集装置２１として有効に利用することができる。これらにより、短時間のうちに、より多くの音声２（車両１外で発せられる日常的な音声２）を確実に収集することができる。また、携帯端末２１ａ，２１ｂや固定端末２１ｆには、各種のセンサー類が設けられているので、これらのセンサー類を活用することで、音声２以外の多くの種類の情報を収集して、音声２と関連付けることや、判別事象１７に対する判別精度を上げることも可能となる。 (Effect 3) The portable terminals 21a and 21b and the fixed terminal 21f are used as the voice collecting device 21. Since the mobile terminals 21a and 21b are carried around and carried by individuals, the mobile terminals 21a and 21b are optimal as the voice collection device 21 that collects a lot of voices 2. Further, since the fixed terminal 21f is installed in a place where a person lives, etc., it can be effectively used as the voice collecting device 21. As a result, more voices 2 (daily voices 2 emitted outside the vehicle 1) can be reliably collected in a short time. In addition, since the mobile terminals 21a and 21b and the fixed terminal 21f are provided with various sensors, by using these sensors, many types of information other than the voice 2 are collected and the voice is recorded. 2 and the discrimination accuracy for the discrimination event 17 can be increased.

（効果４）判別事象１７は、ヒヤリハット事象としても良い。そして、ヒヤリハット事象を正確に判別できるようにすることで、安心して運転できるようにするための支援や、事故などの発生を未然に防止するための支援を行うことなどが可能となる。例えば、道路交通情報の収集や配信を行う情報集配信装置などとして活用することができる。 (Effect 4) The discrimination event 17 may be a near-miss event. By making it possible to accurately determine a near-miss event, it is possible to provide support for driving with peace of mind or support for preventing an accident from occurring. For example, it can be used as an information collection and distribution device that collects and distributes road traffic information.

（効果５）判別事象１７は、わくわく事象としても良い。そして、わくわく事象を正確に判別できるようにすることで、車両１搭乗中の乗員の期待感を高めて、ドライブや運転の楽しさを演出するための支援を行うことなどが可能となる。即ち、情報集配信装置以外に、運転に新たな価値観を付与して、運転を有意義なものにするための案内装置などとして活用することができる。 (Effect 5) The discrimination event 17 may be an exciting event. Then, by making it possible to accurately determine the exciting event, it is possible to enhance the expectation of the occupant while boarding the vehicle 1 and to provide support for directing the enjoyment of driving and driving. That is, in addition to the information collection and distribution device, it can be used as a guidance device for imparting new values to driving and making driving meaningful.

（効果６）判別事象１７は、感動事象としても良い。そして、感動事象を正確に判別できるようにすることで、ドライブや運転を有意義なものにするための支援を行うことなどが可能となる。即ち、運転の本来の目的を高めるための案内装置などとして活用することができる。 (Effect 6) The discrimination event 17 may be a touching event. Then, by making it possible to accurately discriminate the emotional event, it is possible to provide support for making driving and driving meaningful. That is, it can be utilized as a guide device for enhancing the original purpose of driving.

（効果７）言語モデル５が、判別事象１７に関連するキーワード６をデータ化したものとされた。そして、通常音響モデル８が、日常的に発する言葉に付随する音量、抑揚、高さ、速さ、長さ、強さ、溜め、こぶし、しゃくり、ビブラート、ファルセット（ファルセット率、ファルセット変化度）、ブレス（ブレス形態）などの音響情報のうちの少なくとも１つを入力量として有するものとされた。また、特定音響モデル９が、判別事象１７に相当する時に発するキーワード６に付随する音量、抑揚、高さ、速さ、長さ、強さ、溜め、こぶし、しゃくり、ビブラート、ファルセット（ファルセット率、ファルセット変化度）、ブレス（ブレス形態）などの音響情報のうちの少なくとも１つを入力量として有するものとされた。 (Effect 7) The language model 5 is data obtained by converting the keyword 6 related to the discrimination event 17 into data. And the normal acoustic model 8 is the volume, inflection, height, speed, length, strength, reservoir, fist, shackles, vibrato, falsetset (fallet rate, falsetset change rate) that accompany everyday words. It is assumed that at least one of acoustic information such as breath (breath form) is input as an input amount. Also, the volume, inflection, height, speed, length, strength, reservoir, fist, shackle, vibrato, falset (falset rate, It is assumed that at least one of acoustic information such as (Falset change degree) and breath (breath form) is input as an input amount.

これにより、キーワード抽出部７は、言語モデル５を用いて判別事象１７に関するキーワード６を抽出することができる。そして、音響判別部１２は、車内でキーワード６を判別した時の音量、抑揚、高さ、速さ、長さ、強さ、溜め、こぶし、しゃくり、ビブラート、ファルセット（ファルセット率、ファルセット変化度）、ブレス（ブレス形態）などの音響情報のうちの少なくとも１つを、特定音響モデル９や通常音響モデル８に登録された音響情報と比較することで、乗員状態１１が特定状態であるか通常状態であるかを正確に判断することができる。 Thus, the keyword extraction unit 7 can extract the keyword 6 related to the discrimination event 17 using the language model 5. The sound discriminating unit 12 then determines the volume, inflection, height, speed, length, strength, sump, fist, shackle, vibrato, falsetset (falseset rate, falsetset change rate) when the keyword 6 is discriminated in the vehicle. By comparing at least one of the acoustic information such as breath (brace form) with the acoustic information registered in the specific acoustic model 9 or the normal acoustic model 8, the occupant state 11 is in the specific state or the normal state Can be accurately determined.

（効果８）言語モデル５、特定音響モデル９、通常音響モデル８を、個人別に作成するようにした。これにより、各個人に応じて、判別事象１７を正確に判断することができる。また、言語モデル５、特定音響モデル９、通常音響モデル８を、個人別に作成することで、その個人が、ドライバーとして乗車している場合でも、ドライバー以外で乗車している場合でも、常に事象の判別が正確にできるようになり、乗員全員についての判別事象１７を個別に正確に判断することが可能になる。また、使用する車両１を変更したような場合であっても、変更した車両１に対してこれらの各モデルを個人ごとに適用することで、そのまま事象の判別に使用できるようになる。更に、個人別に作成された言語モデル５、特定音響モデル９、通常音響モデル８を分類することで、傾向が似ている人に対して、同様の判断を行ったり、同様の情報を発信したりするようなこともできるようになる。 (Effect 8) The language model 5, the specific acoustic model 9, and the normal acoustic model 8 are created for each individual. Thereby, the determination event 17 can be accurately determined according to each individual. In addition, by creating the language model 5, the specific acoustic model 9, and the normal acoustic model 8 for each individual, the event always occurs regardless of whether the individual is riding as a driver or a person other than the driver. The discrimination can be made accurately, and the discrimination event 17 for all the occupants can be judged accurately and individually. Further, even when the vehicle 1 to be used is changed, by applying each of these models to the changed vehicle 1 for each individual, it can be used for event determination as it is. Furthermore, by classifying the language model 5, the specific acoustic model 9, and the normal acoustic model 8 created for each individual, the same judgment can be made or similar information can be transmitted to people with similar trends. You can do things like that.

以上、実施例を図面により詳述してきたが、実施例は例示にしか過ぎないものである。よって、本発明は、実施例にのみ限定されるものではなく、要旨を逸脱しない範囲の設計の変更等があってもこの発明に含まれることは勿論である。また、例えば、各実施例に複数の構成が含まれている場合には、特に記載がなくとも、これらの構成の可能な組合せが含まれることは勿論である。また、複数の実施例や変形例が開示されている場合には、特に記載がなくとも、これらに跨がった構成の組合せのうちの可能なものが含まれることは勿論である。また、図面に描かれている構成については、特に記載がなくとも、含まれることは勿論である。更に、「等」の用語がある場合には、同等のものを含むという意味で用いられている。また、「ほぼ」「約」「程度」などの用語がある場合には、常識的に認められる範囲や精度のものを含むという意味で用いられている。 While the embodiments have been described in detail with reference to the drawings, the embodiments are merely illustrative. Therefore, the present invention is not limited only to the embodiments, and it goes without saying that design changes and the like within a range not departing from the gist are included in the present invention. Further, for example, when each embodiment includes a plurality of configurations, it is a matter of course that possible combinations of these configurations are included even if not specifically described. Further, when a plurality of embodiments and modification examples are disclosed, it is needless to say that possible combinations of configurations extending over these are included even if not specifically described. Further, the configuration depicted in the drawings is of course included even if not particularly described. Further, when there is a term of “etc.”, it is used in the sense that the equivalent is included. In addition, when there are terms such as “almost”, “about”, “degree”, etc., they are used in the sense that they include those in the range and accuracy recognized by common sense.

１車両
２音声
３音声データ
４音声データ認識部
５言語モデル
６キーワード
７キーワード抽出部
８通常音響モデル
９特定音響モデル
１１乗員状態
１２音響判別部
１３乗員状況
１４車両状況
１５データベース
１６判別事象特定部
１７判別事象
１８データ登録部
２１音声収集装置
２１ａ携帯端末
２１ｂ携帯端末
２１ｆ固定端末
２２モデル育成部
３１報知部 DESCRIPTION OF SYMBOLS 1 Vehicle 2 Voice 3 Voice data 4 Voice data recognition part 5 Language model 6 Keyword 7 Keyword extraction part 8 Normal acoustic model 9 Specific acoustic model 11 Passenger state 12 Acoustic discriminating part 13 Passenger situation 14 Vehicle situation 15 Database 16 Discrimination event specific part 17 Discrimination event 18 Data registration unit 21 Voice collection device 21a Mobile terminal 21b Mobile terminal 21f Fixed terminal 22 Model training unit 31 Notification unit

Claims

車両内で発せられた音声から音声データを認識する音声データ認識部と、
該音声データ認識部で認識された音声データから、事象判別用に設定された言語モデルを用いて、判別しようとする事象に関するキーワードを抽出するキーワード抽出部と、
該キーワード抽出部で抽出されたキーワードに伴う音響情報を、予め設定された通常音響モデルおよび特定音響モデルと比較して乗員状態が通常状態であるか特定状態であるかを判別する音響判別部と、
キーワード抽出部で抽出されたキーワードと、音響判別部で判別された乗員状態と、キーワードが抽出された時点またはその前後における乗員状況、車両状況、データベースに蓄積されたデータのうちのいずれか１つに基いて、乗員に生じている事象を特定する判別事象特定部と、
少なくとも、該判別事象特定部で特定された判別事象と、前記キーワード抽出部で判別されたキーワードとを関連付けたデータを、前記データベースに登録するデータ登録部と、を備えており、
更に、車両外で日常的に発せられた音声を収集可能な音声収集装置が設けられ、
該音声収集装置によって車両外で日常的に収集された音声から前記言語モデル、通常音響モデル、特定音響モデルを育成可能なモデル育成部が備えられたことを特徴とする車両用事象判別装置。 A voice data recognition unit for recognizing voice data from voice generated in the vehicle;
A keyword extraction unit for extracting a keyword related to an event to be determined from the speech data recognized by the speech data recognition unit, using a language model set for event determination;
An acoustic discriminating unit for discriminating whether the occupant state is the normal state or the specific state by comparing the acoustic information accompanying the keyword extracted by the keyword extracting unit with a preset normal acoustic model and a specific acoustic model; ,
Any one of the keyword extracted by the keyword extraction unit, the occupant status determined by the acoustic determination unit, the occupant status, vehicle status, and data accumulated in the database at or before and after the keyword was extracted Based on the discriminant event identification unit that identifies the event occurring in the occupant,
At least a data registration unit for registering in the database data associated with the discrimination event specified by the discrimination event specifying unit and the keyword determined by the keyword extraction unit;
Furthermore, a voice collecting device capable of collecting voices routinely emitted outside the vehicle is provided,
An event discriminating apparatus for a vehicle, comprising: a model training unit capable of training the language model, the normal acoustic model, and the specific acoustic model from voices collected daily outside the vehicle by the voice collecting device.

請求項１に記載の車両用事象判別装置において、
前記判別事象特定部で特定された判別事象に基いて、必要な情報を車両内に報知する報知部を備えたことを特徴とする車両用事象判別装置。 The vehicle event determination device according to claim 1,
An event discriminating apparatus for a vehicle comprising an informing unit for informing a vehicle of necessary information based on the discriminating event specified by the discriminating event specifying unit.

請求項１または請求項２に記載の車両用事象判別装置において、
前記音声収集装置が、持ち運び可能な携帯端末、または、車両外に設置された固定端末であることを特徴とする車両用事象判別装置。 In the vehicle event determination device according to claim 1 or 2,
The vehicle event discriminating apparatus, wherein the voice collecting device is a portable terminal that can be carried or a fixed terminal installed outside the vehicle.

請求項１ないし請求項３のいずれか１項に記載の車両用事象判別装置において、
前記判別事象が、ヒヤリハット事象であることを特徴とする車両用事象判別装置。 In the vehicle event determination device according to any one of claims 1 to 3,
The vehicle event discrimination device, wherein the discrimination event is a near-miss event.

請求項１ないし請求項３のいずれか１項に記載の車両用事象判別装置において、
前記判別事象が、わくわく事象であることを特徴とする車両用事象判別装置。 In the vehicle event determination device according to any one of claims 1 to 3,
The vehicle event discrimination device, wherein the discrimination event is an exciting event.

請求項１ないし請求項３のいずれか１項に記載の車両用事象判別装置において、
前記判別事象が、感動事象であることを特徴とする車両用事象判別装置。 In the vehicle event determination device according to any one of claims 1 to 3,
The vehicle event discrimination device, wherein the discrimination event is an emotional event.

請求項１ないし請求項６のいずれか１項に記載の車両用事象判別装置において、
前記言語モデルが、判別事象に関連するキーワードをデータ化したものとされ、
前記通常音響モデルが、日常的に発する言葉に付随する音量、抑揚、高さ、速さ、長さ、強さ、溜め、こぶし、しゃくり、ビブラート、ファルセット（ファルセット率、ファルセット変化度）、ブレス（ブレス形態）などの音響情報のうちの少なくとも１つを入力量として有するものとされ、
前記特定音響モデルが、判別事象に相当する時に発するキーワードに付随する音量、抑揚、高さ、速さ、長さ、強さ、溜め、こぶし、しゃくり、ビブラート、ファルセット（ファルセット率、ファルセット変化度）、ブレス（ブレス形態）などの音響情報のうちの少なくとも１つを入力量として有するものとされたことを特徴とする車両用事象判別装置。 The event determination apparatus for a vehicle according to any one of claims 1 to 6,
The language model is a data of keywords related to discrimination events,
The normal acoustic model is the volume, inflection, height, speed, length, strength, reservoir, fist, shackle, vibrato, falsetto (falset rate, falsetset change), breath ( And at least one of acoustic information such as a breath form) as an input amount,
Volume, inflection, height, speed, length, strength, reservoir, fist, shackle, vibrato, falsetset (fallet rate, falsetset change rate) associated with keywords generated when the specific acoustic model corresponds to a discrimination event An event discriminating apparatus for vehicles, characterized in that it has at least one of acoustic information such as breath (brace form) as an input quantity.

請求項７に記載の車両用事象判別装置において、
前記言語モデル、特定音響モデル、通常音響モデルは、個人別に作成されることを特徴とする車両用事象判別装置。 In the vehicle event determination device according to claim 7,
The event determination apparatus for a vehicle, wherein the language model, the specific acoustic model, and the normal acoustic model are created for each individual.