WO2023204288A1 - Information processing device and program - Google Patents

Information processing device and program Download PDF

Info

Publication number
WO2023204288A1
WO2023204288A1 PCT/JP2023/015855 JP2023015855W WO2023204288A1 WO 2023204288 A1 WO2023204288 A1 WO 2023204288A1 JP 2023015855 W JP2023015855 W JP 2023015855W WO 2023204288 A1 WO2023204288 A1 WO 2023204288A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
financial data
information processing
account item
financial
Prior art date
Application number
PCT/JP2023/015855
Other languages
French (fr)
Japanese (ja)
Inventor
孝志 森
伸太郎 今井
伊織 三浦
青雲 山根
Original Assignee
有限責任監査法人トーマツ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 有限責任監査法人トーマツ filed Critical 有限責任監査法人トーマツ
Publication of WO2023204288A1 publication Critical patent/WO2023204288A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting

Definitions

  • the present invention relates to an information processing device that inspects accounting processing.
  • An object of the present invention is to provide a technology that allows highly accurate estimation of the existence of fraud in accounting procedures and objectively presents the basis for determining fraud.
  • An information processing apparatus includes an acquisition means, an extraction means, and a determination means.
  • the acquisition means acquires financial data related to multiple accounting periods for one business operator.
  • the extraction means extracts, for each account item generated based on the financial data acquired by the acquisition means, transition information indicating a change in value related to the account item.
  • the determining means determines the inappropriateness of the accounting process by the one business operator based on the transition information extracted by the extracting means. According to this information processing device, when there is fraud in accounting processing by one business operator, the fact of fraud can be estimated with high accuracy. In addition, since the basis for determining fraud can be presented objectively, subsequent manual verification becomes more efficient.
  • the determination means includes a preprocessing means, a plurality of weak learners, and a bagging means.
  • the preprocessing means generates a plurality of data sets by performing undersampling based on the transition information extracted by the extraction means.
  • the plurality of data sets output from the preprocessing means are respectively input to the plurality of weak learners.
  • These plurality of weak learners construct a learning model for outputting the indicator indicating the inappropriateness based on the financial data regarding the plurality of businesses.
  • the bagging means calculates an index indicating inappropriateness by bagging outputs from a plurality of weak learners. According to this aspect, the fact of fraud can be estimated with even higher accuracy. Furthermore, it is possible to provide a learning method suitable for situations where it is difficult to collect fraudulent training data.
  • the plurality of weak learners operate based on gradient boosting. According to this aspect, the accuracy of determining whether there is fraud in accounting processing is improved.
  • the preprocessing means generates data by adding feature quantities obtained by inputting the financial data to a plurality of learning models different from the learning model to the transition information extracted by the extraction means. .
  • the preprocessing means generates the plurality of data sets by performing undersampling on the generated data. According to this aspect, the accuracy of determining whether there is fraud in accounting processing is improved.
  • the determining means calculates a SHAP value indicating the degree of contribution for each account item based on the index calculated for each account item. According to this aspect, the basis for determining fraud is visualized.
  • the preprocessing means corrects the transition information based on market data.
  • the present invention provides a step of acquiring, in a computer, financial data relating to a plurality of accounting periods for one business operator, and a step of acquiring financial data relating to a plurality of accounting periods for one business operator, and a step of acquiring financial data relating to a plurality of accounting periods for one business operator, and for each account item generated based on the acquired financial data.
  • FIG. 1 is a diagram illustrating a configuration example of an information processing device 10 according to an embodiment of the present invention.
  • FIG. 2 is a functional block diagram showing functions realized by a control unit 100 of the information processing device 10 according to a financial analysis program PA. It is a flowchart showing the flow of a financial analysis method executed by the control unit 100 of the information processing device 10 according to the financial analysis program PA.
  • 3 is a diagram illustrating an example of SHAP values displayed on the display unit 130 of the information processing device 10.
  • FIG. 1 is a diagram showing a configuration example of an information processing device 10 according to an embodiment of the present invention.
  • the information processing device 10 is, for example, a computer device owned and managed by an auditing corporation, but it does not matter who is the administrator or user of the computer device.
  • the information processing device 10 in this embodiment is, for example, a personal computer.
  • the information processing device 10 includes a control section 100, a communication I/F section 110, an operation input section 120, a display section 130, a storage section 140, and a bus that mediates data exchange between these components. 150.
  • the control unit 100 is, for example, a CPU (Central Processing Unit).
  • the control unit 100 functions as a control center of the information processing device 10 by executing the financial analysis program PA stored in the storage unit 140.
  • the communication I/F unit 110 is connected to a telecommunications line such as the Internet wirelessly or by wire.
  • Communication I/F unit 110 receives data sent via a telecommunications line, and delivers the received data to control unit 100. Furthermore, the communication I/F section 110 sends data given from the control section 100 to the telecommunications line.
  • the operation input unit 120 includes one or both of a pointing device such as a mouse and a keyboard.
  • a pointing device such as a mouse
  • the operation input unit 120 outputs operation content data indicating the content of the user's operation to the control unit 100. Thereby, the content of the user's operation is transmitted to the control unit 100.
  • the display unit 130 is a display device including a liquid crystal panel and its driving circuit.
  • the display unit 130 displays various images under the control of the control unit 100.
  • the storage unit 140 is a storage device that includes a volatile storage unit 142 and a nonvolatile storage unit 144.
  • the volatile storage unit 142 is, for example, a RAM (Random Access Memory).
  • the volatile storage unit 142 is used by the control unit 100 as a work area when executing various programs.
  • the nonvolatile storage unit 144 is, for example, a hard disk.
  • Various programs and various data are stored (installed) in the nonvolatile storage unit 144 in advance.
  • a kernel program for implementing an OS (Operating System) in the control unit 100 and a financial analysis program PA are stored in advance. Note that in FIG. 1, illustration of the kernel program is omitted.
  • the financial analysis program PA is a program that causes the control unit 100 to execute the financial analysis method according to the present invention.
  • the control unit 100 When the power of the information processing device 10 (not shown in FIG. 1) is turned on, the control unit 100 reads the kernel program from the nonvolatile storage unit 144 to the volatile storage unit 142, and starts executing the kernel program.
  • the control unit 100 operating according to the kernel program receives operation content data from the operation input unit 120 instructing the start of execution of another program, the control unit 100 transfers the other program from the non-volatile storage unit 144 to the volatile storage unit 144. The data is read into the storage unit 142 and its execution is started.
  • the following description will focus on the case where an operation to instruct execution of the financial analysis program PA is performed on the operation input unit 120.
  • control unit 100 Upon receiving operation content data from the operation input unit 120 instructing the start of execution of the financial analysis program PA, the control unit 100 reads the financial analysis program PA from the non-volatile storage unit 144 to the volatile storage unit 142 and starts its execution. do.
  • the control unit 100 operating according to the financial analysis program PA realizes the functions of the present invention.
  • FIG. 2 is a functional block diagram showing the functions that the control unit 100 implements according to the financial analysis program PA.
  • the control unit 100 operating according to the financial analysis program PA functions as an acquisition means 1002, an extraction means 1004, and a determination means 1006. That is, the acquisition means 1002, extraction means 1004, and determination means 1006 shown in FIG. 2 are software modules realized by operating a computer such as a CPU according to software (financial analysis program PA).
  • the functions of each of the acquisition means 1002, extraction means 1004, and determination means 1006 are as follows.
  • the acquisition means 1002 acquires financial data D1 related to a plurality of fiscal periods, such as five fiscal periods, for one business entity that is subject to risk management (hereinafter referred to as a target business entity).
  • a business entity may be a corporation, other organization, sole proprietorship, or any other entity that conducts some kind of business activity and prepares and manages accounting information, and does not necessarily have to match the legal definition or classification.
  • So-called group companies, including consolidated subsidiaries, etc. may be regarded as one business operator.
  • a business entity related to a transaction described in one financial data D1 can be regarded as one business entity.
  • the acquisition means 1002 acquires the financial data D1 from the server in charge of financial management in the target business entity by communicating with the server via the communication I/F section 110.
  • Financial data for one fiscal year typically represents a financial statement that describes the values for each account such as cost of goods sold, sales, and inventories at the fiscal year end.
  • the financial data D1 represents financial statements for each of a plurality of financial periods.
  • the financial data D1 does not need to indicate the contents of the account book or financial statement itself prepared by the business operator, and it is sufficient if it is created based on records such as the financial statement or account book.
  • the financial data D1 may include information generated using information on a plurality of account items, such as sales profit rate.
  • the financial data D1 may be in a format required by a regulatory agency such as the Financial Services Agency or an institution specified by law such as a stock exchange, or may be in a format unique to the target company. It's okay.
  • the financial data D1 may be a ledger in which the details of past transactions conducted by the business operator are recorded, as long as it is composed of headings (account items, display items) and their contents. It does not matter whether it is public or private.
  • the financial data D1 may or may not be associated with information indicating whether or not fraud has actually been discovered.
  • the fiscal year-end does not necessarily have to be the same as that required by the company's articles of incorporation or law, but may just be the one established as the break in the books mentioned above. Note that if the financial data D1 is made public by a regulatory agency or stock exchange, the acquisition means 1002 may obtain the financial data D1 from the regulatory agency or stock exchange, etc., rather than from the target business's server. good.
  • the extraction means 1004 extracts from the financial data D1, for each account item extracted based on the financial data D1 acquired by the acquisition means 1002, transition information D2 indicating changes in values over a plurality of accounting periods. For example, if the financial statements of the target business entity include M (M is an integer of 2 or more) account items, the extraction means 1004 extracts M account items at maximum. When M account items are extracted, the transition information D2 represents the change in value over a plurality of settlement periods for each of the M account items.
  • the extraction means 1004 may store the extracted transition information D2 in the nonvolatile storage unit 144 in order to be able to present the basis for determining fraud at a later date, and the control unit 100 may store the extracted transition information D2 in the nonvolatile storage unit 144.
  • the display unit 130 may display a graph or the like showing the transition of values for each account item over a plurality of accounting periods of the target business operator.
  • the determining means 1006 determines the inappropriateness of the target business entity's accounting process based on the transition information D2 generated by the extracting means 1004. As shown in FIG. 2, the determination means 1006 in this embodiment includes a preprocessing means 1006a, weak learning units 1006b(1) to 1006b(N), and bagging means 1006c. Note that N is an integer of 2 or more. In the following, if there is no need to distinguish between the weak learners 1006b(1) to 1006b(N), the weak learners 1006b(1) to 1006b(N) will be referred to as weak learners 1006b.
  • the determining means 1006 has a plurality of weak learning devices 1006b.
  • a weak learner is an AI that cannot be expected to have high prediction accuracy on its own.
  • a specific example of a weak learner is a decision tree.
  • a decision tree is a tree structure for drawing a conclusion regarding the target value of a certain item from the observation results of that item.
  • a learning model for outputting indicators indicating the inappropriateness of accounting treatment for each account based on financial data of multiple businesses different from the target business is based on weak learning. It is constructed in the form of an ensemble of vessels 1006b(1) to 1006b(N).
  • the weak learners 1006b(1) to 1006b(N) in this embodiment function as the above learning model by operating based on gradient boosting.
  • Gradient boosting is a method that randomly acquires partial data from training data, sequentially constructs a model (decision tree), and uses a weighted majority vote of the predicted values of each model to determine the final predicted value. say.
  • the preprocessing means 1006a generates a plurality of data sets based on the transition information D2 extracted by the extraction means 1004. To explain in more detail, the preprocessing means 1006a first inputs financial data into a plurality of learning models different from the learning models constructed by the weak learners 1006b(1) to 1006b(N) to obtain the obtained data. Data is generated by adding the obtained feature amount to the transition information D2. Specific examples of the plurality of learning models in this embodiment include isolation forest, CFO modified Jones model, and the like.
  • the preprocessing means 1006a performs undersampling on the data obtained by inputting the financial data D1 into a plurality of learning models and adding the feature values obtained to the transition information D2 to create a plurality of data sets, that is, a data set.
  • Undersampling refers to randomly extracting Y (a positive integer less than or equal to X) data from X (X is an integer greater than or equal to 2) data.
  • the preprocessing means 1006a performs undersampling N times to generate data sets D3(1) to D3(N), but the data extracted in each undersampling partially overlap with each other. It's okay. Note that during undersampling, the data sets D3(1) to D3(N) always include the entire amount of change information D2 of fraud cases in previous years.
  • the preprocessing means 1006a inputs the data sets D3(1) to D3(N) to the weak learning devices 1006b(1) to 1006b(N) one by one. Specifically, the preprocessing means 1006a inputs the data set D3(n) to the weak learning device 1006b(n).
  • the bagging means 1006c performs bagging (in this embodiment, weighted By performing a majority vote), an index V1 indicating inappropriateness is calculated for each account item.
  • the bagging means 1006c calculates one index V1 for each of the M account items, that is, M indicators V1.
  • the determining means 1006 determines the inappropriateness of accounting processing for each target business entity and account item based on the index V1. For example, if the index calculated by the bagging unit 1006c exceeds a predetermined threshold, the determining unit 1006 determines that there is a high possibility that inappropriate accounting processing is being performed.
  • the determining means 1006 may cause the nonvolatile storage unit 144 to store data indicating the index V1 and the determination result based on the index V1, or cause the display unit 130 to display the index V1 and the determination result based on the index V1. It's okay.
  • FIG. 3 is a flowchart showing the flow of this financial analysis method. As shown in FIG. 3, this financial analysis method includes acquisition processing SA110, extraction processing SA120, and determination processing SA130.
  • control unit 100 functions as an acquisition means 1002.
  • control unit 100 acquires financial data related to a plurality of accounting periods for the target business entity.
  • the control unit 100 functions as the extraction means 1004.
  • the control unit 100 generates, for each account item extracted based on the financial data acquired in the acquisition process SA110, transition information indicating changes in values related to the account item over multiple accounting periods. do.
  • the control unit 100 functions as the determination means 1006.
  • the control unit 100 first performs the extraction process SA120 on the data sets (that is, N data sets) to be input into each of the weak learning devices 1006b(1) to 1006b(N) one by one. It is generated based on the transition information generated in .
  • the control unit 100 inputs each of the N data sets to each of the weak learners 1006b(1) to 1006b(N), one by one, and 1006b(N), and an index indicating inappropriateness is calculated for each account item.
  • FIG. 4 shows an example of the calculated index for each account item and the total value of the index (estimated value indicating the probability that fraud is presumed).
  • the total value of the SHAP value is calculated as a total risk score between 0 and 1 by logit conversion, and the higher the value, the greater the possibility that the accounting treatment of the target business is inappropriate. shows. From the same figure, it can be seen that the inappropriateness of indicators related to inventory, such as "inventory turnover period" and "inventory-to-book ratio", is a factor pushing up the total of indicators. Therefore, auditors who look at these results can decide to focus on checking these items since overstatement of inventory is suspected. In addition, overstatement of inventories leads to inflated profits, but in this example we can confirm that both profit-related indicators and SHAP values are high, which makes it difficult to judge that there may be overstatement of inventories. The evidence is shown.
  • an index indicating the inappropriateness of accounting processing is calculated for each account item by analyzing the change in value for each account item over multiple accounting periods using machine learning. For example, if the value of a specific account item has suddenly decreased in the past year (for example, by more than a predetermined threshold compared to the previous year), or conversely, the value of a specific account item seems unnatural in multiple accounting periods. If the value is so constant that it will be reflected in the SHAP value for that account item.
  • the preprocessing means 1006a corrects the transition information based on market condition data representing market conditions, and corrects the transition information based on the corrected transition information.
  • Market conditions refer to buying and selling conditions in various markets such as stock markets or commodity markets.
  • Market data includes stock price index, domestic corporate goods price index, exchange rate against the US dollar (or the average value of the exchange rate), credit growth rate, cash currency (banknotes issued + money in circulation) and deposits with domestic banks, etc. This data represents one or more of the following: total deposits held, lending interest rates, and long-term bond yields.
  • a stock price index is the rate of change in stock prices from the previous year.
  • the domestic corporate goods price index is the rate of change in the total added value of newly produced goods and services within the country.
  • the preprocessing means 1006a may correct the transition information based on the business type, business type, industry to which the business belongs, and other attributes instead of or in addition to the market data. Since the characteristics and trends of financial data can vary depending on the business content, it is expected that the accuracy of judgment will improve.
  • the determining means 1006 may calculate a SHAP value for each account item, which indicates the degree of contribution (degree of contribution) of each account item to inappropriateness, based on the index calculated for each account item.
  • the SHAP value may be stored in the nonvolatile storage section 144 and displayed on the display section 130.
  • the acquisition means 1002, the extraction means 1004, and the determination means 1006 in the above embodiment were software modules, any one, a plurality, or all of the acquisition means 1002, the extraction means 1004, and the determination means 1006 are ASIC. It may also be a hardware module such as. Even if one, more than one, or all of the acquiring means 1002, the extracting means 1004, and the determining means 1006 are hardware modules, the same effects as in the above embodiment can be achieved.
  • the financial analysis program PA that causes a computer such as a CPU to execute the financial analysis method of the present invention is stored in advance in the storage unit 140 of the information processing device 10.
  • the financial analysis program PA may be manufactured alone or transferred (that is, provided) for a fee or free of charge.
  • the specific manner in which the financial analysis program PA is provided is by writing it on a computer-readable recording medium such as a flash ROM and distributing it, or by downloading it via a telecommunications line such as the Internet. Examples include aspects in which By operating a general computer according to the financial analysis program PA distributed according to these aspects, it becomes possible to cause the computer to execute the financial analysis method of the present invention, and the same effects as in the above embodiment can be obtained. .
  • the information processing system there is a step of acquiring financial data relating to a plurality of fiscal periods for one business operator, and a step of acquiring financial data regarding a plurality of financial periods for one business operator, and for each account item generated based on the acquired financial data. It is only necessary that the steps of extracting transition information indicating the transition of the value related to the above and determining the inappropriateness of the accounting process by the one business operator based on the generated information are performed.
  • 10 Information processing device, 100... Control unit, 110... Communication I/F unit, 120... Operation input unit, 130... Display unit, 140... Storage unit, 150... Bus, 1002... Acquisition means, 1004... Extraction means, 1006 ...judgment means, 1006a...preprocessing means, 1006b, 1006b(1) to 1006b(N)...weak learning device, 1006c...bagging means, PA...financial analysis program.

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

An information processing device 10 includes: an acquisition means 1002; an extraction means 1004; and a determination means 1006. The acquisition means 1002 acquires financial data which pertains to an entity of interest and which is from a plurality of accounting periods. The extraction means 1004 extracts, per account title generated on the basis of the financial data acquired by the acquisition means 1002, transition information indicating the transition of a value related to each account title. The determination means 1006 determines, on the basis of the transition information extracted by the extraction means 1004, whether the entity of interest has been conducting account processing improperly.

Description

情報処理装置およびプログラムInformation processing equipment and programs
 本発明は、会計処理の検査を行う情報処理装置に関する。 The present invention relates to an information processing device that inspects accounting processing.
 会計書類の改ざん等、不適切な会計を行っている企業の数は年々増加している。また、新型コロナウイルスの蔓延に起因する業績不振の影響等も加味すると、今後も不適切な会計が生じるリスクの高い状態が継続すると考えられる。このように、会計書類の検査の必要性は年々増している。一方、会計士など会計監査に関わる人手不足が問題となっている。このような状況において、近年では、機械学習またはディープラーニング等の高度なデータ分析技術が一般に普及しつつあることを踏まえ、これらのデータ分析技術を会計監査における財務分析に導入することも提案されている(例えば、特許文献1)。 The number of companies engaging in inappropriate accounting practices, such as falsifying accounting documents, is increasing year by year. Furthermore, when considering the effects of poor business performance due to the spread of the new coronavirus, it is thought that the risk of inappropriate accounting will continue to remain high. As described above, the need for auditing accounting documents is increasing year by year. On the other hand, the shortage of accountants and other human resources involved in accounting audits has become a problem. Under these circumstances, in light of the fact that advanced data analysis technologies such as machine learning or deep learning have become popular in recent years, it has also been proposed to introduce these data analysis technologies into financial analysis in accounting audits. (For example, Patent Document 1).
特許第6345856号公報Patent No. 6345856
 しかし、会計処理において不正がなされたか否かを機械的に判定した場合の精度は十分とはいえないのが現状である。判定精度が上がらない一つの要因としては、不正を行う企業は全体からみればごく一部であり、結果として訓練データとして用いる財務データには不正がなされた財務データ(すなわち本来欲しいデータ)ごく一部しか含まれていないので、これが学習効率を上げるのを困難としている点が挙げられる。加えて、不正ありと判定した場合において、その根拠を客観的に提示することについて課題がある。不正の有無の判定はその企業の死活問題ともなりうる重要に事項であるから、機械学習システムを利用する者(会計監査人や企業内の担当者等)に対する信頼性を向上させることは非常に重要である。 However, the current situation is that the accuracy of mechanically determining whether or not fraud has occurred in accounting processing is not sufficient. One of the reasons why the judgment accuracy does not improve is that only a small proportion of companies commit fraud, and as a result, the financial data used as training data contains only a small percentage of the financial data where fraud was committed (that is, the data that was originally desired). This makes it difficult to improve learning efficiency since it only includes the first part. In addition, when it is determined that fraud has occurred, there is a problem in presenting the basis objectively. Since determining the presence or absence of fraud is an important matter that can be a matter of life and death for a company, it is extremely important to improve the reliability of those who use machine learning systems (such as accounting auditors and personnel within the company). is important.
 本発明は、会計処理上の不正の存在を高い精度で推定するとともに、不正と判定した根拠を客観的に提示できるようにする技術を提供することを目的とする。 An object of the present invention is to provide a technology that allows highly accurate estimation of the existence of fraud in accounting procedures and objectively presents the basis for determining fraud.
 本発明の一の態様に係る情報処理装置は、取得手段と、抽出手段と、判定手段と、を有する。取得手段は、一の事業者についての複数の決算期に係る財務データを取得する。抽出手段は、取得手段により取得された財務データに基づいて生成される勘定科目ごとに、当該勘定科目に係る値の推移を示す推移情報を抽出する。判定手段は、抽出手段により抽出された推移情報に基づいて、前記一の事業者による会計処理における不適切性を判定する。
 この情報処理装置によれば、一の事業者による会計処理に不正がある場合、不正の事実を高い精度で推定できる。加えて、不正と判定する根拠を客観的に提示できるので、その後に人手による検証が効率化する。
An information processing apparatus according to one aspect of the present invention includes an acquisition means, an extraction means, and a determination means. The acquisition means acquires financial data related to multiple accounting periods for one business operator. The extraction means extracts, for each account item generated based on the financial data acquired by the acquisition means, transition information indicating a change in value related to the account item. The determining means determines the inappropriateness of the accounting process by the one business operator based on the transition information extracted by the extracting means.
According to this information processing device, when there is fraud in accounting processing by one business operator, the fact of fraud can be estimated with high accuracy. In addition, since the basis for determining fraud can be presented objectively, subsequent manual verification becomes more efficient.
 好ましい態様において、前記判定手段は、前処理手段と、複数の弱学習器と、バギング手段とを有する。前処理手段は、抽出手段により抽出された推移情報に基づいてアンダーサンプリングを行って複数のデータセットを生成する。複数の弱学習器には、前処理手段から出力された複数のデータセットがそれぞれ入力される。これら複数の弱学習器は、複数の事業者についての前記財務データに基づいて前記不適切性を示す指標を出力するための学習モデルを構築する。バギング手段は、複数の弱学習器からの出力をバギングして不適切性を示す指標を算出する。
 この態様によれば、不正の事実をさらに高い精度で推定することができる。また、不正をした訓練データを集めるのが困難という状況に適した学習手法を提供することができる。
In a preferred embodiment, the determination means includes a preprocessing means, a plurality of weak learners, and a bagging means. The preprocessing means generates a plurality of data sets by performing undersampling based on the transition information extracted by the extraction means. The plurality of data sets output from the preprocessing means are respectively input to the plurality of weak learners. These plurality of weak learners construct a learning model for outputting the indicator indicating the inappropriateness based on the financial data regarding the plurality of businesses. The bagging means calculates an index indicating inappropriateness by bagging outputs from a plurality of weak learners.
According to this aspect, the fact of fraud can be estimated with even higher accuracy. Furthermore, it is possible to provide a learning method suitable for situations where it is difficult to collect fraudulent training data.
 好ましい態様において、本発明に係る情報処理装置は、複数の弱学習器が勾配ブースティングに基づいて動作する。
 この態様によれば、会計処理における不正の有無の判定精度が向上する。
In a preferred embodiment, in the information processing device according to the present invention, the plurality of weak learners operate based on gradient boosting.
According to this aspect, the accuracy of determining whether there is fraud in accounting processing is improved.
 好ましい態様において、前記前処理手段は、前記学習モデルとは異なる複数の学習モデルに前記財務データを入力して得られた特徴量を、抽出手段により抽出された推移情報に加えたデータを生成する。そして、前記前処理手段は、該生成されたデータに対してアンダーサンプリングを行うことによって前記複数のデータセットを生成する。
 この態様によれば、会計処理における不正の有無の判定精度が向上する。
In a preferred embodiment, the preprocessing means generates data by adding feature quantities obtained by inputting the financial data to a plurality of learning models different from the learning model to the transition information extracted by the extraction means. . The preprocessing means generates the plurality of data sets by performing undersampling on the generated data.
According to this aspect, the accuracy of determining whether there is fraud in accounting processing is improved.
 好ましい態様において、前記判定手段は、各勘定科目について算出された前記指標に基づいて、勘定科目ごとの貢献度を示すSHAP値を算出することを特徴とする。
 この態様によれば、不正と判定する根拠が可視化される。
In a preferred embodiment, the determining means calculates a SHAP value indicating the degree of contribution for each account item based on the index calculated for each account item.
According to this aspect, the basis for determining fraud is visualized.
 好ましい態様において、前記前処理手段は、市況データに基づいて前記推移情報を補正することを特徴とする。
 この態様によれば、会計処理における不適切性を判定する際の基準となる推移情報が市況データに基づいて補正されるので、判定精度が向上する。
 本発明は、他の観点において、コンピュータに、一の事業者についての複数の決算期に係る財務データを取得するステップと、該取得した財務データに基づいて生成される勘定科目ごとに、当該勘定科目に係る値の推移を示す推移情報を抽出するステップと、該生成された情報に基づいて、前記一の事業者による会計処理における不適切性を判定するステップとを実行させるためのプログラムを提供する。
In a preferred embodiment, the preprocessing means corrects the transition information based on market data.
According to this aspect, since the transition information that serves as a standard for determining inappropriateness in accounting processing is corrected based on market data, the determination accuracy is improved.
In another aspect, the present invention provides a step of acquiring, in a computer, financial data relating to a plurality of accounting periods for one business operator, and a step of acquiring financial data relating to a plurality of accounting periods for one business operator, and a step of acquiring financial data relating to a plurality of accounting periods for one business operator, and for each account item generated based on the acquired financial data. Provided is a program for executing the steps of extracting transition information indicating changes in values related to items, and determining inappropriateness in accounting processing by the one business operator based on the generated information. do.
本発明の一実施形態による情報処理装置10の構成例を示す図である。1 is a diagram illustrating a configuration example of an information processing device 10 according to an embodiment of the present invention. 情報処理装置10の制御部100が財務分析プログラムPAに従って実現する機能を示す機能ブロック図である。FIG. 2 is a functional block diagram showing functions realized by a control unit 100 of the information processing device 10 according to a financial analysis program PA. 情報処理装置10の制御部100が財務分析プログラムPAに従って実行する財務分析方法の流れを示すフローチャートである。It is a flowchart showing the flow of a financial analysis method executed by the control unit 100 of the information processing device 10 according to the financial analysis program PA. 情報処理装置10の表示部130に表示されるSHAP値の例を示す図である。3 is a diagram illustrating an example of SHAP values displayed on the display unit 130 of the information processing device 10. FIG.
 以下に述べる各実施形態には技術的に好ましい種々の限定が付されている。しかし、本発明の実施形態は、以下に述べる形態に限られるものではない。 Various technically preferable limitations are attached to each of the embodiments described below. However, embodiments of the present invention are not limited to the forms described below.
A.実施形態 図1は、本発明の一実施形態による情報処理装置10の構成例を示す図である。情報処理装置10は、例えば監査法人によって所有・管理されるコンピュータ装置であるが、コンピュータ装置の管理者や使用者は問わない。本実施形態における情報処理装置10は、例えばパーソナルコンピュータである。図1に示されるように、情報処理装置10は、制御部100、通信I/F部110、操作入力部120、表示部130、記憶部140、およびこれら構成要素間のデータ授受を仲介するバス150を備える。 A. Embodiment FIG. 1 is a diagram showing a configuration example of an information processing device 10 according to an embodiment of the present invention. The information processing device 10 is, for example, a computer device owned and managed by an auditing corporation, but it does not matter who is the administrator or user of the computer device. The information processing device 10 in this embodiment is, for example, a personal computer. As shown in FIG. 1, the information processing device 10 includes a control section 100, a communication I/F section 110, an operation input section 120, a display section 130, a storage section 140, and a bus that mediates data exchange between these components. 150.
 制御部100は、例えばCPU(Central Processing Unit)である。制御部100は、記憶部140に記憶されている財務分析プログラムPAを実行することにより、情報処理装置10の制御中枢として機能する。 The control unit 100 is, for example, a CPU (Central Processing Unit). The control unit 100 functions as a control center of the information processing device 10 by executing the financial analysis program PA stored in the storage unit 140.
 通信I/F部110は、インターネット等の電気通信回線に無線または有線で接続される。通信I/F部110は、電気通信回線経由で送られてくるデータを受信し、受信したデータを制御部100へ引き渡す。また、通信I/F部110は、制御部100から与えられたデータを電気通信回線へ送出する。 The communication I/F unit 110 is connected to a telecommunications line such as the Internet wirelessly or by wire. Communication I/F unit 110 receives data sent via a telecommunications line, and delivers the received data to control unit 100. Furthermore, the communication I/F section 110 sends data given from the control section 100 to the telecommunications line.
 操作入力部120は、マウス等のポインティングデバイスとキーボードとのいずれか一方、または両方を含む。操作入力部120に対して情報処理装置10のユーザによる操作が為されると、操作入力部120は、ユーザの操作内容を示す操作内容データを制御部100へ出力する。これにより、ユーザの操作内容が制御部100へ伝達される。 The operation input unit 120 includes one or both of a pointing device such as a mouse and a keyboard. When the user of the information processing device 10 performs an operation on the operation input unit 120, the operation input unit 120 outputs operation content data indicating the content of the user's operation to the control unit 100. Thereby, the content of the user's operation is transmitted to the control unit 100.
 表示部130は、液晶パネルとその駆動回路とを含む表示装置である。表示部130は、制御部100による制御の下、各種画像を表示する。 The display unit 130 is a display device including a liquid crystal panel and its driving circuit. The display unit 130 displays various images under the control of the control unit 100.
 記憶部140は、揮発性記憶部142と不揮発性記憶部144とを含む記憶装置である。揮発性記憶部142は、例えばRAM(Random Access Memory)である。揮発性記憶部142は、各種プログラムを実行する際のワークエリアとして制御部100によって利用される。不揮発性記憶部144は、例えばハードディスクである。不揮発性記憶部144には、各種プログラムおよび各種データが予め記憶(インストール)されている。 The storage unit 140 is a storage device that includes a volatile storage unit 142 and a nonvolatile storage unit 144. The volatile storage unit 142 is, for example, a RAM (Random Access Memory). The volatile storage unit 142 is used by the control unit 100 as a work area when executing various programs. The nonvolatile storage unit 144 is, for example, a hard disk. Various programs and various data are stored (installed) in the nonvolatile storage unit 144 in advance.
 不揮発性記憶部144に記憶されているプログラムの一例としては、OS(Operating System)を制御部100に実現させるためのカーネルプログラム、および財務分析プログラムPAが予め記憶されている。なお、図1では、カーネルプログラムの図示は省略されている。財務分析プログラムPAは、本発明に係る財務分析方法を制御部100に実行させるプログラムである。 As examples of programs stored in the nonvolatile storage unit 144, a kernel program for implementing an OS (Operating System) in the control unit 100 and a financial analysis program PA are stored in advance. Note that in FIG. 1, illustration of the kernel program is omitted. The financial analysis program PA is a program that causes the control unit 100 to execute the financial analysis method according to the present invention.
 情報処理装置10の電源(図1では図示略)が投入されると、制御部100は、カーネルプログラムを不揮発性記憶部144から揮発性記憶部142に読み出し、当該カーネルプログラムの実行を開始する。カーネルプログラムに従って作動している制御部100は、他のプログラムの実行開始を指示する操作内容データを操作入力部120から受け取ったことを契機として、当該他のプログラムを不揮発性記憶部144から揮発性記憶部142に読み出してその実行を開始する。以下では、財務分析プログラムPAの実行を指示する操作が操作入力部120に為された場合を中心に説明する。 When the power of the information processing device 10 (not shown in FIG. 1) is turned on, the control unit 100 reads the kernel program from the nonvolatile storage unit 144 to the volatile storage unit 142, and starts executing the kernel program. When the control unit 100 operating according to the kernel program receives operation content data from the operation input unit 120 instructing the start of execution of another program, the control unit 100 transfers the other program from the non-volatile storage unit 144 to the volatile storage unit 144. The data is read into the storage unit 142 and its execution is started. The following description will focus on the case where an operation to instruct execution of the financial analysis program PA is performed on the operation input unit 120.
 財務分析プログラムPAの実行開始を指示する操作内容データを操作入力部120から受け取ると、制御部100は、財務分析プログラムPAを不揮発性記憶部144から揮発性記憶部142に読み出してその実行を開始する。財務分析プログラムPAに従って作動している制御部100は、本発明の機能を実現する。 Upon receiving operation content data from the operation input unit 120 instructing the start of execution of the financial analysis program PA, the control unit 100 reads the financial analysis program PA from the non-volatile storage unit 144 to the volatile storage unit 142 and starts its execution. do. The control unit 100 operating according to the financial analysis program PA realizes the functions of the present invention.
 図2は、制御部100が財務分析プログラムPAに従って実現する機能を示す機能ブロック図である。図2に示されるように、財務分析プログラムPAに従って作動している制御部100は、取得手段1002、抽出手段1004、および判定手段1006として機能する。つまり、図2に示される取得手段1002、抽出手段1004、および判定手段1006は、CPU等のコンピュータをソフトウェア(財務分析プログラムPA)に従って作動させることにより実現されるソフトウェアモジュールである。取得手段1002、抽出手段1004、および判定手段1006の各々の機能は次の通りである。 FIG. 2 is a functional block diagram showing the functions that the control unit 100 implements according to the financial analysis program PA. As shown in FIG. 2, the control unit 100 operating according to the financial analysis program PA functions as an acquisition means 1002, an extraction means 1004, and a determination means 1006. That is, the acquisition means 1002, extraction means 1004, and determination means 1006 shown in FIG. 2 are software modules realized by operating a computer such as a CPU according to software (financial analysis program PA). The functions of each of the acquisition means 1002, extraction means 1004, and determination means 1006 are as follows.
 取得手段1002は、リスク管理の対象となる一の事業者(以下、対象事業者)について、例えば5期分などの複数の決算期に係る財務データD1を取得する。なお、事業者とは、法人その他の団体、個人事業主など、何らかの事業活動を行い、会計情報を作成・管理するものであればよく、法律上の定義や区分とは必ずしも一致する必要はない。連結子会等を含むいわゆるグループ企業を一つの事業者とみなしてもよいし。要するに、一つの財務データD1に記載された取引に関係する事業体を一つの事業者であるとみなすことができる。 The acquisition means 1002 acquires financial data D1 related to a plurality of fiscal periods, such as five fiscal periods, for one business entity that is subject to risk management (hereinafter referred to as a target business entity). Note that a business entity may be a corporation, other organization, sole proprietorship, or any other entity that conducts some kind of business activity and prepares and manages accounting information, and does not necessarily have to match the legal definition or classification. . So-called group companies, including consolidated subsidiaries, etc., may be regarded as one business operator. In short, a business entity related to a transaction described in one financial data D1 can be regarded as one business entity.
 取得手段1002は、対象事業者において財務管理を司るサーバと通信I/F部110を介して通信することにより、当該サーバから財務データD1を取得する。一決算期の財務データは、典型的には、売上原価、売上高、および棚卸資産といった勘定科目ごとに決算期における値を記載した決算書を表す。換言すると、財務データD1は、複数の決算期の各々における決算書を表す。ただし、財務データD1その事業者が作成した帳簿ないし決算書の内容そのものを示す必要はなく、当該決算書ないし帳簿などの記録に基づいて作成されたものであればよい。例えば、決算書上は売上高と利益額という勘定しか存在しない場合に、財務データD1売上高利益率という複数の勘定科目の情報を用いて生成される情報が含まれていてもよい。 The acquisition means 1002 acquires the financial data D1 from the server in charge of financial management in the target business entity by communicating with the server via the communication I/F section 110. Financial data for one fiscal year typically represents a financial statement that describes the values for each account such as cost of goods sold, sales, and inventories at the fiscal year end. In other words, the financial data D1 represents financial statements for each of a plurality of financial periods. However, the financial data D1 does not need to indicate the contents of the account book or financial statement itself prepared by the business operator, and it is sufficient if it is created based on records such as the financial statement or account book. For example, when there are only accounts for sales and profits on the financial statement, the financial data D1 may include information generated using information on a plurality of account items, such as sales profit rate.
 なお、財務データD1は、金融庁などの監督官庁や証券取引所等の法律で定められている機関によって要請された形式のものであってもよいし、当該対象企業独自の形式のものであってもよい。要するに、財務データD1とは、その事業者が行った過去の取引の内容が記載された帳簿であって、見出し(勘定科目、表示科目)とその内容とで構成されたものであればよく、公開・非公開であるかも問わない。なお、財務データD1には、実際に不正が発覚したか否か示す情報が紐づけられていてもいなくてもよい。
 また、決算期とは、必ずしもその事業者の定款や法律等で要求されたものと一致する必要はなく、上記の帳簿の区切りとして定められたものであればよい。
 なお、財務データD1が監督官庁や証券取引所にて公開されている場合、取得手段1002は、対象事業者のサーバからではなく、監督官庁や証券取引所等から財務データD1を取得してもよい。
The financial data D1 may be in a format required by a regulatory agency such as the Financial Services Agency or an institution specified by law such as a stock exchange, or may be in a format unique to the target company. It's okay. In short, the financial data D1 may be a ledger in which the details of past transactions conducted by the business operator are recorded, as long as it is composed of headings (account items, display items) and their contents. It does not matter whether it is public or private. Note that the financial data D1 may or may not be associated with information indicating whether or not fraud has actually been discovered.
Furthermore, the fiscal year-end does not necessarily have to be the same as that required by the company's articles of incorporation or law, but may just be the one established as the break in the books mentioned above.
Note that if the financial data D1 is made public by a regulatory agency or stock exchange, the acquisition means 1002 may obtain the financial data D1 from the regulatory agency or stock exchange, etc., rather than from the target business's server. good.
 抽出手段1004は、取得手段1002により取得した財務データD1に基づいて抽出される勘定科目ごとに、複数の決算期に亙る値の推移を示す推移情報D2を財務データD1から抽出する。例えば、対象事業者における決算書にM(Mは2以上の整数)個の勘定科目が含まれている場合、抽出手段1004は、最大でM個の勘定科目を抽出する。M個の勘定科目が抽出された場合、推移情報D2は、M個の勘定科目の各々について複数の決算期に亙る値の推移を表す。
 なお、抽出手段1004は、不正と判定する根拠を後日提示できるようにするため、抽出した推移情報D2を不揮発性記憶部144に記憶させてもよく、制御部100は、不揮発性記憶部144に記憶された推移情報に基づいて、対象事業者の複数の決算期に亙る勘定科目ごとの値の推移を示すグラフ等を表示部130に表示させてもよい。
The extraction means 1004 extracts from the financial data D1, for each account item extracted based on the financial data D1 acquired by the acquisition means 1002, transition information D2 indicating changes in values over a plurality of accounting periods. For example, if the financial statements of the target business entity include M (M is an integer of 2 or more) account items, the extraction means 1004 extracts M account items at maximum. When M account items are extracted, the transition information D2 represents the change in value over a plurality of settlement periods for each of the M account items.
Note that the extraction means 1004 may store the extracted transition information D2 in the nonvolatile storage unit 144 in order to be able to present the basis for determining fraud at a later date, and the control unit 100 may store the extracted transition information D2 in the nonvolatile storage unit 144. Based on the stored transition information, the display unit 130 may display a graph or the like showing the transition of values for each account item over a plurality of accounting periods of the target business operator.
 判定手段1006は、抽出手段1004により生成された推移情報D2に基づいて、対象事業者の会計処理における不適切性を判定する。図2に示されるように、本実施形態における判定手段1006は、前処理手段1006aと、弱学習器1006b(1)~1006b(N)と、バギング(bagging)手段1006cとを有する。なお、Nは2以上の整数である。以下では、弱学習器1006b(1)~1006b(N)の各々を区別する必要がない場合には、弱学習器1006b(1)~1006b(N)は弱学習器1006bと表記される。 The determining means 1006 determines the inappropriateness of the target business entity's accounting process based on the transition information D2 generated by the extracting means 1004. As shown in FIG. 2, the determination means 1006 in this embodiment includes a preprocessing means 1006a, weak learning units 1006b(1) to 1006b(N), and bagging means 1006c. Note that N is an integer of 2 or more. In the following, if there is no need to distinguish between the weak learners 1006b(1) to 1006b(N), the weak learners 1006b(1) to 1006b(N) will be referred to as weak learners 1006b.
 上記の通り、判定手段1006は複数の弱学習器1006bを有する。弱学習器とは、単体では高い予測精度を期待できないAIことをいう。弱学習器の具体例としては、決定木が挙げられる。決定木とは、ある事項に対する観察結果から、その事項の目標値に関する結論を導くための木構造のことをいう。本実施形態では、対象事業者とは異なる複数の事業者についての財務データに基づいて会計処理の不適切性を示す指標を勘定科目ごとに出力するための学習モデル(予測モデル)が、弱学習器1006b(1)~1006b(N)のアンサンブルの形で構築される。 As mentioned above, the determining means 1006 has a plurality of weak learning devices 1006b. A weak learner is an AI that cannot be expected to have high prediction accuracy on its own. A specific example of a weak learner is a decision tree. A decision tree is a tree structure for drawing a conclusion regarding the target value of a certain item from the observation results of that item. In this embodiment, a learning model (prediction model) for outputting indicators indicating the inappropriateness of accounting treatment for each account based on financial data of multiple businesses different from the target business is based on weak learning. It is constructed in the form of an ensemble of vessels 1006b(1) to 1006b(N).
 より詳細には、本実施形態における弱学習器1006b(1)~1006b(N)は勾配ブースティングに基づいて動作することで上記学習モデルとして機能する。勾配ブースティングとは、学習データからランダムに部分データを取得してモデル(決定木)を逐次的に構築し、各モデルの予測値の重み付き多数決で最終的な予測値とする手法のことをいう。 More specifically, the weak learners 1006b(1) to 1006b(N) in this embodiment function as the above learning model by operating based on gradient boosting. Gradient boosting is a method that randomly acquires partial data from training data, sequentially constructs a model (decision tree), and uses a weighted majority vote of the predicted values of each model to determine the final predicted value. say.
 前処理手段1006aは、抽出手段1004にて抽出された推移情報D2に基づいて、複数のデータセットを生成する。より詳細に説明すると、前処理手段1006aは、まず、弱学習器1006b(1)~弱学習器1006b(N)により構築される学習モデルとは異なる複数の学習モデルに財務データを入力して得られた特徴量を推移情報D2に加えたデータを生成する。本実施形態における複数の学習モデルの具体例としては、Isolation forest、またはCFO修正Jonesモデル等が挙げられる。次いで、前処理手段1006aは、複数の学習モデルに財務データD1を入力して得られた特徴量を推移情報D2に加えたデータに対してアンダーサンプリングを行うことによって複数のデータセット、即ちデータセットD3(1)~D3(N)を生成する。アンダーサンプリングとは、X(Xは2以上の整数)個のデータからランダムにY(X以下の正の整数)個のデータを抽出することをいう。本実施形態では、前処理手段1006aは、N回のアンダーサンプリングを行ってデータセットD3(1)~D3(N)を生成するが、各アンダーサンプリングにおいて抽出されるデータは互いに一部が重複してもよい。なお、アンダーサンプリングの際には、データセットD3(1)~D3(N)の中に、必ず過年度の不正事例の推移情報D2が全量含まれるようにする。 The preprocessing means 1006a generates a plurality of data sets based on the transition information D2 extracted by the extraction means 1004. To explain in more detail, the preprocessing means 1006a first inputs financial data into a plurality of learning models different from the learning models constructed by the weak learners 1006b(1) to 1006b(N) to obtain the obtained data. Data is generated by adding the obtained feature amount to the transition information D2. Specific examples of the plurality of learning models in this embodiment include isolation forest, CFO modified Jones model, and the like. Next, the preprocessing means 1006a performs undersampling on the data obtained by inputting the financial data D1 into a plurality of learning models and adding the feature values obtained to the transition information D2 to create a plurality of data sets, that is, a data set. Generate D3(1) to D3(N). Undersampling refers to randomly extracting Y (a positive integer less than or equal to X) data from X (X is an integer greater than or equal to 2) data. In this embodiment, the preprocessing means 1006a performs undersampling N times to generate data sets D3(1) to D3(N), but the data extracted in each undersampling partially overlap with each other. It's okay. Note that during undersampling, the data sets D3(1) to D3(N) always include the entire amount of change information D2 of fraud cases in previous years.
 次いで、前処理手段1006aは、データセットD3(1)~D3(N)を弱学習器1006b(1)~弱学習器1006b(N)に一つずつ入力する。具体的には、前処理手段1006aは、データセットD3(n)を弱学習器1006b(n)に入力する。 Next, the preprocessing means 1006a inputs the data sets D3(1) to D3(N) to the weak learning devices 1006b(1) to 1006b(N) one by one. Specifically, the preprocessing means 1006a inputs the data set D3(n) to the weak learning device 1006b(n).
 バギング手段1006cは、データセットD3(1)~D3(N)の入力に応じた弱学習器1006b(1)~弱学習器1006b(N)の各々の出力についてバギング(本実施形態では、重み付き多数決)を行うことにより、不適切性を示す指標V1を勘定科目ごとに算出する。財務データD1からM個に勘定科目が抽出される場合、バギング手段1006cは、M個の勘定科目の各々について一つずつ、即ちM個の指標V1を算出する。判定手段1006は、指標V1に基づいて対象事業者及び勘定科目ごとに会計処理における不適切性を判定する。例えば、判定手段1006は、バギング手段1006cにより算出された指標が所定の閾値を上回れば、不適切な会計処理が行われている可能性が高いと判定する。なお、判定手段1006は、指標V1および指標V1に基づく判定結果を示すデータを不揮発性記憶部144に記憶させてもよく、また、指標V1および指標V1に基づく判定結果を表示部130に表示させてもよい。 The bagging means 1006c performs bagging (in this embodiment, weighted By performing a majority vote), an index V1 indicating inappropriateness is calculated for each account item. When M account items are extracted from the financial data D1, the bagging means 1006c calculates one index V1 for each of the M account items, that is, M indicators V1. The determining means 1006 determines the inappropriateness of accounting processing for each target business entity and account item based on the index V1. For example, if the index calculated by the bagging unit 1006c exceeds a predetermined threshold, the determining unit 1006 determines that there is a high possibility that inappropriate accounting processing is being performed. Note that the determining means 1006 may cause the nonvolatile storage unit 144 to store data indicating the index V1 and the determination result based on the index V1, or cause the display unit 130 to display the index V1 and the determination result based on the index V1. It's okay.
 また、財務分析プログラムPAに従って作動している制御部100は、本発明に係る財務分析方法を実行する。図3は、この財務分析方法の流れを示すフローチャートである。図3に示されるように、この財務分析方法は、取得処理SA110、抽出処理SA120、および判定処理SA130を含む。 Furthermore, the control unit 100 operating according to the financial analysis program PA executes the financial analysis method according to the present invention. FIG. 3 is a flowchart showing the flow of this financial analysis method. As shown in FIG. 3, this financial analysis method includes acquisition processing SA110, extraction processing SA120, and determination processing SA130.
 取得処理SA110では、制御部100は、取得手段1002として機能する。取得処理SA110では、制御部100は、対象事業者について複数の決算期に係る財務データを取得する。 In the acquisition process SA110, the control unit 100 functions as an acquisition means 1002. In the acquisition process SA110, the control unit 100 acquires financial data related to a plurality of accounting periods for the target business entity.
 取得処理SA110に後続する抽出処理SA120では、制御部100は、抽出手段1004として機能する。抽出処理SA120では、制御部100は、取得処理SA110にて取得した財務データに基づいて抽出される勘定科目ごとに、複数の決算期に亙る当該勘定科目に係る値の推移を示す推移情報を生成する。 In the extraction process SA120 that follows the acquisition process SA110, the control unit 100 functions as the extraction means 1004. In the extraction process SA120, the control unit 100 generates, for each account item extracted based on the financial data acquired in the acquisition process SA110, transition information indicating changes in values related to the account item over multiple accounting periods. do.
 抽出処理SA120に後続する判定処理SA130では、制御部100は、判定手段1006として機能する。判定処理SA130では、制御部100は、まず、弱学習器1006b(1)~弱学習器1006b(N)の各々に一つずつ入力するデータセット(即ちN個のデータセット)を、抽出処理SA120にて生成した推移情報に基づいて生成する。次いで、制御部100は、N個のデータセットの各々を弱学習器1006b(1)~弱学習器1006b(N)の各々に一つずつ入力し、弱学習器1006b(1)~弱学習器1006b(N)の各々からの出力をバギングして不適切性を示す指標を勘定科目ごとに算出する。算出された勘定科目ごとの指標および指標の合計値(不正が推定される確度示す推定値)の例を図4に示す。 In the determination process SA130 that follows the extraction process SA120, the control unit 100 functions as the determination means 1006. In the determination process SA130, the control unit 100 first performs the extraction process SA120 on the data sets (that is, N data sets) to be input into each of the weak learning devices 1006b(1) to 1006b(N) one by one. It is generated based on the transition information generated in . Next, the control unit 100 inputs each of the N data sets to each of the weak learners 1006b(1) to 1006b(N), one by one, and 1006b(N), and an index indicating inappropriateness is calculated for each account item. FIG. 4 shows an example of the calculated index for each account item and the total value of the index (estimated value indicating the probability that fraud is presumed).
 同図において、勘定科目ごとのSHAP値はその値が大きいほど、過去の不正事例と類似した動きを示している。また、SHAP値の合計値は、ロジット変換することでトータルのリスクスコアとして0~1の間で算出され、その値が大きいほど、対象事業者の会計処理が不適切である可能性が大きいことを示す。同図から、指標の合計を押し上げる要因として、「棚卸資産回転期間」や「棚卸資産純資産比率」等、棚卸資産と関係する指標に対する不適切性が高いことはわかる。よって、この結果をみた監査人等は、棚卸資産の過大計上が疑われるので、これらの項目を重点的にチェックするという判断をすることができる。加えて、棚卸資産の過大計上は利益水増しに繋がるが、実際、この例では利益関連の指標もSHAP値も高くなっていることが確認でき、棚卸資産の過大計上があるかもしれないという判断の根拠が示されている。 In the figure, the larger the SHAP value for each account item, the more similar it is to past fraud cases. In addition, the total value of the SHAP value is calculated as a total risk score between 0 and 1 by logit conversion, and the higher the value, the greater the possibility that the accounting treatment of the target business is inappropriate. shows. From the same figure, it can be seen that the inappropriateness of indicators related to inventory, such as "inventory turnover period" and "inventory-to-book ratio", is a factor pushing up the total of indicators. Therefore, auditors who look at these results can decide to focus on checking these items since overstatement of inventory is suspected. In addition, overstatement of inventories leads to inflated profits, but in this example we can confirm that both profit-related indicators and SHAP values are high, which makes it difficult to judge that there may be overstatement of inventories. The evidence is shown.
 上記実施形態によれば、複数の決算期に亙る勘定科目ごとの値の推移を機械学習により分析することで、会計処理の不適切性を示す指標が勘定科目ごとに算出される。例えば、直近の一年で特定の勘定科目の値が急減に(例えば前年比で所定の閾値以上)変化している場合、あるいは逆に、複数の決算期において特定の勘定科目が不自然と思われるほど一定であるような場合、当該勘定科目にかかるSHAP値に反映されることになる。 According to the above embodiment, an index indicating the inappropriateness of accounting processing is calculated for each account item by analyzing the change in value for each account item over multiple accounting periods using machine learning. For example, if the value of a specific account item has suddenly decreased in the past year (for example, by more than a predetermined threshold compared to the previous year), or conversely, the value of a specific account item seems unnatural in multiple accounting periods. If the value is so constant that it will be reflected in the SHAP value for that account item.
 具体的には、アンダーサンプリングと弱学習によるアンサンブル学習(バギング)を組み合わせることで、訓練データとして用いる大量の財務データのうち不正が行われた財務データはごく少数であるという本件分野に特有な状況においても、学習効率を向上させることができる。この結果、高い精度で会計処理の不適切性を推定することができる。加えて、不正と判定した根拠が勘定科目ごとのSHAP値として提示されるから、客観性および信頼性が向上する。 Specifically, by combining undersampling and ensemble learning (bagging) using weak learning, we are able to reduce the amount of fraudulent financial data out of the large amount of financial data used as training data, a situation unique to this field. Learning efficiency can also be improved. As a result, the inappropriateness of accounting treatment can be estimated with high accuracy. In addition, since the basis for determining fraud is presented as a SHAP value for each account item, objectivity and reliability are improved.
 財務データD1について、本発明に係るシステムの結果を監査人等があらためて検証を行う場合、事前に本発明に係るシステムを使用しない場合に比べて、チェックのポイント(どの勘定科目を優先的にチェックすべきか)を絞りこむことができるので、検証作業の効率化が期待される。 When an auditor, etc. re-verifies the results of the system according to the present invention regarding financial data D1, check points (which account items should be checked with priority) in advance compared to cases where the system according to the present invention is not used. This is expected to improve the efficiency of verification work.
B.その他の実施例
 以上説明した実施形態は、以下のように変形されてもよい。
 前処理手段1006aは、抽出手段1004により抽出された推移情報に基づいて複数のデータセットを生成する際に、市況を表す市況データに基づいて推移情報を補正し、補正後の推移情報に基づいて複数のデータセットを生成してもよい。市況とは、株式市場または商品市場等の各種市場における売買の状況のことをいう。
 市況データは、株価指数、国内企業物価指数、アメリカドルに対する為替レート(或いは当該為替レートの平均値)、与信の伸び率、現金通貨(銀行券発行高+貨幣流通高)と国内銀行等に預けられた預金の合計、貸出利率、および長期債券利回り、のうちのいずれか一つ、または複数を表すデータである。株価指数とは、株価の対前年増減率のことである。国内企業物価指数とは、国内で新しく生産された商品およびサービスの付加価値の総計の増減比率のことである。
B. Other Examples The embodiment described above may be modified as follows.
When generating a plurality of data sets based on the transition information extracted by the extraction means 1004, the preprocessing means 1006a corrects the transition information based on market condition data representing market conditions, and corrects the transition information based on the corrected transition information. Multiple datasets may be generated. Market conditions refer to buying and selling conditions in various markets such as stock markets or commodity markets.
Market data includes stock price index, domestic corporate goods price index, exchange rate against the US dollar (or the average value of the exchange rate), credit growth rate, cash currency (banknotes issued + money in circulation) and deposits with domestic banks, etc. This data represents one or more of the following: total deposits held, lending interest rates, and long-term bond yields. A stock price index is the rate of change in stock prices from the previous year. The domestic corporate goods price index is the rate of change in the total added value of newly produced goods and services within the country.
 市況を会計処理の不適切性を示す指標に反映させることで、例えば、財務データD1において売上高が直近1年で極端に落ち込んでいる一方でここ1年は景気が急激に悪化している場合、学習結果から、この売上高の落ち込みは景気の悪化に起因するものであって不正の直接証拠とはいえないとの結果を導くことができる。他方、仮にここ1年の景気が好調であった場合、売上高の落ち込みは、帳簿の改ざん、虚偽の申告といった不正会計処理が行われたことに起因するものであるとの結果が得られることが想定される。 By reflecting market conditions in indicators that indicate the inappropriateness of accounting treatment, for example, if in financial data D1, sales have fallen extremely in the past year, but the economy has deteriorated rapidly in the past year. From the learning results, it can be concluded that this decline in sales is caused by the economic downturn and cannot be considered as direct evidence of fraud. On the other hand, if the economy has been strong over the past year, the drop in sales could be attributed to fraudulent accounting practices such as falsification of books or false declarations. is assumed.
 前処理手段1006aは、市況データに替えてまたは加えて、事業者の業種、業態、属する業界その他の属性に基づいて、推移情報を補正してもよい。事業内容に応じて財務データの特徴・傾向は異なりうるから、判定精度の向上が期待される。 The preprocessing means 1006a may correct the transition information based on the business type, business type, industry to which the business belongs, and other attributes instead of or in addition to the market data. Since the characteristics and trends of financial data can vary depending on the business content, it is expected that the accuracy of judgment will improve.
 判定手段1006は、各勘定科目について算出された指標に基づいて、不適切性に対する各勘定科目の寄与の程度(貢献度)を示すSHAP値を勘定科目ごとに算出してもよい。この場合、SHAP値を不揮発性記憶部144へ記憶するとともに表示部130に表示してもよい。 The determining means 1006 may calculate a SHAP value for each account item, which indicates the degree of contribution (degree of contribution) of each account item to inappropriateness, based on the index calculated for each account item. In this case, the SHAP value may be stored in the nonvolatile storage section 144 and displayed on the display section 130.
 上記実施形態における取得手段1002、抽出手段1004、および判定手段1006はソフトウェアモジュールであったが、取得手段1002、抽出手段1004、および判定手段1006のうちのいずれか一つ、複数、または全部はASIC等のハードウェアモジュールであってもよい。取得手段1002、抽出手段1004、および判定手段1006のうちのいずれか一つ、複数、または全部がハードウェアモジュールであっても、上記実施形態と同一の効果が奏される。 Although the acquisition means 1002, the extraction means 1004, and the determination means 1006 in the above embodiment were software modules, any one, a plurality, or all of the acquisition means 1002, the extraction means 1004, and the determination means 1006 are ASIC. It may also be a hardware module such as. Even if one, more than one, or all of the acquiring means 1002, the extracting means 1004, and the determining means 1006 are hardware modules, the same effects as in the above embodiment can be achieved.
 上記実施形態では、本発明の財務分析方法をCPU等のコンピュータに実行させる財務分析プログラムPAが情報処理装置10の記憶部140に予め記憶させていた。しかし、財務分析プログラムPAが単体で製造、または、有償或いは無償で譲渡(即ち、提供)されてもよい。財務分析プログラムPAを提供する際の具体的な態様としては、フラッシュROM等のコンピュータ読み取り可能な記録媒体に財務分析プログラムPAを書き込んで配布する態様、またはインターネット等の電気通信回線経由のダウンロードにより配布する態様が挙げられる。これらの態様により配布される財務分析プログラムPAに従って一般的なコンピュータを作動させることで、当該コンピュータに本発明の財務分析方法を実行させることが可能になり、上記実施形態と同一の効果が得られる。 In the embodiment described above, the financial analysis program PA that causes a computer such as a CPU to execute the financial analysis method of the present invention is stored in advance in the storage unit 140 of the information processing device 10. However, the financial analysis program PA may be manufactured alone or transferred (that is, provided) for a fee or free of charge. The specific manner in which the financial analysis program PA is provided is by writing it on a computer-readable recording medium such as a flash ROM and distributing it, or by downloading it via a telecommunications line such as the Internet. Examples include aspects in which By operating a general computer according to the financial analysis program PA distributed according to these aspects, it becomes possible to cause the computer to execute the financial analysis method of the present invention, and the same effects as in the above embodiment can be obtained. .
 要するに、本発明に係る情報処理システムにおいて、一の事業者についての複数の決算期に係る財務データを取得するステップと、該取得した財務データに基づいて生成される勘定科目ごとに、当該勘定科目に係る値の推移を示す推移情報を抽出するステップと、該生成された情報に基づいて、前記一の事業者による会計処理における不適切性を判定するステップとが実行されていればよい。 In short, in the information processing system according to the present invention, there is a step of acquiring financial data relating to a plurality of fiscal periods for one business operator, and a step of acquiring financial data regarding a plurality of financial periods for one business operator, and for each account item generated based on the acquired financial data. It is only necessary that the steps of extracting transition information indicating the transition of the value related to the above and determining the inappropriateness of the accounting process by the one business operator based on the generated information are performed.
10…情報処理装置、100…制御部、110…通信I/F部、120…操作入力部、130…表示部、140…記憶部、150…バス、1002…取得手段、1004…抽出手段、1006…判定手段、1006a…前処理手段、1006b,1006b(1)~1006b(N)…弱学習器、1006c…バギング手段、PA…財務分析プログラム。 10... Information processing device, 100... Control unit, 110... Communication I/F unit, 120... Operation input unit, 130... Display unit, 140... Storage unit, 150... Bus, 1002... Acquisition means, 1004... Extraction means, 1006 ...judgment means, 1006a...preprocessing means, 1006b, 1006b(1) to 1006b(N)...weak learning device, 1006c...bagging means, PA...financial analysis program.

Claims (7)

  1.  一の事業者についての複数の決算期に係る財務データを取得する取得手段と、
     該取得した財務データに基づいて生成される勘定科目ごとに、当該勘定科目に係る値の推移を示す推移情報を抽出する抽出手段と、
     該生成された情報に基づいて、前記一の事業者による会計処理における不適切性を判定する判定手段と
     を有する情報処理装置。
    an acquisition means for acquiring financial data related to multiple accounting periods for one business operator;
    Extracting means for extracting, for each account item generated based on the acquired financial data, transition information indicating changes in values related to the account item;
    and determining means for determining inappropriateness in accounting processing by the one business operator based on the generated information.
  2.  前記判定手段は、
     該抽出された推移情報に基づいてアンダーサンプリングを行って複数のデータセットを生成する前処理手段と、
     前記前処理手段から出力された複数のデータセットがそれぞれ入力される複数の弱学習器であって、複数の事業者についての前記財務データに基づいて前記不適切性を示す指標を出力するための学習モデルを構築するための弱学習器と、
     前記複数の弱学習器からの出力をバギングして前記不適切性を示す指標を算出するバギング手段と
     を有する
     請求項1に記載の情報処理装置。
    The determining means is
    preprocessing means for generating a plurality of data sets by performing undersampling based on the extracted transition information;
    a plurality of weak learning machines to which a plurality of data sets output from the preprocessing means are input, respectively, for outputting the indicator indicating the inappropriateness based on the financial data regarding a plurality of businesses; A weak learner for building a learning model,
    The information processing apparatus according to claim 1, further comprising: a bagging unit that calculates an index indicating the inappropriateness by bagging outputs from the plurality of weak learning devices.
  3.  前記複数の弱学習器は、勾配ブースティングに基づいて動作する
     請求項2に記載の情報処理装置。
    The information processing device according to claim 2, wherein the plurality of weak learners operate based on gradient boosting.
  4.  前記前処理手段は、
     前記学習モデルとは異なる複数の学習モデルに前記財務データを入力して得られた特徴量を、該抽出された推移情報に加えたデータを生成し、該生成されたデータに対してアンダーサンプリングを行うことによって前記複数のデータセットを生成する
     請求項3に記載の情報処理装置。
    The pretreatment means includes:
    Generate data by adding features obtained by inputting the financial data to a plurality of learning models different from the learning model to the extracted transition information, and perform undersampling on the generated data. The information processing apparatus according to claim 3, wherein the plurality of data sets are generated by performing the following steps.
  5.  前記判定手段は、各勘定科目について算出された前記指標に基づいて、勘定科目ごとの貢献度を示すSHAP値を算出する
     請求項1ないし4のいずれか一項に記載の情報処理装置。
    The information processing device according to any one of claims 1 to 4, wherein the determining means calculates a SHAP value indicating a degree of contribution for each account item based on the index calculated for each account item.
  6.  前記前処理手段は、市況データに基づいて前記推移情報を補正する
     請求項2~5のいずれか一項に記載の情報処理装置。
    The information processing device according to claim 2, wherein the preprocessing means corrects the transition information based on market data.
  7.  コンピュータに、
     一の事業者についての複数の決算期に係る財務データを取得するステップと、
     該取得した財務データに基づいて生成される勘定科目ごとに、当該勘定科目に係る値の推移を示す推移情報を抽出するステップと、
     該生成された情報に基づいて、前記一の事業者による会計処理における不適切性を判定するステップと
     を実行させるためのプログラム。
    to the computer,
    a step of acquiring financial data related to multiple accounting periods for one business operator;
    extracting, for each account item generated based on the acquired financial data, transition information indicating changes in values related to the account item;
    and determining inappropriateness in accounting processing by the one business operator based on the generated information.
PCT/JP2023/015855 2022-04-22 2023-04-21 Information processing device and program WO2023204288A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-070830 2022-04-22
JP2022070830A JP7216854B1 (en) 2022-04-22 2022-04-22 Information processing device and program

Publications (1)

Publication Number Publication Date
WO2023204288A1 true WO2023204288A1 (en) 2023-10-26

Family

ID=85119995

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/015855 WO2023204288A1 (en) 2022-04-22 2023-04-21 Information processing device and program

Country Status (2)

Country Link
JP (1) JP7216854B1 (en)
WO (1) WO2023204288A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7394272B2 (en) 2020-02-04 2023-12-08 株式会社Ihiインフラ建設 Backup power supply for water gates

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019067086A (en) * 2017-09-29 2019-04-25 新日本有限責任監査法人 Financial analysis device, financial analysis method, and financial analysis program
JP2021043840A (en) * 2019-09-13 2021-03-18 仰星監査法人 Accounting audit support device, accounting audit support method and accounting audit support program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019067086A (en) * 2017-09-29 2019-04-25 新日本有限責任監査法人 Financial analysis device, financial analysis method, and financial analysis program
JP2021043840A (en) * 2019-09-13 2021-03-18 仰星監査法人 Accounting audit support device, accounting audit support method and accounting audit support program

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "Unbalanced data - Qiita", 5 November 2021 (2021-11-05), XP093101555, Retrieved from the Internet <URL:https://web.archive.org/web/20211105105517/https://qiita.com/tk-tatsuro/items/10e9dbb3f2cf030e2119> *
ICHIHARA, NAOTO: "Current State of FinTech × Auditing: Improper Accounting Recognized Using AI", ACCOUNTING, vol. 69, no. 6, 1 June 2017 (2017-06-01), pages 55 - 63, ISSN: 0386-4448 *
MIYAKAWA, DAISUKE: "Possibilities for AI-Based Improper Accounting Detection and Prediction: The Search for the Future of Auditing", ACCOUNTING, vol. 71, no. 11, 1 November 2019 (2019-11-01), pages 89 - 96, ISSN: 0386-4448 *

Also Published As

Publication number Publication date
JP7216854B1 (en) 2023-02-01
JP2023160443A (en) 2023-11-02

Similar Documents

Publication Publication Date Title
Hatzakis et al. Operations in financial services—An overview
Ettredge et al. How do restatements begin? Evidence of earnings management preceding restated financial reports
Dahmash et al. The value relevance and reliability of reported goodwill and identifiable intangible assets
Krishnan et al. Does financial reporting quality vary across firm life cycle?
Kribat et al. Evidence on the nature, extent and determinants of disclosures in Libyan banks’ annual reports
Chatterjee The impact of working capital on the profitability: Evidence from the Indian firms
Li et al. Using economic links between firms to detect accounting fraud
Bansal et al. Do Indian firms engage in classification shifting to report inflated core earnings?
WO2023204288A1 (en) Information processing device and program
Khatun et al. Earnings manipulation behavior in the banking industry of Bangladesh: the strategical implication of Beneish M-score model
Mokni et al. Risk management practiced tools in the MENA region: A comparative study between islamic and conventional banks
Mechelli Accounting harmonization and compliance in applying IASB standards: An empirical survey about the first time adoption of IAS 7 by Italian listed groups
Yassin et al. Revenue standard and earnings management during the COVID-19 pandemic: a comparison between IFRS and GAAP
Fatima Impact of E-Accounting in Today's Scenario
Dlaskova et al. Valuation of intangible assets according to Czech accounting standards and IFRS in the context of explanatory power of financial statements
Ayele et al. A study on tax evasion and avoidance in Ethiopia: the case of Ethiopian revenue and customs authority Bahir Dar branch
Ovidiu-Constantin et al. Risk management’s importance and role in audit
Szabo Meeting investor outflows in Czech bond and equity funds: horizontal or vertical?
Idris et al. Tax revenue and macroeconomic growth in Nigeria: a contextual analysis
Staub The association of International Financial Reporting Standards (IFRS) and International Standards on Auditing (ISA) in minimizing the Occupational Fraud Risks within Entities in Developing Countries (Doctoral dissertation, University of St. Gallen)
Walters et al. Risks of carbon fraud
Adebisi Influence Of Risk Assets Impairment On Performance Of Nigerian Listed Deposit Money Banks
Henry The competitive effects of IPOs on industry rivals
Bashaija et al. Effect of Financial Controls on Financial Stability of Micro Finance Institutions in Rwanda
Austin et al. The effect of forensic accounting on bank performance

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23791936

Country of ref document: EP

Kind code of ref document: A1