WO2013009160A2

WO2013009160A2 - A geometric method for predicting landslide disaste

Info

Publication number: WO2013009160A2
Application number: PCT/MY2011/000204
Authority: WO
Inventors: Habibah Hj LATEH; Alireza BAHIRAIE; Anton Abdulbasah KAMIL
Original assignee: Universiti Sains Malaysia
Priority date: 2011-07-11
Filing date: 2011-09-13
Publication date: 2013-01-17
Also published as: MY185446A

Description

A Geometric Method For Predicting Landslide Disaster

Field of the Invention

[0001] The present invention relates to a method for providing a geometrical demonstration to predict a landslide disaster based on a landslide disaster prediction index (LDPI). In particular, the LDPl is adapted for carrying out a multi-dimensional analysis on real data to predict landslides occurrences in a particular land or area.

Background

[0002] In the world today, billions of dollars and lives have been lost due to the inevitable occurrences of natural disasters. Earthquakes, flooding, landslides, etc. are a few examples of a natural disaster. A landslide is often described as a phenomenon when a broad range of ground movement occurs. Such movement may occur due to various geographical (climatic, rainfall, etc.) or human (construction, deforestation, etc.) factors. Landslides often result from a combination of these various factors. Therefore it is important to study and analyze the various factors all together.

[0003] The study and research of the various factors include collating, summarizing, analyzing and presenting of data sets. Each data set comprises real data. These real data involves scores of numbers, results, data analysis etc. that are generally dynamically changing. There are a large number of data sets as various lands come from different areas and vary in size and primary scope. As a researcher, data analyst etc., there is a need to evaluate these data sets. Evaluation of these data sets may provide information if a land is stable or unstable, or assisting in predicting a landslide disaster. Therefore there are various methods or statistical approaches designed to represent these data effectively and efficiently. These statistical approaches and methods are typically known to those skilled in the art. [0004] One such common method or approach typically used in data analysis includes ratio analysis. Ratios are utilized to compare the data sets of lands, to assess the stability progress of the lands from one accounting period to the next. It is an important tool used to determine the condition of the land. However if a ratio(r) has a data set X (as a numerator) and a data set 7 (as a denominator) in its functional form, a plot of changes in X versus equivalent changes in 7 in Cartesian coordinates exhibits a "heaviness" on the denominator characteristic of ratios. Any opposite but equal changes respectively in X and Y will not generate opposite but equal values in the ratios. Therefore, ratios tend to show a non-scaling and non-proportionality property. For an example, let a pair of data set ratio(r) be:

X

[Eq. 1 ]

7 wherein,

V X& Y > 0; the function r =— can be described as heavy on its denominator. To demonstrate Y

the disproportionate scaling of r is as follows:

Thus;

( fr \ ( fr \

— => as X - 0 then

MJ MJ

[0005] FIG. 1 exemplifies the rate of changes of ratios (r) with respect to a data set X and a data set Y. The rates of changes of ratios (r) are not similar. This shows that ratios (r) tend to exhibit a disproportionate scaling or a non- proportionality effect. The problem lies in the assumption underlying the use of ratios to control for size differences. Therefore, there is a need to address the proportionality between the ratio's numerator and denominator.

[0006] FIG. 2 illustrates a process diagram of an estimation method 200 for modelling and prediction testing of an observed sample data set 201 for landslide disaster prediction. The estimation method 200 for computation of stable and nonstable probabilities is proposed by selection of inputs and outputs. In landslide studies, there are two groups, stable and non-stable lands and slopes. Consequently, applying a plurality of prediction regression models to an observed data set provides a prediction fitness function. Based on these functions, it is possible to evaluate and predict the stable and non-stable status of a land. There are two types of tests for methodological validation. A first test 204 is to have an observed sample data set 201 classified into a training sample 202 and a validation sample 203. The validation sample 203 implies a group of holdout observations, while the training sample 202 is the remaining observations. A second test 205 has a hit rate of the number of correctly classified observations divided by the total number of observations.

[0007] There are many classical cross-sectional statistical methods in binomial class prediction like univariate analysis, risk index models, multivariate discriminant analysis and conditional probability models such as logit, probit and linear probability models. Multiple Discriminant Analysis (MDA) is a technique that is appropriate in testing the hypothesis of group data means, whether it is one group, two group, or multiple groups, by averaging the scores of all individual data of that sample set within each group. There could be many depending variables on the number of groups. The mean is a calculation that represents the location that is most typical for a particular group of data. The function is to show how far apart the particular group may be along the dimension being tested. Once that dimension is found, it enables the researcher to compare the discriminant scores of the group to see how they are distributed to detect any overlap. The size of the overlap dictates how well or poorly the functions discriminate between groups. A small overlap is an indication that the discriminant performs well in separating the group, whereas a larger overlap is a sign of a poor discriminator between groups.

[0008] Discriminant function analysis is a statistical technique used to determine which variables discriminate between two (or more) naturally existing groups. More specifically, it determines if two (or more) groups are significantly different from each other with respect to the mean of a given variable(s). If the means for a variable are significantly different in different groups, then this variable discriminates between the groups. The most common application of discriminant function analysis is to include many measures in order to determine the ones that discriminate between groups. This leads to building a model of how best to predict to which group a given variable belongs to. A model can be built systematically, where all variables are reviewed and evaluated at each step to determine which contribute the most to discriminating between groups. It can also be done backwards, where all variables are included in the model and at each step, the variable that contributes the least to the predictive model is eliminated.

[0009] For model building, DA takes the following form:

[Eq. 2]

which is based on a stepwise approach to select the best discriminating variables that could predict stability or non-stability of lands and slopes. Exposition of simple linear regression method extended to Z₍. = βχ_Λ , + u_j . The Z,will be considered nonstable if Z 0, and stable otherwise. With regards to the above matter, x_j is denoted as i number of landslide data, w^is denoted as i number of error term, and the range of Z,is from -∞ to +∞. The coefficients are found by taking the inverse of the within the subject's covariance matrix W md multiplying it by the predictor means:

[Eq. 3]

[0010] To create cross-matrices for between-group differences and within groups' differences,

SStotal = SSbg + SSwg [Eq. 4]. The determinants are calculated for these matrices and used to calculate a test statistic, either Wilks' Lambda or Pillai's Trace. Wilks' Lambda follows the equation:

Although it is the most used technique in literature, discriminant analysis contains certain problems in terms of the assumptions it is based on. The first assumption is that ratios as independent variables are normally distributed and the second assumption is that the ratios of stability and non-stability of lands and slopes have the same variance and covariance matrices. Even if Altman et al. (1967) creates discriminant analysis in order to relax the assumption of equal variance-covariance matrices, estimation processes are very complicated. MDA depends largely on some restrictive assumptions including linearity, normality, independence among input variables and pre-existing functional form relating the independent variable and dependent variable. [0011] To overcome the disadvantages of MDA and provide higher prediction accuracy, later studies used logit (Ohlson, 1980) or probit (Zmijewski, 1984) to construct their predictive modes. Some studies comparing the logit model and discriminant analysis such as Martin (1977) and Wiginton (1980) generally stated that the logit model is preferable against the discriminant analysis. [0012] To respond to the limitations of MDA, Ohlson (1980) applied the logit method to develop a probabilistic model of stability and non-stability of lands and slopes. In the study, he selected nine explanatory variables from the previous literature using criteria of simplicity and popularity. In logit probabi lity models, using the coefficients on the independent variables derives the probability of an event for a dichotomous dependent variable. The marginal effect of these coefficients may be interpreted as the effect of a unit change in an independent variable on the probability of a dependent variable. A cumulative probability distribution is necessary to constrain the predictive values within an acceptable range between 0 and 1 (Maddala and Lahiri, 2008).

[0013] The Logit model proposed for each trial , is a set of explanatory variables that is the final equation. These explanatory variables may be thought of as being in a k vector Xj and the model then takes the form:

The logits, natural log of the odds and the unknown binomial probabilities are modeled as a linear function of the X,. [Eq. 7]

Particular element of X,- may be set as 1 for all / to yield an intercept in the model. The unknown parameters ?. are usually estimated by maximum likelihood using a method common to all generalized linear models. The model has an equivalent formulation:

^P> ~ i _{+ e}-Gffo-i-»,*i+^'-+ _e-/M i^Eq- ⁸^ where j{x) is an analytic function in X. This function is preferred because its derivative is easily calculated:

[Eq. 9]

^ dx

[0014] In terms of classification or prediction ability among traditional models, some studies have found the Logit model superior to the MDA (Gu, 2002). In contrast, a research by Aziz and Dar, (2006) has shown that the two models are equally efficient. The probability and likelihood function for non-stable lands is:

P^ W = U_x^ - — _[Eq. 10] which is a logistic distribution function of non-stable land. For ease of exposition, it can be extended to ^ ⁼ 1- +^ e T ¹ ^· ^π]

where;

Z, = ¾;, + w,. [Eq. 12]

If Pi represents the probability of non-stability, which is given in the equation above, then 1 - P, would be the probability of stability. [0015] The probability of stability is obtained by substituting into the cumulative probability function in standardized equation format. Although there are other alternative methods, which are computationally more complex and sophisticated than classical statistical methods, it is unclear if they produce better performing failure prediction models and whether the use of the statistical techniques is valid under very restrictive assumptions (Ooghe et al., 2005). Some sophisticated alternative methods produce better performing failure prediction such as fuzzy rule- based classification model, multi-logit model, dynamic event history analysis, multidimensional scaling, rough set analysis, expert systems specifically neural networks, genetic algorithms and genetic programming (Landajo et al., 2007).

[0016] In statistics, an outlier is an observation that is numerically distant from the rest of the data. Outliers may distort the estimation results and some researchers approximate univariate normality by a 'trimming' method known as an 'outlier deletion' . This method involves segregating outliers to obtain a normal distribution (Ezzamel et. al, 1987). Atkinson and Riani (2001 ), Flores and Garrido (2001) have developed the theoretical foundations as well as the algorithm to obtain a consistent estimator in the logit model with outliers, but they did not provide the applied studies. If outliers indeed exist when the dependent variable is binary, the conventional logit model might be biased.

[0017] Literature on linear regression analysis and taking outliers into account has also been developed since 1960. The methods of Rousseew (1985), such as Least Median of Squares (LMS), Least Trimmed Squares (LTS), are now standard options in a number of econometric software. The literature, however, is slow in the consideration of outliers when the Logit model is employed since 1990. Furthermore, all developments are on the theoretical derivations of outliers in logit method and there is a lack of its applications in the fields.

[0018] A Robust statistic provides an alternative approach to classical statistical methods. Robust method provides an automatic ways of detecting, down weighting (or removing), and flagging outliers, largely removing the need for manual screening. There are various definitions of "a robust statistic". A robust statistic is resistant to errors in the results produced by deviations from assumptions. The median is a robust measure of central tendency, while the mean is not. For an example, the median has a breakdown point of 50%, while the mean has a breakdown point of 0% (Maronna et al., 2006). The median absolute deviation and inter-quartile range are robust measures of statistical dispersion, while the standard deviation and range are not. Trimmed estimators and Winsorised estimators are general methods to make statistics more robust. The basic tools used to describe and measure robustness are, a breakdown point, an influence function and a sensitivity curve. The breakdown point of an estimator is the proportion of incorrect observations an estimator can handle before giving an arbitrarily large result. [0019] There are a plurality of approaches to robust estimation proposed, including R-estimators and L-estimators. However, M-estimators appear to provide a high breakdown point, generality and efficiency (Huber, 1981). M-estimators are a generalization of Maximum Likelihood Estimators (MLEs). MLEs try to maximize

or equivalently minimize

∑- log ./(*, )

In 1981 , Huber proposed to generalize this to the minimization of

where p is some function. [0020] ∑p(x.) may be minimized by differentiating p and solving

dpi x)

where ψ(χ) =—— if p has a derivative. M -estimators do not necessarily relate to a dx probability density function. [0021] It is shown that M-estimators are asymptotically normally distributed, so that as long as their standard errors may be computed, an approximate approach to inference is available. It can also be shown that the influence function of an M- estimator is proportional to ( / (Huber, 1981). Therefore, it is possible to derive the properties of such an estimator (such as its rejection point, gross-error sensitivity or local-shift sensitivity when its ψ function is known. The classical MLE for a generalized linear model may be highly influenced by outliers. In all of the robust models, the explanatory vectors x, can be highly influential outliers.

[0022] The Robust Library in S-Plus software robustly fits a Generalized

Linear Models (GLIM) for response observations y„ i = 1 , 2, . ..n, that may follow one of a Poisson or Binomial distributions. The Binomial Distribution is:

where; y = 0, 1 , . . . Hi 0 < _A < 1 rii = the number of binomial trials for observation ^,. [0023] When η,· =1 , the observations are called y, Bernoulli trials. The expected value of y,- for the Binomial distribution is related to /, by:

Then a vector xf = (χ_η,χ_η,...,χ_ψ) of P independent explanatory variables, and corresponding vector β^τ = {β_{,β₂,...,β_ρ) of unknown regression coeffi cients, from which the software form a linear predictor η = χ β. The linear predictor η and the expected value μ, are related through a link function (g) which maps μ, to η = gijui . The inverse link transformation g^'1 maps η to //, = g^"'(?7). The connection to the Logit model follows a Binomial Model Canonical, which:

with inverse transformation;

1 + exp( ?7),

[0024] For the Binomial model, there is a conditional expectation:

.l + exp(x,^r ?)

In the Bernoulli distributions, the response y_t- is either 0 or 1 , therefore y_t- is not an outlier. In the general Binomial model when is large, the y,- can also be outliers in cases where the expected values of—are small. Thus, in the general Binomial cases, n there is a need for influential y,- outliers to have a robust alternative to the MLE. [0025] With regards to the misclassification results, which are important in a research, a misclassification model approach is used to estimate β, as a solution of an estimating equation:

i^'=l

where is given by the misclassification model:

P(y, = i /*,·) = g ^x (^β) + r x D - 2_g ^~ ' (χ β) = F(x , _r)

[Eq. 20]

This estimator, introduced by Copas (1988), has properties similar to a allows-type unbiased bounded influence estimates.

[0026] High economic and social costs encountered in landslide disasters have spurred searches for better understanding and prediction capability ( cKee and Lensberg, 2002). Since the application of Neural networks (NN) as the first heuristic method to the problem in the 1990s, there was a major shift in prediction methodology that has continued to the present time (Atiya, 2001). Further studies provided more comprehensive surveys on prediction methods such as Genetic Algorithm (GA). Thereafter, a relatively new technique for prediction, Genetic programming (GP), with more accurate classification model for prediction was introduced.

[0027] GP is a search methodology belonging to the family of evolutionary computation. GP may be considered as an extension of GAs (Koza, 1992). GA is a stochastic search technique that searches large and complicated spaces from the ideas of natural genetics and evolutionary principle. They have been demonstrating effective and robust in searching very large spaces in a wide range of applications (Colin, 1994). GA has been applied in wide range of fields such as trading system, stock selection, disaster prediction etc. GP is a GA applied to population of computer programs (CP). While a GA usually operates on (coded) strings of numbers, a GP has to operate on CP. GP allows, in comparison with GA, an optimization of much more complicated structures and may therefore be applied to a greater diversity of problems (Sette and Boullart, 2001 and Nwogugu, 2006).

[0028] Darwinian theory of evolution inspires GP models. In the most common implementation, a population of candidate solutions is maintained. After a generation is accomplished, the population is fitted better for a given problem. GP uses tree-like individuals representing mathematical expressions. Following a recent study by Etemadi et al. (2009), FIGs. 3A-3D exemplifies the genetic programming of the study. FIG. 3A illustrates a tree representation of a program (expression): (X*Y) + 6 - (Z/8). The program includes a plurality of GP individual 301. [0029] Three genetic operators are mostly used in these algorithms: reproduction, crossover, and mutation. FIGs. 3B-3C exemplifies the crossover operators of the genetic programming between parents and children respectively. Firstly, a reproduction operator chooses an individual in a current population and copies it without any changes into a new population. Secondly, two parent individuals are selected and a sub-tree is picked on each one. The crossover then swaps the nodes and their relative sub-tree from one parent to the other. If a condition is violated, the "too-large" offspring is simply replaced by the parents. There are other parameters that specify the frequency in which internal or external points are selected as crossover points.

[0030] FIG. 3D exemplifies the mutation operator of the genetic programming. The mutation operator may be applied to either a function node or a terminal node, where the tree is randomly selected. If the chosen node is a terminal node, it is replaced by another terminal. If the chosen node is a function and point mutation is to be performed, it is replaced by a new function with the same parity (Lee and Miah, 2004). When the tree mutation is carried out, a new function node is chosen and a new randomly generated sub-tree substitutes the original node together with its relative sub-tree. A depth ramp is used to set bounds on size when generating the replacement sub-tree. This checks that the replacement does not violate the depth limit. If the replacement does not violate the depth limit, the mutation operator reproduces the original tree into the new generation. Further parameters specify the probability with which internal or external points are selected as mutation points. [0031] To obtain the best fitness function for all classification problems and in order to apply a particular fitness function, the learning algorithms must convert the value by the evolved model into "1 " or "0" using a "0/1 Rounding Threshold". If the value returned by the evolved model is equal to or greater than the rounding threshold, then the record is classified as "1" and "0" otherwise. There are many varieties of fitness function such as number of hits, sensitivity or specificity, relative squared error (RSE), mean squared error (MSE) etc., which may be applied for evaluating performance of the generated classification rules. The "number of hits" is used as a fitness functions due to its simplicity and efficiency, which is based on the number of samples correctly classified. More specifically, the fitness function ( ,^:) of an individual program corresponds to the number of hits and is evaluated by /,^: = h where h is the number of fitness cases correctly evaluated or number of hits. Therefore, for the fitness function i^'), the maximum fitness (f_mm) is given by /_max = n where n is the number of fitness cases.

[0032] A "raw function" (r/), is a counterpart of the fitness measure J. It is complemented with a parsimony term. Parsimony pressure puts a little pressure on the size of the evolving solutions, allowing the discovery of more compact models. Therefore, a raw maximum fitness function rf_mM = n and an overall fitness program population ifppi) is evaluated by: fpp [Eq. 21]

where Si is the size of the program, and S_max and S_mj_n represents the maximum and the minimum of the program population respectively. Maximum and minimum of program sizes are evaluated by the formulas: S_imx = G(h + t) [Eq. 22]

S_mi„ = G [Eq. 23] where G is the number of genes, and h and t are the head and tail sizes which are "0" and "1" respectively in classification problems. Therefore, when:

with the condition, / = 1 .0002 X rf,_mii [Eq. 26] then the process will be optimized. The described procedure is depicted by Tsakonas (2006). Once the fitness function is defined, an optimization of the fitness function for which optimization techniques may be used. The implementation of a genetic model is to automatically extract an intelligent classification rule for prediction classes of stability and non-stability lands and slopes in a sample by the given values of a number of ratios, called predicting variables. Each rule is constituted by a logical combination of these ratios. The combination determines a class description, which is used to construct the classification rule.

Summary

[0033] In one aspect of the present invention, there is provided a method for predicting a landslide disaster on a specific area. The method comprising predetermining data having variables of the specific area that are associated to landslide, wherein the variables includes geological data, hydrological data, and landslide historical data; pre-classifying each of the variables with specific indicators and boundaries, wherein the specific indicator is associated to a rating of the respective variables, wherein the boundaries includes an upper and lower boundary for each of the variables; generating a mathematical multi-dimensional index in the specific area, wherein the mathematical multi-dimensional index comprises visualizations and rankings of the specific areas through a Cartesian co-ordinate, wherein the Cartesian co-ordinate represents a risk ratio of the specific indicator; ranking the specific areas based on the Cartesian co-ordinates; and identifying a disaster risk landslide area based on the rankings of the specific areas generated from the mathematical multidimensional index. [0034] In one embodiment, the pre-classifying of the variables further comprises a plurality of statistical approaches, wherein the plurality of statistical approaches include a Multiple Discriminant Analysis ( DA), a Logistic Regression, a Robust Logistic Regression and a Genetic Programming. [0035] In another embodiment, the upper and lower boundary for each of the variables determines a high risk and a low risk respectively. The high risk is associated to a higher probability of the landslide disaster occurring. The low risk is associated to a lower probability of the landslide disaster occurring.

[0036] In yet another embodiment, the mathematical multi-dimensional index is symmetric; proportionate; and non-invariant. The mathematical multi-dimensional index is further compared with a Limit Equilibrium Method (LEM) to determine accuracy.

[0037] In another embodiment, rankings of the variables in the specific area are individually measured without any limitation and deletion. Brief Description of the Drawings

[0038] This invention will be described by way of non-limiting embodiments of the present invention, with reference to the accompanying drawings, in which:

[0039] FIG. 1 exemplifies a rate of change of a ratio (r) with respect to and Y in Cartesian coordinates; [0040] FIG. 2 illustrates a process diagram of an estimation method for modeling and prediction testing of an observed sample data set; [0041] FIG. 3A illustrates a tree representation of a program (expression):

(X*Y) + 6 - (Z/8);

[0042] FIG. 3B exemplifies the crossover operators of the genetic programming between parents; [0043] FIG. 3C exemplifies the crossover operators of the genetic programming between children;

[0044] FIG. 3D exemplifies the mutation operator of the genetic programming;

[0045] FIG. 4 illustrates a process flow of a method for predicting a landslide disaster; [0046] FIG. 5A illustrates a plurality of Total Risk (TR) isolines of a risk box and its direction of increase;

[0047] FIG. 5B illustrates a plurality of Net Risk (NR) isoclines and its direction of increase;

[0048] FIG. 5C illustrates a plurality of Overlapping Risk (OR) isoclines and its direction of increase;

[0049] FIG. 5D illustrates a plurality of Share Risk (SR) isoclines;

[0050] FIG. 5E exemplifies a rate of change of an upper sector above the 45°

SSR

line that is similar but opposite to that in a lower sector below the 45° line

Κ δΥ )

SSR [0051] FIG. 5F illustrates an equi-SR ray in a Risk Box;

[0052] FIG. 5G illustrates a generalized Risk Box;

[0053] FIG. 5H illustrates a Landslide Disaster Prediction Index (LDPP) isoclines; [0054] FIG. 51 exemplifies a LDP] exhibiting proportional scaling;

[0055] FIG. 5J exemplifies a graphical representation of identifying boundaries for all factors;

[0056] FIG. 5K illustrates a 3-Dimensional (3D) index for landslide prediction;

[0057] FIG. 5L illustrates a multi-dimensional integrated index; [0058] FIG. 6A provides a screen shot of the main page of a Landslide Disaster Prediction Software (LDPS);

[0059] FIG. 6B provides a screen shot as a plurality of combination panel are chosen; and

[0060] FIG. 6C provides a screen shot example of all the specific areas ranked using the LDP] value for any combination.

Detailed Description

[0061] The following descriptions of a number of specific and alternative embodiments are provided to understand the inventive features of the present invention. It shall be apparent to one skilled in the art, however that this invention may be practiced without such specific details. Some of the details may not be described in length so as to not obscure the invention. For ease of reference, common reference numerals will be used throughout the figures when referring to same or similar features common to the figures. [0062] FIG. 4 illustrates a process flow of a method 400 for predicting a landslide disaster in a specific area according to one embodiment of the present invention. The method 400 comprises collecting of data for a field and laboratory work in step 401 ; application of statistical approaches to find out a plurality of parameters for different classes and verifying of field investigation at step 402; providing a mathematical multi-dimensional index for landslide prediction at step 403; simulating a test and accuracy result comparison with a Limit Equilibrium Method (LEM) at step 404; and tabulating data collected in a spreadsheet software at step 405.

[0063] The data collected for the field and laboratory work for various specific areas in step 401. The data collected are variables that are associated to a landslide disaster. The variables include slope geometry data (slope angle, slope length, slope height, etc.); slope materials data (soil types, soil characteristics, etc.); geological data (rock types, bed rock, weathering grade, etc.); hydrological data (rainfall, water table, moisture content, etc.); geomorphologic data (gully, erosion, etc.); geotechnical data (soil strength, pore water pressure, etc.); and landslide historical data. The specific area includes coordinates of the areas that may be prone to landslide disasters, commonly occurring landslide disasters areas, etc.

[0064] Statistical approaches are applied to find out the parameters of the different classes and for field investigation verification in step 402. The statistical approaches include Multiple Discriminant Analysis (MDA), Logistic Regression, Robust Logistic Regression and Genetic Programming. The variables are categorized into components of categorical data sets, more specifically into specific indicators, which includes ratings of the variables. The statistical approaches also identify an upper and lower boundary, or thresholds, for each of the specific indicator.

[0065] The mathematical multi-dimensional index in the specific area in step

403 is based on a landslide disaster prediction index {LDPI). The LDPI comprises visualizations and rankings of the specific area through a Cartesian co-ordinate, wherein the Cartesian co-ordinate represents a risk ratio of the specific indicator. Standard statistical approaches show the higher probability of the variables occurring. However, it is important to study and analyze all variables and data sets that may lead to a landslide occurrence. The multidimensionality of the LDPI measures all the variables in the specific area individually without any limitation and deletion.

[0066] In step 404, the data set is further tested with the LDPI, to test its performance and check the results. The accuracy of the LDPI is further checked with a prediction perspective. The results maybe compared with the LEM. In most slope stability analysis, the LEM investigates the equilibrium of soil mass tending to slide down under the influence of gravity. The LEM approach is known to those skilled in the art, and therefore no further illustration is provided herewith. [0067] Tabulation of the. data collected in the spreadsheet software in step 405 provides easy visual results to interpret animated real performance of the data set. The spreadsheet software includes a Landslide Disaster Prediction Software (LDPS) that allows a data analyst or researcher to select the variable one by one and its total respectively, providing a plurality of combinations as desired. Concepts from different theoretical areas are used to create a complete picture of the specific area then can be conveyed in a single face or a static graph.

[0068] The first essential step 401 in the method 400 for predicting a landslide disaster requires the collection of the data. The data collected has to be analyzed and compared, in order to derive information. The following TABLE 1 shows a comprehensive inventory of variables, which are collected for the field and laboratory work, in a particular study area.

Variables Definitions / descriptions

Slope Number A number assigned to a slope in a study plan of all slopes of the highway.

Left or right position A location of a slope whether it is on the right or left hand side.

Upslope position Upslope position relates to a slope that is located higher than a road.

Down-slope position Down-slope position relates to slopes that is located below a road or lower than a road.

Chainage A start and end chainage (study chainage) of a slope feature.

Associated features Types of features that adjoin either upslope or down-slope of the feature being surveyed.

Sheet erosion Removal of soil from a slope surface in the form of 'sheet' by runoff.

Rill Type of erosion that creates small rivulets (less than 0.5m deep) running down the slope. Gully An erosion channel by water, which is more than 0.5 meters deep, running down the slope.

Mass movement Any slumping and/or slippage of material on the slope face caused by erosion.

Percentage of erosion An estimated percentage of the area of the slope face that is affected by each erosion type.

Severity of sheet erosion Seriousness of sheet erosion on the cut slope face recorded in the form of 'very severe', 'moderate to severe' and 'minor'.

Severity of rill erosion Seriousness of rill erosion on the cut slope face recorded in the form of 'very severe', 'moderate to severe' and 'minor'

Severity of gully erosion Seriousness of gully erosion on the cut slope face recorded in the form of 'very severe', 'moderate to severe' and 'minor'

Maximum gully depth Maximum value of the vertical depth from the gully surface to the base or bed of the gully measured in meters.

Gully depth at head Vertical depth from the gully surface to the base of the gully measured at the gully head in meters.

Gully depth at toe/base Vertical depth from the gully surface to the base of the gully measured at the gully mouth or toe in meters.

Maximum gully width Maximum horizontal width of the gully measured at the surface in meters.

Gully width at head Horizontal width of the gully measured at the gully head in meters.

Gully width at middle Horizontal width of the gully measured at the middle of the gully in meters. Gully width at toe Horizontal width of the gully measured at the gully mouth in meters.

Gully length Trace length of the gully as measured on the surface in meters.

Gully angle at head Angle of the gully measured from the base of gully channel to the slope base (at the head of the gully).

Gully angle at base Angle of the gully measured from the base of gully channel to the slope base (at gully mouth and base).

Sidewall angle Sidewall angle of the gully measured from the gully base to the gully sidewall in degrees.

Sidewall angle (side 1) Sidewall angle of the gully measured on the right hand side facing the slope (feature).

Sidewall angle (side 2) Sidewall angle of the gully measured on the left hand side facing the slope (feature).

Gully shape Shape of the gully in cross-section.

U-shaped Base of the gully channel is broad and the sidewall is steep or vertical (> 80°) and the shape looks like 'U\

V-shape Base of the gully is narrow, sharp an wide near the surface and the shape looks like 'V

Vertical sidewall Sidewall angle that is ~ 65°.

Sloping sidewall Slope of the wall is the slope of side wall of the gully

Benched sidewall Sidewall angle that is < 65°.

Gully located on Form of the slope in which gully is located on (whether the slope is convex, concave or rectilinear). Causes of erosion Identifying the cause of erosion on the slope. Causes are represented by codes.

Damage Type of damage due to erosion.

Corrective measures Defined as the steps taken to reduce or overcome gully erosion from developing.

Measures taken Type of measures taken to reduce gully erosion.

Slope angle Angle to the horizontal of the line connecting the slope toe to the crest of the cut slope.

Slope height Maximum vertical height of the slope in meters.

Slope length Length of the slope in meters measured from start of slope base to the end of slope base (from one end to the other end of the slope).

Slope aspect Compass bearing of the slope faces.

Slope type Slope type is recorded as simple, planer, asymmetrical or compound depending on the curvature of the slope crest.

Slope area Area of cut slope surface in square meters.

Slope form Identify whether the slope is in the form of concave, convex or rectilinear.

Slope cross-section Defined and categorized by the relevant curvature indexes when looking at the feature in cross-section.

Slope plan profile Defined and categorized in plan by relevant curvature indexes.

Cut slope Natural slope that has been cut in order to construct the road. Slope status Condition of the cut slope. Code is used to record whether the slope is failed or stable.

Failure type Type of cut slope fai lure.

Logging activities Coding is used to record if there are activities of logging on the upslope or down-slope of the study feature.

Other activities Defined as any other activities (excluding logging) present on the upslope such as settlements, roads, agricultural activities and so on.

Drainage entering the Code (1) is used to record if there is a natural drainage slope flowing into the cut slope and code (2) otherwise.

Types of drainage Type of drainage entering the slope whether the type is dendrite-like, trellis or parallel.

Soil types Type of soil comprising the slope classified based on the

Unified Soil Classification System.

(British Classification)

Soil types Soil is classified in the laboratory based on American

Association of State Highway and Transportation Officials

(AASHTO

(AASHTO) Classification Systems.

Classification)

Soil origin Source of the soil comprising the slope.

Rock type ^• Type of rock comprising the study slope.

Parent rock Type of rock the soil is derived from.

Profile type Profile type of each slope is assign based on weathering grades.

Percentage colluvium Percentage by area of feature considered to be comprised of colluvium.

Percentage core stone Estimated percentage of the area covered by the core stone visible on the slope face.

Percentage surface Percentage of dark brown iron oxide covering the surface of crusting the cut slope.

Weathering grades Grade 1 to 6 is used to assign different weathering grades of materials exposed by cut.

Does slope has rock To indicate whether there is a rock exposed on the slope exposure face. Code is used to record the rock exposure as T for yes and '0' for no.

Exposed rock type Type of rock that can be seen or appear on the cut slope face.

Percentage exposure Estimated percentage of rock exposed on the slope face.

Water table Water table is recorded as high if the water table is above the road formation level, and low if the water table is lower than the road level.

Slope cover Any type of cover (natural or artificial) present on the cut slope.

Percentage uncovered Area of cut slope face uncovered expressed as a percentage of the whole cut slope face.

Main cover type Name of the dominant type of slope cover appeared on the slope face.

Percentage cover Area of main slope cover expressed as percentage of the whole cut slope face. Percentage cover of tree Area of cut slope face covered by tree expressed as a percentage of the whole cut slope face.

Percentage cover of Area of cut slope face covered by shrubs expressed as a vegetation percentage of the whole cut slope face.

Percentage of artificial Area of cut slope face covered by man-made cover such as cover grass, trees, flowers, netting and so on, expressed as a percentage of the whole cut slope face.

Types of artificial cover Type of slope cover made by man and it is not a natural cover.

Cover on back slope Define if the back of the slope is bare or has a natural cover.

Catchment shape Identify if the catchment is elongated or circular in shape.

Land use Identify if the land in the catchment area is being used by any other activities such as logging and so on.

Land use type Codes are used to assign different types of land use in the catchment area.

Catchment area Calculated area of the catchment in square meters.

Shear strength Soil shear strength of gullied slope.

Permeability Measure of the ease with which water flows through the soil in meter/second.

Moisture content Moisture content of the soil sample taken from gullied slope.

Percentage of clay Percentage of clay content of gullied slope.

Percentage of slit Silt content of gullied slope expressed as a percentage. Percentage of sand Sand content of gullied slope expressed as a percentage.

Drain functioning The functioning of the drain. The drain is functioning if the water can flow easily and smoothly.

Number of slope berms The total number of branches present on the cut slope face.

Upslope channel This is the natural channel of water flow and it usually occurred and appeared on the top of the slope or on the slope crest.

Upslope channel present Records 'Yes' if the upslope channel present on the slope or

'No' otherwise.

Surface runoff flow in Concentration of water flowing down on the slope in only one direction one direction.

TABLE 1: List of variables collected for a field and laboratory work

[0069] Various methods are used for every variable's measurements and estimation from every slope is consistent throughout the process of the data collection. It is suggested by Young (1972) that the methods used to collect the data from an area has to be consistent in order to obtain good and valid results. The methods used to collect these data are known to those skilled in the art and therefore, no further illustration is provided herewith.

[0070] Thereafter, statistical approaches are applied in step 402 to further classify the data obtained in step 401 into specific indicators (X | , X₂, . . . , XN). In the method 400, the Robust Logistics Regression and Genetic Programming are used for the first time in landslide disaster prediction. The use of these two statistical approaches increases the performance accuracy results and thresholds. Prediction under Genetic Programming method also increases the classification rates. Robust Logistics Regression is found to be more accurate and compatible as compared to a Logit Regression approach. The plurality of statistical approaches applied is as shown from the [Eqs. 2-26] above.

[0071] At step 403, the LDPl is derived from a Risk Box approach. The proposed LDPl provides symmetric, proportion and scaled measures representation of the landslide disaster in the specific area. The proposed LDPl also ranks the specific areas accordingly to its LDPl value calculated. The LDPl utilizes ratios, whereby the researchers or data analysts compare the performances of the variables from one accounting period to the next. The LDPl is a generalized model that incorporates a stability status (stable or unstable) of the specific area, and allows for visualizing and ranking of each combination of the ratio as numerator and denominator. Derivation of the LDPl will be explained in FIGs. 5A-5L below.

[0072] At step 404, the simulation results and result testing of the LDPl compared with the LEM method. [0073] At step 405 tabulates the data in the spreadsheet software, illustrating all the various landslide disaster areas for every specific combination of the variables over the sample data set. Step 405 also provides the ranking of all the various landslide disaster areas in another table based on the plurality of specific combination of variables that has been chosen. These illustrate the situation of the various specific area based on the specific combination of the variables. Researchers and data analysts are able to quickly visualize the LDPl transformation process, leading to the ranking of the specific areas. [0074] As such, the data set may be fed into a tabulated data (e.g. a Microsoft

Excel file) and the step 405 runs for as many combinations as desired. Interpretation of the data collected in step 401 is a significant undertaking that must rest on a fuller understanding of the specific area and its operations by graphic representation of the specific area's landslide activity. An exemplary illustration of step 405 using the LDPS will be further explained in FJGs. 6A-6C below.

[0075] The method 400 defines new data representations and examines their effects and significance on a landslide disaster prediction. The method 400 integrates the theories in mathematical science, landslide disaster information and data, computer programming and used quantitative and qualitative analyses to make important contributions in both research and practice. When the LDPI model in step 403 needs to be generated, a researcher or data analyst must consider all the relevant variables. If there is doubt about the importance of a variable, that variable should be included in the LDPI model, and then tested for its importance in step 404. The researcher or data analyst should also try to find macroeconomic variables that might have a strong impact in the landslide disaster, which the landslide disaster prediction modeling was developed. Once all the relevant variables have been collected in step 401, all estimation methods available to the researcher or data analyst should be used to develop the model in step 402. In the present embodiment, the estimation model is used in a combined approach for predictive reinforcement.

[0076] The development of the method 400 may also be applied to financial information of a firm to aid in the interpretation of ratios. It can quickly highlight the areas in which the firm's operations are not as efficient or as effective as possible. The method 400 may also be applied as a planning tool to identify potential pitfalls in the firm's operations.

[0077] FIGs. 5A-5L provides a detailed explanation of the step 403 in the method 400. The step 403 of the method 400 includes the LDPI, which is derived from the Risk Box approach. FIGs. 5A-5G illustrates the development of the Risk Box that is adapted to provide a static measurement of a risk ratio. FIGs. 5H-5L illustrates the LDPI adapted to demonstrate the visualization of a transformation associated to a change in the risk ratio. The risk ratio includes real data of the variables that are accountable for a landslide disaster. FIG. 5J illustrates the identifying of the boundaries for all the variables, as the specific indicators are integrated. Thereafter, FIGs. 5K-5L illustrates the index for landslide prediction. The LDPI is for predicting landslide disaster, to improve the predictive capacity of a disaster risk landslide area, to provide a multidimensional analysis where every possible variable accountable for a landslide disaster occurrence is analyzed and considered. [0078] In FIGs. 5A-5G, the Risk Box is a new measure of risk ratios as one embodiment of the present invention. A plurality of components of a ratio approach is first considered. The components includes a numerator (X); a denominator (Y); a Total Risk (TR); a Net Risk (NR); an Overlapping Risk (OR); and a Share of overlapping Risk (SR) index. Using basic accounting relationships between each of these risk components, TR, in any particular period is the sum of X and Y. In turn, TR can be simultaneously decomposed into its constituent components, namely NR and OR, where 77? = NR + OR. The static relationships between the six components may be defined by the following definitions. [0079] In the first definition, Peter Carti Brown (2007) stated that the total risk consists of all developments and movements of a variable along the time that will affect risk and that can be controlled and observed. Total risk sometimes consist of market attempts to measure the relationship between risk and prediction. The factors with regards to the landslide disaster, computes systematically or unsystematically the effects of the particular area. The locus of equi TR values in the Risk Box consists of all those points, whose co-ordinates (X , Y) share a common sum (X+ Y). That is,

X + Y= TR = NR + OR [Eq. 27]

[0080] In the second definition, the locus of equi NR values in the Risk Box consists of all points, which are identified as residuals and provides an impact and probability in risk management. By using the NR, data analysts give a score to companies and only those points, whose co-ordinates (X, Y) share a common value in X -Y. That is,

[0081] In the third definition, two measures in a strong and same contrast that neither X nor Y has temporal precedence. When they overlap, the combination will improve the reliability. The locus of equi OR values in the Risk Box comprises of all those points which risk factors work together. In other words, two policies providing the same converge or risk which these factors have interrelationship, and only those points, whose co-ordinates (X, Y) share a common value in (A^"+ y) - |A^r - y|. That is,

[0082] Lastly, in the fourth definition, the locus of equi SR indices measuring the share of overlapping valuation in the Risk Box comprises all those points, and only those points, whose co-ordinates (X, Y) share a common value of risk in

The geometries of the linear equations [Eqs.27-30] may therefore be represented in the Cartesian plane.

[0083] Thereafter in the fifth definition, the locus of a linear equation in (X) and

(Y) is a straight line containing all the points, and only the points, whose co-ordinates satisfy the equation aX+bY+c = 0 where a, b, and c, are real and a and b are not both equal to zero. [0084] In the sixth definition, the data set of all the variables and landslide disaster factors in a particular area, i.e. i in ={1,2,3,....njyears are taken into account.

Each respective Risk Box will have sides equal to max(JQ or max(F,). Thus the area or dimensions of the Risk Box is given by:

= [max( ,,r,)] [Eq.31]

Following after the sixth definition, the co-ordinates of the maximum NR are

[Mx(X),0] or [0, Max(Y)].

[0085] The dynamic behavior of NR is as follows: if >Y^,X, >Y_t =>A(NR) = (X~Y), -( -7) [Eq.32] if V ,_, > Y^Y_t >X,^ A(NR) = (Y-X),-(Y- ),_, [Eq.33]

However, if X > Y_t_,X_t < Y_t => A(NR) = (Y - X), - (Y - X),__x ----- [Eq.34]

And, if

< Y^x, > Y_t A(NR) = (x-Y), -(y- [E - 35]

From the [Eq.29], /X = Y => OR = (X + Y).= TR [Eq.36]

\/X = Y=>OR = (X+Y)-(X-Y) = 2Y [Eq.37]

\/X<Y OR = (X+Y)-(Y-X) = 2X [Eq.38]

From the [Eq.37] all TR equals OR. From the [Eq.38], ratio points with equi OR share similar X value. The dynamics of OR is as follows: VX_f_, > Y,_,X, > Y_t implies that the change in OR is dependent only on changes in Y, i.e.

AOR = OR, - OR,_, = 2AY = 2(7, - y ) [Eq.39]

From the [Eq. 39], points with equi OR values share similar Y values. Also V7,_, > ,_,,F, > X implies the change in OR is dependent only on changes in X, i.e.

AOR = OR, - OR = 2Δ = 2(X, - ) [Eq.40]

In general, the change in OR between any two periods in the Risk Box is given by,

AOR = 2[min( ,7), -min(Z,7) ]. [Eq.41]

[0086] From the [Eq.30],

SR = l- ^~^yl [Eq.42]

(X+Y)

\JXovY = 0 SR = 0,and [Eq. 43]

V = y=>SR = l [Eq. 44] X-Y

VX>Y =>(SR-\) = - [Eq. 45]

X+Y

Y(SR-\)] = -X+Y [Eq. 46] X(SR) = -[(SR-2)Y] [Eq. 47] ~[(SR-2)Y]

or X = - [Eq. 48]

SR i(2-SR)Y]

or x=- [Eq. 49]

SR

= SR is a constant ray through the origin.

( 2— SR^{^}\

/X < Y , we have a similar demonstration with Y■ X.

SR J

[0087] Referring to FIG.5D, from OA=TR*, and from RV*-OR*=NR*=AB.

Therefore,

₊ . AB TR*-OR* OR* , _e_ _r_ __m

tanfc',. = = = 1 = 1 - ST? [Eq.50]

' OA TR TR*

=>SK* = l-tan^ values are constant along any ray from the origin. FIG.5D presents two equi-Si? rays from the origin where θ_{=θ₂. Every point on each ray shares an equal SR value.

[0088] The framework outlined above is a simple risk valuation co-ordinate system confined within a square "box". The unit of measurement can be denoted as 1

. With the co-ordinate representation of the variables in the various specific max(X,F), areas, any change in the stability of the specific area can now be visualized within this simple construct. [0089] For a particular year, each of these four components (77?, NR, OR and

SR) is respectively contained within any single point (X, 7) in the Risk Box as shown in FIG. 5.D. In a dynamic context, for example over a period of ten years, we have ten points in the Risk Box each carrying ten respective static values of each component.

[0090] In order to develop the new measure of risk ratios that has three desirable properties, i.e., symmetric, proportional and non-invariant, the Risk Box and new concepts in ratio analysis are introduced as a complimentary measure of ratios. The framework is a two-dimensional box. Associated with it are the ratio values of X, and 7, where each of them is represented as Cartesian co-ordinates. For expositional purposes, suppose our proxy for risk chosen is employed by X_t as numerator and 7, as denominator. For any number, i =1,2,...,n, the proposed Share Risk (SRI) is defined as a function of , and 7. Consider a square two-dimensional space that captures all values in the numerator X_t and denominator 7, for any variable /^' and any period /, where Xi and 7, can be positive, negative or zero (applicable to any level of aggregation such as cross-country studies, cross sector, etc). Assume for all variables in a sector j, the X, > 0 and Y; > 0. All risk components measure indices such as, Total Risk (77? = X+Y), Net Risk (NR = X -Y), Overlapping Risk (OR = (X + 7) - X -Y), and lastly the proposed Share measure of Risk (SR) as defined below, a linear functions of X,- and 7,. Suppose that there are n variables, i.e. =1 ,2,3, then,

X R.NR.OR.SRi > 0 [Eq. 51 ]

where, ^'

OR ₌ (X + Y) - \X- Y\ ₌ 2min(X,7)

77? ^~ (X+ Y) ^~ (X+ Y) [0091] The dimensions of the Risk Box are generated by the maximum value of either X, and 7,· value during the period of study. For example, each respective Risk Box will have sides equal to max(.Y, ) if for max(JQ > max(7,), or equal to max( 7, ) if otherwise. Explanat ion of the dimensions of the box is as defined below. [0092] A two dimensional box that encapsulates all of these values for n variables is constructed. Each respective Risk Box will have sides equal to max(A ), then max^Y,) > max(7,), or equal to max( 7, ) if otherwise. A 45° line from the origin bisects the Risk Box into two equal triangles. FIG. 5A illustrates the TR equi lines and its direction of increase is

V{X„Y,l < = TR, < 77?,₊, [Eq. 55]

Therefore, Y = -X+ TR -^'-— [Eq. 56]

[0093] Hence, the locus of equi TR is perpendicular to the axis of symmetry.

This is the risk components' axis of symmetry. The two triangular planes in the Risk Box consists of a lower triangle containing co-ordinate points {X„ 7_/) where X, > Y„ and an upper triangle containing co-ordinate points {X„ 7,) where 7,· > Xi. A fix value 77? = 77?* implies that = 77?* -7 and a gradient m equals minus unity. Hence, locus of equi 77?* is perpendicular to the axis of symmetry.

[0094] Recalling that NT? = X-Y, the 45° line can be regarded as the contour of the value NR* = 0. For a positive value NR* > 0, above the central of the 45° line, 7 - X = NR*. Therefore, X = Y - NR*, which also slopes upward at 45°, meeting the (horizontal) y-axis. Above the 45° line through the origin, there is another segment of the same contour, the line Y = X - NR*. These two 45° lines from the contour corresponds to NT?*. Increasing the value of constant NR* moves both segments higher up their respective axis, away from the central NR* line. Referring to FIG. 5B, as the line of the risk axis is in symmetry with NR, there are similar values for the gradient, i.e. m =1 and c = NR , which is constant.

[0095] Considering Leontief production function, OR = 2min (X, 7), below the central 45° line, OR - 2X that remains constant for constant X. Above the line OR = 27 that remains constant for constant Y. Therefore, referring to FIG. 5C, the equi corresponding to the constant Overlapping Risk (OR*) is a L-shaped line, where the kink occurs along the central 45° line. As OR* increases, the point moves up the line, away from the origin. Considering a proposed unit- free SR = ^^_j^^y^ , the following two scenarios are obtained.

[0096] The first scenario occurs above the 45° line, where Y > X and thus

2X

SR =——— . The equi corresponding to a constant value SR* is defined by the relation

2— SR *

SR* (X + Y) = 2X, which may be solved for Y to yield Y = x . Given a constant

SR *

2 - SR *

value 57?*, a ray with a constant slope γ = is obtained. Since 0 < SR < 1, then 1

SR *

< γ < co, the ray passes between the central 45° line and the vertical axis.

[0097] The second scenario occurs below the central 45° line, where X > Y and thus SR =—— . The equi corresponding to a constant value 57?* is defined by the

X + 7 ^y

SR *

relation 57?* (X +Y) = 27, which may be solved for 7 to yield 7 =— x . Therefore,

2 - 5^*7? *

57? * this segment of the equi is a ray from the origin with a constant slope = -— is obtained. Since 0 < SR < 1, then 0 < γ' < 1, showing that the ray passes between the central 45° line and the horizontal axis.

[0098] Hence, the equi corresponding to a particular value SR* consists of two rays in the positive quadrant, meeting at the origin, with slopes γ and γ^"' . Referring to FIG.5D, these rays are shown as OC and OB. The symmetry of the diagram about the central 45° line implies that the angles θ_\ and <¾ are equal. In FIG.5D, the relationships between four risk measures and slopes γ and γ ' , consider the rays OC and OB, subtending the angles θ_\ and <¾ measured from the symmetry axis. A, B, C, and D as shown in FIG.5D represents points on the risk plane with A, B, and C sharing equal Total Risk values (TR*). In addition, B, C, and D share equal OR values (OR*).

OA [Eq.57] and

define

TR * -OR * = NR*. [Eq.58] Hence /X,Y>0;

AB _TR*-OR* _ OR* _

[Eq.59]

OA^~ TR* ^~ TR* ^~

=> SR* = l-tm0 [Eq.60] x. = Y. = SR. = l

SR = 1 V , <Y,=>SR,<2 [Eq.61]

X; + Y

V > Y => SR_; > 0

The expression above confirms that the SR values are constant along any ray from the origin and the two extreme cases are as follows. [0099] The first extreme case is when 0_\ = 45°, in which case tan .0 = 1 =-> l - SR

= 1 => SR = 0 and the Y value is zero. When <¾ = -45° = -0\ , in which case tan 0 = -1 = 1 - SR = -1 => SR = 2 and the value is zero.

[00100] The second extreme case is when 0_\ - θι = 0°, in which case SR = 1 and X = Y. The positive slope diagonal is the locus balanced risk where X = Y and equation 77? equals OR and SR equals unity, and NR equals zero.^'

[00101] Further, the convergence of equi SR rays to unity as they approach the axis of symmetry from either the net Y value or net l value plane is demonstrated in the following. Firstly, the equation of SR may be an alternative measure to be used other than the aggregated version. From [Eq. 30] and [Eq. 60], then /X, Y > 0; i) VZ < Y = 1 < SR < 2 for the upper sector, above the 45° line ii) fX > Y => 0 < SR < 1 for the lower sector, below the 45° line. Therefore, = 0 < SR < 2

For simple demonstration on the proportionate scaling of SR is as follows:

X - Y

SR = f(X,Y) = l - [Eq. 62]

X+ Y then,

(cSR^ 2Y cSR \ 1_

as X→0, then [Eq. 64]

V cX ) (X+ Y)² V ax ) Y

[00102] Referring to FIG. 5E, the partials presented verify that the rate of change of the upper sector above the 45° line is similar but opposite to that in

^ Y y SSR

the lower sector below the 45° line . Hence the Risk Box method exhibits

Y proportionate scaling. It is then posited that this share measure should have symmetric, proportional and scaling properties with respect to both axes and its values should also exhibit the proportionality feature with respect to its co-ordinate position. In order to show the geometrical relation between (X„ /,) and VX,Y > 0, SRi consider the proposed measure of Risk Box:

This may be rewritten as,

Hence;

This shows the geometrical relationship which exhibits a linear equation.

[00103] Consider again [Eq. 61 ] which is flXj, Y,) and for (-X_i} -Yi), it is f(-X„-Y,) - 1 - = 1 +^^ = ΑΧ,,Υ,) [Eq. 67]

-X_l - Y_i i + J i This shows that the Risk Box index is symmetrical about the diagonal X, = -Y_t for each X; = Yi. Within this geometric framework, using the criteria of symmetry and proportionality, it is illustrated the characteristic of the ratio being heavy on its denominator as compared to the numerator.

[00104] The SR method may also be used for aggregated and weighted perspectives. Suppose V ,7 > 0 and a weighted versions of the risk measures in accordance to the sectoral significance of each respective sector needs to be presented. The use of weights (W) may be formulated for the proposed risk measure

X — Y.

SR, = 1 —— , which can be:

0 < SR. = ] - ^{Xi ~ Yi} < 2 => -\ < ^X' ^{~ Y}' < \ [Eq. 68]

χ, + γ, *, + Y, Using the common formulation for weighting, this can be defined as:

i

The [Eq. 69] can be an alternative measure to be used other than the aggregated version in another embodiment of the present invention.

[00105] The identification of the underlying functional relationship between the numerator and denominator reveals the proportionality and symmetrical properties. From the perspective of the Y, Share Risk co-ordinates in the top left triangle (to the left of the 45° line) representing Y is higher than X. Similarly in the bottom right triangle (to the right of the 45° line), Y will be lower than X. One benefit of the Risk Box is that any ray from the origin represents a locus of equal SR values. The equi-57? ray is exemplified in the Risk Box as shown in FIG. 5F.

[00106] However, there are limitations in the use of the Risk Box index in a dynamic context. To illustrate this limitation, assume that points A (SR=0.55), B and C in FIG. 5F represent the X and Y co-ordinates for three individual variables. A coordinate change from A to B involves only a small change in the value of the Risk Box index (e.g. SR= 0.85) indicating an increase in X. This scenario is in turn reversed in the case of a co-ordinate change from A to C involving changes in / while the Risk Box index at C remains unchanged respect to B.

[00107] Suppose Rr is a function of the independent variable B and M that is

R_T = R_T(B,M) [Eq.70]

The total differential of R_T is given by: dR_T - XdB + YdM [Eq.71]

If the [Eq.71] is needed to determine an expression for X and Y, and if there is,

¹ ' dR_j"

= Y [Eq.73]

dM)

B-M

R_T=\- [Eq.74]

B+M.

f(B,M)) _(B-M)^l(B+M) + (B+M)^{(B-M)

[Eq.75]

dB J (β+Μ) f(B, M) ) (XdB - YdM)(B + M) + (XdB + YdM)(B - M)

[Eq.76]

K dM J_B (B+M)² f(B, M) X(B + M) + X(B - M) X2B

[Eq.77]

dM (B+M)² (B+M)¹ And in other case,

f (B, M) ) (B - M)(B + M + (B+ M)(B - Mj

[Eq.78]

dM (B+M)² ' f{B, M) _ {XdB + YdM)(B - M) + {XdB - YdM){B + M)

dM J I) (B + M)

f{B, M) ) -Y(B + M) + Y{B - M) -72 M

[Eq. 80]

dM J n (B + MY (B + M)

[00108] From the method presented in Risk Box methodology, the range of Risk

Box values in association with any value for X and 7 of a specific area's landslide performance is presented. Considering the positive and negative values for X and 7 and respective value for the Risk Box, FIG. 5G exemplifies a generalized Risk Box in a new box. From FIG. 5G, a similar risk box is reproduced as shown in FIG. 5D.

[00109] Referring back to FIG. 5D, the line AOB (which is similar to the 45° line) represents where X_t = 7. It is shown that values for X (numerator) and 7 (denominator) are both associated with an initial central point O and moving to any other single point on the other sides for any negative and positive value. Movements to the right hand side or left hand side of the FIG. 5G.

[00110] Along the movements to the sides of FIG. 5G, there may be the following cases:

, then 0 < SR < l and

;vx, .7 > o. Λ', 7

i

n

then - 1 < SR < 0.

[00111] Although the Risk Box is useful in its own right, as the implications of negative and changes in and 7 needs to be examined, a new version of the Risk Box index needs to be constructed. The solution is to translate the Risk Box into another space named LDPI. This allows the data analyst or researcher to visualize the evolution of changes in intensity of dynamic counters where both X and / are changing. Therefore the LDPI that captures all changes in the unit of X and Y for any variable (where a change in X (AX) and change in Y (AY) can be positive, negative, or zero) is constructed. Let a horizontal axis consists of the set of ail A and vertical axis for AY for n specific areas ( = { 1,2,3, n}). The dimensions of the LDPI are central to the construction of the Share Risk index.

[00112] In FIG. 5G, the essential ingredient is that the length of any side is set at two times the maximum of the largest absolute value of whichever is bigger from the change in values of X and / recorded during the considered data set. Y is depicted on the vertical axis (positive or negative AY) and X also on the horizontal axis (positive or negative AX).

[00113] Therefore, the following criteria for the new measure are discussed in the context of the X and Y. Firstly, the greater the sectoral disparity in values, the greater the factor market disruption. Thus, an index should be an increasing function of the change in values (monotonic). Secondly, the factor reallocation requirements is associated with a given level of X ox Y changes and equal risk partners. Thus, the risk value is associated with an increase in factors and is equal to those associated with a factors contraction (consistency). [00114] Following the static framework (Risk Box) as shown in FIGs. 5A-5G that is a two-dimensional box, a new dynamic geometric device named the LDPI is introduced. This new approach allows visualization of evolution of transformation that is associated with ratio values changes, in which pair values of each risk ratios (X_if Y,) are represented as Cartesian co-ordinates. Suppose a proxy for a risk chosen is

X

employed by X,- as numerator and Y, as denominator values of — ratio. For any

Y_j number of specific areas Vi =1 , 2,3,..., n , the proposed LDPI (LDPIj) is defined as a function of X,- and Y,-. [00115] Consider a square two-dimensional space that captures all changes in numerator^ and denominator Y_h for any variable /^', and any period t, where changes in X (AX) and changes in Y (AY) can be positive, negative or zero. Let the risk flows for any hypothetical variable /^' consist of the set of all AX and AY. Ratio values are usually available at uniform discrete time intervals that are annually and quarterly. The dimensions of the LDPI are with respect to max(AX) and ax(AY). The essential length of any side is set at two times the maximum of largest absolute changes value of whichever is bigger from the numerator or denominator values recorded during the considerable period t. Correspondingly, the total area of LDPI is 2 x max(max|A _;|) = 2L if the largest absolute value is from AX, or 2 x max(max|A^.|) = 2L if the largest value is from Δ7, values where L is the length of one side of a LDPI.

[00116] Referring to FIG. 5H, the Δ7, values are depicted on the vertical axis

(±ΔΥ) and the AX_t values on the horizontal axis (±AX) and labeled as (±AX_max) and (±A7_max) respectively. The actual values of ±AX_max and ±A7_ma depend on which of the two is largest. This value will then be applied to both axes to ensure a perfect square.

[00117] Consider the location of an arbitrary risk change co-ordinate (AX, A Y), the lower and upper triangles ZADC and ZABC define as Net Risk for AX₍ (NRX) and Net Risk for AY, (NRY) respectively. The axes are labeled in accordance with the Cartesian plane so the LDPI comprises of four quadrants I-IV. The origin represents the unique (ΑΧ,, ΔΓ,) = 0 case. Quadrant / includes all positive and quadrant III includes all negative changes in a period of study. Quadrant // comprises of negative AX and positive AY, while quadrant IV comprises negative AF and positive AX. The 45°, AOC line is that of perfectly matched risk changes AX_t- = AF,- and zero Net values. This line with positive slope diagonal is the locus of balanced risk where AX, = AY, and LDPl, equals zero. This illustrates the risk component axis of symmetry.

[00118] Following thereafter, a plurality of lines parallel to the 45°, AOC line are termed as risk lines. For any points such as K, L and J as shown in FIG. 5H, J and L are on a same risk line having equal risk values and K have higher risk as compared to points J and L. Accordingly, AF, values will fall and A¾ values will remain unchanged in period M to P, while AX, values will rise and AY, values will remain unchanged in period M to N. Respectively, for a point such as P or M away from the AOC line, the greater is the risk.

[00119] One of the primary innovations of the LDPI is the scaling and ranking ability factor that stems directly from the LDPI construction and the LDPI is two times the absolute maximum of the largest change for the period of study. That is equivalent to 2(L) in FIG. 5H. It is important to note that AX or Δ Υ value in the denominator and numerator will only be equal when either AX or A is also the largest change during the period of study.

[00120] Considering that changes are a monotonically increasing function; Risk values requirements for both AXi values and AY; values are equal; constructing a dynamic measure of risk that satisfies criteria (I)-(1V) for variables, V/ = 1 ,2,3, n; and V(X ,Y) e then:

{-2(max { max

[Eq. 81 ] Thus, in order to have scaled and consistence estimator, the LDPI is given by:

Λ V _ Λ Y 1

LDPI, = , '— j :— -— (AX, - AY.)

' 2(max {max|A^_n |,max|Ay„|}) 2L [Eq. 82]

Thus the LDP has a range:

-, _{< LD}H_{I =} ^ ^' , , < 1

2(max {max|AX_n|,max|A7_(l |}) [00121] Consider the proposed measure of the LDP] from the [Eq. 82]_s which can be rewritten as:

LDP1 = : : : '-. : : [Eq. 83]

2(max {max|A _n |,max|A7_n |}) 2(max {max|zyf_n |,max|A7_n |})

Hence,

AX, = AY, + 2(max {max|z rJ,max|Ay„|}) x (LDPI,) = AY, + 2L x (LDPI,) [Eq. 84] This shows that for every LDPI, there exists a unique straight line with slope of unity and 2(max {max|z L,|, max|Ay,|}) X (LDPIi) intercept. This implies that LDPI values will be the same for every point (AX„ AYi) on the same line.

[00122] Considering the [Eq. 82] again, which v&fiAX_h AY,). For (-AX_h -AY,), it will be: -Λ.Υ. -Δ , )

' ( - ^■ΑΥ: ) =

2 (max ( max Ι ΔΧ,,Ι.ιικιχ I ΔΚ„Ι } )

ΛΛ^', ;.\

(-νν , . Λ}

(max. { max I ΛΑ^',,Ι. max I AY_H\ })

[Eq

This shows that LDPI provides symmetrical space about the diagonal for each AX_t

[00123] Scaling by the largest value for a given time scale (that could be months, years or even decades) allows the data analyst or researcher to observe the progress of risk change over time. A partial presentation of the [Eq. 82] computes to:

and

which verifies the rate of change of the LDPI in the upper sector of FIG. 5H, which

Therefore, the

LDPI exhibits proportional scaling.

[00124] FIG. 51 exemplifies the LDPI exhibiting proportional scaling. When using the LDPI to measure the changing levels, it is summed at a disaggregated level. Therefore, appropriate weights to measure the changes needs to be chosen. The solution for weighted index by the significance of the sector is as follows. Referring to [Eq. 82] and by employing common weighting function, the weighted LDPI, is LDPl_wi = ~ '- [Eq. 88]

∑(/.D/7,)

[00125] The [Eq. 88] provide a multilayered view of changes of each value of ratio. Thus LDPI^* (as the top most layer) will encapsulate all the LDPI, cells. LDPI can be applied equally to varieties of distributional forms, thus making the technique particularly useful in ratio analysis where a diverse set of distributional functions have been identified.

[00126] For any Δ¾ AY, and for any (Δ¾ - AY,), provides a range of

-co < ( Xi - AY,) < +∞. But according to LDPI box and derivatives and measures with respect to the [Eq. 82], LDPI values provides a scaled method and this function will be between unity and negative unity -1 < LDPI, < +1.

[00127] As the new transformation (LDPI) is naturally bounded and unaffected by distance between observations, outlier effect if present will be reduced. Similarly, distance data containing white noise and the sensitivity and power of statistical test are improved. Negative values will be transformed to specific variation, thus removing the necessity of deletion of negative data used in the previous studies. Besides, proportionality is a theoretical assumption that may not in fact hold and the degree of departure varies across industries and size classes. Thus if the relationship between elements of a ratio is constant over time, size and industry, then the proportionality effect will be satisfied for ratios by using the LDPI. According to properties of the LDPI, it is suggested to utilize this methodology for ratio analysis, which can provide a conceptual and complimentary methodological solution to many problems associated with the use of ratios.

[00128] The specific indicators (X | , X₂, . . . , XN) of the landslide disaster are integrated 2 by 2 in to a plurality of points to provide the multidimensional space which includes all the variables 2 by 2 and measure all together. All of the variables are examined, finding the upper and lower boundary for each variable, which can be recognized as a disaster risk variable. All of the points that will be in a 3-dimensional space, are in a plane geometry which can be identified by the rules and equations as

follows:

ΔΧ,. - ΔΖ,

LDPI₃ = -(AX, - AZ_t)

2L ' ' 2(max{max|AX|_n,max|AZ|_n}) [00129] FIG. 5K illustrates a 3-dimensional (3D) integrated index using the rules for "Vector Equations of Plane Geometry". In predicting landslide disasters, it is more applicable to utilize the "Co-Planar Vectors". This results to an equation of plane in 3D space. The distance from the origin is measured as follows:

QuickTime™ and a

TIFF (Uncompressed) decompressor

are needed to see this picture.

which is the distance of the plane vector from the origin. [00130] Referring to FIG. 5L, the same procedure above runs for the multidimensional space which is considered as dimensions as (x, y, z, n). Additionally, similar to 3D equations and vectors, there is provided multi-vectors and multi- dimensions. In this multi-dimensional space, the distance will be measure from the origin.

[00131] Step 405 in the method 400 tabulates all of the data collected from step

401 into a tabular form. FIGs. 6A-6B provides screen shot images of tabulating and representing the data in the LDPS as one embodiment of the present invention. The LDPS is based on the LDPI, which provides visualization and ranks each ratio combination separately. The LDPS also allows researchers or data analysts to view charts that may be used to emphasize different points of views of the data.

[00132] The LDPS comprises a main page where a user (e.g. researcher or data analyst) chooses the source input tabular file with any desired combination; a combination of any selected one that is drawn and each point provides the value for LDPI; and a window which provides a rank of all the specific areas with names and their values for LDPI.

[00133] In FIG. 6A, a source file is first chosen and the Excel file including the names of the various specific areas and related data for each landslide is loaded. In FIG. 6B, after the data set is loaded, the user chooses any combination by pressing on each desired combination that allows the user to view that particular combination for all specific areas for the specific inputted data set. By positioning a mouse pointer over any specified points in the LDPS, the following LDPI value appears at the bottom right side of the window. In FIG. 6C, another display automatically computes the LDPI value for each specific area. The names of specific areas (Point A, Point B, etc.) are therefore ranked with respect to its LDPI values.

[00134] The LDPS illustrates all the specific areas that may have a landslide disaster occuring for every specific combination over the sample data set, and provides the ranking all the specific areas in another window, indicating the satisfactory combinations. FIGs. 6A-6C illustrates the various situations of the specific areas. The LDPS is with the advent of simplicity and accessible computer program. It is adapted to provide a dynamic representation of a landslide's activities, utilizing control and display data sets. It also allows the user to explore and visualize how the landslide at various aspects is operating, or to identify the disaster risk landslide area based on the rankings of the specific areas.

[00135] The above description illustrates various embodiments of the present invention along with examples of how aspects of the present invention may be implemented. While specific embodiments have been described and illustrated it is understood that many charges, modifications, variations and combinations thereof could be made to the present invention without departing from the scope of the present invention. The above examples, embodiments, instructions semantics, and drawings should not be deemed to be the only embodiments, and are presented to illustrate the flexibility and advantages of the present invention as defined by the following claims:

Claims

1 . A method for predicting a landslide disaster on a specific area, the method comprising: pre-determining data having variables of the specific area that are associated to landslide, wherein the variables includes geological data, hydrological data, and landslide historical data; pre-classifying each of the variables with specific indicators and boundaries, wherein the specific indicator is associated to a rating of the respective variables, wherein the boundaries includes an upper and lower boundary for each of the variables; generating a mathematical multi-dimensional index for the specific area, wherein the mathematical multi-dimensional index comprises visualizations and rankings of the specific areas through a Cartesian co-ordinate, wherein the Cartesian co-ordinate represents a risk ratio of the specific indicator; ranking the specific areas based on the Cartesian co-ordinates; and identifying a disaster risk landslide area based on the rankings of the specific areas generated from the mathematical multi-dimensional index.

2. The method according to claim 1 , wherein the pre-classifying of the variables further comprises a plurality of statistical approaches, wherein the plurality of statistical approaches include a Multiple Discriminant Analysis (MDA), a Logistic Regression, a Robust Logistic Regression and a Genetic Programming.

3. The method according to claim 1 , wherein the upper and lower boundary for each of the variables determines a high risk and a low risk respectively.

4. The method according to claim 1 , wherein the high risk is associated to a higher probability of the landslide disaster occurring.

5. The method according to claim 1 , wherein the low risk is associated to a lower probability of the landslide disaster occurring.

6. The method according to claim 1 , wherein the mathematical multi-dimensional index is symmetric; proportionate; and non-invariant.

7. The method according to claim 1, wherein the mathematical multi-dimensional index is further compared with a Limit Equilibrium Method (LEM) to determine accuracy.

8. The method according to claim 1 , wherein the variables in the specific area are individually measured without any limitation and deletion.