DSS for implementing systemic approach to forecasting

A computer based decision support system is proposed the basic tasks of which are modeling and forecasting of financial processes and credit risk estimation. The system is developed on the basis of system analysis principles, i


Introduction
Very often it is possible to reach acceptable quality of forecasting dynamic processes using available on market data processing instruments. For example, well known SAS contains sophisticated data processing procedures usually capable of producing high quality final results [1]. However, such instruments are usually very costly, require special training for solving specific problems and need highly developed computers for implementation. All this creates conditions of rather restricted access for their usage except for banking system and large enterprises. Besides, new modeling and forecasting techniques that appear from time to time in specialized publications need to be appropriately implemented and approbated. To solve this problem it is would be reasonable to develop simpler and much less costly modeling systems constructed with taking into consideration modern principles of system analysis. The systemic approach is based on the results of application of systems analysis methodology.
Processing of statistical data, represented by the time series, is usually accompanied by uncertainties of various kind and nature. More particularly, these are (at least) uncertainties of structural, statistical and parametric form. The structural uncertainties are usually encountered in cases when analysis of time series data doesn't exhibit clear structure for respective model describing it. Remind that the notion of model structure includes the following elements: dimensionality (number of equations comprising a model); model order (highest order of a model equation); input delay time (lag) for independent variables (regressors); nonlinearity and its type (nonlinearity in variables or in parameters); stochastic disturbance and its type (distribution and its parameters) [2]. For example, the model order or input delay time (lags for independent variables) cannot be determined exactly enough using correlation analysis or appropriate time lag estimation algorithms. It is also not always possible to determine in a unique way type of nonlinearity, especially in the cases of processing short samples. The statistical uncertainties exist due to impossibility to determine in a unique way type of probability distribution for stochastic disturbances and the collected data itself, and as a consequence parameters of the distributions, degree of nonstationarity for the stochastic processes etc. As a consequence for existence of the two previous types of uncertainties we can trace the parametric uncertainties: mathematical model parameter estimates, computed with statistical data, can be not consistent, contain bias, and be inefficient.
Thus, existence of process and data uncertainties, and necessity for hierarchical organization for the data processing system as well as necessity for the functional completeness of the whole processing require development and application of systemic approach [3,4], that provides a possibility for solving many substantial problems encountered in statistical data analysis, model constructing, forecasting and generating decision alternatives. In this study we consider some possibilities for constructing data processing scheme based on the principles of systemic approach.

Problem Statement
The purpose of the study is as follows:to consider some principles of the systems analysis that are appropriate for solving the problem of successful short-term forecasting;to develop an efficient data processing scheme for implementation in decision support systems based on these principles;to consider some uncertainties inherent in model building and forecasting process and to show the ways for their elimination;to show advantages of the approach developed.

Some System Analysis Principles Used
For development and implementation of modeling and forecasting system it is proposed to use the following systems analysis principles [4]: the systemic coordination principle; the principle of procedural completeness; the functional orthogonality principle; the principle of mutual informational dependence; the principle of goal directed correspondence; the principle of functional rationality; the principle of multipurpose generalization; the principle of multifactor adaptation, and the principle of rational supplement. The tasks to be solved arising from the systems analysis approach application to forecasting illustrates Fig. 1. A p r i l 0 5 , 2 0 1 5 According to the principle of systemic coordination all the techniques, approaches, and algorithms used in the system should be structurally and functionally coordinated, and should be mutually dependent (linked). This way it is possible to reach a unique systematic methodology for data analysis in the frames of modern decision support system (DSS) constructed, and to improve the quality of intermediate and final results.
The principle of procedural completeness guaranties that DSS developed will provide the possibility for timely and in place execution of all necessary functions directed towards data collection (editing and renewing), formalization of a problem statement, model constructing, computing forecast, and for estimating quality of the model and the forecasts based on it.
Implementation of all computational procedures in the DSS using mutually independent functions corresponds to the principle of functional orthogonality. Such approach to the DSS constructing favors to substantial enhancement of computational stability of the system and simplification of its possible further modifications and functional expansion. According to the principle of mutual informational dependence the computing results, generated by each procedure, should correspond to the formats and requirements of other procedures (or modules). This is provided with respective project development solutions for the system created.
Application of the principle of goal directed correspondence to computational procedures and functions provides a possibility for reaching of a unique goalhigh quality of the final result in the form of forecast estimates for the process under study, as well as decisions based on the forecasts.
The principle of functional rationality prevents duplication of separate DSS functions. As a result of application of the principle it is possible to reach economy of respective computational resources.
According to the principle of multipurpose generalization all the functional modules developed possess necessary degree of generalization what provides a possibility for reaching high quality solutions for a set of distinctive problems that belong to the same class (say high quality forecasting for linear and nonlinear nonstationary processes). Among these problems are the following: accumulating and preliminary data processing; estimation of structure and parameters for the necessary mathematical models; constructing forecasting functions and computing of the forecasts; selecting the best results using appropriate sets of statistical quality criteria.
The principle of multifactor adaptation is directed towards the possibility of solving the problems of computing procedures adaption to the problems of modeling processes of different complexity depending on the completeness of available information and user requirements. The adaptation is performed in the process of model structure and parameters estimation, i.e. the whole identification process of an object under study is represented by a set of adaptive procedures directed towards reaching the main goal: constructing adequate model and computing high quality forecasts.
The use of the rational supplement principle provides a possibility for enhancing the sphere of application of the DSS constructed by expanding it with new data types, computing procedures and criteria. The new procedures could be directed towards implementation of additional preliminary data processing procedures, model structure and parameter estimation, generating forecasting functions as well as selection of the best result for its further use aiming generation of appropriate decision.
A hierarchical structure of the DSS proposed for data analysis in the form of time series is given in Fig. 2. Implementation in the DSS of the system analysis principles mentioned above favors its functional flexibility, computational reliability, enhancement of quality for the final results, expansion of system life span in general, and simplification of drawback elimination and modification procedures.
Consider some functional elements of the general hierarchical structure of the DSS proposed that uses some principles of systemic approach (Fig. 1). After preliminary data processing directed towards data preparation for further model building, statistical analysis of data is performed aiming to determining type of the process under consideration (linear/nonlinear, and stationary/nonstationary) and model structure estimation. An important and useful feature of the system is that it uses separate sets of statistics for analyzing quality of data, adequacy of model constructed and quality of forecast estimates generated with the model. If the forecasts are used for generating alternative decisions then we include another set of criteria for testing decision quality. This way high quality forecasts are generated and the decisions based on these forecasts are usually acceptable.
Further statistical analysis of data is performed on the following purposes: testing for heteroskedasticity, for integration (availability of trend), nonlinearity, preliminary model structure estimation using correlation techniques, lag estimation and determining type of data distribution. After this step it is possible to determine class of the process under study and estimate the elements of model structure mentioned above in the introduction. The forecasting methods used in the system are as follows: regression analysis (linear and nonlinear models), the group method for data handling (GMDH), fuzzy GMDH, fuzzy logic, appropriate versions of Kalman filter, neural nets, support vector regression, nearest neighbor and probabilistic type techniques (Bayesian networks, Bayesian regression).  The information of a model structure provides a possibility for selection of parameters estimation techniques for candidate models among which are: ordinary and nonlinear LS (NLS), recursive LS (RLS), maximum likelihood (ML), and some versions of Markov Chain Monte Carlo (MCMC). Some special optimization techniques are applied in a case of estimating fuzzy GMDH structures.
The candidates models estimated are tested with a set of statistical quality criteria some of which are shown in Fig. 1. One or more acceptable models are used for computing forecasts that are also tested for quality with another set of statistical criteria. Usually two or more forecasting techniques are used for the same dataset to get a possibility for combining the forecasts so that to further improve the final combined estimate.

Uncertainties Identification And Processing
Processing statistical uncertainty. The most often met statistical uncertainties related to model development and estimation of forecasts are provoked by the following factors:  measurement errors (noise) that is available practically in all cases of data collection independently on the data origin (including economy and finances);  stochastic external disturbances that usually negatively influence the process under study and shift the processes from desired mode (say, offshore capital transfer from some country, low quality of higher administration, unstable often changed laws);  missed measurements (observations) and outliers;  multicollinearity that requires special data processing techniques to reduce degree of mutual correlation between separate time series.
The most often means used to fight measurement noise and external stochastic disturbances are digital and optimal filters (among them Kalman filter is a widely used instrument) [5]. Digital filters (DF) help to select for subsequent processing the frequency band of interest by processing the time series data with linear structures that could be represented by autoregression (AR) or autoregression with moving average (ARMA) equations of the type: Appropriately designed adaptive Kalman filter provides a possibility for covariances estimation for stochastic disturbances and measurement noise as well as estimation of short-term forecasts [5,6]. Optimal filter design requires model of the process (system) under study in the state space form: where ) (k Usually such double argument is not used in text to avoid symbols overload for mathematical expressions. The main advantage of the model (1), (2) is that it includes explicitly two random process consequently it is more adequate to reality than say linear regression. The main task of optimal filter is in computing state vector estimates with taking into consideration statistical characteristics (covariances) of the two stochastic processes mentioned. Such approach provides a possibility for improvement of state estimates and to compute estimates of nonmeasurable components of state vector ) (k x if such are available. The main equation of the filter is as follows: where ) (k K is optimal coefficient of a filter in matrix form. The coefficient is computed by minimizing the functional: is exact value for state vector that could be found using deterministic part of model (1). In linear discrete case the coefficient is rather easily computed by solving Riccati equation.
Thus, optimal filter provides a possibility for reducing uncertainties in the form of influence of two stochastic processes ) (k w and ) (k v , and estimation of nonmeasurable components of a state vector when respective components of covariance matrices are known (we mean covariance matrix for system state vector estimation errors). Especially useful are adaptive versions of the filters that are most suitable for practical applications (in on-line and off-line modes of operation). They are suitable for repeated estimation of system (object) matrices ) (k A and ) (k B as well as covariances of the two stochastic processes mentioned [5,7]. Coping with uncertainties appearing due to missing observations. For the data in the time series form the most suitable imputation techniques are:simple averaging when it is possible (when only a few values are missing);generation of forecast estimates with the model constructed using available measurements;generation of missing (lost) estimates from distributions the form and parameters of which are again determined using available part of data;the use of optimization techniques, say appropriate forms of EM-algorithms (expectation maximization);exponential smoothing etc. The simplest model that could be hired for generating forecasts is AR(1): The last expression means that for stationary AR or ARMA processes the estimates of conditional forecasts asymptotically ) (   s converge to unconditional mean (long-term forecast). It should also be mentioned here that optimal filter can also be used for missing data imputation because it contains "internal" forecasting function that provides a possibility for generating quality short-term forecasts.
Further reduction of the uncertainty is possible thanks to application of several forecasting techniques to the same problem with subsequent combining of separate forecasts using appropriate weighting coefficients. The best results of combining the forecasts is achieved when variances of forecasting errors for different forecasting techniques do not differ substantially.
Coping with uncertainties of model parameters estimates. Usually uncertainties of model parameter estimates such as bias and inconsistency result from low informative data, or data do not correspond to normal distribution, what is required in a case of application LS for parameter estimation. This situation may also take place in a case of regressors multicollinearity and substantial influence of process nonlinearity that for some reason has not been taken into account when model was constructed. When power of data sample is not satisfactory for model construction it could be expanded by applying special techniques, or simulation is hired, or special model building techniques, such as GMDH, are applied. Very often GMDH produces results of acceptable quality with short samples. If data do not correspond to normal distribution, then ML technique could be used or appropriate Monte Carlo procedures for Markov Chains (MCMC) [8]. The last techniques could be applied with quite modest computational expenses when the number of parameters is not large. A p r i l 0 5 , 2 0 1 5 Dealing with model structure uncertainties. When using DSS, model structure should practically always be estimated using data. It means that elements of model structure accept almost always only approximate values. When a model is constructed for forecasting we build several candidates and select the best of them with a set model quality statistics. Generally we could define the following techniques to fight structural uncertainties: gradual improvement of model order (AR(p) or ARMA(p, q)) applying adaptive approach to modeling and automatic search for the "best" structure using complex statistical quality criteria; adaptive estimation (improvement) of input delay time (lag) and data distribution type with its parameters; describing detected process nonlinearities with alternative analytical forms with subsequent estimation of model adequacy and forecast quality. An example of complex model and forecast criterion may look as follows: is sum of squared model errors; DW is Durbin-Watson statistic; MAPE is mean absolute percentage error for one step-ahead forecasts; U is Theil coefficient that measures forecasting characteristic of a model;  , are appropriately selected weighting coefficients; i  is parameter vector for th i  candidate model. A criterion of this type is used for automatic selection of the best candidate model. The criterion also allows operation of DSS in adaptive mode. Certainly, other forms of the complex criteria are possible. While constructing the criterion it is important not to overweigh separate members in right hand side.
Coping with uncertainties of a level (amplitude) type. The use of random (i.e. with random amplitude or a level) and/or non-measurable variables leads to necessity of hiring fuzzy sets for describing such situations. The variable with random amplitude can be described with some probability distribution if the measurements are available or they come for analysis in acceptable time span. However, some variables cannot be measured (registered) in principle, say amount of shadow capital that "disappears" every month in offshore, or amount of shadow salaries paid at some company, or a technology parameter that cannot be measures on-line due to absence of appropriate gauge. In such situations we could assign to the variable a set of possible values in linguistic form as follows: capital amount = { very low, low, medium, high, very high }. There is a necessary set of mathematical operations to be applied to such fuzzy variables. Finally fuzzy value could be transformed into exact form using known techniques.
Probabilistic uncertainties and their description. The use of random variables leads to necessity of constructing probability distributions and their application in inference procedures. Usually an observation value is known only approximately though we know the limits for the values. Probability distributions are very useful for describing such situations. When dealing with discrete outcomes, we assign probabilities to specific outcomes using a mass function. It shows how much "weight" (or mass) to assign to each observation or measurement. An answer to the question about the value of a particular outcome will be its mass. The Kolmogorov's axioms of probability are helpful for deeper understanding of what is going on. If two or more variables are analyzed simultaneously we come to necessity of constructing and use joint distributions. Joint distributions allow us calculate conditional probabilities using renormalization procedures when necessary. Very helpful for performing probabilistic computations is a notion of conditional independence: , where x and y are independent events. The identities of independence are very handy though one should be careful when using them, i.e. the events should be really independent. The remarkable intuitive meaning of discrete Bayes' law, , is that it allows to ask reverse questions: "Given that event A happened, what is the probability that a particular event B evoked it?" [9].
The marginal probability, ) (B P , can be computed from conditionals. The probability that B occurs in general )) ( ( B P can be obtained from the condition: Probabilistic types of uncertainties regarding whether or not some event will happen can be taken into consideration with various probabilistic models. To solve the problem of describing and taking into account such uncertainties a variety of Bayesian models could be hired that are presented in the form of so called Bayesian Programming formalism. A set of the models includes Bayesian networks (BN) [10], dynamic Bayesian networks (DBN), Bayesian filters, particle filters, hidden Markov models (discrete and continuous), Kalman filters, Bayesian maps etc. The structure of Bayesian program includes the following elements (steps): (1) problem description and statement formulation with a basic question of the form:

Example Of Dss Application
Numerous examples of model constructing have been solved with the DSS created. In this specific example we used the database consisting of 4700 records that was divided into learning sample (4300 records), and test sample (400 records). The default probabilities were computed and compared to actual data, and also errors of the first and second type were computed using different values of cut-off value. It was established that maximum model accuracy reached for Bayesian network was 0.787 with the cut-off value 0.3. The Bayesian network is "inclined to over insurance", i.e. it rejects more often the clients who could return the credit. The model accuracy and the errors of type I and type II depend on the cut-off value. The cut-off value determines the lowest probability limit for client's solvency, i.e. below this limit a client is considered as such that will not return the credit. Or the cut-off value determines the lowest probability limit for client's default, i.e. below this limit a client is considered as such that will return the credit. As far as the cut-off value of 0.1 or 0.2 is considered as not important, in practice it is reasonable to set the cut-off value at the level of about 0.25 -0.30. Statistical characteristics characterizing quality of the models constructed are given in table 1. The results of computing experiments lead to the conclusion that today scoring models and Bayesian networks belong to the set of the best instruments for banking system due to the fact that BN provide a possibility for detecting "bad" clients and to reduce financial risks caused by the clients. It also should be stressed that DSS constructed is very useful instrument for a decision maker that helps to perform quality processing of statistical data using different techniques, generate alternatives and to select the best one relying on a set of appropriate criteria. The system performs tracking of the whole computational process using separate sets of statistical quality criteria at each stage (level of hierarchy) of decision making: quality of data, models and forecasts (or risk estimates).
Thus, the systematic approach to forecasting proposed helps to develop DSS that possesses the features of directed search for the best forecasting model in respective spaces of structures and parameters, and consequently to enhance its quality. Preliminary computational experiments with actual data showed high usefulness for practical applications of the systemic approach to modeling and forecasting and necessity for it further refinement in the future studies. It is especially important to improve descriptions for the uncertainties mentioned and to use them for reduction degree of uncertainty in the process of model building and forecast estimation.

Conclusions
The methodology was proposed for constructing DSS for mathematical modeling and forecasting of economic and financial processes, and credit risk prediction (estimation) that is based on the following system analysis principles: hierarchical system structure, taking into consideration of probabilistic and statistical uncertainties, availability of adaptation features, generating multiple decision alternatives, and tracking of computational processes at all the stages of data processing with appropriate sets of statistical quality criteria.
The system proposed has a modular architecture that provides a possibility for easy extension of its functional possibilities with new parameter estimation techniques, forecasting methods, financial risk estimation, and generation of decision alternatives. High quality of the final result is achieved thanks to appropriate tracking of the computational processes at all data processing stages: preliminary data processing, model structure and parameter estimation, computing of short-and middle-term forecasts, and estimation of risk variables (parameters) as well as thanks to convenient for a user intermediate and final representation of results. The system is based on the ideologically different techniques of modeling and risk forecasting what creates a good base for combination of various approaches to achieve the best results. The example of the system application shows that it can be used successfully for solving practical problems of forecasting and risk estimation. The results of computing experiments lead to the conclusion that today scoring models, nonlinear regression and Bayesian networks are the best instruments for banking system due to the fact that they provide a possibility for detecting "bad" clients and to reduce financial risks caused by the clients. It also should be stressed that DSS constructed turned out to be very useful instrument for a decision maker that helps to perform quality processing of statistical data using different techniques, generate alternatives and to select the best one. The system performs tracking of the whole computational process using separate sets of statistical quality criteria at each stage of decision making. The DSS can be used for supporting decision making process in various areas of human activities including development of strategy for banking system and industrial enterprises, investment companies etc.
Further extension of the system functions is planned with new forecasting techniques based on probabilistic techniques, fuzzy sets and other artificial intelligence methods. An appropriate attention should also be paid to constructing user friendly adaptive interface based on the human factors principles.