MODELING THE DEFAULT PROBABILITY OF THE RUSSIAN BANKS

Importance The article focuses on modeling of the default probability of the Russian commercial banks. The research reviews two categories of the Russian commercial banks, i.e. those with their licenses recalled by the Central Bank of Russia within August 2013 through May 2016 and the banks that are still in operation. We investigate the reliability and sustainability of credit institutions, and factors that fuel the default. Objectives The research builds up an econometric model for evaluating the probability of banks' default in line with the specifics of the Russian market. Methods Logistic regression is used to determine whether bankruptcy is probable, since it considers figures of financial statements and some institutional factors. The information framework comprises quarterly reports of the Russian commercial banks, which subsequently went bankrupt. Results The article outlines trends in the contemporary banking system, shows key stages of setting up a model for evaluating the probability of the Russian commercial banks' default. Based on properties of the model, we conclude that it is of high quality in terms of statistical significance and economic substance. Conclusions and Relevance The findings can prove useful for researchers who study bankruptcy of credit institutions, and banks' management. The model can be also practiced by banking oversight agencies of the Russian Federations for purposes of remote monitoring, and companies, which are choosing the bank for servicing their accounts. The simplicity and understandability of data allow analyzing banks from perspectives of their would-be customers.


Introduction
Sustainable development of the banking sector is a top priority for financial supervision authorities. To plan their activities and prevent possible crisis, the authorities develop and improve a set of measures for monitoring, identification, control and forecast of possible risks, on an ongoing basis.
Nowadays, there is growing interest in early warning systems, which detect banks exposed to the default risk. In addition to governmental regulators, commercial banks also emphasize the importance of models for detecting bankruptcy, since these techniques will timely flag possible troubles and make the bank undertake recovery measures, thus avoiding future losses. Every year scholars present more papers focusing on various aspects of banks' operation, and modeling the default probability of commercial banks, in particular. Here we should spotlight proceedings by A . A . V a s i l y u k , S . A . G ol o v a n ' , A . M . K a r m i n s k i i , A . V . K o p y l o v , A . V . K o s t r o v , T . N . M u r z e n k o v , A.A. Peresetskii [1][2][3][4][5][6][7], whose expertise made a considerable contribution to this article.
The above proceedings review the specifics of modeling the default probability of banks in the Russian Federation on the basis of national financial statements, macroeconomic and institutional data. Furthermore, scholars pay much attention to testing the reliability of models, and a comparative analysis of econometric models of the default probability (regarded as basic logit-regression) and alternative models.
The logit-regression constitutes the base, because, we believe, it is the logistic model only that provides accurate results corresponding with actual bankruptcy cases, as compared with other schemes. Hence, we decide to analyze closed banks more thoroughly and exclude banks that have breaches out of the sample.
As its leading idea, Peresetskii's paper [6] divides default causes in two parts, i.e. poor financial standing of the credit institution and fraud and money laundering. The research is based on financial statements of the Russian banks, whose licenses were recalled after Q2 2005 through Q4 2008. As the outcome shows, higher quality of the default probability model requires to single out those banks that are involved in money laundering, and exclude them out of the sample.
Macroeconomic indicators are used with reference to the hypothesis stating than the bank's sustainability depends on cyclically changing external conditions. Authors referred to hereinafter [2,3] scrutinize whether macroeconomic variables can be applied to the model. Based on econometric models of binary choice, we evaluated the bankruptcy probability of the Russian banks within 1996 through 2002.
If macroeconomic indicators are added to the model, they improve the statistical quality of the model and reduce errors. Moreover, we complemented the model with such parameters as balance sheet profit, credit to the economy, non-governmental debt obligations.
In one of the recent empirical researches, a group of authors led by A.M. Karminskii [3] reviews the banking sector of Russia in terms of objectives risk managers of major credit institutions and the principal regulator should meet. Following the regression analysis and the respective sample of the Russian banks for the 1998-2011 period, authors made noticeable conclusions.
First, they empirically proved the assumption of non-linear interactions (quadratic dependency) of selected factors.
Second, the researchers managed to significantly improve the quality of the final model as they used macroeconomic factors and indicators of the institutional environment (for example, year, Consumer Price Index, unemployment rates, etc.).
The research Edward Altman carried out in 1968 became the first and foremost study into modeling the default probability of banks [8]. He performed a multiple discriminant analysis to classify foreign companies as sustainable and unsustainable by analyzing their financial statements.
The economist proposed Z-score that was regarded as immediate measurement of the risk. In his research, E. Altman considered relative indicators as factors, i.e. working capital/total assets, retained earnings/total assets, earnings before interests and taxes/total value of assets, market value of equity/carrying amount of all liabilities, and revenue/total assets. The model underwent multiple transformations afterward, paving the way for further researches by Altman.

Review of Default Probability Models
Currently, there exist a lot of mathematical models to evaluate whether banks are exposed to the default risk. The list below includes the most known ones: • market models. They are based on market data on listed securities. Such models can be subdivided into structural and compressed; • models based on financial reporting and accounting data [20]. Depending on the statistical method used, there can be score models, models based on a discriminant analysis and binary choice models; • models based on macroeconomic factors; • models used by international rating agencies; • non-parametric models. For purposes of this article, we model the default probability of banks using the logistic regression, which pertains to the class of binary choice models.
Nowadays, researchers prefer logit-models, though the practice shows that results based on probit-and logit-models usually coincide.
The main distinction of such models is that a dependent variable is binary, i.e. it can be 1, if the bank is declared bankrupt, and zero in the contrary case. This approach prevents the default probability from breaking the bounds of the section [0; 1]. It also allows for non-linear dependence of the default probability on explanatory factors used.
The logistic regression has the following formula: where P(y i = 1) stands for the bankruptcy probability of the i-bank; which stands for a linear combination of independent factors; b j is the regression coefficient for the j-factor; x ij is a value of the j-factor for the i-bank.

Characteristics of the Subject
When data for modeling are gathered, it becomes necessary to define the concept of default, since the initial sample of banks with recalled licenses also contains those banks that were deprived of their licenses due to unreliable financial statements, fraud (money laundering, financing of terrorism).
We should introduce the following definition stating that the bank shall be deemed bankrupt only if one of the following conditions is met: • equity capital adequacy falls below 2 percent; • equity (capital) becomes lower than the minimum authorized capital as of the bank incorporation date; • credit institution has lost its equity entirely; • bank fails to make reserves and provisions as required by the Central Bank of Russia; • bank is unable to perform its monetary obligations to creditors; Information on instances and causes of license recalls from the Russian bank was collected from relevant orders issued by the Central Bank of Russia. The selected population includes 139 commercial organizations (19.7 percent of the total sample), which went bankrupt after August 2013 through May 2016, and had publicly available financial statements for the period from two to six quarters before their bankruptcy.
We match defaulting banks and identical entities, which have similar net assets but were not declared bankrupt.
As a result, we selected 560 banks (80.3 percent of the sample). The sample comprised 699 banks.
To construct logit-regressions, we split the sample in two parts. Part one that underlies models Thus, we produced a set of possible explanatory variables (Tab. 1) to assess relative figures.
In constructing the model we did not use absolute values of financial indicators, but their derivative and relative values. Absolute values were mainly divided by net assets so to balance the size of each bank. As a result, we formed a series of financial coefficients selected by their discriminatory power (based on ANOVA) in relation to bankrupt banks and banks that avoided their default.
Financial indicators (Tab. 2) were finally selected by choosing an optimal combination of factors in terms of the model quality and including indicators of each grouping on the step-by-step basis.
The final selection made us refuse to use the following variables: netprofit_netassets (correlated with profit_netassets), liquidity_liabilities, overdue_cashbal, g r a t e d l o a n s _ n e t a s se t s , d e p o s i t s _ n e t a s s e t s , overdue_reserves.

Addressing the Unbalanced Nature of the Sample and Determining the Forecast Horizon
The logit-regression is distinct since the model shall be trained with defaulting banks and operative banks. We note the disparity of data in the initial sampling, because there are fewer observations of bankrupt banks than those in relation to operative banks.
To mitigate data misstatement, we applied the following balancing method. We reviewed three options of the sampling structure -initial sample, sample with 35-percent share of bankrupt banks and 1:1 sample.
Moreover, we manually formed 10 sub-samples for each structure so to include all 139 bankrupts and a certain amount of random stable institutions. Hereinafter coefficients and results were equated to the arithmetic mean of 10 models computed with coefficients and classification results.
As the number of observations rose, the general precision of accurately classified values of the model increased (from 70 up to 82.7 percent), however, the number of secondary errors grew as well (labeling unreliable banks as sustainable).
The increasing number of bankrupt banks in each sub-sample helped to address insufficient sensitivity of the model and increase this indicator from 21.4 up to 48.7 percent. Considering this aspect and changing significance of coefficient, the sample of 139 bankrupt banks (35 percent) and 256 operative credit institutions (65 percent) seems to be the most appropriate one.
As the following step, we had to find an appropriate forecast horizon, which would allow to determine the bankruptcy probability beforehand. We herein constructed logistic regressions using the selection of relative financial variables in relation to each horizon separately (from two to six quarters, on a quarterly basis). Fig. 2  In practice, the forecast horizon depends on objectives of the model used [7]. To pinpoint banks that possibly may not survive, it is even possible to apply the model to the horizon of six quarters (one year and a half), thus invigorating activities for improving the bank's sustainability.
The period of four quarters is considered as the optimal forecast horizon, since this forecast horizon brings the AIC criterion to its lowest limit and makes the area below the curve remain 0.7.

Analyzing the Institutional Factors
If the specifics of the external environment of the bank is taken into account, it allows to determine the default probability more precisely. We reviewed three institutional variables reflecting whether the bank has branches, participates in the deposit insurance system and where its headquarters are located (Tab. 3).
When we introduce the ACB variable, statistical qualities of the model deteriorate. That is why we have to deny its further examination.
In addition to factors of branches and locations, we considered the bank's size, i.e. a logarithm of net assets LNnetassets. Whereas it is unclear how the size of the bank influences the default probability, we used the second degree polynomial in relation to the variable reflecting the size of the bank (LNnetassets2). It helps us take into account possible U-type behavior of the dependence [4].

Evaluation of the Model Quality
Following the research, we devised the logit-regression in line with relative figures of financial statements, institutional factors and the size of the bank (Fig. 3). At a 1-percent level, coefficients of the following variables have significance: the location of headquarters, rate of the bank's long-term liquidity H4, ratio of carrying amount to net assets, ratio of total deposits of individuals to net assets, ratio of liquid assets to net assets, ratio of provisions for possible losses to net assets, ratio of other banks' accounts (correspondent accounts) to net assets, logarithm of net assets, square of the net asset logarithm. At a five-percent level, the parameter with the explanatory variable of the existence of branches has significance. ROC-curve gets the appearance as depicted in Fig. 4. AUC (area under the ROC-curve) provides the quantitative interpretation of the ROC, and becomes 0.888, with 95-percent confidence interval corresponding with the area indicators ranging from 0.853 to 0.953.
As our next step, we evaluate the quality of the model using the classification table (Tab. 4), which shows how many observations were correctly classified by their a priori category, and how many times the model provided erroneous inference.
It is possible to mitigate errors in the classification of categories by changing the cut-off threshold, i.e. the probability indicator that separates a priori classes. In the context of this research, it is especially important to avoid false negatives (labeling unreliable banks as sustainable). Hence, having analyzed the classification diagram, we equaled the cut-off threshold to 0.3.
In fact, the cut-off threshold depends on the stringency of the regulator's approach to remote monitoring of banks' operations.

Economic Analysis and Interpretation of the Model
For purposes of economic analysis, it is most interesting to interpret the model. Explanatory variables were split into groups: 1. Variables relating to loans issued and deposits (corresp_netassets, depindiv_netassets).
When the ratio of other banks' accounts (in correspondent accounts) to net assets grows, it increases the probability of the bank's default. When the ratio of individuals' deposits to net assets increases, it also makes the bank's default more probable.
Deposits constitute not only a pool of the bank's resources, but also its liabilities for temporarily raised funds. We assume the specifics stems from the entities' proclivity to a banking panic when the sector faces massive withdrawals of deposits from one or several banks, thus causing the crash of the credit institution because it becomes unable to discharge its obligations to depositors.
2. Variables relating to profit (profit_netassets). When profit_netassets gets smaller, it has a positive impact of the default probability, being economically consistent because profit is the main source of funds for development.  Non-current liquidity ratio Н4 curbs the solvency risk in case funds are invested in non-current assets. The highest acceptable numerical value of Н4 is set at 120 percent 2 . As the non-current liquidity ratio goes up, the probability of the bank's default also increases, thus complying with the logic of the indicator.
As corroborated with the model, insufficient liquidity may cause the bank's insolvency, i.e. a decrease in the liquid assets to net assets ratio has a positive effect on the default probability.
Additional provisions for possible losses reduce banks' profit and exert more pressure on the capital. They bring the capital safety margin down.
As the model shows, when the ratio of provisions for possible losses to net assets grows, the default probability of the bank increases. 4. Variables relating to the size of the bank (LNnetassets, LNnetassets2).
The Lnnetassets-variable (logarithm of bank's net assets) describes the size of the bank. Relying on the model, we figured out that the size of the bank affects the default probability. We reckon it results from a better diversified portfolio of loans and a spectrum of services.
However, major banks are often noted to be exposed to the risk, since they count very much on the State aid in case of any financial difficulties because they are too-big-to-fail.
To check whether major banks adhere to risky policies, we introduced an additional variable -a second degree polynomial in relation to the bank-size variable. Following the analysis, we refute the hypothesis stating that the bank will be supported by the State in case it has any financial difficulties. The same can be seen in practice.

Institutional
indicators are very important for the model. We confirm the hypothesis stating that the existence of branches mitigates the default probability and the Central Bank of Russia demonstrates a lower proclivity to recall licenses from regional banks.
Perhaps the reason is that the Central Bank of Russia tries to recall licenses from regional banks to a lesser extent so to sustain the existing competition that is not that high in regions.
Therefore, coefficients assessed absolutely comply with their economic substance and can be used to predict the default probability of banks.

Conclusions
The final model primarily allows to detect unsustainable banks. Recognizing the big significance of false negatives and balancing the sample, we ensured high precision of the classification of bankrupt banks.
Whereas we managed to preserve classification capacities of the test sample of banks declared bankrupt in 2016, the model was proved to be practicable and feasible.
The findings can be useful for researchers who examine issues of credit institutions' bankruptcy, and for management of banks. Considering only six indicators of financial reporting, managers will be able to evaluate the financial position of their banks and counterparts. Furthermore, the model of the default probability of the Russian banks can be used by banking supervisory bodies of the Russian Federation as a system for remote monitoring, and any companies to choose a servicing bank.
The simplicity of the model and respective variables help loyal and would-be customers to analyze their banks.
Source: Authoring