Journal of Applied Mathematics

Volume 2019, Article ID 8350464, 11 pages

https://doi.org/10.1155/2019/8350464

## Dynamic Credit Quality Evaluation with Social Network Data

^{1}Pan African University Institute for Basic Sciences, Technology and Innovation, Kenya^{2}School of Mathematics, University of Nairobi, Kenya

Correspondence should be addressed to Stanley Sewe; moc.liamg@ewesyelnats

Received 2 November 2018; Accepted 3 February 2019; Published 1 April 2019

Academic Editor: Junzo Watada

Copyright © 2019 Stanley Sewe et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We investigate the filtering problem where the borrower’s time varying credit quality process is estimated using continuous time observation process and her (in this paper we refer to the borrower as female and the lender as male) ego-network data. The hidden credit quality is modeled as a hidden Gaussian mean-reverting process whilst the social network is modeled as a continuous time latent space network model. At discrete times, the network data provides unbiased estimates of the current credit state of the borrower and her ego-network. Combining the continuous time observed behavioral data and network information, we provide filter equations for the hidden credit quality and show how the network information reduces information asymmetry between the borrower and the lender. Further, we consider the case when the network information arrival times are random and solve stochastic optimal control problem for a lender having linear quadratic utility function.

#### 1. Introduction

In this study we consider the problem of stochastic filtering in the presence of network generated information. Consider a continuous time credit quality process which is hidden and is only partially observed through its randomized function . Additionally, at discrete time points the observer has access to unbiased signals of the process and of nodes directly linked to the node. Thus the standard stochastic filtering is carried out before the arrival of the network information. The unbiased signals from the network are used to improve the estimates of the hidden true credit quality.

We assume that the nodes are individual (potential) borrowers in a dynamic social network with the process being borrower’s true credit quality modeled as an Ornstein-Ulehnbeck process. The hidden process is partially observed through its function in continuous time. Thus models the typical borrower’s behavior and financial information available to the lender (e.g., account turnover, periodic account balance). and are modeled as scalar processes for simplicity, though the results apply to the vector case too. The borrower knows her true credit quality, and this is also visible to her direct social contacts. A social link is formed upon mutual consent between the parties. The probability of a link forming is influenced by the distance between the borrowers’ credit quality. Thus links are formed by credit type homophily. Homophily ([1]) is the idea that individuals with similar characteristics are more likely to have a network tie than individuals with different characteristics. The network is thus modeled as an undirected continuous time latent space model. The lender is able to observe at discrete time points the borrower’s ego-network. Hence at time points the lender is able to update his estimate of the borrower’s credit quality using the network data.

Kalman-Bucy filtering techniques provide the estimates for the hidden process and the conditional variance based on the continuous time observations . At time , the estimates and the conditional variances are updated using the vector of network signals. The dimension of the vector is depended on the borrower’s degree (number of friends) at time . Thus we combine Kalman-Bucy filtering and Bayesian updating techniques to improve the lender’s estimate of the borrower’s credit quality. Further, we prove that the network data leads to a lower variance for the estimated credit quality . In addition, we consider the case whereby the network information arrives at random times with intensity of a Poisson process. The dynamics of the filtered estimates and conditional variances are the same as in the case of discrete time information arrivals, though we have the conditional variance being a piecewise deterministic stochastic process. We solve a stochastic optimization problem for a lender using the linear quadratic utility. Due to the random times , the problem is reduced to a linear quadratic Gaussian utility with jumps. Our credit model can be seen as continuous time version of the static credit scoring model of [2]. Also related is the work of [3] who derived filtered estimates and conditional variance of a hidden drift process in the presence of continuous time observation and discrete time scalar expert opinion. Our stochastic optimal control problem is close to the optimal credit limit problem of [4] whereby the lender optimizes the credit limit available to a corporate entity based on the status of the entity’s surplus process. Reference [5] solved a singular optimization problem within a random time horizon to obtain the optimal interest rate payable by a corporate entity. Reference [6] provides a good account of stochastic optimization for jump diffusion processes whilst [7] is an excellent treatment of discrete and stochastic optimization.

Application of social network analysis and social network data in consumer credit has become ubiquitous; see [8] for a review. With the availability of social network data, digital credit and mobile phone based lending have emerged as key areas of study in finance. Some previous studies of dynamic social network include the coevolution model of [9] where the node’s attributes and network ties influence each other. The nodal attributes and network ties are modeled as continuous time Markov chains. The static latent distance models of [10] and discrete time model of [11] are able to capture key network properties like homophily and transitivity (two nodes connected to a common third node are likely to be connected) and group structures. Reference [12] provides a Bayesian dynamic model of relational structure on a latent variable continuous time social network. Reference [13] gives a good review of the static and dynamic latent distance and latent space network models.

There exist two approaches in modelling credit risk: the structural and reduced form approach, respectively. In the structural approach, default occurs when the observed process related to the value of the firm (or firm cash flow) passes below a deterministic default threshold. The threshold is chosen by the managers of the firm, with the default time being a predictable stopping time. In the reduced form approach, default is modeled as an inaccessible stopping time, with the default threshold modelled as a random variable. Reference [14] provides a good introduction to credit risk modelling. There have been attempts to apply the corporate credit risk models in modelling unsecured consumer credit. The challenge therein lies in the fact that it is hard to measure a borrower’s assets. In addition, default in consumer loans is related more to cash flow issues than to the borrower’s debts exceeding her assets. Reference [15] modeled the consumer credit risk using the structural approach of credit risk modeling. Treating the consumer’s behavioral score as a continuous time jump diffusion process, the authors were able to apply the option pricing theory to model the probability of default. Our model is related to the structural approach of credit risk modeling in that the credit worthiness score is seen as an asset to the borrower. However, in the model the lender manages credit risk through continuous updating of the credit limit and borrower’s credit worthiness. The credit limit is affected by the borrower’s credit worthiness, which is related to the probability of default. In turn, the sanctioned credit limit (credit exposure) affects the borrower’s credit worthiness.

Some existing studies on consumer credit scoring include [16] whereby the borrower’s credit rating is modelled as a discrete time Markov chain process upon incorporating a latent variable driven by economic conditions. Reference [2] used social network links and signals from the borrower’s ego-network to estimate her credit quality. The model assumed that the credit quality which is normally distributed is hidden and can only be estimated using the noisy signals from the borrower and her ego-network. Thus the model was based on homophily by preference. Reference [17] proposed the SEN-HMM-CSD (Social Economic Network-Hidden Markov Model-Credit Scores and Default) model whereby economic and social network variables including trust, distrust, and reputation were used to estimate the hidden credit quality of a borrower in a social network. In the SEN-HMM-CSD model, the true credit quality was modeled as a discrete time, discrete state hidden Markov model. Assuming a fully connected social network on a population with no drift, the network based variables were used to generate the credit risk analysis factors, which in turn were used to estimate the hidden Markov parameters for each individual borrower. Our work generalizes the SEN-HMM-CSD model in the continuous time continuous state Hidden Markov model direction through the following: inclusion of neighbouring nodes signals in the estimation of credit worthiness, credit limit variability tied to individual’s credit worthiness, and the network ties based on credit type homophily; thus there is no assumption on a fully connected network model.

The objective of stochastic filtering is to find the best estimate of a hidden process partially seen through the observation process . Excellent textbook treatment of stochastic filtering includes [18–20]. Recent studies on stochastic filtering and information modeling include [3] who studied a market model whereby the unobserved drift process (modeled as an Ornstein-Uhlenbeck process) is filtered conditioned on a continuous time stock return observations and discrete time expert opinion. By incorporating expert opinion in their estimation, the model was a continuous time version of the Black-Litterman model [21]. Reference [22] considered a market model whereby the unobserved drift parameter is estimated on an observation filtration initially enlarged with the terminal value of the stock price perturbed by some constant variance noise. Reference [23] considered an enlargement of filtration problem with a random variable combined with partial information, with an application in linear stochastic control.

The paper is organized as follows. In Section 2, we present our credit model, dynamic social network model, and the information setup. Stochastic filtering and Bayesian updating results are presented in Section 3. Within this section, we also present the properties of the conditional variance. Credit limit management optimization problem is solved in Section 4. Brief numerical results are presented in Section 5, whilst Section 6 concludes.

#### 2. The Credit Model

Consider a filtered probability space satisfying the usual conditions of right continuity and completeness. All processes are assumed to be adapted. We are interested in the following filtering problem.

*Borrower’s Behavioral Dynamics.* We model the borrower’s observed behavioral dynamics as a linear diffusion process defined as

The parameters are assumed to be constants and is a adapted one dimensional Brownian motion. The hidden credit quality process driving the drift is modeled as a mean-reverting Ornstein Ulehnbeck process defined aswhere are constants and is a Brownian motion. We assume that and are independent. is a measurable normally distributed random variable independent of and with mean and variance . is a Gaussian process; its mean and variance are given by

*Network Dynamics.* We consider a society with a population . Each individual in the population has a normally distributed time varying credit quality . The credit qualities are assumed to be independent across the individuals. At each time , with no other additional borrower information, an individual’s true credit quality is assumed to be normally distributed with mean and variance . Individuals and interact and a network tie is formed by mutual consent. Affecting equations (4) and (5)-the highlighted section should read: Assuming an undirected network i.e. , then we let for every and withThus the network ties are conditionally independent Bernoulli random variables given the corresponding probabilities of tie formation . is the survival function of a standard Raleigh distribution. Existence of a network tie between individual and is dependent on the distance between their credit types, and shorter distance leads to higher probability of network tie. Thus the network model captures homophily, whereby individuals with similar characteristics are more likely to have social network ties than individuals with different characteristics. We assume there is no cost incurred in network tie formation or destruction.

Let denote the set of friendship ties in the society at time . The set of borrower ’s friends at time , known as her ego-network, is defined as . For a particular borrower , we consider her true credit quality process and behavioral score .

*Lender’s information.* As a result of the information asymmetry between the lender and the borrowers, the lender is unable to directly observe the true credit quality process for the individual . He however gets noisy observation of the hidden process via the behavioural process . Additionally, the lender is able to receive at discrete time points noisy signals about the current state of and neighbouring nodes within her ego-network. For each , let the set be the set of signals of the borrower’s ego-network such that . The variable is i.i.d across individuals and for . Thus at time point the lender observes borrower ego-network and the vector comprising of the noisy observation of her current credit state and the credit states of her immediate social contacts . The variance is a measure of the reliability of the network information received by the lender.

The information available to the lender can thus be represented by the following filtrations: corresponds to the continuous time behavioral score only and consists of the network information received at discrete times whilst is the combination of behavioral score and the network information. We assume that the -algebras and are augmented by the null sets.

*Assumption 1. *It is obvious that for all . is generated by the innovation process . We assume that for all . Thus is immersed in ; hence every square integrable martingale is a square integrable martingale.

#### 3. Stochastic Filtering

Stochastic filtering entails obtaining estimate of the hidden process conditioned on the observed process. Let be the projection of the process onto the observed filtration , i.e., . is the optimal estimator of in the mean square sense. In this section, we derive the recursive equations for the filtered estimates conditioned on the different information settings.

##### 3.1. Behavioural Observations

In the case whereby the lender observes the borrower’s behavioral information only, i.e., where the lender’s observation filtration is , the state and observation processes constitute a linear system of equations. Besides, the bivariate process is Gaussian. Thus the usual Kalman-Bucy filtering technique, see, e.g., [19, 20], can be used to obtain the conditional mean and the conditional variance .

The dynamics of is given by the following SDE: whilst the dynamics of is given by the deterministic ODE

Equation (8) is the wellknown Riccati equation. With initial value , the unique solution for the equation can be given aswith , , and (see, e.g., [24]).

##### 3.2. Behavioral Observations and Network Information

This is the case whereby the lender uses both the observed behavioral score and the network information to obtain an estimate of the borrower’s credit quality. The following lemma shows that the expected number of friends for individual at any time can be treated as a constant.

Lemma 2. *Let be the borrower’s degree (number of friends) at time . At each time , conditioned on the borrower’s true credit quality , the expected degree is a constant denoted as .*

*Proof. *Conditioned on the borrower’s true credit type , the probability of having a network tie with any other individual isThus the conditional expected number of friends is given byAs , we make the simplifying assumption that is a constant denoted by . Thus in the limit .

The simplifying assumption is justified by the consideration that in a typical human social network; types are diffuse and the population size is large. Besides, in a small social network with less diffuse individual types, the benefits of social network scoring would be limited.

Proposition 3. *For any and , let be the precision of the network information at time . Further define the parameterThen it holds that*(i)*For any , is Gaussian and its dynamics satisfies the following:whilst the conditional variance is given by with initial values . is same as in (9) whilst and *(ii)*At information date , is Gaussian with the conditional mean and variance updated from their respective values at (that is, before the arrival of the network information) to be meanand variance*

*Proof. **Part (i)*. Between two information dates, , there is no arrival of new network information. The only observed process is the continuous time returns with the lender’s information set being . Thus, we have the standard Kalman-Bucy filtering case, with the initial values for conditional mean and conditional variance being and , respectively. The result thus follows from (8) and (9).*Part (ii)*. At the information date, , the estimates for the mean and variance are updated from their values at time using the Bayesian method. The borrower’s ego network at time comprises all her direct network ties. We assume that the lender is able to observe the borrower’s complete ego-network at time . The posterior probability of the borrower’s credit type is obtained by The last equality is as a result of the assumption of independence for the . We have being the assumed density of any individual for . The lender does not update his knowledge of individual signal. The remaining part of the integrand iswhere(i)The first term denotes the product of the prior density (before the arrival of network information) and the likelihood function for the observation .(ii)(a) denotes the probability that at time borrower is friends with the individuals within her ego-network (in ) whose signals are in and that these friends have the signals as collected in (iii)(b) denotes the probability that at time borrower is not friends with anyone outside . As , , by the monotone convergence theorem and applying Lemma 2 above thenwhich has no term. Hence we havewhereby the integrand is a product of Gaussian densities. Upon integrating out and matching the terms of and we obtain the posterior density which is Gaussian with the given expectation and variance.

*For the case whereby the lender observes only the network information, i.e., , we have that between information dates , there is no arrival of new information. We thus get the following corollary.*

*Corollary 4. When the lender’s information set is we have the following:(i)For , the estimates for the mean and variance are given by(ii)At information date , it holds that is Gaussian with mean and variancerespectively.*

*Proof. *Between information dates, it follows that for . Thus and . From the state/observation model, we get that has the formThereforeThe conditional variance is given byHere we have employed the martingale property of the Brownian motion process and used the Ito isometry to obtain the variance. At information date , the estimates are updated by the information received from the random vector . With a Gaussian prior for , we use the Bayesian update similar to Proposition 3 part (ii) to get the posterior expectation and variance.

*Remark 5. *(i)For the case where the individual has no network ties (isolated node), i.e., when , then reduces to . Considering the behavior of the estimates with limiting values of , when we have that and whilst when we have and . For the implication is that no additional information on the hidden process can be obtained by observing the network data.(ii)Further, the estimates and can be obtained from and as limiting cases as . Thus between information dates, the dynamics for the conditional mean and conditional variance are and , respectively. These are deterministic O.D.E equations which can be solved to yield the expressions in Corollary 4. The implication is that on account of no additional information relating to the hidden process can be obtained by observing the continuous time process .(iii)We may consider a case whereby the information arrival times are random times with intensity akin to the jump times of a Poisson process. The random times are independent of the processes and the Brownian motions ; thus the timing of jumps does not carry any information. We deal with this case in Section 4 on optimal credit limit management.

*In the following proposition, we show how the additional network information improves the lender’s estimate of the borrower’s true credit quality. This is captured by the lower variance associated with the estimates from the information set compared to the other two alternative information settings. Thus a lender incorporating signals from the borrower’s ego-network in his analysis is bound to have a better estimate of the borrower’s true credit history.*

*Proposition 6 (properties of the conditional variance). For , and *

*Proof. *For , the dynamics of the conditional variance is given by the function for . From Proposition 3 (i), we havewhilst following Remark 5Comparing and , we note that, at time , since for all values of . Since and a unique solution for the ODE exists, we note that the inequality will persist for all . At time , we again have the inequality . Extending the argument for all yields that that .

To prove that , we note that, at time , . For , we have . This means that the two deterministic functions starting at the same initial value will have the relation . At time , since the map is nondecreasing, we shall have . Iterating this argument for all we conclude that .

*The implication of this result is that the network information improves the accuracy of the lender’s belief about the borrower’s true credit quality. Periodically observing the borrower’s ego-network at discrete times without even observing the entire network leads to a better estimate of the hidden true credit quality. This is as a result of the network ties being based on borrower type homophily. This is quite in line with industry’s practice in digital and mobile phone based lending whereby an individual’s social network data can be used to improve the estimate of her credit quality.*

*4. Optimal Credit Limit Management*

*4. Optimal Credit Limit Management*

*Following Remark 5 (iii), in this section, we consider the case whereby the network information arrival times are modeled as Poisson jump times, with jump intensity . No information arrives at time . As such the sequence is a marked point process, though the dimension of the vector is governed by the borrower’s degree at time . In this case, the dynamics of the filtered process and will be same as in Proposition 3, save for the fact that will be a piecewise deterministic process with stochastic jump times. Denote the jumps of at time as From the properties of the filtered process , has a Gaussian distribution. Let denote the density of with first and second moments and , respectively. We note that the jump sizes are not i.i.d like in the compound Poisson case, since the density is dependent on the state of the system at time .*

*Consider a financial market model with terminal time whereby a lender has access to the continuous time information and the network information generated by the marked point process . The lender desires to obtain an estimate of the borrower true credit quality based on the information set . Besides, the lender desires to optimize his credit decision viz the credit limit sanctioned to the borrower from time to time. Thus we are faced with a stochastic optimal control problem. Due to mathematical tractability, we assume a nondefaulting borrower who is credit constrained and thus will accept and fully utilize the credit limit as and when availed. Thus, the lender avails a credit line facility on revolving fund basis, and the borrower is expected to continuously utilize and repay any outstanding amounts. The lender seeks to adjust the sanctioned credit limit in line with the borrower’s time varying credit quality. Different from the consumer credit card limits, the borrower is expected to fully repay any outstanding amounts before any new borrowing. Such a credit arrangement is applicable in unsecured mobile and digital banking products especially within the emerging market economies; see, e.g., [25, 26].*

*Define the controlled credit quality process as When the borrower has no outstanding credit balance, . As soon as the borrower obtains credit within the sanctioned limit, the finances are channeled towards short-term working capital/consumption with return of (different from the drift before borrowing) which we net the price per unit of funds . We assume that borrowing impacts the drift of the borrower’s credit quality without affecting her volatility. Let and be the lender’s objective function, where is the lender’s discounting factor, is the price per unit of funds, is the lender’s cost per unit of funds, and is the marginal lending cost. , and are all positive. The lender’s optimization problem is where , the admissible set of lending strategies, is defined below. Since the controlled state process is not observable, this is a situation of stochastic optimization with partially observed process. From Proposition 3 and Remark 5 and applying Ito’s formula for jump diffusion process, we condition on the information set to obtain the filtered controlled state process aswhere is the innovation process and is the Poisson process with jump intensity . Let be the compensator for the jump process. The complete information optimization problem is given as*

*Thus the lender seeks to obtain the value function and the optimal loan limit such that . (37) with dynamics (36) is a linear quadratic Gaussian regulator with jump diffusion problem. We solve the optimization problem using dynamic programming approach.*

*Definition 7. *The admissible set is defined aswhere is the family of functions such that and its derivatives are continuous on . The upper bound on the loan amount is established exogenously. The upper bound may be in respect to the credit product specification or the lender’s risk management criteria.

*4.1. The Dynamic Programming Method*

*4.1. The Dynamic Programming Method*

*We know that if the value function , then it satisfies the Hamilton-Jacobi-Bellman (HJB) equationWe proceed to solve the optimization problem in the following proposition.*

*Proposition 8. The function solves the HJB equation with the constants , and given as *

*Proof. *By solving the minimization problem in the HJB equation, we obtain the candidate optimal loan amount as Plugging this into the HJB equation yields the following second-order ODE equation: Given that the value function is a twice continuously differentiable function of , we make the following quadratic* ansatz*: whereby , and are constants to be determined. The derivatives of are given by and . Further, we haveInserting the derivatives of together with (45) and rearranging yields the following:which holds for all . The terms in the large parentheses must equal zero. As such solving for the constants , and we conclude the proof. The case is trivial since then, with the optimal control being a constant.

*Proposition 9. (i)The value function is given bywhere (ii)The optimal credit limit process is given by*

*Proof. *(i)Given that the solution of the HJB equation is a function in , it follows by verification theorem for jump diffusion processes that the value function .(ii)In addition, the candidate optimal loan limit process obtained from optimization problem with the HJB is indeed the optimal loan limit process. Reference [6] provides an excellent review of the verification theorem for jump diffusion processes

*The optimal loan limit is a straight line with y-intercept depended on the expectation of jump size . Since the density of is depended on the state of the system at time , the intercept will change in line with the time varying values of . The positive gradient of the line is independent of the jump sizes.*

*It may be optimal for the lender to decline to avail any credit line to the borrower, i.e., to have . This occurs when In this case, for to be positive we must have .*

*Corollary 10. For the case when there is no network data, i.e., when , the value function is given bywhere and the optimal credit limit process is*

*Proof. *To obtain the optimal value function when the observation filtration is , simply replace the jump intensity parameter in Proposition 9 and hence the proof.

*When , the lender’s optimization problem reduces to the normal linear Gaussian quadratic regulator and the objective function ; see, e.g., [7].*

*Remark 11. *The impact of the network information on the lender’s optimal decision process is seen by noting that whenever for all then . This, coupled with the variance results in Proposition 6, captures the gains made by incorporating the borrower’s ego-network information in the lender’s credit risk strategy.

*5. Numerical Results*

*5. Numerical Results*

*In this section we illustrate our findings in the preceding sections. We model the hidden credit quality process as an Ornstein Ulehnbeck process and the observed returns process as a linear diffusion process. We assume that the network information arrives at equidistant time points . We simulate the processes using the parameter values as found in Table 1.*