**International Journal of Mathematical Research**

January 2017, Volume 6, 1, pp 29-35

Aliu, A. Hassan, Abiodun A. A., Ipinyomi R.A.

on Google Scholar

on PubMed

**Article History:**

Received:
24 February, 2017

Revised:
25 April, 2017

Accepted:
18 May, 2017

Published:
13 June, 2017

Diffusion processes governed by Stochastic Diffusion Equations (SDEs) are a well known tool for modeling continuous-time data. Consequently, there is widely interest in efficiently estimate diffusion parameters from discretely observed data. Likelihood based inference can be problematic, as the transition densities are rarely available in closed form. One widely used solution proposed by Pedersen (1995) involved the introduction of latent data points between every pair of observations to allow an Euler-Maruyama approximation of the true transition densities to become accurate. Marko Chain Monte Carlo methods are therefore be using to sample the posterior distribution of the latent data and model parameters .We apply the so called method to epidemic data which are discretely observed, that undergo stochastic transition rate. In this case, we introduced a new innovation scheme approach that would explore efficient MCMC schemes that are afflicted by degeneracy problem. The method that capable of sampling efficient estimate of diffusion parameters from discrete observed epidemic data with measurement error.

**Article History:** Received: 24 February 2017, Revised: 25 April 2017, Accepted: 18 May 2017, Published: 13 June 2017

**Keywords**: Diffusion process, Stochastic differential equation, Bayesian inference, Numerical solution, Partially observed data, Diffusion bridge, MCMC, SEIR epidemic model.

This study contributes in the existing work of Golightly and Wilkinson (2008). Here, we make use of Bayesian argumentation approach on high frequency discretely observed diffusion times. The primary goal, is on the Modified innovation scheme apply to care for sampling degenerating when imputed time is very large.

Most epidemic data are discretely observed and undergo stochastic transition rate. Stochastic epidemic models allow more realistic description of the transmission of disease as compared to deterministic epidemic models (Becker, 1989; Andersson and Britton, 2000). However, parameter estimation is challenging for discretely observed data for stochastic models (Sørensen, 2004; Jimenez *et al.*, 2006). Several methods of frequentist procedures to infer on the parameters are been considered in the literatures. Most techniques struggle when inter-observation times are large.

Here, we employ an efficient Bayesian estimation approach under stochastic differential equation (SDE) technique. Stochastic differential equation (SDE) models play a prominent role in a range of application areas, including biology, chemistry, epidemiology, mechanics, microelectronics, economics, and finance (Black and Scholes, 1973; Merton, 1976; Cox *et al.*, 1985a; Bibby and Sørensen, 2001; Elerian *et al.*, 2001; Eraker, 2001; Chiarella *et al.*, 2009). A complete understanding of SDE theory requires familiarity with advanced probability and stochastic processes. These processes are often referred to as a diffusion process.

Diffusion processes are a promising instrument to realistically model the time-continuous evolution of natural phenomena. Diffusion process have an advantage over some of the other stochastic formulations, in that, they can be easily derived directly from the deterministic system of ordinary differential equations and have a relatively simple form (Øksendal, 2003).

Most inferring the parameters of models using such observation is a challenging problem in the field of study.

In this paper, we review some of the empirical solution to parameter estimation problems. We adopted Bayesian imputation approach to infer the parameter of interest, we also replaces an intractable transition density problems with a first order Euler-Maruyama approximation, and uses data augmentation to limit the discretisation error incurred by the approximation.

We restrict attention to estimation within the Bayesian imputation approach. The essential idea of the Bayesian imputation approach is to augment low frequency data by introducing intermediate time-points between observation times. An Euler-Maruyama scheme is then applied by approximating the transition densities over the induced discretisation.

To deal with such data, we define Observation say D as:

D_{n}^{(1)} as discretely observed and D_{n}^{(2)} as unobserved part.

where, X^{(1)}_{t} represent dimension d_{1} > 0 and X^{(2)}_{t} dimension d_{2} ≥ 0. With d_{1}+d_{2} = d. If d_{2} = 0, implies fully observed.

We consider a parameterized family of d-dimensional diffusion process *{X _{t} , t ≥ 0}* satisfied by a Stochastic Differential Equation of the form:

**X ^{i}_{t}** is the value of the process at time t,

where ΔW^{i}_{t} ~ *N _{d}(0, IΔt)*. since, most diffusion process undergo Markov chain, we assume equidistant observation times with the likelihood function of the observation given parameters is of the form:

where, *π(x _{tk+1}|x_{tk} ,θ)* denotes the transition density from

This likelihood function is very rarely available in closed form. The maximum likelihood estimation would be intractable. We therefore considered Bayesian method of estimation.

In statistics, Bayesian inference is a method of inference in which Baye’s rule is used to update the probability estimate for a hypothesis as additional evidence is required. The idea behind Bayesian inference is that the likelihood and prior are combined using Bayes’ theorem to compute the posterior distribution.

The posterior density from (4) is given thus:

Where π(θ) is the prior density, the Euler- Maruyama approximation might not be accurate if interval [t_{k+1}, t_{k}] is too large. We therefore adopted a *data augmentation* approach.

In data augmentation we inserting m-1 additional time points in between [t_{k+1}, t_{k}].

, k = 0, ..., K (6)

where

Therefore, the joint posterior for parameters and imputed data as

(7)

where Euler density

N_{d}(.; μ, Σ) denotes the multivariate Gaussian density with mean μ and variance-covariance Σ

**3.1. Sampling Procedure**

The posterior distribution is typically analytically intractable, we therefore sample via Markov Chain Mote Carlo (MCMC) scheme.

for path update, we sample

*x | x*_{0}, x_{T}, θFor parameter update, we sample

*θ | x*_{0}, x_{T}, x

In path updating, various diffusion bridges proposal mechanism for sample the skeleton path had been proposed in the literature, such as Diffusion bridge by Roberts and Stramer (2001) Modified diffusion bridge by Durham and Gallant (2001) Regularized sampler by Lindstrom (2012) among others,

Here we adopted Modified Diffusion Bridge proposed by Durham and Gallant (2001).

Assuming the starting point (x_{0} = x_{τk}) and the end point (x_{T} = x_{τm}) are observed, the path update proposal would now be our aim to get this we defined a distribution:

*q(x _{τk+1} | x_{τk}, x_{τm}, θ)*

and find out the μ_{τk} , Σ_{τk}

Modified diffusion bridge method for univariate model is of the form:

, *k = 0, ..., m-1* (8)

*where, * *,*

The marginal posterior density for the imputed data π(x | x^{(i)}_{τk-1}, x^{(i)}_{τm} ,θ) has acceptance

Probability of the form:

(9)

Under this update scheme, the proposal mechanism of the MCMC becomes degenerate as m → ∞, meaning that, there is dependence between the parameters and the imputed values, likewise there is dependence between values of the imputed latent process itself. This was first highlighted as a problem by Roberts and Stramer (2001). To overcome this, we consider innovation scheme earlier proposed by Golightly and Wilkinson (2008) though not applicable to discrete observation.

Our contribution is on Modified Innovation scheme, that is, the MCMC sampling strategy to be considered was the innovation scheme, first introduced by Chib *et al.* (2006). In diffusion there is one-to-one relationship between ΔX_{t} and ΔW_{t}.

(10)

Implies:

Then hence,

(11)

Rather than sample from the distribution of conditional on the missing imputed data, the innovation scheme uses a subtle reparameterisation, by sampling conditional on the driving Brownian motion, and the latent path *x _{τk}* is obtained deterministically and consistent with the parameters of the model, therefore, this overcoming the dependence problem. We let denote the Brownian increment innovations.

Here, we sampling the parameters of interest (θ), given the Brownian driving (w_{τk}) and observation (D_{T}) thus:

(12)

where the Jacobian for one increment is

The target distribution therefore becomes

(13)

Having set this update scheme, the acceptance probability now becomes

(14)

We demonstrate the performance of aforementioned methods described above by applying it to synthesis simulated epidemic system of diffusion model. We considered stochastic infection model (SEIR Model) which undergo diffusion system of model:

(15)

Here, the state variable X^{(i)}_{t} = (x_{1}, x_{2}, x_{3})^{T}_{,} where, x_{1} denotes Susceptible individuals, x_{2} represent Exposed, and x_{3} Infectious individuals with their initial condition for the state variables are (500000, 1000,10) respectively. The parameter of interest denoted by θ = (β, γ, α)^{T} , we initialized the sampler with 0 < β < 1, 0 < γ < 0.7 and 0.1 < k < 1 that represent transmission rate, exposed rate and infection rate respectively. We performed iteration for 10^{4} times with three different number of imputed time points (m = 5, m = 15 and m = 50). In parameter proposal, we used independent sampler of the form N_{d}(0, ψ_{j}^{2}) distribution for the proposal of parameter of interest, where ψ_{j}^{2} is the turned variance of {0.009, 0.009, 0.001} for beta, gamma and alpha parameter respectively.

To show that the proposed method does not degenerate when increasing the number of imputed time points, we applied modified innovation scheme. We set the starting time point at t_{0} = 0 and end-time at T = 30, with equidistant time interval Δτ = 0.001.

We choose an uninformative prior for each of the parameter, and apply the MCMC scheme to infer the posterior values of the model.

We compared the empirical method (Naïve) with our new method (Modified innovation scheme) for the path and parameters update and the results were depicted below.

Implementation was done with the aid of R-software programming.

1(a)

1(b)

**Figure-1.**(a) shows the density plot for the innovation scheme for three different imputed values, the three imputed were very closed. And 1(b) shows the trace plot for the three parameters, the trace plot mixing very well.

**Source:** Simulated SEIR synthetic data.

**2(a).** Auto-correlation for the Naive method scheme

**2(b).** Auto-correlation for the Modified innovation scheme

**Figure-2(a) & (b).** shows auto-correlation for both traditional naive method for parameter beta and modified innovation scheme.

**Source:** Simulated SEIR synthetic data.

**3.(a)** Auto-correlation for the Naive method

**3(b).** Auto-correlation for the Modified innovation scheme

**Figure-3(a) & (b)**: shows the naive method for the parameter alpha and modified Innovation scheme.

**Source:** Simulated SEIR synthetic data.

We consider a diffusion process approach based on a stochastic discrete-time approximation diffusion process. With the aims of estimate unobserved latent data and parameters of given epidemic system of model when the number of imputed time point is very large. We presented a naive class of estimation with the modified innovation scheme which are computationally and statistically efficient, and can be readily applied in situations where the discrete-observation of the process is possible. Diffusion processes governed by Stochastic Diffusion Equations (SDEs) are a well known tool for modeling continuous-time data. However, most epidemic data are discretely observed and undergo stochastic transition rate. Likelihood based inference can be problematic, as the transition densities are rarely available in closed form. Consequently, there is widely interest in efficiently estimate diffusion parameters from discretely observed data. Additional innovation scheme are considered, focusing on the degenerate problems in the literature. The modified innovation method adopted capable of sampling efficient estimate of diffusion parameters from discrete observed epidemic data for infinite number of imputed time points. See figure 1, 2 and 3. The results obtained from posterior distribution in modified innovation scheme when the number of imputed points increases does not worsen the mixing of the chain, figure 2 and 3. Also, under the modified innovative scheme as number of imputed tend to infinite (m → ∞), we have both parameters and path update that are consistent. Likewise, the situation where the scheme becomes degenerate does not occur.

Our work can be extended in a number of ways, especially to the partially discrete observation and likewise, observation with measurement error. T

Funding: This study received no specific financial support. |

Competing Interests: The authors declare that they have no competing interests. |

Contributors/Acknowledgement: All authors contributed equally to the conception and design of the study. |

Allen, E.J., 2007. Modeling with Itô stochastic differential equations. Dordrecht, The Netherlands: Published by Springer.

Andersson, H. and T. Britton, 2000. Stochastic epidemic models and their statistical analysis.

Becker, N., 1989. Analysis of infectious disease data. London: Chapman & Hall.

Bibby, B. and M. Sørensen, 2001. Simplified estimating functions for diffusion models with a high-dimensional parameter. Scandinavian Journal of Statistics, 28(1): 99-112. *View at Google Scholar | View at Publisher*

Black, F. and M. Scholes, 1973. The pricing of options and corporate liabilities. Journal of Political Economy, 81(3): 637-654. *View at Google Scholar | View at Publisher*

Chiarella, C., H. Hung and T.D. Tô, 2009. The volatility structure of the fixed income market under the HJM framework: A nonlinear filtering approach. Computational Statistics and Data Analysis, 53(6): 2075-2088. *View at Google Scholar | View at Publisher*

Chib, S., M.K. Pitt and N. Shephard, 2006. Likelihood based inference for diffusion driven models. Economics Papers No. 2004-W20, Economics Group, Nuffield College, University of Oxford.

Cox, J., J. Ingersoll and S. Ross, 1985a. An intertemporal general equilibrium model of asset prices. Econometrica, 53(2): 363-384.

Durham, G.B. and A.R. Gallant, 2001. Numerical techniques for maximum likelihood estimation of continuous-time diffusion processes. Journal of Business and Economic Statistics, 20(3): 297-338. *View at Google Scholar | View at Publisher*

Elerian, O., S. Chib and N. Shephard, 2001. Likelihood inference for discretely observed nonlinear diffusions. Econometrica, 69(4): 959–993. *View at Google Scholar | View at Publisher*

Eraker, B., 2001. MCMC analysis of diffusion models with application to finance. Journal of Business & Economic Statistics, 19(2): 177–191. *View at Google Scholar | View at Publisher*

Golightly, A. and D.J. Wilkinson, 2008. Bayesian inference for nonlinear multivariate diffusion models observed with error. Computational Statistics and Data Analysis, 52(3): 1674-1693. *View at Google Scholar | View at Publisher*

Jimenez, J., R. Biscay and T. Ozaki, 2006. Inference methods for discretely observed continuous-time stochastic volatility models: A commented overview. Asia-Pacific Financial Markets, 12(2): 109-141. *View at Google Scholar | View at Publisher*

Lindstrom, E., 2012. A regularised bridge sampler for sparsely sampled diffusions. Statistics and Computing, 22(2): 615-623. *View at Google Scholar | View at Publisher*

Merton, R., 1976. Option pricing when underlying stock returns are discontinuous. Journal of Financial Economics, 3(1-2): 125-144. *View at Google Scholar | View at Publisher*

Øksendal, B., 2003. Stochastic differential equations: An introduction with applications. 6th Edn., New York: Springer.

Pedersen, A.R., 1995. Consistency and asymptotic normality of an approximate maximum likelihood estimator for discretely observed diffusion processes. Bernoulli, 13(3): 257-279. *View at Google Scholar | View at Publisher*

Roberts, G.O. and O. Stramer, 2001. On inference for partially observed nonlinear diffusion models using the metropolis-hastings algorithm. Biometrika, 88(3): 603-621. *View at Google Scholar | View at Publisher*

Sørensen, H., 2004. Parametric inference for diffusion processes observed at discrete points in time: A survey. International Statistical Review, 72(3): 337–354. *View at Google Scholar | View at Publisher*

Views and opinions expressed in this article are the views and opinions of the author(s), International Journal of Mathematical Research shall not be responsible or answerable for any loss, damage or liability etc. caused in relation to/arising out of the use of the content. |