International Journal of Mathematical Research

January 2017, Volume 6, 1, pp 29-35

Statistical Inference for Discretely Observed Diffusion Epidemic Models


Aliu, A. Hassan, Abiodun A. A., Ipinyomi R.A.

Aliu, A. Hassan 1

Abiodun A. A. 2 

Ipinyomi R.A. 2 
  1. Department Mathematic and Statistics, Rufus Giwa Polytechnic, Owo Ondo State Nigeria 1

  2. Department of Statistics, University of Ilorin, Nigeria 2

Pages: 29-35

DOI: 10.18488/journal.24.2017.61.29.35

Share :

Article History:

Received: 24 February, 2017
Revised: 25 April, 2017
Accepted: 18 May, 2017
Published: 13 June, 2017


Abstract

Diffusion processes governed by Stochastic Diffusion Equations (SDEs) are a well known tool for modeling continuous-time data. Consequently, there is widely interest in efficiently estimate diffusion parameters from discretely observed data. Likelihood based inference can be problematic, as the transition densities are rarely available in closed form. One widely used solution proposed by Pedersen (1995) involved the introduction of latent data points between every pair of observations to allow an Euler-Maruyama approximation of the true transition densities to become accurate. Marko Chain Monte Carlo methods are therefore be using to sample the posterior distribution of the latent data and model parameters .We apply the so called method to epidemic data which are discretely observed, that undergo stochastic transition rate. In this case, we introduced a new innovation scheme approach that would explore efficient MCMC schemes that are afflicted by degeneracy problem. The method that capable of sampling efficient estimate of diffusion parameters from discrete observed epidemic data with measurement error.

Article History: Received: 24 February 2017, Revised: 25 April 2017, Accepted: 18 May 2017, Published: 13 June 2017

Keywords: Diffusion process, Stochastic differential equation, Bayesian inference, Numerical solution, Partially observed data, Diffusion bridge, MCMC, SEIR epidemic model.

Contribution/ Originality:

This study contributes in the existing work of Golightly and Wilkinson (2008). Here, we make use of Bayesian argumentation approach on high frequency discretely observed diffusion times. The primary goal, is on the Modified innovation scheme apply to care for sampling degenerating when imputed time is very large.


1. INTRODUCTION

Most epidemic data are discretely observed and undergo stochastic transition rate. Stochastic epidemic models allow more realistic description of the transmission of disease as compared to deterministic epidemic models (Becker, 1989; Andersson and Britton, 2000). However, parameter estimation is challenging for discretely observed data for stochastic models (Sørensen, 2004; Jimenez et al., 2006). Several methods of frequentist procedures to infer on the parameters are been considered in the literatures. Most techniques struggle when inter-observation times are large.

Here, we employ an efficient Bayesian estimation approach under stochastic differential equation (SDE) technique. Stochastic differential equation (SDE) models play a prominent role in a range of application areas, including biology, chemistry, epidemiology, mechanics, microelectronics, economics, and finance (Black and Scholes, 1973; Merton, 1976; Cox et al., 1985a; Bibby and Sørensen, 2001; Elerian et al., 2001; Eraker, 2001; Chiarella et al., 2009). A complete understanding of SDE theory requires familiarity with advanced probability and stochastic processes. These processes are often referred to as a diffusion process.

Diffusion processes are a promising instrument to realistically model the time-continuous evolution of natural phenomena. Diffusion process have an advantage over some of the other stochastic formulations, in that, they can be easily derived directly from the deterministic system of ordinary differential equations and have a relatively simple form (Øksendal, 2003).

Most inferring the parameters of models using such observation is a challenging problem in the field of study.

In this paper, we review some of the empirical solution to parameter estimation problems. We adopted Bayesian imputation approach to infer the parameter of interest, we also replaces an intractable transition density problems with a first order Euler-Maruyama approximation, and uses data augmentation to limit the discretisation error incurred by the approximation.

2. MATERIAL AND METHOD

We restrict attention to estimation within the Bayesian imputation approach. The essential idea of the Bayesian imputation approach is to augment low frequency data by introducing intermediate time-points between observation times. An Euler-Maruyama scheme is then applied by approximating the transition densities over the induced discretisation.

To deal with such data, we define Observation say D as:

Dn(1) as discretely observed and Dn(2) as unobserved part.

where, X(1)t represent dimension d1 > 0 and X(2)t dimension d2 ≥ 0. With d1+d2 = d. If d2 = 0, implies fully observed.

We consider a parameterized family of d-dimensional diffusion process {Xt , t ≥ 0} satisfied by a Stochastic Differential Equation of the form:

Xit is the value of the process at time t, θ is the parameter vector of length p, α(Xit , θ) is the drift functions, β(Xit , θ) is the diffusion coefficient, and Wit is standard Brownian motion (d-vector Wiener Process). The Xi0 = x0 is the vector of initial conditions for this process; we seek a numerical solution via the Euler-Maruyama approximation. The idea is to discretize Equation (2) by Euler Scheme as Allen (2007)

where ΔWit ~ Nd(0, IΔt). since, most diffusion process undergo Markov chain, we assume equidistant observation times with the likelihood function of the observation given parameters is of the form:

where, π(xtk+1|xtk ,θ) denotes the transition density from Xtk = xtk to Xtk+1 = xtk+1.

This likelihood function is very rarely available in closed form. The maximum likelihood estimation would be intractable. We therefore considered Bayesian method of estimation.

3. BAYESIAN INFERENCE

In statistics, Bayesian inference is a method of inference in which Baye’s rule is used to update the probability estimate for a hypothesis as additional evidence is required. The idea behind Bayesian inference is that the likelihood and prior are combined using Bayes’ theorem to compute the posterior distribution.

The posterior density from (4) is given thus:

Where π(θ) is the prior density, the Euler- Maruyama approximation might not be accurate if interval [tk+1, tk] is too large. We therefore adopted a data augmentation approach.

In data augmentation we inserting m-1 additional time points in between [tk+1, tk].

, k = 0, ..., K (6)

where

Therefore, the joint posterior for parameters and imputed data as

(7)

where Euler density

Nd(.; μ, Σ) denotes the multivariate Gaussian density with mean μ and variance-covariance Σ

3.1. Sampling Procedure

The posterior distribution is typically analytically intractable, we therefore sample via Markov Chain Mote Carlo (MCMC) scheme.

  1. for path update, we sample x | x0, xT, θ

  2. For parameter update, we sample θ | x0, xT, x

In path updating, various diffusion bridges proposal mechanism for sample the skeleton path had been proposed in the literature, such as Diffusion bridge by Roberts and Stramer (2001) Modified diffusion bridge by Durham and Gallant (2001) Regularized sampler by Lindstrom (2012) among others,

Here we adopted Modified Diffusion Bridge proposed by Durham and Gallant (2001).

Assuming the starting point (x0 = xτk) and the end point (xT = xτm) are observed, the path update proposal would now be our aim to get this we defined a distribution:

q(xτk+1 | xτk, xτm, θ)

and find out the μτk , Στk

Modified diffusion bridge method for univariate model is of the form:

, k = 0, ..., m-1 (8)

where, ,

The marginal posterior density for the imputed data π(x | x(i)τk-1, x(i)τm ,θ) has acceptance

Probability of the form:

(9)

Under this update scheme, the proposal mechanism of the MCMC becomes degenerate as m → ∞, meaning that, there is dependence between the parameters and the imputed values, likewise there is dependence between values of the imputed latent process itself. This was first highlighted as a problem by Roberts and Stramer (2001). To overcome this, we consider innovation scheme earlier proposed by Golightly and Wilkinson (2008) though not applicable to discrete observation.

4. IMPROVEMENT

Our contribution is on Modified Innovation scheme, that is, the MCMC sampling strategy to be considered was the innovation scheme, first introduced by Chib et al. (2006). In diffusion there is one-to-one relationship between ΔXt and ΔWt.

(10)

Implies:

Then hence,

(11)

Rather than sample from the distribution of conditional on the missing imputed data, the innovation scheme uses a subtle reparameterisation, by sampling conditional on the driving Brownian motion, and the latent path xτk is obtained deterministically and consistent with the parameters of the model, therefore, this overcoming the dependence problem. We let denote the Brownian increment innovations.

Here, we sampling the parameters of interest (θ), given the Brownian driving (wτk) and observation (DT) thus:

(12)

where the Jacobian for one increment is

The target distribution therefore becomes

(13)

Having set this update scheme, the acceptance probability now becomes

(14)

5. SIMULATION AND APPLICATION

We demonstrate the performance of aforementioned methods described above by applying it to synthesis simulated epidemic system of diffusion model. We considered stochastic infection model (SEIR Model) which undergo diffusion system of model:

(15)

Here, the state variable X(i)t = (x1, x2, x3)T, where, x1 denotes Susceptible individuals, x2 represent Exposed, and x3 Infectious individuals with their initial condition for the state variables are (500000, 1000,10) respectively. The parameter of interest denoted by θ = (β, γ, α)T , we initialized the sampler with 0 < β < 1, 0 < γ < 0.7 and 0.1 < k < 1 that represent transmission rate, exposed rate and infection rate respectively. We performed iteration for 104 times with three different number of imputed time points (m = 5, m = 15 and m = 50). In parameter proposal, we used independent sampler of the form Nd(0, ψj2) distribution for the proposal of parameter of interest, where ψj2 is the turned variance of {0.009, 0.009, 0.001} for beta, gamma and alpha parameter respectively.

To show that the proposed method does not degenerate when increasing the number of imputed time points, we applied modified innovation scheme. We set the starting time point at t0 = 0 and end-time at T = 30, with equidistant time interval Δτ = 0.001.

We choose an uninformative prior for each of the parameter, and apply the MCMC scheme to infer the posterior values of the model.

We compared the empirical method (Naïve) with our new method (Modified innovation scheme) for the path and parameters update and the results were depicted below.

Implementation was done with the aid of R-software programming.

1(a)

1(b)

Figure-1.(a) shows the density plot for the innovation scheme for three different imputed values, the three imputed were very closed. And 1(b) shows the trace plot for the three parameters, the trace plot mixing very well.

Source: Simulated SEIR synthetic data.

2(a). Auto-correlation for the Naive method scheme

2(b). Auto-correlation for the Modified innovation scheme

Figure-2(a) & (b). shows auto-correlation for both traditional naive method for parameter beta and modified innovation scheme.

Source: Simulated SEIR synthetic data.

3.(a) Auto-correlation for the Naive method

3(b). Auto-correlation for the Modified innovation scheme

Figure-3(a) & (b): shows the naive method for the parameter alpha and modified Innovation scheme.

Source: Simulated SEIR synthetic data.

6. CONCLUSION AND DISCUSSION

We consider a diffusion process approach based on a stochastic discrete-time approximation diffusion process. With the aims of estimate unobserved latent data and parameters of given epidemic system of model when the number of imputed time point is very large. We presented a naive class of estimation with the modified innovation scheme which are computationally and statistically efficient, and can be readily applied in situations where the discrete-observation of the process is possible. Diffusion processes governed by Stochastic Diffusion Equations (SDEs) are a well known tool for modeling continuous-time data. However, most epidemic data are discretely observed and undergo stochastic transition rate. Likelihood based inference can be problematic, as the transition densities are rarely available in closed form. Consequently, there is widely interest in efficiently estimate diffusion parameters from discretely observed data. Additional innovation scheme are considered, focusing on the degenerate problems in the literature. The modified innovation method adopted capable of sampling efficient estimate of diffusion parameters from discrete observed epidemic data for infinite number of imputed time points. See figure 1, 2 and 3. The results obtained from posterior distribution in modified innovation scheme when the number of imputed points increases does not worsen the mixing of the chain, figure 2 and 3. Also, under the modified innovative scheme as number of imputed tend to infinite (m → ∞), we have both parameters and path update that are consistent. Likewise, the situation where the scheme becomes degenerate does not occur.

7. EXTENSIONS

Our work can be extended in a number of ways, especially to the partially discrete observation and likewise, observation with measurement error. T

Funding: This study received no specific financial support.
Competing Interests: The authors declare that they have no competing interests.
Contributors/Acknowledgement: All authors contributed equally to the conception and design of the study.

REFERENCES

Allen, E.J., 2007. Modeling with Itô stochastic differential equations. Dordrecht, The Netherlands: Published by Springer.

Andersson, H. and T. Britton, 2000. Stochastic epidemic models and their statistical analysis.

Becker, N., 1989. Analysis of infectious disease data. London: Chapman & Hall.

Bibby, B. and M. Sørensen, 2001. Simplified estimating functions for diffusion models with a high-dimensional parameter. Scandinavian Journal of Statistics, 28(1): 99-112. View at Google Scholar | View at Publisher

Black, F. and M. Scholes, 1973. The pricing of options and corporate liabilities. Journal of Political Economy, 81(3): 637-654. View at Google Scholar | View at Publisher

Chiarella, C., H. Hung and T.D. Tô, 2009. The volatility structure of the fixed income market under the HJM framework: A nonlinear filtering approach. Computational Statistics and Data Analysis, 53(6): 2075-2088. View at Google Scholar | View at Publisher

Chib, S., M.K. Pitt and N. Shephard, 2006. Likelihood based inference for diffusion driven models. Economics Papers No. 2004-W20, Economics Group, Nuffield College, University of Oxford.

Cox, J., J. Ingersoll and S. Ross, 1985a. An intertemporal general equilibrium model of asset prices. Econometrica, 53(2): 363-384.

Durham, G.B. and A.R. Gallant, 2001. Numerical techniques for maximum likelihood estimation of continuous-time diffusion processes. Journal of Business and Economic Statistics, 20(3): 297-338. View at Google Scholar | View at Publisher

Elerian, O., S. Chib and N. Shephard, 2001. Likelihood inference for discretely observed nonlinear diffusions. Econometrica, 69(4): 959–993. View at Google Scholar | View at Publisher

Eraker, B., 2001. MCMC analysis of diffusion models with application to finance. Journal of Business & Economic Statistics, 19(2): 177–191. View at Google Scholar | View at Publisher

Golightly, A. and D.J. Wilkinson, 2008. Bayesian inference for nonlinear multivariate diffusion models observed with error. Computational Statistics and Data Analysis, 52(3): 1674-1693. View at Google Scholar | View at Publisher

Jimenez, J., R. Biscay and T. Ozaki, 2006. Inference methods for discretely observed continuous-time stochastic volatility models: A commented overview. Asia-Pacific Financial Markets, 12(2): 109-141. View at Google Scholar | View at Publisher

Lindstrom, E., 2012. A regularised bridge sampler for sparsely sampled diffusions. Statistics and Computing, 22(2): 615-623. View at Google Scholar | View at Publisher

Merton, R., 1976. Option pricing when underlying stock returns are discontinuous. Journal of Financial Economics, 3(1-2): 125-144. View at Google Scholar | View at Publisher

Øksendal, B., 2003. Stochastic differential equations: An introduction with applications. 6th Edn., New York: Springer.

Pedersen, A.R., 1995. Consistency and asymptotic normality of an approximate maximum likelihood estimator for discretely observed diffusion processes. Bernoulli, 13(3): 257-279. View at Google Scholar | View at Publisher

Roberts, G.O. and O. Stramer, 2001. On inference for partially observed nonlinear diffusion models using the metropolis-hastings algorithm. Biometrika, 88(3): 603-621. View at Google Scholar | View at Publisher

Sørensen, H., 2004. Parametric inference for diffusion processes observed at discrete points in time: A survey. International Statistical Review, 72(3): 337–354. View at Google Scholar | View at Publisher

Views and opinions expressed in this article are the views and opinions of the author(s), International Journal of Mathematical Research shall not be responsible or answerable for any loss, damage or liability etc. caused in relation to/arising out of the use of the content.