% BW5.tex : Discounted additive functionals of Markov processes
%

\input mssymb
\magnification=\magstep1
\vsize=9.5truein
\parskip=\medskipamount

% Font selections

\font\sc=cmcsc10

%\baselineskip=18pt   % directives for increased leading
%\lineskiplimit=2pt
%\lineskip=4pt

% Start of definitions

\def\cl{\centerline}
\def\down{\downarrow} \def \up{\uparrow}       % some symbols
\def\leq{\leqslant} \def\geq{\geqslant}
\def\brak#1#2{\langle#1,#2\rangle}

\def\diag{\mathop{\rm diag}\nolimits}          % new roman operators
\def\spect{\mathop{\rm spect}\nolimits}
\def\Leb{\mathop{\rm Leb}\nolimits}
\def\supp{\mathop{\rm supp}\nolimits}
\def\Re{\mathop{\rm Re}} \def\Int{\mathop{\rm Int}}
\def\tendsd{\mathrel{\mathop\rightarrow^{\cal D}}}
\def\wsim{\mathrel{\displaystyle\mathop\sim^{\rm w}}}
\def\eqd{\mathrel{\mathop=^{\cal D}}}
\def\Var{\mathop{\rm Var}\nolimits}

\def\a{\alpha} \def\b{\beta} \def\g{\gamma} \def\d{\delta}    % greek letters
\def\e{\epsilon} \def\l{\lambda} \def\phi{\varphi} \def\s{\sigma}
\def\t{\theta}

\def\Al{A_\l} \def\elt{e^{-\l t}} \def\L{{\cal L}}
\def\D{\triangle} \def\trho{\triangle\rho}
\def\ind{I_{(0,\infty)}}
\def\half{\raise1pt\hbox{$\scriptstyle{1 \over 2}\displaystyle$}}
\def\endbox{\hfill$\square$\par}

% Macros
%
% AUTOMATIC EQUATION NUMBERING          29January1992 (c)MWBAXTER
%
% cno = chapter number : should be set by user
% sno = section number : initialised (0) at chapter start
% eno = equation number : initialised (0) at section start
% pno = proposition number : initialised (0) at section start
%
% \enum{fred} will number equation, eg (1.23), and set \fred={(1.23)}
% \lnum{fred} is as \enum but without the \eqno, for use in \eqalignno
% \lnuma{fred} will number as (1.23a) with \fred={(1.23)} \freda={(1.23a)}
% \lnumb{fred} will number as (1.23b) with \fredb={(1.23b)}
%
% usage: "equation \fred\ suggests that.."  (extra "\ " is trailing space)
%
% WARNING : DO NOT USE TEX or USER MACRO NAMES AS EQUATION NAMES
%
\newcount\cno
\newcount\sno
\newcount\eno
\newcount\pno
\def\addone{\global\advance\eno by1\edef\cs{(\the\eno)}} % x.x FORMAT
\def\lnum#1{\addone\global\expandafter\let\csname#1\endcsname=\cs\cs}
\def\enum#1{\eqno{\lnum{#1}}}
\def\lnuma#1{\addone\edef\csa{(\the\eno{\rm a})}% x.x FORMAT
\global\expandafter\let\csname#1\endcsname=\cs%
\global\expandafter\let\csname#1a\endcsname=\csa\csa}
\def\lnumb#1{\edef\csb{(\the\eno{\rm b})}% x.x FORMAT
\global\expandafter\let\csname#1b\endcsname=\csb\csb}
%
% \pnum{Proposition}{fred} will number proposition/theorem/etc,
%                      eg Proposition 1.1, and set \fred={Proposition 1.1}
\def\paddone#1{\global\advance\pno by 1
  \edef\cs{#1\ \the\pno}} % ADAPTED FOR x.x FORMAT
\def\pnum#1#2{\paddone{#1}%
  \global\expandafter\let\csname#2\endcsname=\cs\cs}
\def\proc#1{\noindent{\bf#1\ \ }}
\def\proof#1{\noindent{\bf Proof of #1\ \ }}
%
% END OF AUTOMATIC NUMBERING MACROS

\sno=0
\eno=0 \pno=0
\def\newsect#1{\bigbreak\bigskip\global\advance\sno by 1
\cl{\bf\the\sno. #1}\nobreak\bigskip\nobreak}

%
\def\newpage{\mark{}\vfill\eject}
%
% References
\newcount\rno
\newcount\rflag
\rflag=0
\input ihp_ref
%
% Begining of text

\cl{\bf Discounted additive functionals of Markov processes}
\bigskip
\cl{{\bf M}ARTIN {\bf B}AXTER}
\medskip
%\cl{\it Pembroke College, Cambridge CB2 1RF}
%\smallskip
\cl{\it Statistical Laboratory, University of Cambridge, Cambridge CB2
1SB}
\bigskip\bigskip
\centerline{\hbox{\vbox{\halign{#.\quad&#\hfil\cr
1&Introduction\cr
2&Large Deviations\cr
3&More exact results for Markov chains\cr}}}}

\newsect{Introduction}%
Much study has been made of the time averages of random processes.
Most of this effort has been directed towards the Ces\`aro average
which weights times uniformly up to a finite horizon. In this
paper, we shall derive some results about more general averages.
The initial motivation was the exponential discount (we will
call this the Abel average), which appears
frequently in the contexts of, among others, systems control and models 
of financial markets. The techniques developed, however, extend easily to 
other shapes of discount.

In this paper we will push forward two strands of inquiry
that have been developed in proceeding work. The generality that
we will obtain should strictly encompass the existing material and can
be read on its own, though reference will be made 
to earlier papers when a proof is essentially the same as one before.

For general results about the large-deviations behaviour of discounted
occupation times of processes, in Section 2 we will build on
Section (a) of \bwii\ and on \baxiii. Previously we knew
that the large-deviation property held for the Abel discounted
average of a general finite-state Markov chain, and we will fully
extend this to completely general discounts of chains and partially
extend further to a wide class of Markov processes. We discover
that although the large-deviation rate function of the discounted
average can be written in terms of that for the Ces\`aro, the rate
is often different for a different discount. We also derive
results about the smoothness and finiteness of the rate function
which are used in the next Section to prove a central limit theorem.

Finally we shall go beyond the limited approximation precision of the
large-deviation property and give an asymptotic expansion of the
density of the distribution itself, following from the density studies
of Section (b) of \bwii. This again is now performed for general discounted
averages of finite-state Markov chains. We also notice a
pattern in the differential equations we worked with, and hypothesize
about their full solutions and other generalizations.

The main objects studied are the Ces\`aro and Abel averages of a process $X$.
Both are random measures of unit mass on the set $S$, the state space of $X$.
The former is defined as 
$$
  C_t(F):={1\over t}\int_0^t I_F(X_s)\,ds,\qquad \hbox{\rm for $F$ measurable
in $S$, $t>0$.}
                                                  $$
For any density $m$ on $[0,\infty)$, that is for $m$ in $L_1^+(\Bbb R^+)$,
the Abel average $\Al$ is
$$
  \Al(F):=\l\int_0^{\infty} m(\lambda t)I_F(X_t)\,dt,\qquad
  \hbox{\rm for $F$ in $S$, $\l>0$.}
                                                  $$
The two main results of the paper are:

\proc{Theorem A.}{\sl Suppose that $X$ is an
irreducible Markov chain on a finite state-space $S$, with Q-matrix
$Q$. Then both $C_t$ and $\Al$ obey the large-deviation principle
with rate functions $I$ and $K$ respectively. That is that
$$
  \eqalign{\limsup_{t\rightarrow\infty}t^{-1}\log\Bbb P(C_t\in F)
  &\leq-\inf_{\nu\in F}I(\nu),\cr
  \limsup_{\l\downarrow0}\l\log\Bbb P(\Al\in F)&\leq-\inf_{\nu\in F}K(\nu),\cr}
  \qquad\eqalign{\liminf_{t\rightarrow\infty}t^{-1}\log\Bbb P(C_t\in G)
  &\geq-\inf_{\nu\in G}I(\nu),\cr
  \liminf_{\l\downarrow0}\l\log\Bbb P(\Al\in G)
  &\geq-\inf_{\nu\in G}K(\nu),\cr}
                                                  $$ 
for $F$ and $G$ respectively closed and open subsets of $M:=M_1(S)$.
Additionally, $K$ is related to $I$ by the equation
$$
  K^*(v)=\int_0^\infty I^*\bigl(m(t)v\bigr)\,dt,\qquad
    \hbox{\sl $v$ in $\Bbb R^S$,}
                                                  $$
where $I^*$ and $K^*$ are the Legendre transforms (convex conjugates)
of $I$ and $K$, satisfying}%end of sl
$$
  \eqalign{I^*(v)&=\sup_{x\in M}\bigl(\brak{v}{x}-I(x)\bigr),\cr
           K^*(v)&=\sup_{x\in M}\bigl(\brak{v}{x}-K(x)\bigr),\cr}\qquad
  \eqalign{I(x)&=\sup_{v\in \Bbb R^S}\bigl(\brak{v}{x}-I^*(v)\bigr),\cr
           K(x)&=\sup_{v\in \Bbb R^S}\bigl(\brak{v}{x}-K^*(v)\bigr).\cr}
                                                  $$

\proc{Theorem B.}{\sl Additionally, if $m$ is of bounded variation, then
the density of $\Al$ on $M$ under the law starting $X$ at $i$, $f_i^\l$
can be written as
$$
  f^\l_i(x)=e^{-K(x)/\l}(2\pi\l)^{-(n-1)/2}\bigl(\det H_K(x)\bigr)^{1/2}
  z_i(x)r^\l_i(x),
                                                  $$
where $H_K$ is the Hessian of $K$ taken with respect to $M$, $z(x)$ is
the positive eigenvector of $Q+\diag(m_0\nabla K(x))$, and the
residue term $r^\l$ goes to 1 in the sense that $r^\l_i(x)\,dx$ converges
weakly to $dx$ on $\Int(M)$.}

\newsect{Large Deviations}%
We will work with general discount shapes and positive recurrent
processes. Let $X$ be a stochastic process with state-space $E$ and
invariant distribution $\pi$. In the Ces\`aro case we would expect 
some sort of ergodic theorem such as
$$
  C_t(F):={1\over t}\int_0^t I_F(X_s)\,ds\rightarrow\pi(F)\qquad
  \hbox{\rm as $t\rightarrow\infty$},\enum{zdba}
                                                     $$
for all measurable subsets $F$ of $E$. Then $C_t$ takes values
in $M_1(E)$, the space of probability measures on $E$. We might also
have a large-deviation result, which we can think of for the moment
as the slogan
$$
  \hbox{\rm ``}\Bbb P(C_t\in H)\approx\exp\bigl(-t\inf_{\nu\in H}I(\nu)
  \bigr)\hbox{\rm\quad as $t\rightarrow\infty$ ''}\qquad H\subset M_1(E),
                                                     $$
for some rate function $I$, with $I(\pi)=0$.
The space $M_1(E)$ and the continuous bounded functions on $E$, $C_b(E)$
are in duality via the bracket, $\brak{v}{\nu}=\int_E v(x)\nu(dx)$. A
related slogan is that of the Laplace transform
$$
  \hbox{\rm ``}\Bbb E\exp\bigl(\brak{v}{C_t}t\bigr)\approx\exp\bigl(t\d(v)
  \bigr)\hbox{\rm\quad as $t\rightarrow\infty$ ''}\qquad v\in C_b(E),
                                                     $$
where $\d$ and $I$ are related by Legendre transformation (convex conjugation),
in that
$$
  \eqalignno{
  I(\nu)&=\d^*(\nu):=\sup_{v\in C_b(E)}\bigl(\brak{v}\nu-\d(v)\bigr),&
  \lnum{zdbb}\cr
  \d(v)&=I^*(v):=\sup_{\nu\in M_1(E)}\bigl(\brak{v}\nu-I(\nu)\bigr),&
  \lnum{zdbc}\cr}
                                                     $$
Our program will be to study, for any discount density $m$ in 
$L_1^+(\Bbb R^+)$, the average
$$
  A_\l(F):=\int_0^\infty \l m(\l t)I_F(X_t)\,dt.\enum{zdbd}
                                                     $$
We will show that $A_\l\rightarrow\pi$ as $\l$ goes to 0, and that the
large-deviation principle holds with rate $K$ whose Legendre transform
$\eta$ is given by the equation
$$
  \eta(v)=\int_0^\infty\d\bigl(m(t)v\bigr)\,dt.\enum{zdbe}
                                                     $$
This is actually the same as the $\eta$-equation at (1.16) in \bwii\
(with discount $m_t=e^{-t}$) and at Theorem~C in \baxiii\
(with discount $m_t=(1-t)^\g$), but \zdbe\ is a more natural
formulation.

\noindent{\bf Standard set-up.} Let $X$ be an ergodic
Feller-Dynkin Markov process on a locally compact Polish space $E$,
with generator $L$. We define the Ces\`aro average $C_t$ and the
general average $\Al$ by \zdba\ and \zdbd\ respectively. Then $C_t$
and $\Al$ converge to $\pi$, the invariant distribution of $X$, with
respect to the weak topology on $M_1(E)$, that is, in the sense
of \zdba. A sufficient condition for the former limit is that, as in 8.11.2 of
\bingham, $\pi$ is a limiting distribution of the transition
semigroup $(P_t)$. The latter limit follows from the former by a similar
$L_1$-continuity argument to that which will be used in the proof of
\pnum{Theorem}{zxdbone}. \deustroo\ show that under an 
assumption of uniform ergodicity
the large-deviation property holds for $C_t$ with rate
function $I$ defined on $M_1(E)$. That is that
$$
  \eqalign{\limsup_{t\rightarrow\infty}t^{-1}\log\Bbb P(C_t\in F)
  &\leq-\inf_{\nu\in F}I(\nu),\cr
  \hbox{\rm and}\qquad\liminf_{t\rightarrow\infty}t^{-1}\log\Bbb P(C_t\in G)
  &\geq-\inf_{\nu\in G}I(\nu),\cr}\enum{zdbf}
                                                  $$
for $F$ and $G$ respectively closed and open subsets of $M_1(E)$.
We learn from 4.2.17 of \deustroo\ that
$$
  \lim_{t\rightarrow\infty}t^{-1}\log\Bbb E\exp\int_0^tv(X_s)\,ds=\d(v),
  \enum{zdbg}
                                                  $$
where $\d$ and $I$ are convex functions satisfying
\zdbb\ and \zdbc. Further there are, by 4.2.27 and
4.2.38 of \deustroo, explicit expressions for $\d$
and $I$ as
$$
  \eqalignno{\d(v)&=\lim_{t\rightarrow\infty}t^{-1}\log\|P_t^v\|_{\rm op},
  \quad\hbox{\rm where
  $P_t^vf(x):=\Bbb E_x\bigl(\exp(\int_0^tv(X_s)\,ds)f(X_t)\bigr)$,}
  &\lnum{zdbh}\cr
  I(\nu)&=\sup\biggl\{-\int_E{Lf(x)\over f(x)}\,\nu(dx):
  f\geq1,\ f\in\mathop{\rm Dom}(L)\,\biggr\}.&\lnum{zdbi}\cr}
                                                  $$
As in \bwii\ we shall be particularly interested
in the case where $X$ is a Markov chain on a finite
state-space $S$ with Q-matrix $Q$. Then
$\d(v)=\sup\{\Re(z):z\in\spect(Q+V)\}$, where $V$ denotes
the diagonal matrix $\diag(v)$ and $\spect(\cdot)$ denotes spectrum
(here the set of eigenvalues). This expression for $\d$ also holds
in the general Markov process setting, if the generator $L$ is
$\pi$-symmetric.

We begin by proving a result whose first part is similar to one
remarked by \kifer\
in the context of the large-deviations of the averages of dynamical
systems, but it is the second part which will be more useful in
our further work. In earlier papers we derived a differential equation
by the self-similarity of discount shapes such as $e^{-t}$, but it is
enough to study the shifts of the discount along the time-axis, which
provides a useful one-dimensional parameterisation.

\noindent{\bf\zxdbone} {\sl Suppose that $X$ is an FD Markov process
on a space $E$, with generator $L$, and $m$ is any density on $[0,\infty)$,
and for $x$ in $E$ and $v$ in $C_b(E)$ the limit
$\d(v)=\lim_\l\l\log\Bbb E_x\exp\int_0^{1/\l}v(X_s)\,ds$ exists uniformly
in $x$ on $E$. If we define $\phi$ by
$$
  \phi(x,t,\l,v)=\Bbb E_x\exp\int_0^\infty\t_t m(\l s)v(X_s)\,ds,
  \eqno\hbox{\rm\lnum{zdbj}}
                                                  $$
where $\t_t$ is the shift operator $\t_tf(s)=f(t+s)$, then
$$
  \lim_{\l\downarrow0}\l\log\phi(x,0,\l,v)=\eta(v):=
  \int_0^\infty\d(m_tv)\,dt.\eqno\hbox{\rm\lnum{zdbk}}
                                                  $$
Further, $\phi(\cdot,t,\l,v)$ is in the domain 
of $L$ and $\phi(x,\cdot,\l,v)$ is differentiable and}
$$
  -{\partial\phi\over\partial t}=\l^{-1}(L+m_tV)\phi.\enum{zdbl}
                                                  $$
\proof{\zxdbone} We prove the first part using
continuity arguments. If we define
$$
  H(\l,x,\a):=\l\log\Bbb E_x\exp\int_0^{1/\l}\a v(X_s)\,ds,\enum{zdbm}
                                                  $$
then $\sup_x|H(\l,x,\a)-\d(\a v)|$ goes to 0 as $\l$ does. We start
by proving the general average limit for $m$ of the form
$$
  m=\sum_{i=1}^nc_iI(a_i,b_i),
                                                  $$
where $\{(a_i,b_i)\}$ are disjoint intervals in $\Bbb R^+$ and $c_i>0$.
Set
$$
  Y^\l_i:=\exp\left(\int_{a_i/\l}^{b_i/\l}c_iv(X_s)\,ds
  -\l^{-1}(b_i-a_i)\d(c_iv)\right),\enum{zdbn}
                                                  $$
and define $y^\l_i(x):=\Bbb E_xY^\l_i$. Then for $\l$ sufficiently small
$|\l\log y^\l_i(x)|<\e$ uniformly in $x$. Thus
$$
  \eqalign{{\Bbb E_x\exp\int_0^\infty m(\l t)v(X_t)\,dt\over
  \exp\l^{-1}\int_0^\infty\d(m_tv)\,dt}&=
  \Bbb E_x(Y^\l_1\ldots Y^\l_n)=
  \Bbb E_x(Y^\l_1\ldots Y^\l_{n-1}y^\l_n(X_{a_n/\l}))\cr
  &\leq e^{\e/\l}\Bbb E_x(Y^\l_1\ldots Y^\l_{n-1})
  \leq\ldots\leq e^{n\e/\l},\cr}
                                                  $$
and so we have the right upper bound. Similarly we have the lower bound.

Let us now define $I_\l(m):=\int_0^\infty m(\l t)v(X_t)\,dt$,
$L_\l(m):=\l\log\Bbb E\exp I_\l(m)$, and $J(m):=\int_0^\infty\d(m_tv)\,dt$.
Then $\l|I_\l(m_1)-I_\l(m_2)|$ is bounded by $\|v\|_\infty\|m_1-m_2\|_1$
uniformly in $\omega$, and hence $|L_\l(m_1)-L_\l(m_2)|$ has the same
bound. Because $|\d(v_1)-\d(v_2)|\leq|v_1-v_2|$, the same bound also
dominates $|J(m_1)-J(m_2)|$. This $L_1$-continuity and (careful)
application of monotone class theorems let us generalize firstly to all
bounded $m$ of compact support and then to all $m$ in $L_1^+(\Bbb R^+)$.

For the differential equation, we need only apply the Feynman-Ka\u c
formula to the space-time process $Y_t:=(X_t,\tau_t)$, where
$\tau_t=\tau_0+t$, which has generator $L+\partial_t$.
Taking $\l=1$ for simplicity, for $v$ in $C_b(E)$,
we define $v_Y$ on $C_b(E\times\Bbb R^+)$ by $v_Y(x,t):=m_tv(x)$, and
$$
  A_t:=\int_0^t v_Y(Y_s)\,ds=\int_0^t\theta_{\tau_0}m(s)v(X_s)\,ds.
  \enum{zdbo}
                                                  $$
Without loss of generality we can assume that $v$ is non-negative,
because if \zdbk\ and \zdbl\ hold for some $v$, then they hold for all
vectors of the form $v+\a{\bf 1}$, where $\bf 1$ is the constant
vector $(1,1,\ldots,1)$. This shifting identity follows from the
fact that $\d(v+\a{\bf 1})=\d(v)+\a$, a property which $\eta$
inherits. Then the semigroup $P^v$ defined by
$$
  P^v_tf(y)=\Bbb E_{Y_0=y}\bigl(e^{A_t}f(Y_t)\bigr)
                                                  $$
has generator $L^v:=L+\partial_t+m_tV$, as seen in, for example, III.39 of 
\williams. Then
if we set $\phi((x,t)):=\phi(x,t,1,v)$, which is continuous in $t$,
we have that
$$
  P_t^v\phi(Y_0)=\Bbb E_{Y_0}\left(e^{A_t}\phi(Y_t)\right)
  =\Bbb E_{Y_0}\left(\Bbb E(\exp A_\infty|{\cal F}_t)\right)=\phi(Y_0).
  \enum{zdbp}
                                                  $$
Thus $t^{-1}(P^v_t-I)\phi=0$, implying that $\phi$ is in the domain
of $L^v$ and is annihilated by it. The equation $L^v\phi=0$ is
exactly \zdbl. \endbox

We note that in the case of $X$ a standard Brownian motion and
the exponential discount $m_t=e^{-t}$ and $v(x)=I(x>0)$, then
\zdbl\ is equation (3.5) of \bwi.

\proc{\pnum{Corollary}{zxdbthree}} {\sl Suppose that $X$ is an
irreducible Markov chain on a finite state-space $S$, with Q-matrix
$Q$, and $m$ is any density on $[0,\infty)$, and $\Al$ is defined by
\zdbd, then the large-deviation property analogue of \zdbf\ holds for $\Al$ 
with rate function $K$, 
$$
  \limsup_{\l\rightarrow0}\l\log\Bbb P(\Al\in F)
  \leq-\inf_{\nu\in F}K(\nu),\quad
  \hbox{\sl and}\quad\liminf_{\l\rightarrow0}\l\log\Bbb P(\Al\in G)
  \geq-\inf_{\nu\in G}K(\nu),\enum{zdbextra}
%  \eqalign{\limsup_{\l\rightarrow0}\l\log\Bbb P(\Al\in F)
%  &\leq-\inf_{\nu\in F}K(\nu),\cr
%  \hbox{\sl and}\qquad\liminf_{\l\rightarrow0}\l\log\Bbb P(\Al\in G)
%  &\geq-\inf_{\nu\in G}K(\nu),\cr}\enum{zdbextra}
                                                  $$
for $F$ and $G$ respectively closed and open subsets of $M$.
The rate function $K$ relates to the $\eta$ of \zdbk\ through the following 
equations:
$$
  \eqalignno{K(x)&=\sup_{v\in\Bbb R^S}\brak{v}{x}-\eta(v),&\lnum{zdbq}\cr
  \eta(v)&=\sup_{x\in M}\brak{v}{x}-K(x),&\lnum{zdbr}\cr}
                                                  $$
where $M:=M_1(S)=\{(x_i)_{i=1}^n:\sum_i x_i=1,\ x\geq0\}$.}

\proof{\zxdbthree} We are in the context of \zxdbone\
because $X$ will satisfy condition ($\tilde{\rm U}$) of 4.2.7 of \deustroo,
which is sufficient for the limit $\d$ to exist as
required by the theorem.
The large-deviation property and \zdbq\ come from theorem II.2 of \ellis.
In his language, $t$ is our $v$, $Y_n$ is our $\Al$, $c_n(\cdot)$ is
our $\l\log\phi(x,0,\l,\cdot)$, and $c(\cdot)$ is our $\eta(\cdot)$.
As $\eta$ is defined and differentiable on the whole of $\Bbb R^S$,
it meets Ellis' `steep' hypothesis. From \zdbk, $\eta$ inherits the
(strict) convexity and differentiability of $\d$, which gives \zdbr.\endbox

We complete this Section with a pair of results about the large-deviation
rate function $K$. The former of these is in the spirit of Proposition~D
of \bwii\ and identifies the points where the
various suprema in Legendre transforms \zdbq\ and \zdbr\ are achieved. This
leads to central limit results and the major result of the 
next Section.

\proc{\pnum{Proposition}{zxdbfive}} 
{\sl Under the conditions of \zxdbthree,
$K$ is finite, twice differentiable and strictly convex on $\Int(M)$,
and the supremum of \zdbq\ is attained uniquely (up to multiples of $\bf 1$) 
at $v=\nabla K(x)$, and the supremum of \zdbr\ is attained uniquely at 
$x=\nabla\eta(v)$.}

\proof{\zxdbfive} It is immediate from its definition
that $\d(v)/\|v\|_\infty\rightarrow1$ as $\|v\|_\infty\rightarrow\infty$
with $v\geq0$. But as also $|\d(v)|\leq\|v\|_\infty$, the Dominated
Convergence theorem gives us that $\eta(v)/\|v\|_\infty$ goes to 1 as well.
Take $x\in\Int(M)$ and suppose there exists a sequence of vectors $(v_n)$
such that
$$
  \brak{v_n}{x}-\eta(v_n)\rightarrow\infty.
                                                  $$
Without loss of generality we can replace $(v_n)$ by
$(v_n-(\min_iv_n(i))\bf 1)$,
because $\eta(v+\a{\bf 1})=\eta(v)+\a$, and thus assume that the $(v_n)$ are
positive, with at least one zero co-ordinate. The sequence must still
get infinitely large, but
$$
  \brak{v_n}{x}-\eta(v_n)\leq\|v\|_\infty\left((1-\min_ix_i)-
  \eta(v_n)/\|v_n\|_\infty\right),\enum{zdbs}
                                                  $$
which is large and negative for large $n$, contradicting our supposition
of $K(x)=\infty$.

As remarked in \zxdbthree, $\eta$ inherits the smoothness and the strict 
convexity on ${\bf 1}^\bot$ of $\d$. Its continuity means that the supremum 
must be attained at some finite point $\hat v(x)$, and the convexity gives 
the uniqueness. The differentiability shows that the maximizing $\hat v$
will be the solution of $\nabla\eta(v)=x$. We can expand $\hat v$
around an $x$ as $\hat v(x+\e)=\hat v(x)+H_\eta^{-1}\e$ to see that
$$
  K(x+\e)=K(x)+\brak{\hat v(x)}{\e}+\half\e^\top H_\eta^{-1}\e+o(\e^2).
  \enum{zdbt}
                                                  $$
Thus $K$ is twice differentiable, $\nabla K(x)-\hat v(x)$ is a
multiple of $\bf 1$, and $K$ is locally (and hence globally) strictly
convex. (Technical note: we are regarding $H_\eta$, the Hessian of $\eta$,
as an automorphism of ${\bf 1}^\bot$.)
By the above $x=\nabla\eta(v)$ is a solution of \zdbr,
and the strict convexity of $K$ shows it is unique.\endbox

In the simple example studied in Section (d) of \bwii,
the rate function was calculated exactly
as $K(x)=\sum\pi_i\log(\pi_i/x_i)$, which is infinite on the boundary
of $M$, whilst the Ces\`aro rate function $I$ is finite everywhere.
Note that we can see that $I$ is finite in the general \zxdbthree\ situation
by considering equation \zdbi.
We shall have further remarks about this example in the next Section,
but for the moment  we derive a necessary and sufficient condition for $K$ 
to be everywhere finite or infinite.

\proc{\pnum{Proposition}{zxdcsix}} {\sl Under the conditions of 
\zxdbthree, the rate function $K$ is either everywhere finite
or everywhere infinite on the boundary of $M$ 
according as to whether the support of the discount function $m$ is of finite 
or infinite (Lebesgue) length.}

\proof{\zxdcsix} Firstly let us define $V_+$ to be the
space of elements of $(\Bbb R^+)^n$ which have at least one zero component.
We note that $\l V_+=V_+$ for any positive $\l$, which is a feature we
shall use later. For $x$ in $\Int(M)$, we take $v_x$ to be the unique
choice in $V_+$ of the $v=\nabla K(x)$ in \zxdbfive. In fact the pair
$(\nabla K,\nabla\eta)$ represents a homeomorphism between $\Int(M)$
and $V_+$. Then $x=\nabla\eta(v_x)$,
so by taking the gradient of \zdbk, we can write $x$ as
$$
  x=\int_0^\infty m_t\nabla\d(m_tv_x)\,dt.
                                                  $$
Then because $\brak{v}{\nabla\d(v)}=\d(v)+I(\nabla\d(v))$ for all $v$ in
$\Bbb R^n$,
$$
  \brak{v_x}{x}=\int_0^\infty\Bigl(\d(m_tv_x)+I\bigl(\nabla\d(m_tv_x)\bigr)
  \Bigr)\,dt=\eta(v_x)+\int_0^\infty I\bigl(\nabla\d(m_tv_x)\bigr)\,dt.
                                                  $$
As $v_x$ is the optimal $v$ in \zdbq, we can express $K(x)$ as
$$
  K(x)=\int_0^\infty I\bigl(\nabla\d(m_tv_x)\bigr)\,dt.\enum{zdbspecial}
                                                  $$
Thus, for an upper bound,
$$
  K(x)\leq\sup_{y\in M}I(y)\,\int_0^\infty I_{\{m_t>0\}}\,dt=
  \sup_{y\in M}I(y)\,\Leb\supp(m),
                                                  $$
and so $K$ is bounded on all of $M$ if the support of $m$, $\supp(m)$, is
compact. The rate function $I$ is only 0 in $M$ at $\pi$, and $\nabla\d$
only takes the value $\pi$ in $V_+$ at 0. Thus from \zdbspecial\ we have
the lower bound
$$
  K(x)\geq\Leb\bigl\{t:m_t\geq\|v_x\|_\infty^{-1}\,\bigr\}\,
  \inf\bigl\{I\bigl(\nabla\d(v)\bigr):v\in V_+,\ \|v\|_\infty\geq1\,\bigr\}>0.
                                                  $$
Now as $x$ tends towards $\partial M$, the boundary of $M$, the vector $v_x$
tends to infinity in $V_+$. So if $m$ has unbounded support, then $K(x)$
tends to infinity as $x$ tends to $\partial M$.
The intuition, of course,
is that $X$ can with positive probability avoid hitting a certain state
for all times in a finite length set but not for all times in an infinite
length set. \endbox


\newsect{More exact results for Markov chains}%
Our aim is to obtain a sharper version of \zdbk\ for finite Markov
chains, and then to derive more terms of the asymptotic expansion
of the density of $\Al$.

The initial case studied in \bwii\ was of a 
symmetrizable (reversible) Markov chain and a smooth discount
density $m$. It turns out that $m$ need only be of bounded variation
(see below), but for technical ease we shall give the proof first in
the case where $m$ is also absolutely continuous.

More interestingly, the symmetrizability is seen now to have only
been needed to make one of the eigenvalues of $Q$ real and its corresponding
eigenvector orthogonal to the others. This in fact happens automatically
because every (non-diagonal) element of $Q$ is non-negative
(we say that $Q$ is {\it essentially non-negative}).
The following theorem collects all the facts about non-negative
matrices that we will need.

\proc{\pnum{Theorem}{zxdcone}} {\sl Let $R$ be an essentially
non-negative $n\times n$ matrix. Let $\d$ be its principal
eigenvalue (the one with greatest real part). Then $\d$ is itself real,
and its corresponding eigenvector is non-negative and no other is
positive.
If, in addition, $R$ is irreducible (in the stochastic sense), then
$\d$ is simple, its eigenvector is strictly positive and no other
is non-negative, and there exists a real diagonal matrix $F$ with
positive elements such that $S:=\d I-F^{-1}RF$ has a simple
eigenvalue zero, with an orthogonal eigenprojection $P$, and
that, for some positive $\sigma$}
$$
  \brak{Sx}{x}\geq\sigma\|(I-P)x\|^2,\qquad\hbox{\sl for all $x\in\Bbb R^n$.}
  \enum{zdca}
                                                  $$
(Where $\brak{\ }{\ }$ and $\|\ \|$ are the standard inner product and
its norm on $\Bbb R^n$, and an orthogonal projection $P$ satisfies
$P^\top=P^2=P$.)

\proof{\zxdcone} For the first parts see the Perron-Frobenius
theorem in, for example, theorem 1.5 of \seneta\ or theorems I.7.5
and I.7.10 of \kato. For the 
existence of $F$ see Theorem~B of \baxiii, which itself is adapted from 
theorem I.7.13 of \kato.\endbox

We recall that the {\it variation} of a measurable function
$x:[0,\infty)\rightarrow\Bbb R$ on $[a,b]$ is defined as
$$
  V_x(a,b):=\sup\sum_{i=1}^n|x(t_i)-x(t_{i-1})|,\enum{zdcb}
                                                  $$
where the supremum is taken over all partitions: $a=t_0<t_1<\ldots<t_n=b$
of $[a,b]$. We say that $x$ is of finite variation (FV) if
$V_x(0,t)$ is finite for all $t$, and that $x$ is of bounded
variation (BV) if $V_x(0,\infty):=\lim_{t\rightarrow\infty}V_x(0,t)$
is finite.
An absolutely continuous BV function is the partial integral of a function
in $L_1(0,\infty)$. (My thanks to James Norris for correcting a previous
mis-statement here.)

We can now begin by strengthening \zdbk:

\proc{\pnum{Theorem}{zxdctwo}} {\sl Let $X$ be an irreducible continuous-time
Markov chain on a finite set $S$, with Q-matrix $Q$. Let $m$ be
a non-negative absolutely continuous density on $[0,\infty)$ of bounded
variation. Then, if $X$ starts in state $i$ and $v$ is in $\Bbb R^S$,
$$  
  \Bbb E_i\exp\int_0^\infty m(\l t)v(X_t)\,dt=e^{\eta(v)/\l}
  \bigl(w_i(m_0v)+o(1)\bigr),\enum{zdcc}
                                                  $$
where $\eta(v)$ is as in \zdbk\ and $w(v)$ is the positive
eigenvector of $Q+V$ and $o(1)$ tends to 0 locally uniformly in $v$
as $\l$ goes to 0.}

\proof{\zxdctwo}

The chain has an invariant distribution $\pi$, but we do not need to
assume that $Q$ is $\pi$-symmetric.
We will aim to get a uniform bound for all $v$ in some compact subset
$V_K$ of $\Bbb R^S$, and for a fixed $m$ such that $V_m(0,\infty)\leq K_V$.
Since \zxdbone\ gives us the asymptotic exponential size of $\phi$,
it is sensible to discount it by the same, by defining
$$
  \psi_i(t,\l,v):=\exp\left(-\l^{-1}\int_0^\infty
  \d(\theta_tm_sv)\,ds\right) \phi(i,t,\l,v).\enum{zdcd}
                                                  $$
Then $\psi$ satisfies the vector differential equation transformed from \zdbl
$$
  \partial_t\psi=\l^{-1}R(m_tv)\psi,\qquad\psi(\infty,\l)={\bf 1},\enum{zdce}
                                                  $$
where $R(v)$ is $\d(v)I-(Q+V)$ which has a simple eigenvalue at 0, and all its
other eigenvalues have positive real part. From \zxdcone, there
exists a real diagonal matrix $F(v)$ with positive elements, such
that $S(v):=F^{-1}(v) R(v) F(v)$ has an orthogonal eigenprojection $P(v)$
onto the space spanned by the strictly positive eigenvector $y(v)$
corresponding to the eigenvalue zero. Further there exists a positive
$\sigma(v)$ such that \zdca\ holds, that is
$$
  \brak{S(v) x}{x}\geq\sigma(v)\|(I-P(v))x\|^2\qquad
  \hbox{\rm for all $x$,}\enum{zdcf}
                                                  $$
where $\brak{\ }{\ }$ and $\|\ \|$ are the standard inner product and
its norm on $\Bbb R^S$.
\kato, or otherwise, tells us that $R$, $S$, $F$, $P$,
$y$ and $\sigma$ are smooth in $v$ with bounded derivatives on $V_K$. Let
$\sigma_0:=\inf_{v\in V_K}\sigma(v)$, which is positive.
We now fix $v$, although our bounds will still be uniform,
and write $R_\a$ for $R(\a v)$,
and so on. We can choose the normalisation
of $F$ uniquely such that $F_0=\diag(\pi_i^{-1/2})$ and
$P_\a F^{-1}_\a F'_\a y_\a=0$, and by choosing $\|y(v)\|=1$ we
ensure that $P_\a y'_\a=0$ and $y_0=\sqrt\pi$.

As in \baxiii, we change bases appropriately by defining
$$
  \chi(t,\l):=F^{-1}_{m(t)}\psi(t,\l).\enum{zdcg}
                                                  $$
The differential equation \zdce\ now becomes
$$
  \partial_t\chi=\l^{-1}S_{m(t)}\chi+J_{m(t)}\chi m',
  \qquad\chi(\infty,\l)=\sqrt\pi,\enum{zdch}
                                                  $$
where $J_\a:=-F^{-1}_\a F'_\a$. Then by taking the inner product of \zdch\
with $\chi$ we can produce a differential inequality in the norm of $\chi$,
$$
  \half\partial_t\|\chi\|_t^2\geq0-K_1\|\chi\|_t^2|m'_t|,\qquad
  \hbox{\rm so that}\qquad \|\chi\|_t\leq\exp
  \left(K_1\int_t^\infty|m'_s|\,ds\right),\enum{zdci}
                                                  $$
where $K_1:=\sup\|J_\a\|$, the supremum taken over the range $\a\in[0,K_V]$
and $v\in V_K$.
Whence we deduce that $\chi$ is uniformly bounded in $t$ and $\l$ by
$K_\chi:=\exp(K_1 K_V)$.
Now we split $\chi$ up according to the decomposition $I=P_\a\oplus(I-P_\a)$,
and define $\chi_-(t):=(I-P_{m(t)})\chi_t$. We differentiate $\chi_-$,
using \zdch, to get
$$
  \partial_t\chi_-=\l^{-1}S_m\chi_-+(I-P_m)J_m\chi m'-P'_m\chi m',
  \qquad\chi_-(\infty,\l)=0.\enum{zdcj}
                                                  $$
Taking the inner product of this with $\chi_-$ itself, we derive the inequality
$$
  \half\partial_t\|\chi_-\|_t^2\geq\l^{-1}\sigma_0\|\chi_-\|_t^2
  -K_\chi(K_1+K_2)\|\chi_-\|_t|m'_t|,\enum{zdck}
                                                  $$
where $K_2:=\sup\|P'_\a\|$, with the supremum taken over the same range
as $K_1$. Which we can integrate to get the upper bound
$$
  \|\chi_-\|_t \leq (K_1+K_2)K_\chi\int_t^\infty 
  e^{-\sigma_0(s-t)/\l}|m'_s|\,ds.\enum{zdcl}
                                                  $$
And so we see that $\chi_-(t,\l)$ tends to $0$ as $\l$ tends to $0$
for all finite $t$, though note that the convergence is not necessarily
uniform in $t$.
Finally we consider the component of $\chi$ in the $y_m$ direction,
$\xi(t,\l):=\brak{\chi(t,\l)}{y_{m(t)}}$, which
is governed by the differential equation obtained from \zdch
$$
  \partial_t\xi_t=\brak{\chi_-}{Jy+y'}m'_t,\qquad\xi(\infty,\l)=1,\enum{zdcm}
                                                  $$
where we used the fact that $PJy=Py'=0$. Then
$$
  |\xi(t,\l)-1|\leq(K_1+K_2)\int_t^\infty\|\chi_-(s,\l)\|\,|m'_s|\,ds,
  \enum{zdcn}
                                                  $$
which, by the Dominated Convergence theorem, tends to 0 uniformly in $t$ as 
$\l$ goes to 0. So
$$
  \phi(i,0,\l,v)=\exp(\eta(v)/\l)\bigl((Fy)_i(m_0v)+o(1)\bigr),\enum{zdco}
                                                  $$
and $F(v)y(v)=w(v)$, where $w(v)$ is the positive eigenvector of
$Q+V$, with the normalisation that $w(0)={\bf 1}$ and
$\partial_\a w(\a v)$ is orthogonal to the positive eigenvector of
$Q^\top+\a V$. In the case where $Q$ is $\pi$-symmetrizable
($\pi_iq_{ij}=\pi_jq_{ji}$), then the normalisation condition
becomes $\|w\|_\pi=1$, where $\|v\|_\pi^2=\sum\pi_iv_i^2$.\endbox

The next theorem removes the restriction that $m$ need be continuous,
but takes us into the technicalities of FV functions. The casual reader
can pass this by without disadvantage.

An FV function can be written as the difference of two increasing
functions, that is
$$
  \displaylines{x_t=x_0+x^+_t-x^-_t,\cr
  \rlap{\hbox{\rm where}}\hfill x^+_t:=\half(x_t+V_x(0,t)-x_0)\quad
  \hbox{\rm and}\quad x^-_t:=\half(x_0+V_x(0,t)-x_t).\hfill\cr}
                                                  $$
And so $x$ has only countably many discontinuities (though they may even be
dense), and thus can be taken to be an R-function (right-continuous with
left limits). We shall take all our functions to be R-functions. We adapt
the calculus from the left-continuous integrands of V.18 of \rogersw\
(changing some signs) to give the formulae
$$
  \leqalignno{x&=x_0+x^c+x^a&(\hbox{\rm Decomposition})\cr
  d(xy)&=x\,dy+y\,dx-\D x\D y&(\hbox{\rm Integration by parts})\cr
  d(f(x))&=f'(x)\,dx^c+\D(f(x))&(\hbox{\rm It\^o's formula})\cr}
                                                  $$
where $x$ and $y$ are FV and $f$ is $C^1$, $\D x_t$ is $x_t-x_{t-}$, and
$x^c$ and $x^a$ denote the continuous and purely discontinuous parts of $x$
respectively. There is an expression for $x^a$ as $\sum_{0<s\leq t}\D x_s$.
As $x^+$ and $x^-$ are increasing they induce positive $\sigma$-finite
Lebesgue-Stieltjes measures on $(0,\infty)$, via $x^+(a,b]=x^+_b-x^+_a$.
So we can associate $x$ with the (signed) measure of their difference.
We write $dx_t=dx^+_t-dx^-_t$. We will also use the notation
$|dx_t|$ for $dx^+_t+dx^-_t=dV_x$. The differential expressions above
are symbolic, being merely shorthand for integral expressions.

We will also use an FV exponential result in that if $x$ is BV and
$$
  \hbox{\rm if}\quad dx_t\geq-x_t|dy_t|,\quad\hbox{\rm then}\quad
  x_t\leq x_\infty\prod_{s>t}(1+|\D y_s|)\exp V_{y^c}(t,\infty).\enum{zdcea}
                                                  $$
Another useful result follows from integration by parts, in that
$$
  \half d\|x_t\|^2=\brak{x_t}{dx_t}-\half\|\D x_t\|^2.\enum{zdceb}
                                                  $$

\proc{\pnum{Theorem}{zxdcextra}} {\sl \zxdctwo\ remains true if
$m$ is a discontinuous non-negative density of bounded variation.}

\proof{\zxdcextra} 
We follow the proof of \zxdctwo\ exactly down to \zdcg, except that
we take $R$, $F$ $S$ and $P$ to be functions of $t$ rather than $v$. We
write $G:=F^{-1}$, $w:=Fy$ and $w^*:=Gy$, and take the normalisation
that $\|y\|=1$ ($\iff\brak{w}{w^*}=1$) and $\brak{w}{dw^*}=0$
($\iff\brak{dw}{w^*_-}=0$). Note that all these functions are BV.
Then \zdch\ becomes
$$
  d\chi=\l^{-1}S\chi\,dt+dG\,F\chi.\enum{zdcec}
                                                  $$
$$
  \hbox{\rm So}\quad\half d\|\chi_t\|^2\geq\brak{dG\,F\chi}{\chi}_t
  -\half\|\D G F\chi\|^2_t\geq K_1\|\chi_t\|^2\|dG\|,\enum{zdced}
                                                  $$
by \zdceb\ and \zdcea, for some constant $K_1$. Hence $\|\chi_t\|$ is uniformly 
bounded in $t$
by some constant $K_\chi$. Now using \zdcec\ and \zdceb\ we again
work with the components of $\chi$ orthogonal to $y$,
$$
  \eqalign{
  \half d\|\chi_-\|_t^2&\geq{\sigma_0\over\l}\|\chi_-\|_t^2\,dt
  +\brak{dG\,F\chi-dP\,\chi+\D P\D\chi}{\chi_-}_t-\half\|\D(P\chi)\|^2_t \cr
  &\geq{\sigma_0\over\l}\|\chi_-\|_t^2\,dt-K_2(\|dG_t\|+\|dP_t\|),\cr}
                                                  $$
for some constant $K_2$.
So a result of the same form as \zdcl\ holds. Finally we find that
$$
  d\xi=\xi\brak{w}{dw^*}+\brak{F\chi_-}{dw^*},
                                                  $$
and we have a bound similar to that of \zdcn, because $\brak{w}{dw^*}=0$.
Explicitly, $w_t$ is the positive eigenvector of $Q+m_t V$ with
the normalisation that $w_\infty={\bf 1}$ and $dw_t$ is orthogonal to
the positive eigenvector of $Q^\top+m_{t-}V$.\endbox 

We can calculate an exact expression for $w$.

\proc{\pnum{Lemma}{zxdcestwo}} {\sl Let $y(v)$ be the positive
eigenvector of $Q+V$ of constant norm, with $y(0)={\bf 1}$.
If $w^0_t:=y(m_tv)$ and $w^*_t$ is the positive eigenvector of
$Q^\top+m_tV$ satisfying $\brak{w^0_t}{w^*_t}=1$, then
$$
  w_t=w^0_t\prod_{s>t}(1+\brak{\D w^0_s}{w^*_{s-}})\exp\int_t^\infty
  \brak{dw^{0,c}_s}{w^*_s}.
                                                  $$
Further $w_t=w_t(v,m)$ is continuous in $v$.}

\proof{\zxdcestwo} If we set $w_t=r_tw^0_t$, then
$$
  dw_t=dr^c_t\,w^0_t+r_t\,dw^0_t+\D r_tw^0_{t-}\quad
  \hbox{\rm so}\quad\brak{dw_t}{w^*_{t-}}=dr_t+r_t\brak{dw^0_t}{w^*_{t-}}.
                                                  $$
An application of \zdcea\ gives the expression for $w$. Elementary 
perturbation results, in for example Kato~[7], tell us that $y$ is 
smooth in $v$ and so
$$
  dw^0_t(v)=\brak{v}{\nabla}y(m_tv)\,dm^c_t+\D(w^0_t(v)).
                                                  $$
Thus the difference $dw^0_t(v)-dw^0_t(u)$ can be written as
$$
  \displaylines{
  \Bigl(\brak{v-u}{\nabla}y(m_tv)
  +\brak{u}\nabla(y(m_tv)-y(m_tu))\Bigr)\,dm^c_t+\D(y(m_tv)-y(m_tu)).\cr
  \rlap{\hbox{\rm Hence}}\hfill\quad|dw^0_t(v)-dw^0_t(u)|\leq K_1\|v-u\|\,
  |dm^c_t| +K_2\|v-u\|\,|\D m_t|,\hfill\cr}
                                                  $$
where $K_1:=\sup_{K_V}\|\nabla y(v)\|+V_K\sup_{K_V}\|v\|\,\|\nabla y(v)-
\nabla y(u)\|/\|v-u\|$ and some constant $K_2$. Hence $w_t$ is
(Lipschitz) continuous in $v$.\endbox

\proc{\pnum{Theorem}{zxdcfour}} {\sl Let $X$ be an irreducible continuous-time
Markov chain on a finite set $S$, with Q-matrix $Q$. Let $m$ be
a non-negative density on $[0,\infty)$ of bounded
variation. Then
where $f_i^\l$ is the density of $A_\l$ on $M$ under the law starting
$X$ at $i$, the $(f_i^\l)$ can be written as
$$
  f^\l_i(x)=e^{-K(x)/\l}(2\pi\l)^{-(n-1)/2}\bigl(\det H_K(x)\bigr)^{1/2}
  z_i(x)r^\l_i(x),\enum{zdcp}
                                                  $$
where $K$ is as defined by \zdbq, $H_K$ denotes its Hessian taken with
respect to $M$,
$z(x)$ is the positive eigenvector of $Q+\diag(m_0\nabla K(x))$, and the
residue term $r^\l$ goes to 1 as $\l$ goes to 0, in the sense that
$$
 \limsup_{\l\downarrow0}\int_F r_i^\l(x+\sqrt\l y)\,dy\leq|F|,\quad
  \hbox{\sl and}\quad\liminf_{\l\downarrow0}\int_G r_i^\l(x+\sqrt\l y)\,dy
  \geq|G|,\enum{zdcq}
                                                  $$
for all $x$ in $\Int(M)$, and for
$F$ and $G$ respectively closed and open bounded subsets of ${\bf 1}^\bot$.}

\noindent{\bf Notes:} (1)  We take the Hessian regarding $K$ as a function
on an open subset of $\Bbb R^{n-1}$, that is
$K(x_1,\ldots,x_{n-1},1-\sum_i^{n-1}x_i)$. See the example at the end
of this paper.

\noindent (2) Unfortunately we would really like to prove the result that
$$
  \Bbb P_i(A_\l\in H)\sim\int_H e^{-K(x)/\l}
  (2\pi\l)^{-(n-1)/2}\bigl(\det H_K(x)\bigr)^{1/2}z_i(x)\,dx,\enum{zdcr}
                                                  $$
for suitable $H$, as $\l$ goes to 0. This could be proved if the integrand
in our control of $r^\l$ was $r_i^\l(x+\l y)$
rather than $r_i^\l(x+\sqrt\l y)$.

\proof{\zxdcfour} \zxdctwo\ can be taken as saying
that for $v$ in $\Bbb R^S$,
$$
  f_i^{v,\l}(x):=\exp\Bigl(\l^{-1}\bigl(\brak{v}{x}-\eta(v)\bigr)\Bigr)
  f_i^\l(x)/w_i(v)\enum{zdcs}
                                                  $$
is (asymptotically) a density on $M$, where $w_i(v)$ is the $w_0(v,m)(i)$
of \zxdcestwo, and our $z_i(x)$ will be $w_i(\nabla K(x))$. If $A_\l^v$ 
under $\Bbb P_i$
has the law $f_i^{v,\l}$, then we can derive a central limit result
by considering $Z_\l^v:=(A_\l^v-\nabla\eta(v))/\sqrt\l$. We see that
for $u$ in $\Bbb R^S$,
$$
  \eqalignno{\relax\Bbb E_i\bigl(\exp\brak{u}{Z_\l^v}\bigr)
  &=\exp\Bigl(\l^{-1}\bigl(\eta(v+\sqrt\l u)-\eta(v)-
  \brak{\sqrt\l u}{\nabla\eta(v)}\,\bigr)\Bigr)\cr 
  &\qquad\ldots\Bigl(w_i(v+\sqrt\l u)+o(1)\Bigr)/w_i(v),
  &\lnum{zdct}\cr}
                                                  $$
using the local uniformity in $v$ of the convergence of $o(1)$. As $\eta$
inherits the smoothness of $\d$, we can expand it about $v$ as
$$
  \eta(v+\sqrt\l u)=\eta(v)+\brak{\sqrt\l u}{\nabla\eta(v)}+
  \half\l u^\top H_\eta(v)u+o(\l),\enum{zdcu}
                                                  $$
and hence deduce that
$$
  \lim_{\l\downarrow0}\Bbb E_i\bigl(\exp\brak{u}{Z_\l^v}\bigr)=
  \exp\bigl(\half u^\top H_\eta(v)u\bigr).\enum{zdcv}
                                                  $$
In other words
$$
  Z_\l^v\tendsd N(0,H_\eta(v)).
                                                  $$
We can think of $f_i^{v,\l}$ as the distribution of $\Al$
conditioned in some way to converge to $\nabla\eta(v)$, but we do
not make this formal. \zxdbfive\ provides the interpretation of $\nabla\eta(v)$
as the maximizing $x$ in the Legendre transform.

Recall that a sequence of laws $(\nu_n)$ on a Polish space $E$ converges
to a law $\nu$ with respect to the weak topology on $M_1(E)$ if
$\brak{v}{\nu_n}\rightarrow\brak{v}\nu$ for all $v$ in $C_b(E)$.
\billingsley, 2.1, shows that this is equivalent to each of the following
$$
  \eqalign{\limsup_{n\rightarrow\infty}\nu_n(F)&\leq\nu(F)\qquad
  \hbox{\rm $F$ closed in $E$,}\cr
  \liminf_{n\rightarrow\infty}\nu_n(G)&\geq\nu(G)\qquad
  \hbox{\rm $G$ open in $E$,}\cr
  \hbox{\rm and}\qquad\lim_{n\rightarrow\infty}\nu_n(H)&=\nu(H)\qquad
  \hbox{\rm $H$ in $E$ with $\nu(\partial H)=0$.}\cr}
                                                  $$
Setting $x=\nabla\eta(v)$, we recall from \zxdbfive\ that $\nabla K(x)$
is $v$, up to a multiple of~$\bf 1$. The asymptotics of
the density of $Z_\l^v$ are given by
$$
  f_i^{Z,\l}(y):=\l^{(n-1)/2}f_i^{v,\l}(x+\sqrt\l y)
  \sim(2\pi)^{-(n-1)/2}\det H_K(x)^{\half}e^{-\half y^\top H_K(x)y}\,
  r_i^\l(x+\sqrt\l y),
                                                  $$
because
$$
  \eqalign{&\brak{v}{x+\sqrt\l y}-\eta(v)-K(x+\sqrt\l y)\cr
  &\quad=\brak{v}{x}-\eta(v)-K(x)+\sqrt\l\brak{v-\nabla K(x)}{y}
  -\half\l y^\top H_K(x)y+o(\l)\cr
  &\quad=\l(-\half y^\top H_K(x)y+o(1)).\cr}
                                                  $$
The normal distribution $N(0,H_\eta(v))$ itself has density
$$
  f(y)=(2\pi)^{-(n-1)/2}\det H_K(x)^{\half} e^{-\half y^\top H_K(x)y},
                                                  $$
as \zxdbfive\ tells us that $H_K=H_\eta^{-1}$ on $M$.
By Lemma I.45.1 of \williams, if $H$ is bounded and $|\partial H|=0$
then
$$
  \int I_H(y)(f(y))^{-1}f_i^{Z,\l}(y)\,dy\rightarrow|H|,\qquad
  \hbox{\rm or}\qquad \int_H r_i^\l(x+\sqrt\l y)\,dy\rightarrow|H|.\enum{zdcw}
                                                  $$
Hence by the equivalence of the above expressions for weak convergence,
the result is proved.\endbox

The following Corollary is intended in the way of a remark, and was the
original statement of \zxdcfour, but is now seen to be weaker, although
perhaps a more natural formulation.

\proc{\pnum{Corollary}{zxdcfive}} {\sl Under the conditions of 
\zxdcfour,
$$
  \limsup_{\l\downarrow0}\int_F r_i^\l(x)\,dx\leq|F| \qquad\hbox{\sl and}
  \qquad\liminf_{\l\downarrow0}\int_G r_i^\l(x)\,dx\geq|G|,\enum{zdcx}
                                                  $$
for $F$ closed in $\Int(M)$ and $G$ open in $\Int(M)$. In other words,
$r_i^\l(x)\,dx$ converges weakly to $dx$ on $\Int(M)$.}

\proof{\zxdcfive} Take $G$ open in $\Int(M)$,
$\d$ small and positive with $G_\d:=\{y\in G:B(y,\d)\subseteq G\}$,
and $B$ a ball around 0, then by Fatou's lemma and Fubini's theorem
$$
  \eqalign{|G_\d|\,|B|&=\int_{G_\d}\left(\liminf_{\l\downarrow0}
  \int_B r_i^\l(x+\sqrt\l y)\,dy\right)\,dx\cr
  &\leq\liminf_{\l\downarrow0}\int_{G_\d}\int_B r_i^\l(x+\sqrt\l y)\,dy\,dx
  \leq\left(\liminf_{\l\downarrow0}\int_G r_i^\l(x)\,dx\right)\,|B|.\cr}
                                                  $$
Letting $\d$ tend to 0, we have one of our bounds. For some $F$ closed
in $\Int(M)$, we need $\int_B r_i^\l(x+\sqrt\l y)\,dy$ to be uniformly bounded
on $F$ and for $\l$ near 0. It is, and the bound is
$$
  \sup_{x\in F}(2\pi)^{(n-1)/2}\det H_K(x)^{-\half}\sup_{y\in B}
  e^{\half y^\top H_K(x)y}<\infty.
                                                  $$
Working with this $F$ and with $F^\d:=\{y\in M:d(y,F)\leq\d\}$, we
can show in a similar way that
$$
  \limsup_{\l\downarrow0}\int_F r_i^\l(x)\,dx\leq|F^\d|,
                                                  $$
and hence we are home.\endbox

\noindent{\bf Some remarks.}

\noindent{\bf (a)} Were the $r_i^\l$ to be equicontinuous (or some such
condition) we would have that $r_i^\l(x)\rightarrow1$ for all $x$ and
hence that $f_i^\l(x)/f_j^\l(x)\rightarrow z_i(x)/z_j(x)$ and
$\lim-(Qf^\l)_i/f^\l_i$ differs from $m_0\nabla K(x)$ only by a multiple of 
$\bf 1$, as in Section (c) of \bwii, where the choice
of $\nabla K(x)$ in $\ker(\d)$ was called $g(x)$.

\noindent{\bf (b)} Note that the proof of \zxdcfour\ gives us a
central limit theorem for $\Al$ as
$$
  Z_\l:=(A_\l-\pi)/\sqrt\l\tendsd N(0,H_\eta(0)).\enum{zdcy}
                                                  $$
Taking a Taylor expansion of $\d$ about 0 and integrating we discover
that $H_\eta(0)=\sigma^2 H_\d(0)$, where $\sigma=\|m\|_{L_2}$ which is
finite because $m$ is in both $L_1$ and $L_\infty$.

\noindent{\bf Example.} (This case was first studied in Section (d) of \bwii.)
Suppose we have a Markov chain which is symmetric and 
space-homogeneous, with Q-matrix
$q_{ij}:=\pi_j-\d_{ij}$, where $\pi$ is a distribution on a finite set $S$. 
The Ces\`aro large-deviation rate function
is $I(x)=1-(\sum\sqrt{\pi_ix_i})^2$, and the exponentially discounted
large-deviation rate is $K(x)=\sum\pi_i\log(\pi_i/x_i)$.
We found then that $\d(v)$ is the unique root $\d$ in $(\max_i(v_i-1),\infty)$
of
$$
  \sum_{i\in S}{\pi_i\over\d+1-v_i}=1,
                                                  $$
and that $\eta$ is given by $\eta(v)=\d(v)-\sum_i\pi_i\log(\d(v)+1-v_i)$.
We find now that
$$
  \eqalign{\nabla_i I(x)&=1-\sqrt{\pi_i\over x_i}\Bigl(\sum_{j\in S}
  \sqrt{\pi_j x_j}\Bigr),\cr
  \nabla_i\d(v)&={\pi_i\over(\d(v)+1-v_i)^2}\Big/\sum_{j\in S}
  {\pi_j\over(\d(v)+1-v_j)^2},\cr
  \nabla_iK(x)&=1-{\pi_i\over x_i},\cr
  \hbox{\rm and}\quad\nabla_i\eta(v)&={\pi_i\over\d(v)+1-v_i}.\cr}
                                                  $$
Here we chose $\nabla I$ and $\nabla K$ to be in the kernel of $\d$.
The distribution of $\Al$ can be calculated explicitly to be a multidimensional 
$\b$-distribution with density
$$
  f_i^\l(x)={x_i\over\pi_i}\Gamma(\l^{-1})\prod_{j\in S}{x_j^{(\pi_j/\l)-1}
  \over\Gamma(\pi_j/\l)}.
                                                  $$
Note that the Hessian of $K$ on $M$ is not the same as that derived
from the extension of $K$ to $\Bbb R^S$, but by using any of the following
co-ordinate schemes:
$$
  \eqalign{
  K_i : \Bbb R^{S\backslash\{i\}}&\longrightarrow\Bbb R
                    \qquad\hbox{\rm for each $i\in S$}\cr
\hbox{\rm where}\qquad(x_j)_{j\neq i}&\longmapsto 
    K(x_1,\ldots,x_{i-1},1-\sum_{j\neq i}x_j,x_{i+1},\ldots,x_n)\cr
\noalign{\smallskip}
\hbox{\rm or}\qquad K_0 : \Bbb R^n&\longrightarrow\Bbb R\cr
\hbox{\rm where}\qquad x&\longmapsto K(x+(1-{\bf 1}^\top x){\bf1}/n)+
                         \half({\bf1}^\top x)^2.\cr}
                                                  $$
What is happening here is that our choice of basis for evaluating
the Hessian corresponds to our choice of basis for integrating which
was made back at the start of Section (b) of \bwii. The $K_0$
representation projects onto $M$ and adds a strictly convex term which
is perpendicular to $M$. This representation is more natural, though cumbersome
to calculate with, and can be shown equivalent to any of the others by
verifying that the change of basis matrix has determinant one.
Thus the Hessian (in the $K_n$ realisation) and its determinant are
given by
$$
  H_{K_n}(x)_{ij}={\pi\over x_i^2}\d_{ij}+{\pi_n\over x_n^2}\qquad
  \hbox{\rm and}\qquad\det(H_K(x))=
  \Bigl(\prod_{i\in S}{\pi_i\over x_i^2}\Bigr)\sum_{j\in S}{x_j^2\over\pi_j}.
                                                  $$
The normalisation of the eigenvector $z_i(x)$ is that $\|z\|_\pi=1$,
so it is given by
$$
  z_i(x)={x_i\over\pi_i}\Bigl(\sum_{j\in S}{x_j^2\over\pi_j}\Bigr)^{-1/2}.
                                                  $$
(The corresponding vector for the Ces\`aro case is
$\sqrt{x_i/\pi_i}(\sum_j\sqrt{\pi_jx_j})$.)
We can now calculate the residual functions using Stirling's formula
$$
  \Gamma(x)=\sqrt{2\pi}x^{x-\half}e^{-x}(1+O(x^{-1})),
                                                  $$
where $|O(x^{-1})|\leq K/x$ as $x\rightarrow\infty$ for some constant $K$.
It is thus discovered that 
the residual functions $r_i^\l(x)$ can be calculated and are found to
be independent of both $i$ and $x$, and are of size $1+O(\l)$.

\proc{\pnum{Hypothesis}{pxchyp}} We recall from Theorem~C of \bwii\
that in the set-up of \zxdcfour\ with the
exponential discount ($m_t=e^{-t}$), the density $f^\l$ satisfies
the vector differential equation
$$
  \L f^\l=-\l^{-1}Qf^\l,\enum{zdcha}
                                                  $$
where $\L$ is the matrix differential operator $\L=\diag(\sum_{j\neq i}
(\partial_j-\partial_i)x_j)_{i\in S}$. Here we have changed the
domain of $f^\l$ from a subset of $\Bbb R^{n-1}$ equivalent to $M$,
to a neighbourhood of $M$ in $\Bbb R^n$ by extension. The operator
$\L$ is invariant to the extension chosen. If we discount $f^\l$ by
the known large-deviation rate function $K$, that is by defining
$g^\l$ by
$$
  \displaylines{f^\l(x)=e^{-K(x)/\l}(2\pi\l)^{-(n-1)/2}g^\l(x)\cr
  \rlap{\hbox{\rm then}}\hfill \L g^\l=\l^{-1}R\bigl(\nabla K(x)\bigr)g^\l.
  \hfill\cr}
                                                  $$
This compares with equation \zdce\ which said that
$$
  \partial_t\psi=\l^{-1}R(m_tv)\psi,
                                                  $$
where $\psi$ is the discount of $\phi$ as defined by \zdcd. The matrix
$R(v)$ has a simple eigenvalue 0 and all other eigenvalues have positive
real part. We saw that $\psi$ tended to a multiple of the 0-eigenvector
of $R(m_tv)$ as $\l$ went to 0, and also that $g^\l$ tended (in some
sense) to $z(x)$, which was the 0-eigenvector of $R(\nabla K(x))$.
We can formulate an analogue of \zdcha\ for the general discount case
as follows.

Let us write $A_{\l,t}$ for $\l M_t^{-1}\int_0^\infty\t_tm(\l s)\d_{X_s}\,ds$,
where $M_t:=\int_0^\infty\t_tm(s)\,ds$, and $f^{\l,t}_i$ for the density
of $A_{\l,t}$ if $X$ starts in state $i$. Then $A_{\l,t}$ will satisfy
the large-deviation property with rate function $K_t$, where
$$
  K_t=\eta_t^*\quad\hbox{\rm where}\quad\eta_t(v):=\int_0^\infty
  \d(M_t^{-1}\t_tm(s)v)\,ds,
                                                  $$
and we write $f^{\l,t}$ as
$$
  \displaylines{
  f^{\l,t}(x)=e^{-K_t(x)/\l}(2\pi\l)^{-(n-1)/2}g^{\l,t}(x).\cr
  \rlap{\hbox{\rm Then}}\hfill M_t^{-1}m_t\L g^{\l,t}+\partial_t g^{\l,t}
  =\l^{-1}R\bigl(M_t^{-1}m_t\nabla K_t(x)\bigr)g^{\l,t}.\hfill\cr}
                                                  $$
Again \zxdcfour\ tells us that $g^{\l,t}$ tends (in some sense) to the
0-eigenvector, $z$, of the matrix $R$. We hypothesize that the convergence
is in fact pointwise.

\newpage
\centerline{\bf REFERENCES}
\vskip0.7truein
\input ref
\end
% End of Text