Markov Chain and Queuing System
Zijing Hu
October 15, 2022
Contents
1 Discrete-Time Markov Chain (DTMC) 1
2 Continuous-Time Markov Chain (CTMC) 11
3 Poisson Process 14
4 Queuing System 15
*This note is based on ECEN 755: Stochastic Systems by Dr. P. R. Kumar, TAMU.
1 Discrete-Time Markov Chain (DTMC)
Definition 1.1. A discrete-time Markov chain on a countable set, S, is a stochastic process satis-
fying the Markov property
P r (x(t) = i
t
|x(t 1) = i
t1
, . . . , x(0) = i
0
)
=P r (x(t) = i
t
|x(t 1) = i
t1
)
for any i
1
, . . . , i
T
S and n N.
Consider only time-homogeneous Markov chains in which the transition probabilities are
time-invariant (Note that P
0
= I):
p
ij
= p
ij
(t) = P r (x(t + 1) = j|x(t) = i)
For a discrete time and time-homogeneous Markov chain on S we thus have that
P r(x(t) = i
t
, . . . , x(0) = i
0
) = p
i
t1
,i
t
· · · · · p
i
0
,i
1
· π(i
0
)
where we use the notation
π(i
0
) = P r(x(0) = i
0
)
for the initial distribution of π
π
π(0). We also have π
π
π(t + k) = π
π
π(t)P
k
. Any probability vector
π
π
π(t) = (π
i
(t))
iS
and two-dimensional array of probabilities P = (p
ij
)
i,jS
with
P
jS
p
ij
= 1 for all
i S defines the distribution of a time-homogeneous Markov chain on S through the identity.
Definition 1.2. If π
π
π = π
π
πP, we say π
π
π is a steady state distribution for the Markov Chain.
Definition 1.3. For a Markov chain with state space S, consider a pair of states (i, j). We say that
j is reachable from i, denoted by i j, if there exists an integer n 0 such that P
n
ij
> 0.
1
Definition 1.4. If j is reachable from i, and i is reachable from j, then the states i and j are said
to communicate, denoted by i j.
The relation defined by communication satisfies the following conditions:
1. All states communicate with themselves: P
0
ii
= 1 > 0.
2. Symmetry: If i j, then j i.
3. Transitivity: If i k and k j, then i j.
Proposition 1.5. For each Markov chain, there exists a unique decomposition of the state space S
into a sequence of disjoint subsets C
1
, C
2
, ...,
S =
[
i=1
C
i
in which each subset has the property that all states within it communicate. Each such subset is called
a communication class of the Markov chain.
Example 1.6. Here’s an example of DTMC:
A
B
C
D E
F
G H I
J
K
L
M
There are six communication classes: C
1
= {A, B, C}, C
2
= {D, E}, C
3
= {F }, C
4
= {G},
C
5
= {H}, and C
6
= {I, J, K, L, M }. Thus, we can draw the hierarchy of these communication
classes,
C
4
C
2
C
1
C
3
C
5
C
6
in which C
4
, C
2
, and C
5
are called transient communication class and C
1
, C
3
, and C
6
are called closed
communication class.
Definition 1.7. A communication class C s is said to be closed if there is no i C and j C
c
such
that i j. Otherwise, it is said to be transient.
Theorem 1.8. If T = set of all transient states, then P r(x(t) T |x(0) T ) cr
t
, where 0 < r < 1.
Lemma 1.9. Suppose i j and let N
ij
= {t : p
ij
(t) > 0}. N
i
i is closed under addition.
Theorem 1.10. If a set of non-negative integers is closed under addition then it contains all but a
finite numbers of multiples of its gcd (greatest common divisor).
2
Define N
ii
= {t : (p
ii
(t) > 0} and d
i
= gcd of N
ii
. Then N
ii
contains kd
i
expect for a finite
number of k
s. Note that d
i
is called the period of the communication class and period is a class
property.
Lemma 1.11. i j d
i
= d
j
.
Proof. Suppose i
n
j
m
i. Then n + m N
ii
. With Theorem 1.10, we have i
n
j
kd
j
j
m
i.
Then n + m N
ii
and i
n
j
(k+1)d
j
j
m
i. Then n + m N
ii
for some ks. This implies d
j
|d
i
.
Similarly, we have d
i
|d
j
, which shows d
i
= d
j
.
Definition 1.12. (1) A Markov chain is irreducible if all states communicate with each other; (2)
A communication class is aperiodic if its period = 1; (3) A Markov chain is called regular if it is
irreducible and aperiodic.
Example 1.13. Aperiodic communication classes.
Let T be the set of all transient states and C
i
represent all closed communication classes. We can
write the structure of P
t
for general Markov chain as:
C
1
C
2
... C
n
T
×
×
× 0 · · · 0 0 C
1
0 ×
×
× · · · 0 0 C
2
.
.
.
.
.
.
.
.
.
.
.
.
0 0 · · · ×
×
× 0 C
n
×
×
× ×
×
× · · · ×
×
× ×
×
× T
If t goes to infinite, the matrix becomes:
C
1
C
2
... C
n
T
×
×
× 0 · · · 0 0 C
1
0 ×
×
× · · · 0 0 C
2
.
.
.
.
.
.
.
.
.
.
.
.
0 0 · · · ×
×
× 0 C
n
×
×
× ×
×
× · · · ×
×
× 0 T
Theorem 1.14. Let P be a regular Markov chain. Then we have
(1) lim
n→∞
P
n
= A = [α
α
α
,α
α
α
, . . . ,α
α
α
]
(2) PA = AP, α
α
αP = α
α
α
(3) α
α
α is the unique invariant probability vector.
(4) For any π
π
π(0), we have lim
t→∞
π
π
π(t) α
α
α
Proof. Taking any x, we can show that the difference between the max (M) and the min (m) elements
of lim
n→∞
xP
n
0. Suppose for all elements of P, we have P
ij
β > 0. Then,
3
M
t+1
(1 β)M
t
+ βm
t
m
t+1
βM
t
+ (1 β)m
t
M
t
m
t
(1 2β)
t
(M
0
m
0
),
which proves (1). Using Proof by contradiction can prove (2) and (3).
Definition 1.15. Define a sequence {x
n
}.
(1) Suppose that lim
n→∞
x
n
= x. We say the sequence {x
n
} converges to x.
(2) Suppose that lim
n→∞
1
n
P
n
i=1
x
i
= x. Then the sequence {x
n
} is Ces`aro-Summable to x.”
(3) Take any γ (0, 1), define
u
n
=
n
X
i=0
n
i
γ
ni
(1 γ)
i
x
i
Suppose that lim
n→∞
u
n
= x. Then the sequence {x
n
} is Euler-Summable to x.”
Facts
If {x
n
} converges to x, then it is Ces`aro-Summable and Euler-Summable to x.
If {x
n
} is Ces`aro-Summable and Euler-Summable to x, then the two sums must be the same.
(Kemeny, J. G., Snell, J. L. Finite Markov Chains, Princeton, 1960, pp. 12)
Theorem 1.16. Let P be a irreducible Markov chain with d > 1. Then we have
(1) {P
n
} is Euler-Summable for all γ (0, 1)
(2) {P
n
} is Ces`aro-Summable
Proof. (1)
n
X
i=0
n
i
γ
ni
(1 γ)
i
P
i
= [γI + (1 γ)P]
n
,
which is a stochastic matrix of a regular Markov chain and converges to A when n goes to infinity.
we also have
α [γI + (1 γ)P] = α α
α
αP = α
α
α
(2) We can show that
lim
n→∞
P
d1
i=0
P
nd+i
d
B lim
m→∞
P
m
i=0
P
i
m + 1
B
Now we show an example with d = 3. Then P can be written as
C
1
C
2
C
3
!
0 ×
×
× 0 C
1
0 0 ×
×
× C
2
×
×
× 0 0 C
3
and we also have
P
2
=
C
1
C
2
C
3
!
0 0 ×
×
× C
1
×
×
× 0 0 C
2
0 ×
×
× 0 C
3
, P
3
=
C
1
C
2
C
3
!
×
×
× 0 0 C
1
0 ×
×
× 0 C
2
0 0 ×
×
× C
3
=
C
1
C
2
C
3
!
R
1
0 0 C
1
0 R
2
0 C
2
0 0 R
3
C
3
,
where R
i
are stochastic matrices of regular Markov chains. Then we have
4
lim
n→∞
P
nd
= lim
n→∞
C
1
C
2
C
3
!
R
nd
1
0 0 C
1
0 R
nd
2
0 C
2
0 0 R
nd
3
C
3
C
1
C
2
C
3
!
A
1
0 0 C
1
0 A
2
0 C
2
0 0 A
3
C
3
Thus,
lim
n→∞
P
nd
+ P
nd+1
+ P
nd+2
3
B
Definition 1.17. (Sample path behavior of Markov chain) we define the jumping time as
T
0
= 0, T
t+1
= min{t > T
t
, x(t) = x(T
t
)}
and holding time as
τ
i
= T
i+1
T
i
1.
Define the jump chain as
x
J
k
= x(T
k
)
x(t)
t
T
0
T
1
T
2
T
3
T
4
T
5
τ
1
= 1 τ
4
= 3
Theorem 1.18. (1) {x
J
k
} is a Markov chain with transformation probability P, where
p
J
ii
= 0, p
J
ij
=
p
ij
1 p
ii
(2) Given x
J
n
, τ
n
is geometrically distributed:
P r
τ
n
= l|x
J
n
= i
= p
l
ii
(1 p
ii
)
(3) Given x
J
n
, τ
n
and x
J
n+1
are independent with each other.
Proof.
P r
x
J
k+1
= j and τ
k
= l|x
J
k
= i
= p
ij
· p
l
ii
=
p
ij
1 p
ii
· p
l
ii
(1 p
ii
)
= P r
x
J
k+1
= j|x
J
k
= i
P r
τ
k
= l|x
J
k
= i
Geometric distribution
A random variable z is said to be geometrically distributed with the parameter q (0, 1) if
p(z > t) = q
t
(1 q). We also have p(z t) =
P
t=T
q
t
(1 q) = q
T
and p(z s + t|z s) = q
T
. We
say z is memoryless.
Finite and countable infinite state Markov chain
5
Same: ”, ”, and hierarchy of communicating classes.
Different: there may be no closed communicating class; even if there is closed communicating
class, it may not have invariant distribution vector.
A warm-up example
Suppose that f (t) =
1
t
α
where α > 0. Then we have
Z
+
1
f(t) dt =
Z
+
1
1
t
α
dt =
(
1
α1
if α > 1
+ otherwise
We also have
+
X
n=2
1
t
n
<
Z
+
1
f(t) dt <
+
X
n=1
1
t
n
,
which suggests that
+
X
n=1
1
t
n
(
< if α > 1
= + otherwise
Now let τ
i
= min{t 1 : x(t) = i} and M
i
= E[τ
i
|x(0) = i]. We can show that M
i
could be either
finite or infinite:
Suppose that ˜p
n
= 1/n
2
and
P
+
n=1
˜p
n
= c. Let p
n
= ˜p
n
/c be the probability of returning to
state i in n steps after starting from state i. Then we have
M
i
=
+
X
n=1
np
n
=
+
X
n=1
1
cn
= +
Suppose that ˜p
n
= 1/n
3
and
P
+
n=1
˜p
n
= d. Let p
n
= ˜p
n
/d. Then we have
M
i
=
+
X
n=1
np
n
=
+
X
n=1
1
dn
2
< +
Definition 1.19. We say state i is (1) transient if P r(τ < +∞|x(0) = i) < 1; (2) null recurrent
if P r(τ < +∞|x(0) = i) = 1 and M
i
= +; (3) positive recurrent if P r(τ < +∞|x(0) = i) < 1
and and M
i
< +.
1 2 3 4 5
...
1 1 1 1 1
Here is an example of recurrent state. Define p
i
= P r(x(t + 1) = i|x(t) = 1) where i = 2, 3, ....
Thus, the mean return time M
i
=
P
+
n=2
np
n
. State 1 is null recurrent if M
i
= + or positive
recurrent if M
i
< +.
Definition 1.20. Define the probability that the first passage time from state i to state j is equal
to n as
f
ij
(n) = P r
x(1) = j, x(2) = j, . . . , x(n 1) = j, x(n) = j|x(0) = i
State i is transient if and only if
P
+
n=1
f
ii
(n) < 1 or P r(τ
i
< +∞|x(0) = i) < 1.
State i is recurrent if and only if
P
+
n=1
f
ii
(n) = 1 or P r(τ
i
< +∞|x(0) = i) = 1.
6
Note that p
ij
(0) = δ
ij
but f
ij
(0) = 0 for all i and j. We can compute p
ij
(n) and f
ij
(n) for n > 0
as follows:
p
ij
(n) = δ
n0
+
n
X
m=0
f
ij
(m)p
jj
(n m)
f
ij
(n) = p
ij
(n) δ
n0
n1
X
m=0
f
ij
(m)p
jj
(n m)
Definition 1.21. The (one-sided or unilateral) Z-transform is defined as
X(z) = Z{x(n)} =
+
X
n=0
x(n) · z
n
Theorem 1.22. Let P
ii
(z) and F
ii
(z) be the Z-transforms of p
ii
(n) and f
ii
(n), respectively. We have
P
ii
(z) =
1
1 F
ii
(z)
Proof.
P
ii
(z) =p
ii
(0) +
+
X
n=1
z
n
n
X
m=1
f
ii
(m)p
ii
(n m)
=1 +
+
X
m=0
+
X
n=m
f
ii
(m)p
ii
(n m)z
n
given that f
ii
(0) = 0
=1 +
+
X
m=0
+
X
n=m
[f
ii
(m) · z
m
]
h
p
ii
(n m) · z
(nm)
i
=1 +
"
+
X
m=0
f
ii
(m) · z
m
#"
+
X
n=0
p
ii
(n) · z
n
#
let n n m
=1 + P
ii
(z)F
ii
(z)
When can you interchange limit and integral/summation? Below rules hold for both
integral and summation.
Example 1.23. Let f
n
(x) = 1 for any x [n 1, n]. Otherwise f
n
(x) = 0. Then we have
Z
+
0
f
n
(x) dx = 1, n = lim
n+
Z
+
0
f
n
(x) dx = 1
lim
n+
f
n
(x) = 0, x =
Z
+
0
lim
n+
f
n
(x) dx
= 0
Lemma 1.24. (Fatou’s Lemma) Suppose that f
n
(x) 0, x and lim
n→∞
= f(x). Then we have
lim
n→∞
Z
f
n
(x) dx
Z
f(x) dx
7
Theorem 1.25. (Monotone Convergence Theorem) If f
n1
(x) f
n
(x), x (f
n
(x) is monotonically
increasing in n) and lim
n+
f
n
(x) = f(x), then
lim
x+
Z
f
n
(x) dx =
Z
f(x) dx
Theorem 1.26. (Dominated Convergence Theorem) Suppose that |f
n
(x)| g(x), n, x and
R
g(x) dx <
+, then
lim
x+
Z
f
n
(x) dx =
Z
lim
n+
f
n
(x) dx
Lemma 1.27. State i is transient
P
+
n=1
p
ii
(n) < +.
Proof. Use Theorem 1.22 and Theorem 1.25.
lim
z1
+
X
n=0
p
ii
(n)z
n
= lim
z1
1
1
P
+
n=0
f
ii
(n)z
n
+
X
n=0
p
ii
(n) lim
z1
z
n
=
1
1
P
+
n=0
f
ii
(n) lim
z1
z
n
+
X
n=0
p
ii
(n) =
1
1
P
+
n=0
f
ii
(n)
Thus, we have
+
X
n=0
p
ii
(n) < +
+
X
n=0
f
ii
(n) < 1 State i is transient
Lemma 1.28. State i is recurrent
P
+
n=0
p
ii
(n) = +
Proof. Let E
n
= 1 if x(n) = i. Otherwise E
n
= 0. Then
E
"
+
X
n=0
E
n
#
= E [# of returns to i]
Using Theorem 1.25 we have
E
"
+
X
n=0
E
n
#
=
+
X
n=0
E [E
n
] =
+
X
n=0
p
ii
(n)
Lemma 1.29. Recurrence is a class property: if i j and i is recurrent, then j is recurrent.
Proof. Suppose i j and j is recurrent. Then we have
p
ii
(n) p
ij
(k)p
jj
(n k l)p
ji
(l)
+
X
n=0
p
ii
(n) p
ij
(k)
+
X
n=0
p
jj
(n k l)p
ji
(l) = +
Theorem 1.30. A special case of Blackwell’s Renewal Theorem (aperiodic Markov chain)
lim
n+
p
ii
(n) =
1
P
+
n=0
nf
ii
(n)
=
1
M
i
8
Lemma 1.31. For regular Markov chain, we have
lim
n+
p
ji
(n) =
1
P
+
n=0
nf
ii
(n)
=
1
M
i
Lemma 1.32. Positive recurrence is a class property.
Proof. Suppose that j is a positive recurrent state. Using Lemma 1.29 we know that i is recurrent.
We can find p
ij
(k) > 0 for some k and p
ji
(l) > 0 for some l. Then we have
p
ii
(n) p
ij
(k)p
jj
(n k l)p
ji
(l)
1
M
i
= lim
n+
p
ii
(n) p
ij
(k)
lim
n+
p
jj
(n k l)
p
ji
(l) =
1
M
j
p
ij
(k)p
ji
(l) > 0
Theorem 1.33. For a regular, positive recurrent Markov chain, let lim
n+
p
ii
(n) = π
i
where π
i
> 0.
Then π
j
=
P
i
π
i
p
ij
,
P
i
π
i
= 1, and these results are unique.
Proof. Step 1. Given that p
ij
(n + 1) =
P
k
p
ik
(n)p
kj
, we take limits of both left and right sides and
use Fatou’s Lemma (Lemma 1.24):
π
j
= lim
n+
p
ij
(n + 1) = lim
n+
X
k
p
ik
(n)p
kj
X
k
lim
n+
p
ik
(n)p
kj
=
X
k
π
k
p
kj
Step 2.
π
j
X
k
π
k
p
kj
X
k
X
l
π
l
p
lk
!
p
kj
=
X
l
π
l
X
k
p
lk
p
kj
=
X
l
π
l
p
lj
(2)
= π
j
X
k
π
k
p
kj
X
k
π
k
p
kj
(2) · · ·
X
k
π
k
p
kj
(n) . . .
Suppose that π
j
>
P
k
π
k
p
kj
(n) for some n. Then it leads to contradiction
X
j
π
j
>
X
j
X
k
π
k
p
kj
(n) =
X
k
π
k
X
j
p
kj
(n) =
X
k
π
k
= π
j
=
X
k
π
k
p
kj
(n), n
Step 3.
π
j
=
X
k
π
k
lim
n+
p
kj
(n) = π
j
X
k
π
k
=
X
k
π
k
= 1
Definition 1.34. Let V (t) be a stochastic process and F
t
= {V (0), V (1), . . . , V (t)}. E|V (t)| < +.
Martingale: E[V (t + 1)|F
t
] = V (t).
Supermartingale: E[V (t + 1)|F
t
] V (t).
Submartingale: E[V (t + 1)|F
t
] V (t).
Theorem 1.35. Supermartingale Convergence Theorem. Suppose V (t) 0 is a supermartingale
lim
t+
V (t) = Random
9
Theorem 1.36. Submartingale Convergence Theorem. Suppose (1) V (t) 0 is a submartingale
and (2) lim sup E|V (t)| < +
lim
t+
V (t) = Random
E|V (t)| < +
Theorem 1.37. L2-Super/sub/martingale Convergence Theorem. Suppose (1) V (t) 0 is a
super/sub/martingale and (2) E[V
2
(t)] C, t
lim
t+
V (t) = Random
E[V (t)] E[V ], E|V (t) V | 0
Lemma 1.38. Borel–Cantelli Lemma. Suppose that E(t) represents some event and we have
+
X
t=0
P [E(t)] < +
Then E(t) happens only finitely many times with probability 1.
Compare it with Markov chain. Suppose x(t) = 0, P [x(t) = i|x(0) = i] = p
ii
(t) and
P
+
t=0
p
ii
(t) <
+. Then x(t) = i holds only for finitely many t.
Definition 1.39. A Lyapunov function for an autonomous dynamical system ˙y = g(y),where
g : R
n
R
n
, with an equilibrium point at y = 0 is a scalar function V : R
n
R that is continuous,
has continuous first derivatives, is strictly positive for y = 0, and for which the time derivative
˙
V = V · g is non positive.
Example 1.40. Consider Lyapunov function
V (x
1
(t), x
2
(t)) =
1
2
(x
2
1
(t) + x
2
2
(t))
(a) Suppose that an autonomous dynamical system is defined as follows
dx
1
(t)
dt
= x
2
(t),
dx
2
(t)
dt
= x
1
(t),
We have
dV
dt
= x
1
(t)
dx
1
(t)
dt
+ x
2
(t)
dx
2
(t)
dt
= 0
Hence x
2
1
(t) + x
2
2
(t) = x
2
1
(0) + x
2
2
(0).
(b) Suppose that an autonomous dynamical system is defined as follows
dx
1
(t)
dt
= x
2
(t) x
1
(t)(x
2
1
(t) + x
2
2
(t))
dx
2
(t)
dt
= x
1
(t) x
2
(t)(x
2
1
(t) + x
2
2
(t))
We have
˙
V = (x
2
1
+ x
2
2
) = 4V
2
. Hence
Z
t
0
dV
V
2
=
Z
t
0
4 dt = V (t) =
1
1
V (0)
+ 4t
V (t) 0 as t +
10
Definition 1.41. Let F denote the history information with respect to t. τ is a stopping point of
the event {τ t} is known at time t
Theorem 1.42. Suppose {V
t
} is a martingale with respect to {F
t
}. Let τ be a stopping point of {F
t
}.
Then {V
min(τ,t)
F
t
} is a martingale.
Lemma 1.43. Suppose V (·) is a Lyapunov function and
E [V (x(t + 1))|x(t)] V (x(t)) 1(x(t) = 0)
Let τ
i0
is the first time hit state 0 when starting from state i. Then
E[τ
i0
] V (i)
Proof. Suppose that x(0) = i. We have
E [E [V (x(t + 1))|x(t)]] E [V (x(t)) 1(x(t) = 0)]
E[1(x(t) = 0)] E[V (x(t))] E [V (x(t + 1))]
E[τ
i0
] =
T
X
t=0
E[1(x(t) = 0)] V (x(0)) E [V (x(t + 1))] V (x(0)) = V (i)
Lemma 1.44. Define V (·) as in Lemma 1.43. Suppose
P
i
p
0i
V (i) < +. Then the Markov chain is
positive recurrent.
Proof.
E[τ
00
] = p
00
+
X
i=0
p
0i
E[τ
i0
] p
00
+
X
i=0
p
0i
V (i) < +
Theorem 1.45. (Foster’s Theorem) Let X = (X
n
)
nZ
+
be a DTMC on countable, irreducible state-
space X . If L : X R
+
is a function with EL (X
0
) < and such that for some K > k 0 and
some ϵ > 0, X (k) = {x X : L(x) k} is finite, and
E [L (X
n
) | X
n1
] < K, if L (X
n1
) k
E [L (X
n
) L (X
n1
) | X
n1
] < ϵ, if L (X
n1
) > k
then X is positive recurrent.
2 Continuous-Time Markov Chain (CTMC)
Definition 2.1. Continuous time Markov chain satisfying the Markov property
P r(x(t) = j|x(u) for u s) = P r(x(t) = j|x(s))
Hence
P r(x(t) = j|x(s) = i) = p
ij
(t, s)
For time-homogeneous cases, we have
p
ij
(t, s) = p
ij
(t s)
Theorem 2.2. Chapman–Kolmogorov equation:
p
ij
(t, s) =
X
k
p
ik
(t, u)p
kj
(u, s) u (s, t)
For time-homogeneous cases, we have P(s + t) = P(s)P(t) (semigroup property) where P(t) = [p
ij
(t)]
11
Definition 2.3. The infinitesimal generator of CTMC is defined as the one-sided derivative
Q = lim
t0
+
P(t) P(0)
t
= lim
t0
+
P(t) I
t
Q is a real matrix independent of t.
Properties of Q (suppose that i = j):
q
ii
= lim
t0
+
p
ii
(t) 1
t
0
q
ij
= lim
t0
+
p
ij
(t) 0
t
0
The row sum of Q = lim
t0
+
P
k
p
ik
(t) 1
t
= 0
Using Chapman–Kolmogorov equation, we have
d
ds
P(s + t) = P
(s)P(t) =
dP(s + t)
d(s + t)
d(s + t)
ds
P(s + t) = P
(s + t)
s = 0 = P
(t) = QP(t) (Forward differential equation)
t = 0 = P
(s) = P(s)Q (Backward differential equation)
The solution of the above equations is
P(t) = e
tQ
= I + tQ +
1
2!
t
2
Q
2
+
1
3!
t
3
Q
3
+ . . .
Combined with the initial state, we have
π
π
π(t) = π
π
π(0)P(t)
π
π
π
(t) = π
π
π(0)P(t)Q = π
π
π(t)Q
Interpretation of π
π
π
(t) = π
π
π(t)Q: for j = i and samll t, we have (1) q
ij
p
ij
(t)/t = rate of
probability flow into j from i and (2) q
ii
=
P
q
ij
= total rate of probability flow out of i. Thus,
π
i
(t) = π
i
(t)q
ii
+
X
j=i
π
j
(t)q
ji
= π
i
(t)
total rate of flow out of i
+
X
j=i
π
j
(t)
rate of flow from j into i
Definition 2.4. π
π
π(t) is a stead state if π
π
π
(t) = 0 π
π
π(t)Q = 0.
DTMC and CTMC
Same: ”, closed communicating class, transient
Different: aperiodic
Note that for any CTMC we can sample {t, 2t, . . . kt, . . . } to get a DTMC:
P(kt) = P(t)P(t) · · · = P
k
(t)
Definition 2.5. Let x(0) = i. The leave time T
and return time T are defined as
T
= min{t 0, x(t) = i}
T = min{t T
, x(t) = i}
12
Note that the definition of return time in terms of CTMC is different from DTMC because one
has to first leave the state before returning.
x(t)
t
T
0
T
1
T
2
T
3
T
4
T
5
Definition 2.6. We say state i is (1) transient if P r(T < +∞|x(0) = i) < 1; (2) null recurrent
if P r(T < +∞|x(0) = i) = 1 and E(T ) = +; (3) positive recurrent if P r(T < +∞|x(0) = i) < 1
and and E(T ) < +.
Sample Path Behavior of CTMC
Assume that the sample paths (1) have only isolated jumps (next jump is meaningful) and (2) are
right continuous.
Definition 2.7. The jump time is defined as
T
n
= min{t > T
n1
: x(t) = x(T
n1
)}
The holding time is defined as
τ
n
= T
n+1
T
n
Theorem 2.8. Sample Markov chain right after jump time: x(T
n
) (x(T
n+1
) = x(T
n
)). We have
1. {x(T
n
) : n 0} is a DTMC
2. τ
n
is exponential distributed: P r(z t) = 1 e
λt
and E[τ
n
] = 1/q
ii
(suppose that x(T
n
) = i)
3. τ
n
and x(T
n+1
) are independent given x(T
n
)
Proof. We can sample from the Markov chain every h seconds. Denote the sampled sequence as x(kh)
where k = 0, 1, 2, . . . . Denote the sampled time points right after jumps as {x
h,J
n
}. As h 0
+
, we
have (x
h,J
0
, x
h,J
1
, x
h,J
2
, . . . ) (x(T
0
), x(T
1
), x(T
2
), . . . ). Then, as h 0
+
we have
p
h,J
ij
=
p
ij
(h)
1 p
ii
(h)
=
h ·
p
ij
(h)0
h
h
P
k=j
p
ik
(h)0
h
=
hq
ij
+ o(h)
P
k=j
[hq
ik
+ o(h)]
q
ij
P
k=j
q
ik
Suppose that x(T
n
) = i. Given that P(t) = e
tQ
, we have τ
n
exp(q
ii
), which is a Markov chain.
Definition 2.9. Fraction of time spent on state i. Let τ
(i)
k
and T
(i)
k
denote the sampled holding
time and jump time for state i, respectively. Then,
π
i
= lim
t+
1
t
Z
t
0
1(x(s) = i) ds = lim
K+
P
K
k=1
τ
(i)
k
P
K
k=1
(T
(i)
k+1
T
(i)
k
)
=
1
q
ii
M
ii
where
M
ii
= E
h
T
(i)
k+1
T
(i)
k
i
=
1
q
ii
π
i
Note that for positive recurrent Markov chain we have M
ii
< + and π
π
πQ = 0; for null recurrent
Markov chain we have M
ii
= +. What does π
π
πQ = 0 mean?
13
0 =
X
i
π
i
q
ij
=
X
i=j
π
i
q
ij
+ π
j
q
jj
=
X
i=j
π
i
q
ij
π
j
X
i=j
q
ji
So the influx and outflux of rates for j are equal.
3 Poisson Process
0 1 2 3 4 5
...
λ λ λ λ λ λ
Consider a CTMC with the infinitesimal generator
Q =
λ λ
λ λ
λ λ
. . . . . .
Suppose that N(0) = 0 (N(t) and x(t) are equivalent but in Poisson process we prefer to use N(t)).
Given π
π
π
(t) = π
π
π(t)Q, we have
dπ
0
(t)
dt
= π
0
(t)q
00
= λπ
0
(t)
dπ
1
(t)
dt
= π
0
(t)q
01
+ π
1
(t)q
11
= λπ
0
(t) λπ
1
(t)
......
dπ
n
(t)
dt
= π
n1
(t)q
n1,n
+ π
n
(t)q
n,n
= λπ
n1
(t) λπ
n
(t)
Now we solve the above equations with π
π
π(0) = (1, 0, 0, . . . ):
π
n
(t) =
(λt)
n
n!
e
λt
π
π
π(t) =
e
λt
, λte
λt
, . . . ,
(λt)
n
n!
e
λt
, . . .
Another way to solve π
π
π(t) is to use z-transform Π(z, t) =
P
+
n=0
z
n
π
n
(t),
Π(z, t)
t
= λπ
0
(t) + λ
+
X
n1
z
n
[π
n1
(t) π
n
(t)] = (zλ λ)Π(z, t)
= Π(z, t) = e
(zλλ)t
= e
λt
+ zλte
λt
+ · · · + z
n
(λt)
n
n!
e
λt
+ . . .
Now we have
P r(N(t) N (s) = k) = P r(N (t s) = k) = e
λ(ts)
[λ(t s)]
k
k!
Theorem 3.1. Superposition of a Poisson process. If {N
1
(t) : t 0} and {N
2
(t) : t 0} are
two independent Poisson processes with respective rates λ
1
and λ
2
. Then, {N
1
(t) + N
2
(t) : t 0}
is a Poisson process with rate λ
1
+ λ
2
. The Poisson process {N
1
(t) + N
2
(t) : t 0} is called the
superposition of {N
1
(t) : t 0} and {N
2
(t) : t 0}
14
Theorem 3.2. Decomposition of a Poisson process. Consider a Poisson process {N (t) : t 0}
with rate λ. Suppose that each time an event occurs it is classified as either a type I or a type II event.
Suppose further that each event is classified as type I event with probability p and as type II event with
probability 1p. Let N
1
(t) denote respectively the number of type I events occurring in [0, t]. Let N
2
(t)
denote respectively the number of type II events occurring in [0, t]. Note that N(t) = N
1
(t) + N
2
(t).
Then, {N
1
(t) : t 0} and {N
2
(t) : t 0} are independent Poisson processes with respective rates
λ
1
= λp and λ
2
= λ(1 p).
4 Queuing System
0 1 2 3 4
...
µ
1
µ
2
µ
3
µ
4
µ
5
λ
0
λ
1
λ
2
λ
3
λ
4
Definition 4.1. In a birth-death process, let x(t) denote the population, λ
t
the birth rate, and µ
t
the death rate. λ
t
and µ
t
can vary with time. Then, we have
Q =
λ
0
λ
0
µ
1
(λ
1
+ µ
1
) λ
1
µ
2
(λ
2
+ µ
2
) λ
2
. . . . . .
How to solve π
π
π, the steady state probability distribution? One way is to fix π
0
and solve π
π
π
recurrently. But there’s a simpler way. Recall that in steady cases, the probability flows between two
sets constituting a partition of the state space are in balance. We have µ
i+1
π
i+1
= λ
i
π
i
and hence
π
n
= π
0
n1
Y
i=0
λ
i
µ
i+1
n 1
Given that
P
+
i=0
π
i
= 1, we have
π
0
=
1
1 +
P
+
n=1
Q
n1
i=0
λ
i
µ
i+1
π
n
=
Q
n1
i=0
λ
i
µ
i+1
1 +
P
+
n=1
Q
n1
i=0
λ
i
µ
i+1
The CTMC is positive recurrent if and only if
P
+
n=1
Q
n1
i=0
λ
i
µ
i+1
< +.
Definition 4.2. M/M/1 queue represents the queue length in a system having a single server (the
“1”), where arrivals are determined by a Poisson process P oiss(λ) (the first “M”) and job service
times have an exponential distribution exp(µ) (the second “M”).
According to the Birth-Death Process, we have π
i
= ρ
i
π
0
, where ρ = λ/µ is the utilization of
service. To ensure the system has a steady state, ρ < 1.
ρ =
1
µ
1
λ
=
mean service time
mean arrival time
15
We also have
π
0
=
1
P
+
i
ρ
i
= 1 ρ = Pr(the service is idle)
ρ = Pr(the service is busy)
Let L denote the mean number of customers/items. Then,
L =
+
X
i=0
i
=
+
X
i=0
i
(1 ρ) = ρ(1 ρ)
d
dρ
+
X
i=1
ρ
i
=
ρ
1 ρ
Theorem 4.3. Little’s Theorem. Let W denote mean delay. Then L = λW .
Proof. Suppose that
α(t) = # of arrivals in [0, t] minutes
δ(t) = # of departures in [0, t] minutes
A(t) =
Z
t
0
(α(t) δ(t)) dt = customer/item minutes
Then, when t goes to infinity, we have
L =
A(t)
t
=
A(t)
α(t)
α(t)
t
= λW
The Little’s Theorem can be applied to model either waiting for service or mean of total time
spent. Denote the time to wait for service as W
s
and the mean of total time spent as W . We have
W
s
= W
1
µ
. Therefore, L
q
= λW
s
=
ρ
1ρ
ρ, where L
q
is the mean of total number of customers
in the queue (but not receiving service yet).
Example 4.4. M/M/c Queue.
0 1 2 3
...
c
c+1
...
µ
2µ 3µ 4µ
λ λ λ λ λ
λ λ
The system has steady state if and only if ρ < c. We also have
π
i
=
ρ
i
i!
π
0
if i c
π
i
=
ρ
i
c!
1
c
ic
π
0
if i > c
Example 4.5. M/M/ Queue. In this system, we have W =
1
µ
and L =
λ
µ
0 1 2 3
...
c
c+1
...
µ
2µ 3µ 4µ
(c + 1)µ (c + 2)µ
λ λ λ λ λ
λ λ
Example 4.6. M/M/c/c Queue (Erlang Model of a Telephone Exchange). The last c rep-
resents the capacity of the system.
16
0 1 2 3
...
c
µ
2µ 3µ 4µ
λ λ λ λ λ
Let λ,
1
µ
, and c be the arrival rate, mean length of phone call, and number of lens, respectively.
Then the probability that the system is busy, π
c
, is
π
c
=
ρ
c
c!
π
0
Given that
P
+
i=0
π
i
= 1, we have
π
0
=
+
X
i=0
ρ
i
i!
, π
c
=
ρ
c
/c!
P
i=0
ρ
i
/i!
π
c
is called Erlang B formula or Erlang loss formula.
Arriving Customer’s Viewpoint
Let π
a
i
(n) = P r(the n-th arriving customer sees i customers when he arrives) and π
a
i
= lim
n+
π
a
i
(n).
Example 4.7. Birth-death process. Let N(t) be the number of customers in the system at time t.
Assume there is a steady state.
π
a
i
= P r(N(t) = i|arriving in (t, t + h))
=
P r(arriving in (t, t + h))P r(N(t) = i)
P
+
j=0
P r(arriving in (t, t + h))P r(N(t) = j)
=
λ
i
i
P
+
j=0
λ
k
k
=
λ
i
π
i
P
+
j=0
λ
k
π
k
If λ
i
= λ, we have π
a
i
= π
i
.
Theorem 4.8. PASTA property (Poisson arrivals see time averages). If a Markov chain
has only unit jumps and steady state exists, then, asymptotically, arriving customers’ viewpoint is
equivalent to departing customers’ viewpoint.
Example 4.9. M/G/1. G refers to any general distribution, B(t), of the time of service time.
Let
X
n
= the number of customers in the system just after n-th service completed.
A
n
= the number of customers who arrive between the n-th and (n + 1)-th completed service.
Then X
n+1
= X
n
+A
n+1
1 if X
n
1. Otherwise, X
n+1
= A
n+1
. This is called embedded Markov
chain.
k
a
= P r(A = a) =
Z
+
0
(λt)
a
e
λt
a!
dB(t)
Then we have
P =
k
0
+ k
1
k
2
k
3
k
4
. . .
k
0
k
1
k
2
. . .
k
0
k
1
. . .
k
0
. . .
. . .
17
We can rewrite the formula as X
n+1
= X
n
+ A
n+1
1(X
n
0). Let n + and assume the steady
state exists. We have
E[X] = E[X] + E[A] E[1(X
n
0)] E[A] = E[1(X
n
0)]
E[X
2
] = E[X
2
] + E[A
2
] + E[1(X
n
0)
2
] + 2E[XA] 2E[X] 2E[A 1(X
n
0)]
= E[X
2
] + E[A
2
] + E[A] + 2E[X]E[A] 2E[X] 2E[A
2
]
Given that
E[X] = L
E[A] =
Z
+
0
λt dB(t) =
λ
µ
= ρ
E[A
2
] =
Z
+
0
λ
2
t
2
dB(t) = ρ
2
+ ρ + λ
2
V ar
B
we have the Pollaczek–Khinchine formula:
L = ρ +
ρ
2
+ λ
2
V ar
B
2(1 ρ)
This shows that it is the variance of service time causes delay. If the service time is deterministic,
1
µ
,
then we get the least delay
L = ρ
ρ
2
2(1 ρ)
Example 4.10. G/M/1. G refers to any general distribution A(t) of the time between two arrivals.
Let
X
n
= the number of customers in the system just prior to the n-th arrival.
B
n
= the number of customers who are served in the n-th inter arrival time.
Then X
n+1
= X
n
+ 1 B
n
.
b
m
= P r(B = m) =
Z
+
0
(µt)
m
e
µt
m!
dA(t)
Then we have
P =
P
+
k=2
b
k
b
1
b
0
P
+
k=3
b
k
b
2
b
1
b
0
P
+
k=4
b
k
b
3
b
2
b
1
b
0
. . . . . . . . . . . . . . . . . .
Example 4.11. G/G/1 (G
/G
′′
/1). Let
W
n
= waiting time for service to begin for the n-th customer.
S
n
= service time for the n-th customer. S
n
G
.
T
n
= inter arrival time between the n-th and (n + 1)-th customer. T
n
G
′′
.
Then W
n+1
= max{0, W
n
+ S
n
T
n
} (Lindley equation) and we have
P r(W
n+1
t) = F
n+1
(t) = P r(W
n
+ S
n
T
n
t)
F =
Z
0
−∞
F (t x) dU(x)
where U(x) = P r(S
n
T
n
x). The Wiener–Hopf method can be used to solve this expression.
18