diff --git a/lectures/stationary_densities.md b/lectures/stationary_densities.md index 8f4e6cde..5914f00c 100644 --- a/lectures/stationary_densities.md +++ b/lectures/stationary_densities.md @@ -34,7 +34,7 @@ tags: [hide-output] ## Overview -In a [previous lecture](https://python-intro.quantecon.org/finite_markov.html), we learned about finite Markov chains, a relatively elementary class of stochastic dynamic models. +In {doc}`intermediate:finite_markov`, we learned about finite Markov chains, a relatively elementary class of stochastic dynamic models. The present lecture extends this analysis to continuous (i.e., uncountable) state Markov chains. @@ -42,12 +42,12 @@ Most stochastic dynamic models studied by economists either fit directly into th In this lecture, our focus will be on continuous Markov models that -* evolve in discrete-time +* evolve in discrete time * are often nonlinear The fact that we accommodate nonlinear models here is significant, because linear stochastic models have their own highly developed toolset, as we'll -see {doc}`later on `. +see later in {doc}`arma`. The question that interests us most is: Given a particular stochastic dynamic model, how will the state of the system evolve over time? @@ -96,7 +96,7 @@ Once we've built some intuition we'll cover the general case. ### Definitions and basic properties -In our [lecture on finite Markov chains](https://python-intro.quantecon.org/finite_markov.html), we studied discrete-time Markov chains that evolve on a finite state space $S$. +In our lecture {doc}`intermediate:finite_markov`, we studied discrete-time Markov chains that evolve on a finite state space $S$. In this setting, the dynamics of the model are described by a stochastic matrix --- a nonnegative square matrix $P = P[i, j]$ such that each row $P[i, \cdot]$ sums to one. @@ -127,7 +127,7 @@ The family of discrete distributions $P[i, \cdot]$ will be replaced by a family Analogous to the finite state case, $p(x, \cdot)$ is to be understood as the distribution (density) of $X_{t+1}$ given $X_t = x$. -More formally, a *stochastic kernel on* $S$ is a function $p \colon S \times S \to \mathbb R$ with the property that +More formally, a **stochastic kernel on** $S$ is a function $p \colon S \times S \to \mathbb R$ with the property that 1. $p(x, y) \geq 0$ for all $x, y \in S$ 1. $\int p(x, y) dy = 1$ for all $x \in S$ @@ -274,7 +274,7 @@ p(x, y) where $\phi$ is the density of $A_{t+1}$. (Regarding the state space $S$ for this model, a natural choice is $(0, \infty)$ --- in which case -$\sigma(x) = s f(x)$ is strictly positive for all $s$ as required) +$\sigma(x) = s f(x)$ is strictly positive for all $x$ as required) ### Distribution dynamics @@ -322,7 +322,7 @@ defined by (\psi P)(y) = \int p(x,y) \psi(x) dx ``` -This operator is usually called the *Markov operator* corresponding to $p$ +This operator is usually called the **Markov operator** corresponding to $p$. ```{note} Unlike most operators, we write $P$ to the right of its argument, @@ -368,7 +368,7 @@ $y$. Another possibility is to discretize the model, but this introduces errors of unknown size. -A nicer alternative in the present setting is to combine simulation with an elegant estimator called the *look-ahead* estimator. +A nicer alternative in the present setting is to combine simulation with an elegant estimator called the **look-ahead** estimator. Let's go over the ideas with reference to the growth model {ref}`discussed above `, the dynamics of which we repeat here for convenience: @@ -412,12 +412,12 @@ where $p$ is the growth model stochastic kernel in {eq}`statd_sssk`. What is the justification for this slightly surprising estimator? -The idea is that, by the strong [law of large numbers](https://python-intro.quantecon.org/lln_clt.html#lln-ksl), +The idea is that, by the strong [law of large numbers](https://python.quantecon.org/lln_clt.html#lln-ksl), $$ \frac{1}{n} \sum_{i=1}^n p(k_{t-1}^i, y) \to -\mathbb E p(k_{t-1}^i, y) +\mathbb E \left[ p(k_{t-1}^i, y) \right] = \int p(x, y) \psi_{t-1}(x) \, dx = \psi_t(y) $$ @@ -462,13 +462,20 @@ The following code is an example of usage for the stochastic growth model {ref}` (stoch_growth)= ```{code-cell} python3 +--- +mystnb: + figure: + caption: Density of $k_1$ (lighter) to $k_T$ (darker) + name: fig-statd-density-sequence +--- # == Define parameters == # s = 0.2 δ = 0.1 -a_σ = 0.4 # A = exp(B) where B ~ N(0, a_σ) +a_σ = 0.4 # A = exp(B) where B ~ N(0, a_σ**2) α = 0.4 # We set f(k) = k**α ψ_0 = beta(5, 5, scale=0.5) # Initial distribution ϕ = lognorm(a_σ) +rng = np.random.default_rng(1234) def p(x, y): @@ -480,12 +487,12 @@ def p(x, y): return ϕ.pdf((y - (1 - δ) * x) / d) / d n = 10000 # Number of observations at each date t -T = 30 # Compute density of k_t at 1,...,T+1 +T = 30 # Compute density of k_t at 1,...,T # == Generate matrix s.t. t-th column is n observations of k_t == # k = np.empty((n, T)) -A = ϕ.rvs((n, T)) -k[:, 0] = ψ_0.rvs(n) # Draw first column from initial distribution +A = ϕ.rvs((n, T), random_state=rng) +k[:, 0] = ψ_0.rvs(n, random_state=rng) # Draw first column from initial distribution for t in range(T-1): k[:, t+1] = s * A[:, t] * k[:, t]**α + (1 - δ) * k[:, t] @@ -500,7 +507,6 @@ greys.reverse() for ψ, g in zip(laes, greys): ax.plot(ygrid, ψ(ygrid), color=g, lw=2, alpha=0.6) ax.set_xlabel('capital') -ax.set_title(f'Density of $k_1$ (lighter) to $k_T$ (darker) for $T={T}$') plt.show() ``` @@ -531,7 +537,7 @@ We can, however, construct a fairly general theory using distribution functions. ### Example and definitions -To illustrate the issues, recall that Hopenhayn and Rogerson {cite}`HopenhaynRogerson1993` study a model of firm dynamics where individual firm productivity follows the exogenous process +To illustrate the issues, recall that {cite:t}`HopenhaynRogerson1993` study a model of firm dynamics where individual firm productivity follows the exogenous process $$ X_{t+1} = a + \rho X_t + \xi_{t+1}, @@ -557,7 +563,7 @@ puts positive probability mass on 0 and 1. Hence it cannot be represented as a density. -What we can do instead is use cumulative distribution functions (cdfs). +What we can do instead is use cumulative distribution functions (CDFs). To this end, set @@ -566,7 +572,7 @@ G(x, y) := \mathbb P \{ h(a + \rho x + \xi_{t+1}) \leq y \} \qquad (0 \leq x, y \leq 1) $$ -This family of cdfs $G(x, \cdot)$ plays a role analogous to the stochastic kernel in the density case. +This family of CDFs $G(x, \cdot)$ plays a role analogous to the stochastic kernel in the density case. The distribution dynamics in {eq}`statd_fdd` are then replaced by @@ -576,13 +582,13 @@ The distribution dynamics in {eq}`statd_fdd` are then replaced by F_{t+1}(y) = \int G(x,y) F_t(dx) ``` -Here $F_t$ and $F_{t+1}$ are cdfs representing the distribution of the current state and next period state. +Here $F_t$ and $F_{t+1}$ are CDFs representing the distribution of the current state and next period state. The intuition behind {eq}`statd_fddc` is essentially the same as for {eq}`statd_fdd`. ### Computation -If you wish to compute these cdfs, you cannot use the look-ahead estimator as before. +If you wish to compute these CDFs, you cannot use the look-ahead estimator as before. Indeed, you should not use any density estimator, since the objects you are estimating/computing are not densities. @@ -591,7 +597,7 @@ One good option is simulation as before, combined with the [empirical distributi ## Stability -In our [lecture](https://python-intro.quantecon.org/finite_markov.html) on finite Markov chains, we also studied stationarity, stability and ergodicity. +In our lecture {doc}`intermediate:finite_markov`, we also studied stationarity, stability and ergodicity. Here we will cover the same topics for the continuous case. @@ -603,7 +609,7 @@ The general case is relatively similar --- references are given below. Analogous to [the finite case](https://python.quantecon.org/finite_markov.html#stationary-distributions), given a stochastic kernel $p$ and corresponding Markov operator as defined in {eq}`def_dmo`, a density $\psi^*$ on $S$ is called -*stationary* for $P$ if it is a fixed point of the operator $P$. +**stationary** for $P$ if it is a fixed point of the operator $P$. In other words, @@ -640,9 +646,9 @@ With additional conditions, we can also get a unique stationary density ($\psi \ ``` This combination of existence, uniqueness and global convergence in the sense -of {eq}`statd_dca` is often referred to as *global stability*. +of {eq}`statd_dca` is often referred to as **global stability**. -Under very similar conditions, we get *ergodicity*, which means that +Under very similar conditions, we get **ergodicity**, which means that ```{math} :label: statd_lln @@ -689,6 +695,7 @@ Here is such a figure. (statd_egs)= ```{figure} /_static/lecture_specific/stationary_densities/solution_statd_ex2.png +Density sequences from four different initial conditions ``` All sequences are converging towards the same limit, regardless of their initial condition. @@ -818,6 +825,7 @@ Try running at `n = 10, 100, 1000, 10000` to get an idea of the speed of converg ϕ = norm() n = 500 θ = 0.8 +rng = np.random.default_rng(1234) # == Frequently used constants == # d = np.sqrt(1 - θ**2) δ = θ / d @@ -830,14 +838,15 @@ def p(x, y): "Stochastic kernel for the TAR model." return ϕ.pdf((y - θ * np.abs(x)) / d) / d -Z = ϕ.rvs(n) +Z = ϕ.rvs(n, random_state=rng) X = np.empty(n) +X[0] = 0 for t in range(n-1): X[t+1] = θ * np.abs(X[t]) + d * Z[t] ψ_est = LAE(p, X) k_est = gaussian_kde(X) -fig, ax = plt.subplots(figsize=(10, 7)) +fig, ax = plt.subplots() ys = np.linspace(-3, 3, 200) ax.plot(ys, ψ_star(ys), 'b-', lw=2, alpha=0.6, label='true') ax.plot(ys, ψ_est(ys), 'g-', lw=2, alpha=0.6, label='look-ahead estimate') @@ -882,10 +891,11 @@ Here's one program that does the job # == Define parameters == # s = 0.2 δ = 0.1 -a_σ = 0.4 # A = exp(B) where B ~ N(0, a_σ) +a_σ = 0.4 # A = exp(B) where B ~ N(0, a_σ**2) α = 0.4 # f(k) = k**α ϕ = lognorm(a_σ) +rng = np.random.default_rng(1234) def p(x, y): "Stochastic kernel, vectorized in x. Both x and y must be positive." @@ -906,8 +916,8 @@ for i in range(4): # == Generate matrix s.t. t-th column is n observations of k_t == # k = np.empty((n, T)) - A = ϕ.rvs((n, T)) - k[:, 0] = ψ_0.rvs(n) + A = ϕ.rvs((n, T), random_state=rng) + k[:, 0] = ψ_0.rvs(n, random_state=rng) for t in range(T-1): k[:, t+1] = s * A[:,t] * k[:, t]**α + (1 - δ) * k[:, t] @@ -938,22 +948,23 @@ To illustrate, let's generate three artificial data sets and compare them with a The three data sets we will use are: $$ -\{ X_1, \ldots, X_n \} \sim LN(0, 1), \;\; +\{ X_1, \ldots, X_n \} \sim \mathrm{LN}(0, 1), \;\; \{ Y_1, \ldots, Y_n \} \sim N(2, 1), \;\; \text{ and } \; -\{ Z_1, \ldots, Z_n \} \sim N(4, 1), \; +\{ Z_1, \ldots, Z_n \} \sim N(4, 1) $$ Here is the code and figure: ```{code-cell} python3 n = 500 -x = np.random.randn(n) # N(0, 1) -x = np.exp(x) # Map x to lognormal -y = np.random.randn(n) + 2.0 # N(2, 1) -z = np.random.randn(n) + 4.0 # N(4, 1) +rng = np.random.default_rng(1234) +x = rng.standard_normal(n) # N(0, 1) +x = np.exp(x) # Map x to lognormal +y = rng.standard_normal(n) + 2.0 # N(2, 1) +z = rng.standard_normal(n) + 4.0 # N(4, 1) -fig, ax = plt.subplots(figsize=(10, 6.6)) +fig, ax = plt.subplots() ax.boxplot([x, y, z]) ax.set_xticks((1, 2, 3)) ax.set_ylim(-2, 14) @@ -1010,6 +1021,7 @@ J = 8 θ = 0.9 d = np.sqrt(1 - θ**2) δ = θ / d +rng = np.random.default_rng(1234) fig, axes = plt.subplots(J, 1, figsize=(10, 4*J)) initial_conditions = np.linspace(8, 0, J) @@ -1018,9 +1030,9 @@ X = np.empty((k, n)) for j in range(J): axes[j].set_ylim(-4, 8) - axes[j].set_title(f'time series from t = {initial_conditions[j]}') + axes[j].set_title(f'time series from $X_0 = {initial_conditions[j]:.1f}$') - Z = np.random.randn(k, n) + Z = rng.standard_normal((k, n)) X[:, 0] = initial_conditions[j] for t in range(1, n): X[:, t] = θ * np.abs(X[:, t-1]) + d * Z[:, t]