diff --git a/tutorials/cont_traits/figures/ou_gm.png b/tutorials/cont_traits/figures/ou_gm.png index 4af3d3b2..f3895a52 100644 Binary files a/tutorials/cont_traits/figures/ou_gm.png and b/tutorials/cont_traits/figures/ou_gm.png differ diff --git a/tutorials/cont_traits/figures/relaxed_ou_gm.png b/tutorials/cont_traits/figures/relaxed_ou_gm.png index 9951e3ec..3756ac81 100644 Binary files a/tutorials/cont_traits/figures/relaxed_ou_gm.png and b/tutorials/cont_traits/figures/relaxed_ou_gm.png differ diff --git a/tutorials/cont_traits/figures/rjoubm_gm.png b/tutorials/cont_traits/figures/rjoubm_gm.png new file mode 100644 index 00000000..053e4144 Binary files /dev/null and b/tutorials/cont_traits/figures/rjoubm_gm.png differ diff --git a/tutorials/cont_traits/modules/relaxed_OU_parameter_estimation.md b/tutorials/cont_traits/modules/relaxed_OU_parameter_estimation.md index d79ba763..ebde24cb 100644 --- a/tutorials/cont_traits/modules/relaxed_OU_parameter_estimation.md +++ b/tutorials/cont_traits/modules/relaxed_OU_parameter_estimation.md @@ -1,6 +1,6 @@ {% section Relaxing the OU Model %} -Under a simple Ornstein-Uhlenbeck (OU) model, the optimal of a continuous character is determined by a single rate parameter, $\theta$. Many evolutionary questions are related to how the optimal phenotype changes among lineages. In a Bayesian setting, we can specify a ''relaxed'' OU prior model that allows the optimal phenotype to vary over the phylogeny. +Under a simple Ornstein-Uhlenbeck (OU) model, the optimal of a continuous character is determined by a single location parameter, $\theta$. Many evolutionary questions are related to how the optimal phenotype changes among lineages. In a Bayesian setting, we can specify a ''relaxed'' OU prior model that allows the optimal phenotype to vary over the phylogeny. Here, we will use the ''random local clock'' model, similar to the one described by {% citet Uyeda2014 %} in the software `bayOU`. In this model, we assume that each branch in the phylogeny either does or does not have a optimum shift. When there is no shift on a branch, the optimum on the branch is inherited directly from its ancestral branch; when there _is_ a shift, the ancestral optimum is shifted by an amount that is drawn from a specified prior distribution. We specify a prior probability, $p$, that a given branch experiences a shift. For a tree with $n$ branches, the expected number of shifts is $E(k) = n \times p$. To control the number of shifts, we specify a prior on the _expected number of shifts_, $E(k)$, and then calculate the prior probability for a shift on a particular branch, $p = E(k) / n$. The graphical model shows the relationship between the branch-specific optima and priors (fig_ou_relaxed_gm). @@ -22,7 +22,8 @@ We begin by deciding which of the traits to use. Here, we assume we are analyzin trait <- 1 ``` -Now, we read in the (time-calibrated) tree corresponding. +Now, we read in the (time-calibrated) tree corresponding and specify the tree model. +In this tutorial, we assume the tree is known without error. We create a constant node for the tree that corresponds to the observed phylogeny. ``` T <- readTrees("data/primates_tree.nex")[1] ``` @@ -52,29 +53,22 @@ monitors = VectorMonitors() {% subsection Specifying the model %} -{% subsubsection Tree model %} +{% subsubsection Diffusion parameter %} -In this tutorial, we assume the tree is known without area. We create a constant node for the tree that corresponds to the observed phylogeny. +The stochastic rate of evolution is controlled by the rate parameter $\sigma^2$, which represents the diffusion variance per unit time. We draw the rate parameter from a log-normal distribution, where the median is the non-phylogenetic across-species variance scaled by the root age. We use $2*H \approx 1.1748$ as the spread parameter such that the 95% interval span $\approx 2$ orders of magnitude. ``` -tree <- T -``` - -{% subsubsection Rate parameter %} - -The stochastic rate of evolution is controlled by the rate parameter, $\sigma^2$. We draw the rate parameter from a loguniform prior. This prior is uniform on the log scale, which means that it is represents ignorance about the _order of magnitude_ of the rate. As before, we provide a scale move that proposes changes to the parameter during MCMC. - -``` -sigma2 ~ dnLoguniform(1e-3, 1) +root_age := tree.rootAge() +sigma2 ~ dnLognormal(ln(data.var(trait) / root_age), 1.1748) moves.append( mvScale(sigma2, weight=1.0) ) ``` -{% subsubsection Adaptation parameter %} +{% subsubsection Attraction rate parameter %} -The rate of adaptation toward the optimum is determined by the parameter $\alpha$. We draw $\alpha$ from an exponential prior distribution, and place a scale proposal on it. This parameter is assumed to be constant across the tree (even though the optimum will vary). We specify the mean of the exponential prior distribution on $\alpha$ to be half the root age divided by $\ln(2)$, which means that we expect a phylogenetic half life of half the tree age. +The rate of attraction toward the optimum is determined by the parameter $\alpha$. We draw $\alpha$ from a log-normal prior distribution, and place a scale proposal on it. This parameter is assumed to be constant across the tree (even though the optimum will vary). We specify the median of the log-normal prior distribution on $\alpha$ to be $\ln(2)$ divided by half the root age, which means that we expect a phylogenetic half life of half the tree age. We specify the spread parameter to be $2*H \approx 1.1748$ , such that the 95% interval spans $\approx 2$ orders of magnitude. ``` root_age := tree.rootAge() -alpha ~ dnExponential( abs(root_age / 2.0 / ln(2.0)) ) +alpha ~ dnLognormal( abs(ln(2.0) / (0.5 * root_age)), 1.1748 ) moves.append( mvScale(alpha, weight=1.0) ) ``` @@ -92,7 +86,7 @@ expected_number_of_shifts <- 5 shift_probability <- expected_number_of_shifts / nbranches ``` -Next, we specify the prior distribution on the size of shifts (when they occur). We draw each rate shift from a uniform distribution. +Next, we specify the prior distribution on the size of shifts (when they occur). We draw each rate shift from a normal distribution. ``` shift_distribution = dnNormal(0, 0.587) ``` diff --git a/tutorials/cont_traits/modules/simple_OU_model_selection.md b/tutorials/cont_traits/modules/simple_OU_model_selection.md index 4e3db57f..befa6b28 100644 --- a/tutorials/cont_traits/modules/simple_OU_model_selection.md +++ b/tutorials/cont_traits/modules/simple_OU_model_selection.md @@ -14,12 +14,20 @@ $$ where $P( \text{OU model} \mid X)$ and $P( \text{OU model})$ are the posterior probability and prior probability of the OU model, respectively. +{% figure fig_rjoubm_gm %} + +{% figcaption %} +The graphical model representation of the mixture model for Ornstein-Uhlenbeck (OU) and Brownin motion (BM) processes using a reversible-jump algorithm. +For more information about graphical model representations see {% citet Hoehna2014b %}. +{% endfigcaption %} +{% endfigure %} + {% subsubsection Reversible-jump between OU and BM models %} To enable rjMCMC, we simply have to place a reversible-jump prior on the relevant parameter, $\alpha$. We can modify the prior on `alpha` so that it takes either a constant value of 0, or is drawn from a prior distribution. Finally, we specify a prior probability on the OU model of `p = 0.5`. ``` -alpha ~ dnReversibleJumpMixture(0.0, dnExponential( abs(root_age / 2.0 / ln(2.0)) ), 0.5) +alpha ~ dnReversibleJumpMixture(0.0, dnLognormal( abs(ln(2.0) / (0.5 * root_age)), 1.1748 ), 0.5) ``` We then provide a reversible-jump proposal on `alpha` that proposes changes between the two models. ``` diff --git a/tutorials/cont_traits/modules/simple_OU_parameter_estimation.md b/tutorials/cont_traits/modules/simple_OU_parameter_estimation.md index f5b3f75e..2f652972 100644 --- a/tutorials/cont_traits/modules/simple_OU_parameter_estimation.md +++ b/tutorials/cont_traits/modules/simple_OU_parameter_estimation.md @@ -22,8 +22,9 @@ trait <- 1 ``` Now, we read in the (time-calibrated) tree corresponding. +In this tutorial, we assume the tree is known without error. We create a constant node for the tree that corresponds to the observed phylogeny. ``` -T <- readTrees("data/primates_tree.nex")[1] +tree <- readTrees("data/primates_tree.nex")[1] ``` Next, we read in the character data for the same dataset. @@ -45,22 +46,16 @@ moves = VectorMoves() monitors = VectorMonitors() ``` -{% subsection Specifying the model %} +{% subsection Specifying the OU model %} -{% subsubsection Tree model %} -In this tutorial, we assume the tree is known without area. We create a constant node for the tree that corresponds to the observed phylogeny. +{% subsubsection Diffusion parameter %} -``` -tree <- T -``` - -{% subsubsection Rate parameter %} - -The stochastic rate of evolution is controlled by the rate parameter, $\sigma^2$. We draw the rate parameter from a loguniform prior. This prior is uniform on the log scale, which means that it is represents ignorance about the _order of magnitude_ of the rate. +The stochastic rate of evolution is controlled by the rate parameter $\sigma^2$, which represents the diffusion variance per unit time. We draw the rate parameter from a log-normal distribution, where the median is the non-phylogenetic across-species variance scaled by the root age. We use $2*H \approx 1.1748$ as the spread parameter such that the 95% interval span $\approx 2$ orders of magnitude. ``` -sigma2 ~ dnLoguniform(1e-3, 1) +root_age := tree.rootAge() +sigma2 ~ dnLognormal(ln(data.var(trait) / root_age), 1.1748) ``` In order to estimate the posterior distribution of $\sigma^2$, we must provide an MCMC proposal mechanism that operates on this node. Because $\sigma^2$ is a rate parameter, and must therefore be positive, we use a scaling move called `mvScale`. @@ -68,15 +63,25 @@ In order to estimate the posterior distribution of $\sigma^2$, we must provide a moves.append( mvScale(sigma2, weight=2.0) ) ``` -{% subsubsection Adaptation parameter %} +{% subsubsection Attraction rate parameter %} -The rate of adaptation toward the optimum is determined by the parameter $\alpha$. We draw $\alpha$ from an exponential prior distribution, and place a scale proposal on it. We specify the mean of the exponential prior distribution on $\alpha$ to be half the root age divided by $\ln(2)$, which means that we expect a phylogenetic half life of half the tree age. +The rate of attraction toward the optimum is determined by the parameter $\alpha$. We draw $\alpha$ from a log-normal prior distribution, and place a scale proposal on it. We specify the median of the log-normal prior distribution on $\alpha$ to be $\ln(2)$ divided by half the root age, which means that we expect a phylogenetic half life of half the tree age. We specify the spread parameter to be $2*H \approx 1.1748$ , such that the 95% interval spans $\approx 2$ orders of magnitude. ``` -root_age := tree.rootAge() -alpha ~ dnExponential( abs(root_age / 2.0 / ln(2.0)) ) +alpha ~ dnLognormal( abs(ln(2.0) / (0.5 * root_age)), 1.1748 ) moves.append( mvScale(alpha, weight=2.0) ) ``` +{% aside Alternative: Prior distributions on phylogenetic half-life and/or stationary variance%} + +Alternatively, you can draw the phylogenetic half-life and/or stationary variance from some prior distributions, and transform them to $\alpha$ and $\sigma^2$ respectively using the equations below. +``` +alpha := ln(2) / t_half +sigma2 := alpha * 2 * vy +``` + +{% endaside %} + + {% subsubsection Optimum %} We draw the optimal value from a vague uniform prior ranging from -10 to 10 (you should change this prior if your character is outside of this range). Because this parameter can be positive or negative, we use a slide move to propose changes during MCMC. @@ -98,7 +103,7 @@ moves.append( avmvn_move ) {% subsubsection Assessing the phylogenetic half-life and decrease in variance due to selection %} -For our OU model, we are going to add two variables which are transformations primarily of the strength of selection $\alpha$. +For our OU model, we are going to add three variables which are transformations of the rate of attraction $\alpha$ and/or the diffusion variance $\sigma^2$. First, we add the phylogenetic half-life $t_{1/2} = \ln(2)/\alpha$, which represents the expected time needed for a trait to cover half the distance between root state and the selective optimum $\theta$. ``` t_half := ln(2) / alpha @@ -108,6 +113,10 @@ Second, we add the metric $p_{th}$ which represents the percent decrease in trai ``` p_th := 1 - (1 - exp(-2.0*alpha*root_age)) / (2.0*alpha*root_age) ``` +Lastly, we add the stationary variance which represents the trait variance at stationarity. Recall that unlike the Brownian motion process, the OU process is stationary, i.e., the variance does not increase indefinitely +``` +vy := sigma2 / ( 2 * alpha ) +``` {% subsubsection Ornstein-Uhlenbeck model %} diff --git a/tutorials/cont_traits/scripts/mcmc_OU.Rev b/tutorials/cont_traits/scripts/mcmc_OU.Rev index 6990b2e0..0a011914 100644 --- a/tutorials/cont_traits/scripts/mcmc_OU.Rev +++ b/tutorials/cont_traits/scripts/mcmc_OU.Rev @@ -4,7 +4,7 @@ # Ornstein-Uhlenbeck model # # -# authors: Michael R. May and Sebastian Höhna +# authors: Michael R. May, Sebastian Höhna, and Priscilla Lau # ################################################################################ @@ -16,8 +16,8 @@ # Datasets: Female Mass, Tail Length – Body Length Residuals, Body Length – Mass Residuals, Maximum Age, Sexual Dimorphism, Geographic Range Size, Latitudinal Midpoint, Distance to Continental Centroid, Population Density, Home Range Size, Group Size, Gestation Duration, Litter size trait <- 1 -### Read in the trees -T <- readTrees("data/primates_tree.nex")[1] +### Read in the trees and specify the tree model +tree <- readTrees("data/primates_tree.nex")[1] ### Read in the character data data <- readContinuousCharacterData("data/primates_cont_traits.nex") @@ -29,22 +29,16 @@ moves = VectorMoves() monitors = VectorMonitors() -########################## -# Specify the tree model # -########################## - -tree <- T - ########################## # Specify the rate model # ########################## root_age := tree.rootAge() -sigma2 ~ dnLoguniform(1e-3, 1) +sigma2 ~ dnLognormal( ln( data.var(trait) / root_age ), 1.1748 ) moves.append( mvScale(sigma2, weight=2.0) ) -alpha ~ dnExponential( abs(root_age / 2.0 / ln(2.0)) ) +alpha ~ dnLognormal( abs(ln(2.0) / (0.5 * root_age)), 1.1748 ) moves.append( mvScale(alpha, weight=2.0) ) theta ~ dnUniform(-10, 10) @@ -61,6 +55,7 @@ moves.append( avmvn_move ) root_age := tree.rootAge() t_half := ln(2) / alpha p_th := 1 - (1 - exp(-2.0*alpha*root_age)) / (2.0*alpha*root_age) +vy := sigma2 / ( 2 * alpha ) ########################## diff --git a/tutorials/cont_traits/scripts/mcmc_OU_RJ.Rev b/tutorials/cont_traits/scripts/mcmc_OU_RJ.Rev index 44f9c9ab..2571b359 100644 --- a/tutorials/cont_traits/scripts/mcmc_OU_RJ.Rev +++ b/tutorials/cont_traits/scripts/mcmc_OU_RJ.Rev @@ -3,7 +3,7 @@ # RevBayes Example: Model selection between OU and BM # # -# authors: Michael R. May and Sebastian Höhna +# authors: Michael R. May, Sebastian Höhna, and Priscilla Lau # ################################################################################ @@ -14,8 +14,8 @@ ### Select the trait to analyze trait <- 1 -### Read in the trees -T <- readTrees("data/primates_tree.nex")[1] +### Read in the trees and specify the tree model +tree <- readTrees("data/primates_tree.nex")[1] ### Read in the character data data <- readContinuousCharacterData("data/primates_cont_traits.nex") @@ -26,22 +26,17 @@ data.includeCharacter( trait ) moves = VectorMoves() monitors = VectorMonitors() -########################## -# Specify the tree model # -########################## - -tree <- T ########################## # Specify the rate model # ########################## -sigma2 ~ dnLoguniform(1e-3, 1e1) +sigma2 ~ dnLognormal( ln( data.var(trait) / root_age ), 1.1748 ) moves.append( mvScale(sigma2, weight=1.0) ) root_age := tree.rootAge() -alpha ~ dnReversibleJumpMixture(0.0, dnExponential( abs(root_age / 2.0 / ln(2.0)) ), 0.5) +alpha ~ dnReversibleJumpMixture(0.0, dnLognormal( abs(ln(2.0) / (0.5 * root_age)), 1.1748 ), 0.5) moves.append( mvRJSwitch(alpha, weight=1.0) ) moves.append( mvScale(alpha, weight=1.0) ) diff --git a/tutorials/cont_traits/scripts/mcmc_OU_prior.Rev b/tutorials/cont_traits/scripts/mcmc_OU_prior.Rev index aebe2400..920e3016 100644 --- a/tutorials/cont_traits/scripts/mcmc_OU_prior.Rev +++ b/tutorials/cont_traits/scripts/mcmc_OU_prior.Rev @@ -16,8 +16,8 @@ # Datasets: Female Mass, Tail Length – Body Length Residuals, Body Length – Mass Residuals, Maximum Age, Sexual Dimorphism, Geographic Range Size, Latitudinal Midpoint, Distance to Continental Centroid, Population Density, Home Range Size, Group Size, Gestation Duration, Litter size trait <- 1 -### Read in the trees -T <- readTrees("data/primates_tree.nex")[1] +### Read in the trees and specify the tree model +tree <- readTrees("data/primates_tree.nex")[1] ### Read in the character data data <- readContinuousCharacterData("data/primates_cont_traits.nex") @@ -29,22 +29,16 @@ moves = VectorMoves() monitors = VectorMonitors() -########################## -# Specify the tree model # -########################## - -tree <- T - ########################## # Specify the rate model # ########################## root_age := tree.rootAge() -sigma2 ~ dnLoguniform(1e-3, 1) +sigma2 ~ dnLognormal( ln( data.var(trait) / root_age ), 1.1748 ) moves.append( mvScale(sigma2, weight=2.0) ) -alpha ~ dnExponential( abs(root_age / 2.0 / ln(2.0)) ) +alpha ~ dnLognormal( abs(ln(2.0) / (0.5 * root_age)), 1.1748 ) moves.append( mvScale(alpha, weight=2.0) ) theta ~ dnUniform(-10, 10) @@ -60,6 +54,7 @@ moves.append( avmvn_move ) # some useful variable transformations to monitor t_half := ln(2) / alpha p_th := 1 - (1 - exp(-2.0*alpha*root_age)) / (2.0*alpha*root_age) +vy := sigma2 / ( 2 * alpha ) ########################## diff --git a/tutorials/cont_traits/scripts/mcmc_relaxed_OU.Rev b/tutorials/cont_traits/scripts/mcmc_relaxed_OU.Rev index 3a5f09c1..a3c8f61c 100644 --- a/tutorials/cont_traits/scripts/mcmc_relaxed_OU.Rev +++ b/tutorials/cont_traits/scripts/mcmc_relaxed_OU.Rev @@ -4,7 +4,7 @@ # Ornstein-Uhlenbeck model. # # -# authors: Michael R. May and Sebastian Höhna +# authors: Michael R. May, Sebastian Höhna, and Priscilla Lau # ################################################################################ @@ -15,8 +15,8 @@ ### Select the trait to analyze trait <- 1 -### Read in the trees -T <- readTrees("data/primates_tree.nex")[1] +### Read in the trees and specify the tree model +tree <- readTrees("data/primates_tree.nex")[1] ntips <- T.ntips() nbranches <- 2 * ntips - 2 @@ -29,23 +29,18 @@ data.includeCharacter( trait ) moves = VectorMoves() monitors = VectorMonitors() -########################## -# Specify the tree model # -########################## - -tree <- T ########################## # Specify the rate model # ########################## -# specify the rate parameter -sigma2 ~ dnLoguniform(1e-3, 1) +# specify the diffusion parameter +root_age := tree.rootAge() +sigma2 ~ dnLognormal( ln( data.var(trait) / root_age ), 1.1748 ) moves.append( mvScale(sigma2, weight=1.0) ) -# specify the strength parameter -root_age := tree.rootAge() -alpha ~ dnExponential( abs(root_age / 2.0 / ln(2.0)) ) +# specify the attraction rate parameter +alpha ~ dnLognormal( abs(ln(2.0) / (0.5 * root_age)), 1.1748 ) moves.append( mvScale(alpha, weight=1.0) ) # specify theta at the root of the tree diff --git a/tutorials/cont_traits/simple_ou.md b/tutorials/cont_traits/simple_ou.md index 21e3a769..73172c71 100644 --- a/tutorials/cont_traits/simple_ou.md +++ b/tutorials/cont_traits/simple_ou.md @@ -1,7 +1,7 @@ --- title: Simple Ornstein-Uhlenbeck Models subtitle: Estimating optima under Ornstein-Uhlenbeck evolution -authors: Michael R. May and Sebastian Höhna +authors: Michael R. May, Sebastian Höhna, and Priscilla Lau level: 6 order: 1.7 prerequisites: