Skip to contents

Introduction

This vignette will show you how to get started with the likelihoodTools package. The likelihoodTools package provides a set of tools to help you work with the likelihood package (Murphy 2023).

We use the dataset and the example included in the likelihood to illustrate how to use the likelihoodTools package.

We will show you how to create a model function, compute the MLE of the parameters (both steps from likelihood package), and then how to use the likelihoodTools package to:

  • extract the main information from the MLE results
  • plot residuals vs. predicted values
  • plot the residuals vs. the observed values.

Maximum Likelihood Estimation of the models parameters

As indicated in the likelihood package the first step is to formulate the model function, and then compute the MLE of the parameters.

data(crown_rad)
dataset <- crown_rad
modelfun <- function (a, b, DBH) {a + b * DBH}

Then, compute them MLE of the parameters using the anneal function from the likelihood package.

results <- anneal(model = modelfun,
                  par = list(a = 0, b = 0),
                  var = list(DBH = "DBH", x = "Radius", mean = "predicted",
                             sd = 0.815585, log = TRUE),
                  source_data = dataset,
                  par_lo = list(a = 0, b = 0),
                  par_hi = list(a = 50, b = 50),
                  pdf = dnorm,
                  dep_var = "Radius",
                  max_iter = 20000,
                  show_display = FALSE)

Extract the main information from the MLE results

The likellihood::anneal() function returns a list object with information on the annealing run. This list-type object contains several information (see likellihood::anneal() for more details). The mle_format() function from likelihoodTools package allows extract the main information from the MLE results in a more user-friendly way.

mle_format(results, yvar = "Radius")
#>   max_likeli n_params  aic_cor      aic        R2     slope      rmse
#> 1  -119.7453        2 243.6156 243.4906 0.4450001 0.9999627 0.8151813

You can plot the table with more style using thekableExtra (Zhu 2024) package.

m <- mle_format(results, yvar = "Radius")

# Rename the columns
colnames(m) <- c(
  "Maximum Likelihood", 
  "# Params.", 
  "AIC~cor~", 
  "AIC", 
  "R^2^", 
  "Slope", 
  "RMSE"
)

# Print the results in a table
library(kableExtra)
kable(m, escape = FALSE, caption = "MLE Results", digits = 3) |> 
  kable_styling(full_width = F, position = "center", bootstrap_options = c("striped", "hover"))
MLE Results
Maximum Likelihood # Params. AICcor AIC R2 Slope RMSE
-119.745 2 243.616 243.491 0.445 1 0.815

Plot the residuals vs. predicted values

The mle_plot_residuals() function allows you to plot the residuals (observed - residuals) values vs. predicted values. The predicted values are obtained from the model with the parameters values estimated by maximum likelihood estimation using simulated annealing.

p_residuals <- mle_plot_residuals(results, yvar = "Radius")

Plot the predicted vs. observed values

The mle_plot_observed() function allows you to plot the predicted values vs. observed values. The predicted values are obtained from the model with the parameters values estimated by maximum likelihood estimation using simulated annealing.

p_observed <- mle_plot_observed(results, yvar = "Radius")

Customize the plot

If you don’t like the default plot, you can customize it using ggplot2 (Wickham 2016) parameters. For example, you can increase the size of the points, change the color, change axis titles, etc.

p_observed +
  ggplot2::geom_point(size = 2, color = "red") +
  ggplot2::labs(x = "Observed Radius", y = "Predicted Radius") +
  ggplot2::theme(
    axis.title.x = ggplot2::element_text(size = 13, face = "bold"),
    axis.title.y = ggplot2::element_text(size = 13, face = "bold")
  )

Plot the distribution of the residuals

Maybe the user could be interested in plot the distribution of the residuals. We provided a code to do this (we will implement it as function).

library(ggplot2)
library(dplyr)

# compute the residuals
d <- results$source_data |> 
  dplyr::mutate(residuals = Radius - predicted)

# plot the histogram
p_residuals_dist <- ggplot(data = d, aes(x = residuals)) +
  geom_histogram(aes(y = after_stat(density), x = residuals), 
                   fill = "gray", col = "white", bins = 30) +
  theme_bw() +
  theme(panel.grid = element_blank()) +
  xlab("Residuals") +
  ylab("Density")

p_residuals_dist 

Now, we can add the distribution to the histogram. For those, we used the MASS package (Venables and Ripley 2002).

library(MASS)

# estimate the distribution of the residuals  
dist <- MASS::fitdistr(d$residuals, "normal") 
params <- as.data.frame(t(dist$estimate))


p_residuals_dist <- p_residuals_dist  + stat_function(fun = dnorm, 
              args = list(mean = params$mean, 
                          sd = params$sd), 
              color = "black", size = 1) 
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#>  Please use `linewidth` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
p_residuals_dist 

More customization

Maybe the user is interested in arrange the residuals plots, the observed vs.  predicted and the error distribution plots in a single plot. There are several packages as patchwork (Pedersen 2024) or cowplot (Wilke 2024) that allow you to do this. Here, we provide a simple code to do this.

library(patchwork)


(p_observed + p_residuals + p_residuals_dist +  
    plot_annotation(tag_levels = "a")) & 
  theme(axis.title = element_text(size = 14, face = "bold"), 
        axis.text = element_text(size = 11),
        plot.tag = element_text(size = 14, face = "bold"))

References

Murphy, Lora. 2023. likelihood: Methods for Maximum Likelihood Estimation. https://CRAN.R-project.org/package=likelihood.
Pedersen, Thomas Lin. 2024. patchwork: The Composer of Plots. https://CRAN.R-project.org/package=patchwork.
Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with s. Fourth. New York: Springer. https://www.stats.ox.ac.uk/pub/MASS4/.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
Wilke, Claus O. 2024. Cowplot: Streamlined Plot Theme and Plot Annotations for ’Ggplot2’. https://CRAN.R-project.org/package=cowplot.
Zhu, Hao. 2024. kableExtra: Construct Complex Table with kable and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.