Last updated: 2023-07-26

Checks: 7 0

Knit directory: ms_mariposas_pheno/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20230601) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 7c33a30. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    data/raw_data/.DS_Store

Untracked files:
    Untracked:  data/best_models_prec.csv
    Untracked:  data/best_models_prec.xlsx
    Untracked:  data/best_models_temp.csv
    Untracked:  data/best_models_temp.xlsx
    Untracked:  data/doy_medio_sp.csv
    Untracked:  data/doy_medio_sp.xlsx
    Untracked:  data/models_prec_scaled.csv
    Untracked:  data/models_prec_scaled.xlsx
    Untracked:  data/models_tmed_scaled.csv
    Untracked:  data/models_tmed_scaled.xlsx

Unstaged changes:
    Modified:   analysis/climate_sensibility.Rmd
    Modified:   data/models_prec.csv
    Modified:   data/models_prec.xlsx
    Modified:   data/models_tmed.csv
    Modified:   data/models_tmed.xlsx
    Modified:   data/selected_species.csv
    Modified:   data/transectos_climate.csv

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/prepare_climate.Rmd) and HTML (docs/prepare_climate.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 7c33a30 ajpelu 2023-07-26 add orcid
html 32f6fa3 ajpelu 2023-07-26 Build site.
Rmd 00ea503 ajpelu 2023-07-26 climate data
Rmd 4e72a27 ajpelu 2023-07-11 prepare climate data
html baaf783 ajpelu 2023-07-10 Build site.
Rmd 1a4104b ajpelu 2023-07-10 prepare data climate with new data
html a834041 ajpelu 2023-06-26 Build site.
Rmd 3600071 ajpelu 2023-06-26 get data from 1st oct to sep+1
Rmd 03899a4 ajpelu 2023-06-04 add plots climate
html 59c4576 ajpelu 2023-06-01 Build site.
Rmd 2f9f824 ajpelu 2023-06-01 change name
html b4c84e5 ajpelu 2023-06-01 Build site.
Rmd 55a33ae ajpelu 2023-06-01 add prepare climate data

Introduction

library(here)
library(tidyverse)
library(purrr)
library(runner)
library(kableExtra)
library(DT)
  • We used the climate data from REDIAM (500 x 500 grid for all Andalusian territory). Gridded data comes from meteorological stations. For each pixel values of maximum, minimum and median monthly temperatures are available from 1971 to 2021. Monthly rainfall are available from 1951 to 2022.

  • For each transect we obtained all the data of the pixels that contact with the transects (see code/get_climate_transect.R). Then we generated the average value of each variable by transect and month.

files <- list.files("data/raw_data/climate_transects", pattern = "transect.csv", full.names = TRUE)

custom_f <- function(x) { 
  out <- 
    read_csv(x) |> 
    dplyr::select(-ID) |> 
    pivot_longer(-c("transectid","x", "y")) |> 
    separate(name, into = c("var", "year", "month", "cog")) |> 
    dplyr::select(-cog, -x, -y)
  return(out)
  }


raw <- files |> 
  map_df(~custom_f(.)) |> 
  mutate(var = recode(var, 
                      "tm2" = "tmin", 
                      "tm3" = "tmax"))


climate_transect <- raw |> 
  group_by(transectid, var, year, month) |> 
  summarise(avg_transect = mean(value)) |> 
  mutate(date = as.Date(paste(year, month, "01", sep="-"), format="%Y-%m-%d")) |> ungroup() |> 
  filter(date >= as.Date("2007-01-01", format="%Y-%m-%d")) |>
  mutate(month_names = strftime(date, '%b')) 

write_csv(climate_transect, "data/raw_data/climate_transects/climate_transect_all_avg.csv")
climate_transect <- read_csv("data/raw_data/climate_transects/climate_transect_all_avg.csv")

See aspect of the table:

Compute month, bi-month and three-month data

For each transect, we selected monthly data. Then we computed the bi-months and three-months average.

Prepare hydrological months

aux <- climate_transect |> 
  mutate_at(vars(month, year), as.numeric) |> # convert to numeric 
  mutate(
    hydro_month = case_when(
      month >= 10 ~ (month - 9), 
      TRUE ~ month + 3), 
    hydro_year = case_when(
      month >= 10 ~ (year + 1), 
      TRUE ~ year), 
    name_m = case_when(
      month >= 10 ~ paste0(month_names, "_pre"), 
      TRUE ~ month_names
    )
  ) 

# |> 
#   filter(hydro_year > 2007)

Temperature data

Monthly data

We selected temperature data in the climate_transect dataframe. For each variable (i.e. tmax, tmed, tmin) we pivot the data to obtain the temperature data for each transect (ID transect) and year (from 2007 to 2021). The resulted dataframe is called d1mt (data-1-month-temperature).

d1mt <- aux |> 
  filter(var != "p") |> 
  dplyr::select(-month, -date, -month_names, -year, -hydro_month) |> 
  pivot_wider(values_from = avg_transect, 
              names_from = name_m) 

Explore the results

Bimonthly data

To obtain values of temperature aggregated each 2-months, we used the runner() function of runner package. This function computes a function over a temporal window. For the temperature data we use the bimonth average temperature. As monthly data, we pivot this dataframe to obtain for each variable, transect and year the temperature value (in this case the 2-month average temperature). The name of the variables will be a combination of the two averaged month, i.e.: JanFeb. The resulting dataset is called d2mt (data-2-month-temperature).

aux2 <- aux |> 
  filter(var != "p") |> 
  group_by(var) |> 
  mutate(
    avg_two_months = runner::runner(
      x = avg_transect,
      k = 2,
      f = mean,
      na_pad = TRUE
    )) |> 
  mutate(
    month_names2 = runner::runner(
      x = name_m, 
      k = 2, 
      f = paste, 
      na_pad = TRUE, 
      collapse="")
  ) |> 
  ungroup()

d2mt <- aux2 |> 
  filter(var != "p") |> 
  dplyr::select(-month, -date, -month_names, -year, -hydro_month, -avg_transect, -name_m) |> 
  filter(!(is.na(month_names2))) |>
  pivot_wider(values_from = avg_two_months, 
              names_from = month_names2) 

See the results

Some test were performed to check the results:

d2008 <- aux |>
  filter(var == "tmax") |>
  filter(transectid == 14) |>
  filter(hydro_year == 2008)

# test 1
janfeb <- mean(as.vector(d2008[4:5, "avg_transect"]$avg_transect))

test_aux2 <- aux2 |>
  filter(var == "tmax") |>
  filter(transectid == 14) |>
  filter(hydro_year %in% c(2007, 2008))

identical(janfeb, (test_aux2 |> 
  filter(hydro_year == 2008) |> 
  filter(month_names2 == "JanFeb") |> pull(avg_two_months)))
[1] TRUE
# test 2
Dec_pre_jan <- mean(as.vector(d2008[3:4, "avg_transect"]$avg_transect))  

identical(Dec_pre_jan, (test_aux2 |> 
  filter(hydro_year == 2008) |> 
  filter(month_names2 == "Dec_preJan") |> pull(avg_two_months)))
[1] TRUE

Trimonthly data

To obtain values of temperature aggregated each 3-months, we again used the runner() function of runner package. As monthly data, we pivot this dataframe to obtain for each variable, transect and year the temperature value (in this case the 3-month average temperature). The name of the variables will be a combination of the three averaged month, i.e.: JanFebMar. The resulting dataset is called d3mt (data-3-month-temperature).

aux3 <- aux |> 
  filter(var != "p") |>
  group_by(var) |> 
  mutate(
    avg_three_months = runner::runner(
      x = avg_transect,
      k = 3,
      f = mean,
      na_pad = TRUE
    )
  ) |> 
  mutate(
    month_names3 = runner::runner(
      x = name_m, 
      k = 3, 
      f = paste, 
      na_pad = TRUE, 
      collapse="")
  ) |> ungroup()

d3mt <- aux3 |> 
  filter(var != "p") |> 
  dplyr::select(-month, -date, 
                -month_names, -year, 
                -hydro_month, -avg_transect, -name_m) |> 
  filter(!(is.na(month_names3))) |> 
  pivot_wider(values_from = avg_three_months, 
              names_from = month_names3)

See the results:

We generated a dataframe with monthly, 2-month and 3-month data

climate_temp <- d1mt |> 
  inner_join(d2mt) |> 
  inner_join(d3mt)

Rainfall data

As we see in the previous section, we also computed the monthly, bimonthly and 3-monthly data for the rainfall, but in this case we used the cummulative precipitation (not the average), so the value for JanFeb corresponds to the cummulative rainfall of Jan and Feb. 

auxp <- aux |> 
  filter(var == "p")


d1mp <- aux |> 
  filter(var == "p") |> 
  dplyr::select(-month, -date, -month_names, -year, -hydro_month) |> 
  pivot_wider(values_from = avg_transect, 
              names_from = name_m) 


aux2p <- aux |> 
  filter(var == "p") |> 
  mutate(
    sum_two_months = runner(
      x = avg_transect,
      k = 2,
      f = sum,
      na_pad = TRUE
    )) |> 
  mutate(
    month_names2 = runner(
      x = name_m, 
      k = 2, 
      f = paste, 
      na_pad = TRUE, 
      collapse="")
  ) |> ungroup()


d2mp <- aux2p |> 
  dplyr::select(-month, -date, -month_names, 
                -year, -hydro_month, -avg_transect, -name_m) |> 
  filter(!(is.na(month_names2))) |> 
  pivot_wider(values_from = sum_two_months, 
              names_from = month_names2) 

aux3p <- aux |> 
  filter(var == "p") |> 
  mutate(
    sum_three_months = runner(
      x = avg_transect,
      k = 3,
      f = mean,
      na_pad = TRUE
    )
  ) |> 
  mutate(
    month_names3 = runner(
      x = name_m, 
      k = 3, 
      f = paste, 
      na_pad = TRUE, 
      collapse="")
  ) |> ungroup()

d3mp <- aux3p |> 
  dplyr::select(-month, -date, -month_names, 
                -year, -hydro_month, -avg_transect, -name_m) |> 
  filter(!(is.na(month_names3))) |> 
  pivot_wider(values_from = sum_three_months, 
              names_from = month_names3) 
    

climate_prec <- d1mp |> 
  inner_join(d2mp) |> 
  inner_join(d3mp)

Generated data

The temperature and rainfall data were joinned and exported.

climate_data  <- bind_rows(
  climate_temp, climate_prec)
# Filter erroneous date windows
climate_data <- climate_data |> 
  dplyr::select(-SepOct_pre, -AugSepOct_pre, -SepOct_preNov_pre)
climate_data |>
  mutate(across(-c(transectid, var, hydro_year), ~round(., digits = 2))) |>
  DT::datatable()

sessionInfo()
R version 4.2.3 (2023-03-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur ... 10.16

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DT_0.26          kableExtra_1.3.4 runner_0.4.2     lubridate_1.9.2 
 [5] forcats_1.0.0    stringr_1.5.0    dplyr_1.1.0      purrr_1.0.1     
 [9] readr_2.1.4      tidyr_1.3.0      tibble_3.1.8     ggplot2_3.4.1   
[13] tidyverse_2.0.0  here_1.0.1       workflowr_1.7.0 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10       svglite_2.1.0     getPass_0.2-2     ps_1.7.1         
 [5] rprojroot_2.0.3   digest_0.6.31     utf8_1.2.2        R6_2.5.1         
 [9] evaluate_0.19     httr_1.4.4        pillar_1.8.1      rlang_1.1.0      
[13] rstudioapi_0.14   whisker_0.4       callr_3.7.3       jquerylib_0.1.4  
[17] rmarkdown_2.19    webshot_0.5.4     htmlwidgets_1.6.2 bit_4.0.4        
[21] munsell_0.5.0     compiler_4.2.3    httpuv_1.6.8      xfun_0.39        
[25] pkgconfig_2.0.3   systemfonts_1.0.4 htmltools_0.5.4   tidyselect_1.2.0 
[29] fansi_1.0.3       viridisLite_0.4.1 crayon_1.5.2      tzdb_0.3.0       
[33] withr_2.5.0       later_1.3.0       grid_4.2.3        jsonlite_1.8.4   
[37] gtable_0.3.1      lifecycle_1.0.3   git2r_0.30.1      magrittr_2.0.3   
[41] scales_1.2.1      vroom_1.6.3       cli_3.6.0         stringi_1.7.8    
[45] cachem_1.0.6      fs_1.6.2          promises_1.2.0.1  xml2_1.3.3       
[49] bslib_0.4.2       ellipsis_0.3.2    generics_0.1.3    vctrs_0.6.0      
[53] tools_4.2.3       bit64_4.0.5       glue_1.6.2        crosstalk_1.2.0  
[57] hms_1.1.2         processx_3.7.0    parallel_4.2.3    fastmap_1.1.0    
[61] yaml_2.3.7        timechange_0.1.1  colorspace_2.0-3  rvest_1.0.3      
[65] knitr_1.41        sass_0.4.5