Downscale CMIP5 grid data
Usage
downscaleCMIP5(
rcp,
mod,
indir,
uerra,
outdir,
vars = NULL,
cmip5_vars = NULL,
lon_lim = c(-11, 12),
lat_lim = c(28, 44),
years = 1991:2100,
dictionary = system.file("extdata", "CMIP5_dictionary.csv", package = "dsclim"),
method = "GLM",
family_link = stats::gaussian(link = "identity"),
local.predictors = NULL,
global.vars = NULL,
spatial.predictors = NULL,
extended.predictors = NULL,
combined.only = TRUE,
globalAttributes = NULL
)
Arguments
- rcp
A character string indicating the Historical or the Representative Concentration Pathway (RCP) that is going to be downscaled. Possible values are c("historical", "rcp2.6", "rcp4.5", "rcp6.0", "rcp8.5").
- mod
A character string indicating the General Circulation Model that is going to be downscaled. Possible values are c("CESM1-CAM5", "CSIRO-Mk3-6-0", "IPSL-CM5A-MR").
- indir
A character string indicating the directory path to look for the CMIP5 data. This directory should have a folder with the name of the RCP that is going to be loaded. The RCP folder should have in turn a folder with the name of the General Circulation Model that is goint to be loaded.
- uerra
A climate4R grid object to be used as predictand of the dowscaling process, usually the historical high resolution dataset. This object is created with the
loadUerra
) or theloadGridData
) functions.- outdir
A character string indicating the directory path to save the downscaled datasets.
- vars
A character vector indicating the CMIP5 variables to be used as predictors. If NULL (default value) the function use all available variables: c("tas, "tasmax", "tasmin", "hurs", "ps", "pr", "clt", "sfcWind"), which represent mean surface temperature (tas), maximum surface temperature (tasmax), minimum surface temperature (tasmin), relative humidity (hurs), surface pressure (ps), precipitation (pr), total cloudiness (clt), and surface wind speed (sfcWind). These are the names of the CMIP5 variables in their standardized format (as they should match those from other datasets, e.g. TraCE21ka and UERRA datasets).
- cmip5_vars
A character vector indicating the variables to be downloaded. If NULL (default value) the function downscaled all the available variables c("tas, "tasmax", "tasmin", "hurs", "ps", "pr", "cld", "wss"), which represent mean surface temperature (tas), maximum surface temperature (tasmax), minimum surface temperature (tasmin), relative humidity (hurs), surface pressure (ps), precipitation (pr), total cloudiness (cld), and surface wind speed (wss). These are the names of the CMIP5 variables in their original format (as coded in original CMIP5 datasets).
- lon_lim
A numeric vector (length = 2) with the longitudinal extent that is going to be dowscaled
- lat_lim
A numeric vector (length = 2) with the latitudinal extent that is going to be dowscaled.
- years
A numeric vector with the sequence of years that are going to be downscaled. By default, it downscaled the whole future period in CMIP5 dataset (1991-2100).
- dictionary
A data dictionary (as in
loadGridData
) that is going to be used to load the data CMIP5 data. By default it uses an internal dictionary that standardize CMIP5, TraCE21ka, and UERRA data to the same format.- method
A character string indicating the downscaling method to be used. More details can be read at
downscaleTrain
- family_link
A character string indicating the family link to be used if
method = "GLM"
.- local.predictors
Default to
NULL
, and not used. Otherwise, a named list of arguments in the formargument = value
, with the following arguments:vars
: names of the variables inx
to be used as local predictorsfun
: Optional. Aggregation function for the selected local neighbours. The aggregation function is specified as a list, indicating the name of the aggregation function in first place (as character), and other optional arguments to be passed to the aggregation function. For instance, to compute the average skipping missing values:fun = list(FUN= "mean", na.rm = TRUE)
. Default to NULL, meaning that no aggregation is performed.n
: Integer. Number of nearest neighbours to use. If a single value is introduced, and there is more than one variable invars
, the same value is used for all variables. Otherwise, this should be a vector of the same length asvars
to indicate a different number of nearest neighbours for different variables.
- global.vars
An optional character vector with the short names of the variables of the input
x
multigrid to be retained as global predictors (use thegetVarNames
helper if not sure about variable names). This argument just produces a call tosubsetGrid
, but it is included here for better flexibility in downscaling experiments (predictor screening...). For instance, it allows to use some specific variables contained inx
as local predictors and the remaining ones, specified insubset.vars
, as either raw global predictors or to construct the combined PC.- spatial.predictors
Default to
NULL
, and not used. Otherwise, a named list of arguments in the formargument = value
, with the arguments to be passed toprinComp
to perform Principal Component Analysis of the predictors grid (x
). See Details on principal component analysis of predictors.- extended.predictors
This is a parameter related to the extreme learning machine and reservoir computing framework where input data is randomly projected into a new space of size
n
. Default toNULL
, and not used. Otherwise, a named list of arguments in the formargument = value
, with the following arguments:n
: A numeric value. Indicates the size of the random nonlinear dimension where the input data is projected.module
: A numeric value (Optional). Indicates the size of the mask's module. Belongs to a specific type of ELM called RF-ELM.
- combined.only
Optional, and only used if spatial.predictors parameters are passed. Should the combined PC be used as the only global predictor? Default to TRUE. Otherwise, the combined PC constructed with
which.combine
argument inprinComp
is append to the PCs of the remaining variables within the grid.- globalAttributes
Optional. A list of global attributes included in the NetCDF file. Same format as
varAttributes
.
Value
The function return the string "Done" when completed successfully. However, the real output of the function is the output directory with the downscaled data in the same folder structure as the original dataset (Output/CMIP5/RCP/GCM/var). By default, the data are saved yearly (one file for each year) in two different formats: netCDF (nc) and raw text-tab-separated format (columns: x, y, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12; numbered columns represent the 10 months of the year).
Examples
if (FALSE) { # \dontrun{
uerra <- dsclim::loadUerra(
"Data/UERRA/UERRA-HARMONIE/2m_temperature/latlon/1961-90_2m_temperature.nc",
"tas"
)
dsclim::downscaleCMIP5("rcp6.0",
"CESM1-CAM5",
"Data/CMIP5/",
uerra,
"Output/CMIP5/",
lon_lim = c(-11, 12),
lat_lim = c(28, 44),
years = 1991:2100,
method = "GLM",
family_link = "gaussian"
)
} # }