Skip to contents

Model an imputed dataset

Usage

model_impute(
  object,
  model_fun,
  rhs,
  model_args = list(),
  extractor,
  extractor_args = list(),
  filter = list(),
  mutate = list(),
  ...,
  timeout = 600
)

# S4 method for class 'ANY'
model_impute(
  object,
  model_fun,
  rhs,
  model_args = list(),
  extractor,
  extractor_args = list(),
  filter = list(),
  mutate = list(),
  ...,
  timeout = 600
)

# S4 method for class 'aggregatedImputed'
model_impute(
  object,
  model_fun,
  rhs,
  model_args = list(),
  extractor,
  extractor_args = list(),
  filter = list(),
  mutate = list(),
  ...,
  timeout = 600
)

Arguments

object

The imputed dataset.

model_fun

The function to apply on each imputation set. Or a string with the name of the function. Include the package name when the function is not in one of the base R packages. For example: "glm" or "INLA::inla".

rhs

The right hand side of the model.

model_args

An optional list of arguments to pass to the model function.

extractor

A function which return a matrix or data.frame. The first column should contain the estimate, the second the standard error of the estimate.

extractor_args

An optional list of arguments to pass to the extractor function.

filter

An optional argument to filter the aggregated dataset. Either a function which takes the Covariate slot as an argument. Or a list which will be passed to the .dots argument of dplyr::filter(). You can filter on the covariates in the aggregated dataset. Besides those you can also filter on Imputation_min and Imputation_max. These variables represent the lowest and highest value of the imputations per row in the data.

mutate

An optional argument to alter the aggregated dataset. Will be passed to the .dots argument of dplyr::mutate(). This is mainly useful for simple conversions, e.g. factors to numbers and vice versa.

...

currently ignored.

timeout

Maximum duration allowed for fitting a single imputation model in seconds. Defaults to 600 seconds (10 minutes).

Examples

dataset <- generate_data(n_year = 10, n_site = 50, n_run = 1)
dataset$Count[sample(nrow(dataset), 50)] <- NA
model <- lm(Count ~ Year + factor(Period) + factor(Site), data = dataset)
imputed <- impute(data = dataset, model = model)
aggr <- aggregate_impute(imputed, grouping = c("Year", "Period"), fun = sum)
extractor <- function(model) {
  summary(model)$coefficients[, c("Estimate", "Std. Error")]
}
model_impute(
  object = aggr,
  model_fun = lm,
  rhs = "0 + factor(Year)",
  extractor = extractor
)
#> # A tibble: 10 × 5
#>    Parameter      Estimate    SE   LCL   UCL
#>    <fct>             <dbl> <dbl> <dbl> <dbl>
#>  1 factor(Year)1     1460.  216. 1036. 1883.
#>  2 factor(Year)2     1460.  216. 1037. 1883.
#>  3 factor(Year)3     1569.  216. 1145. 1993.
#>  4 factor(Year)4     1808.  216. 1385. 2231.
#>  5 factor(Year)5     1671.  216. 1247. 2094.
#>  6 factor(Year)6     1546.  217. 1121. 1970.
#>  7 factor(Year)7     1681.  216. 1257. 2104.
#>  8 factor(Year)8     1633.  216. 1209. 2056.
#>  9 factor(Year)9     1798.  216. 1375. 2220.
#> 10 factor(Year)10    1426.  216. 1003. 1849.