Checks if a dataset confirms to a given set of rules
Usage
check_data(
x,
rules,
xname = deparse(substitute(x)),
stop_on_fail = FALSE,
stop_on_warn = FALSE,
stop_on_error = FALSE
)
Arguments
- x
a dataset, either a
data.frame
,dplyr::tibble
,data.table::data.table
,arrow::arrow_table
,arrow::open_dataset
, ordplyr::tbl
(SQL connection)- rules
a list of
rule
s- xname
optional, a name for the x variable (only used for errors)
- stop_on_fail
when any of the rules fail, throw an error with stop
- stop_on_warn
when a warning is found in the code execution, throw an error with stop
- stop_on_error
when an error is found in the code execution, throw an error with stop
Examples
rs <- ruleset(
rule(mpg > 10),
rule(cyl %in% c(4, 6)), # missing 8
rule(qsec >= 14.5 & qsec <= 22.9)
)
rs
#> <Verification Ruleset with 3 elements>
#> [1] 'Rule for: mpg' matching `mpg > 10` (allow_na: FALSE)
#> [2] 'Rule for: cyl' matching `cyl %in% c(4, 6)` (allow_na: FALSE)
#> [3] 'Rule for: qsec' matching `qsec >= 14.5 & qsec <= 22.9` (allow_na: FALSE)
check_data(mtcars, rs)
#> name expr allow_na negate tests pass fail
#> 1: Rule for: mpg mpg > 10 FALSE FALSE 32 32 0
#> 2: Rule for: cyl cyl %in% c(4, 6) FALSE FALSE 32 18 14
#> 3: Rule for: qsec qsec >= 14.5 & qsec <= 22.9 FALSE FALSE 32 32 0
#> warn error time
#> 1: 0.0052380562 secs
#> 2: 0.0027456284 secs
#> 3: 0.0002403259 secs