The functions calculate the checksum (digest; hash value) of one or multiple files. They can be used to verify file integrity.

checksum(files, hash_fun = c("xxh64", "md5", "sha256"))

xxh64sum(files)

md5sum(files)

sha256sum(files)

Arguments

files

Character vector of file path(s). File path(s) can be absolute or relative.

hash_fun

String that defines the hash function. See Usage for allowed values; defaults to the first.

Value

Named character vector with the same length as files and with the file names as names.

Details

A few cryptographic and non-cryptographic hash functions are implemented, either from the OpenSSL library (through openssl) or as embedded in the digest package.

Functions md5sum() etc. are simple shortcuts to checksum() with the appropriate hash function preset. Their names were chosen to match those of xxHash and GNU coreutils.

The cryptographic algorithms use the OpenSSL implementation and stream-hash the binary contents of the connections to the respective files. They turn the hash-format for binary streams by the openssl package into a regular hash string. Note that n2khab will mask tools::md5sum(), which is a standalone implementation.

See also

Other functions regarding file management for N2KHAB projects: download_zenodo(), fileman_folders(), fileman_up(), locate_n2khab_data()

Examples

# creating two different temporary files:
file1 <- tempfile()
file2 <- tempfile()
files <- c(file1, file2)
file.create(files)
#> [1] TRUE TRUE
con <- file(file2)
writeLines("some text", con)
close(con)

# computing alternative checksums:
checksum(files)
#>   file1ca53262a5fb   file1ca553e11694 
#> "ef46db3751d8e999" "b563efb2061ae502" 
xxh64sum(files)
#>   file1ca53262a5fb   file1ca553e11694 
#> "ef46db3751d8e999" "b563efb2061ae502" 
md5sum(files)
#>                   file1ca53262a5fb                   file1ca553e11694 
#> "d41d8cd98f00b204e9800998ecf8427e" "4d93d51945b88325c213640ef59fc50b" 
sha256sum(files)
#>                                                   file1ca53262a5fb 
#> "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855" 
#>                                                   file1ca553e11694 
#> "a23e5fdcd7b276bdd81aa1a0b7b963101863dd3f61ff57935f8c5ba462681ea6" 

if (FALSE) {
# This will error:
files <- c(file1, file2, tempfile(), tempfile())
checksum(files)
}