fpmin, fpmax, and
prange are added as parallel equivalents of base R
pmin, pmax, and range operations. These
functions compute element-wise minimum, maximum, and range (max - min)
across multiple vectors. They support integer, logical, and double
types, with automatic type promotion. Implemented by @SebKrantz.Update copyright date in c files
Fix note on CRAN regarding Rf_isFrame
iif tests for new version of R.nswitch. Thanks to Sebastian Krantz for raising an
issue.Update copyright date in c files
Fix note on CRAN regarding SETLENGTH
"kit.nThread" is defined before setting it to
1LUpdate configure file to extend support for GCC
Correct warnings in NEWS.Rd (strong)
Correct typo in funique.Rd thanks to @davidbudzynski
Function pprod now returns double output even if
inputs are integer - in line with base::prod - to avoid
integer overflows.
Update configure file
Function pcountNA is equivalent to
pcount(..., value = NA).
Function pcountNA and
pcount(..., value = NA) allow NA counting with
mixed data type (including data.frame).
pcountNA also supports list-vectors as inputs and counts
empty or NULL elements as NA.
Functions panyv, panyNA,
pallv and pallNA are added as efficient
wrappers around pcount and pcountNA. They are
parallel equivalents of scalar functions base::anyNA and
anyv, allv and allNA in the
‘collapse’ R package.
Functions pfirst and plast are added to
efficiently obtain the row-wise first and last non-missing value or
non-empty element of lists. They are parallel equivalents to the
(column-wise) ffirst and flast functions in
the ‘collapse’ R package. Implemented by @SebKrantz.
Functions psum/pprod/pmean also support logical
vectors as input. Implemented by @SebKrantz.
charToFact was not returning proper results.
Thanks to @alex-raw
for raising an issue.Function pprod now returns double output even if
inputs are integer - in line with base::prod - to avoid
integer overflows.
C compiler warnings on CRAN R-devel caused by compilation with -Wstrict-prototypes are now fixed. Declaration of functions without prototypes is depreciated in all versions of C. Thanks to Sebastian Krantz for the PR.
pcount now supports data.frame.pcount now works with specific NA values,
i.e. NA_real_, NA_character_ etc…psum, pmean, pprod,
pany and pall now support lists. Thanks to
Sebastian Krantz for the request and code suggestion.topn should now work for ALTREP object. Thanks
to @ben-schwen for
raising an issue.Function funique now preserves the attributes if the
input is a data.table, tibble or similar
objects. Thanks to Sebastian Krantz for the request.
Function topn now defaults to base R
order for large value of n. Please see updated
documentation for more information ?kit::topn.
Function charToFact gains a new argument
addNA=TRUE to be used to include (or not) NA
in levels of the output.
Function shareData, getData and
clearData implemented to share data objects between R
sessions. These functions are experimental and might change in the
future. Feedback is welcome. Please see ?kit::shareData for
more information.
Few calloc functions at C level have been replaced
by R C API function Calloc to avoid valgrind
errors/warnings in Travis CI.
Errors reported by rchk on CRAN have been
fixed.
Function charToFact gains a new argument
decreasing=FALSE to be used to order levels of the output
in decreasing or increasing order.
Function topn gains a new argument
index=TRUE to be used return index (TRUE) or
values (FALSE) of input vector.
funique and psort on some platforms.pmean, pprod and
psum will result in error if used with factors.
Documentation has been updated.Function funique and fduplicated gain
an additional argument fromLast=FALSE to indicate whether
the search should start from the end or beginning PR#11.
Functions pall, pany,
pmean, pprod and psum accept
data.frame as input PR#15. Please see
documentation for more information.
Function charToFact is equivalent to to base R
as.factor but is much quicker and only converts character
vector to factor. Note that it is parallelised. For more details and
benchmark please see ?kit::charToFact.
Function psort is experimental and equivalent to to
base R sort but is only for character vector. It can sort
by “C locale” or by “R session locale”. For more details and benchmark
please see ?kit::psort.
A few OpenMP directives were missing for functions
vswitch and nswitch for character vectors.
These have been added in PR#12.
Function funique was not preserving attributes for
character, logical and complex vectors/data.frames. Thanks to Sebastian
Krantz (@SebKrantz)
for bringing that to my attention. This has been fixed in PR#13.
Functions funique and uniqLen should
now be faster for factor and logical vectors
PR#14.
Function uniqLen(x) is equivalent to base R
length(unique(x)) and uniqueN in package data.table.
Function uniqLen, implemented in C, supports vectors,
data.frame and matrix. It should be faster
than these functions. For more details and benchmark please see
?kit::uniqLen.
Function vswitch now supports mixed encoding and
gains an additional argument checkEnc=TRUE. Thanks to
Xianying Tan (@shrektan) for the request and review PR#7.
Function nswitch is a nested version of function
vswitch and also supports mixed encoding. Please see please
see ?kit::nswitch for further details. Thanks to Xianying
Tan (@shrektan) for
the request and review PR#10.
Small algorithmic improvement for functions
fduplicated, funique and
countOccur for vectors,
data.frame and matrix.
A tests folder has been added to the source package to track coverage and bugs.
nif has been split into two distinctive
functions at C level, one has its arguments evaluated in a lazy way and
is for R users and the other one (nifInternalR) is not lazy and is
intended for usage at C level.Function countOccur(x), implemented in C, is
comparable to base R function table. It
returns a data.frame and is between 3 to 50 times faster.
For more details, please see ?kit::countOccur.
Functions funique and fduplicated now
support matrices. Additionally, these two functions should also have
better performance compare to previous release.
Functions topn has an additional argument
hasna=TRUE to indicates whether data contains
NA value or not. If the data does not contain
NA values, the function should be faster.
data.frame
and matrix as well as do other operations. These functions
are not exported or visible to the user but might become available and
callable at C level in the future.Function fpos was not properly handling
NaN and NA for complex and double. This should
now be fixed. The function has also been changed in case the ‘needle’
and ‘haysatck’ are vectors so that a vector is returned.
Functions funique and fduplicated were
not properly handling data containing POSIX data. This has
now been fixed.
Functions fduplicated(x) and
funique(x), implemented in C, are comparable to
base R functions duplicated and
unique. For more details, please see
?kit::funique.
Functions psum and pprod have now
better performance for type double and complex.
Function count(x, y) now checks that x
and y have the same class and levels. So does
pcount.
Function pmean was not callable at C level because
of a typo. This is now fixed.
Function count(x, value), implemented in C, to
simply count the number of times an element value occurs in
a vector or in a list x. For more details, please see
?kit::count.
Function pmean(..., na.rm=FALSE),
pall(..., na.rm=FALSE), pany(..., na.rm=FALSE)
and pcount(..., value), implemented in C, are similar to
already available function psum and pprod.
These functions respectively apply base R functions mean,
all and any element-wise. For more details,
benchmarks and help, please see ?kit::pmean.
Fix Solaris Unicode warnings for NEWS file. Benchmarks have been moved from the NEWS file to each function Rd file.
Fix some NA edge cases for pprod and
psum so these functions behave more like base R function
prod and sum.
Fix installation errors for version of R (<3.5.0).
Function
fpos(needle, haystack, all=TRUE, overlap=TRUE), implemented
in C, is inspired by base function which when used in the
following form which(x == y, arr.ind =TRUE). Function
fpos returns the index(es) or position(s) of a
matrix/vector within a larger matrix/vector. Please see
?kit::fpos for more details.
Function
iif(test, yes, no, na=NULL, tprom=FALSE, nThread=getOption("kit.nThread")),
originally contributed as fifelse in package data.table, was
moved to package kit to be developed independently. Unlike the current
version of fifelse, iif allows type promotion
like base function ifelse. For further details about the
differences with fifelse, as well as
hutils::if_else and dplyr::if_else, please see
?kit::iif.
Function nif(..., default=NULL), implemented in C,
is inspired by SQL CASE WHEN. It is comparable to dplyr function
case_when however it evaluates it arguments in a lazy way
(i.e only when needed). Function nif was originally
contributed as function fcase in the data.table
package but then moved to package kit so its development may resume
independently. Please see ?kit::nif for more
details.
Function pprod(..., na.rm=FALSE) and
psum(..., na.rm=FALSE), implemented in C, are inspired by
base function pmin and pmax. These new
functions work only for integer, double and complex types and do not
recycle vectors. Please see ?kit::psum for more
details.
Function setlevels(x, old, new, skip_absent=FALSE),
implemented in C, may be used to set levels of a factor object. Please
see ?kit::setlevels for more details.
Function topn(vec, n=6L, decreasing=TRUE),
implemented in C, returns the top largest or smallest n
values for a given numeric vector vec. It is inspired by
dplyr::top_n and equivalent to base functions order and
sort in specific cases as shown in the documentation. Please see
?kit::topn for more details.
Function
vswitch(x, values, outputs, default=NULL, nThread=getOption("kit.nThread")),
implemented in C, is a vectorised version of base R
function switch. This function can also be seen as a
particular case of function nif. Please see
?kit::switch for more details.