faster, lighter, cleaner analysis of biological sequences - tidysq

Dominik Rafacz

tidysq is an R-package for the analysis of biological sequences. Thanks to its functionally oriented high level interface, the usage of the package is intuitive even for less R-fluent users. The implementation of key functions with C++ and the use of sequence bit packing greatly increases both the speed and volume of data that the package is able to process. During the presentation I will show how the package fits into the tidyverse, enabling for example integration with the dplyr package and making the code clear and readable even for beginners. I will also compare its performance and ease of use with Biostrings and ape, other packages for management of biological sequences.