Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
| Version: | 0.1.5 | 
| Depends: | R (≥ 3.1.1) | 
| Imports: | assertthat (≥ 0.1), digest (≥ 0.6.8), dplyr (≥ 0.8.0), NLP (≥ 0.1.8), Rcpp (≥ 0.12.0), RcppProgress (≥ 0.1), stringr (≥ 1.0.0), tibble (≥ 3.0.1), tidyr (≥ 0.3.1) | 
| LinkingTo: | BH, Rcpp, RcppProgress | 
| Suggests: | testthat (≥ 0.11.0), knitr (≥ 1.11), rmarkdown (≥ 0.8), covr | 
| Published: | 2020-05-15 | 
| DOI: | 10.32614/CRAN.package.textreuse | 
| Author: | Lincoln Mullen | 
| Maintainer: | Lincoln Mullen <lincoln at lincolnmullen.com> | 
| BugReports: | https://github.com/ropensci/textreuse/issues | 
| License: | MIT + file LICENSE | 
| URL: | https://docs.ropensci.org/textreuse, https://github.com/ropensci/textreuse | 
| NeedsCompilation: | yes | 
| Materials: | README, NEWS | 
| In views: | NaturalLanguageProcessing | 
| CRAN checks: | textreuse results | 
| Reference manual: | textreuse.html , textreuse.pdf | 
| Vignettes: | Text alignment (source, R code) Introduction to the textreuse packages (source, R code) Minhash and locality-sensitive hashing (source, R code) Pairwise comparisons for document similarity (source, R code) | 
| Package source: | textreuse_0.1.5.tar.gz | 
| Windows binaries: | r-devel: textreuse_0.1.5.zip, r-release: textreuse_0.1.5.zip, r-oldrel: textreuse_0.1.5.zip | 
| macOS binaries: | r-release (arm64): textreuse_0.1.5.tgz, r-oldrel (arm64): textreuse_0.1.5.tgz, r-release (x86_64): textreuse_0.1.5.tgz, r-oldrel (x86_64): textreuse_0.1.5.tgz | 
| Old sources: | textreuse archive | 
| Reverse suggests: | textrank | 
Please use the canonical form https://CRAN.R-project.org/package=textreuse to link to this page.