Package: udpipe
Type: Package
Title: Tokenization, Parts of Speech Tagging, Lemmatization and
        Dependency Parsing with the 'UDPipe' 'NLP' Toolkit
Version: 0.8.12
Maintainer: Jan Wijffels <jwijffels@bnosac.be>
Authors@R: c(
    person('Jan', 'Wijffels', role = c('aut', 'cre', 'cph'), email = 'jwijffels@bnosac.be'), 
    person('BNOSAC', role = 'cph'), 
    person("Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic", role = 'cph'), 
    person('Milan Straka', role = c('ctb', 'cph'), email = 'straka@ufal.mff.cuni.cz'), 
    person('Jana Straková', role = c('ctb', 'cph'), email = 'strakova@ufal.mff.cuni.cz'))
Description: This natural language processing toolkit provides language-agnostic
    'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency
    parsing' of raw text. Next to text parsing, the package also allows you to train
    annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided
    at <https://universaldependencies.org/format.html>. The techniques are explained
    in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0
    with UDPipe', available at <doi:10.18653/v1/K17-3009>. 
    The toolkit also contains functionalities for commonly used data manipulations on texts 
    which are enriched with the output of the parser. Namely functionalities and algorithms 
    for collocations, token co-occurrence, document term matrix handling, 
    term frequency inverse document frequency calculations,
    information retrieval metrics (Okapi BM25), handling of multi-word expressions,
    keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) 
    sentiment scoring and semantic similarity analysis.
License: MPL-2.0
URL: https://bnosac.github.io/udpipe/en/index.html,
        https://github.com/bnosac/udpipe
Encoding: UTF-8
Depends: R (>= 2.10)
Imports: Rcpp (>= 0.11.5), data.table (>= 1.9.6), Matrix, methods,
        stats
LinkingTo: Rcpp
VignetteBuilder: knitr
Suggests: knitr, rmarkdown, topicmodels, lattice, parallel
RoxygenNote: 7.1.2
NeedsCompilation: yes
Packaged: 2025-09-04 10:25:57 UTC; jwijffels
Author: Jan Wijffels [aut, cre, cph],
  BNOSAC [cph],
  Institute of Formal and Applied Linguistics, Faculty of Mathematics and
    Physics, Charles University in Prague, Czech Republic [cph],
  Milan Straka [ctb, cph],
  Jana Straková [ctb, cph]
Repository: CRAN
Date/Publication: 2025-09-04 15:50:02 UTC
Built: R 4.5.2; x86_64-w64-mingw32; 2025-11-01 01:55:06 UTC; windows
Archs: x64
