* using log directory 'd:/Rcompile/CRANpkg/local/4.0/HadoopStreaming.Rcheck' * using R version 4.0.5 (2021-03-31) * using platform: x86_64-w64-mingw32 (64-bit) * using session charset: ISO8859-1 * checking for file 'HadoopStreaming/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'HadoopStreaming' version '0.2' * checking package namespace information ... OK * checking package dependencies ... OK * checking if this is a source package ... OK * checking if there is a namespace ... OK * checking for hidden files and directories ... OK * checking for portable file names ... OK * checking whether package 'HadoopStreaming' can be installed ... OK * checking installed package size ... OK * checking package directory ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking for left-over files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking whether the package can be unloaded cleanly ... OK * checking whether the namespace can be loaded with stated dependencies ... OK * checking whether the namespace can be unloaded cleanly ... OK * checking loading without being on the library search path ... OK * checking use of S3 registration ... OK * checking dependencies in R code ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... [4s] NOTE hsTableReader: no visible global function definition for 'object.size' hsWriteTable: no visible global function definition for 'write.table' Undefined global functions or variables: object.size write.table Consider adding importFrom("utils", "object.size", "write.table") to your NAMESPACE file. * checking Rd files ... OK * checking Rd metadata ... OK * checking Rd cross-references ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking Rd contents ... OK * checking for unstated dependencies in examples ... OK * checking examples ... [1s] ERROR Running examples in 'HadoopStreaming-Ex.R' failed The error most likely occurred in: > ### Name: HadoopStreaming-package > ### Title: Functions facilitating Hadoop streaming with R. > ### Aliases: HadoopStreaming-package HadoopStreaming > ### Keywords: package > > ### ** Examples > > ## STEP 1: MAKE A CONNECTION > > ## To read from STDIN (used for deployment in Hadoop streaming and for command line testing) > con = file(description="stdin",open="r") > > ## Reading from a text string: useful for very small test examples > str <- "Key1\tVal1\nKey2\tVal2\nKey3\tVal3\n" > cat(str) Key1 Val1 Key2 Val2 Key3 Val3 > con <- textConnection(str, open = "r") > > ## Reading from a file: useful for testing purposes during development > cat(str,file="datafile.txt") # write datafile.txt data in str > con <- file("datafile.txt",open="r") > > ## To get the first few lines of a file (also very useful for testing) > numlines = 2 > con <- pipe(paste("head -n",numlines,'datafile.txt'), "r") > > ## STEP 2: Write map and reduce scripts, call them mapper.R and > ## reducer.R. Alternatively, write a single script taking command line > ## flags specifying whether it should run as a mapper or reducer. The > ## hsCmdLineArgs function can assist with this. > ## Writing #!/usr/bin/env Rscript can make an R file executable from the command line. > > ## STEP 3a: Running on command line with separate mappers and reducers > ## cat inputFile | Rscript mapper.R | sort | Rscript reducer.R > ## OR > ## cat inputFile | R --vanilla --slave -f mapper.R | sort | R --vanilla --slave -f reducer.R > > ## STEP 3b: Running on command line with the recommended single file > ## approach using Rscript and the hsCmdLineArgs for argument parsing. > ## cat inputFile | ./mapReduce.R --mapper | sort | ./mapReduce.R --reducer > > ## STEP 3c: Running in Hadoop -- Assuming mapper.R and reducer.R can > ## run on each computer in the cluster: > ## $HADOOP_HOME/bin/hadoop $HADOOP_HOME/contrib/streaming/hadoop-0.19.0-streaming.jar \ > ## -input inpath -output outpath -mapper \ > ## "R --vanilla --slave -f mapper.R" -reducer "R --vanilla --slave -f reducer.R" \ > ## -file ./mapper.R -file ./reducer.R > > ## STEP 3d: Running in Hadoop, with the recommended single file method: > ## $HADOOP_HOME/bin/hadoop $HADOOP_HOME/contrib/streaming/hadoop-0.19.0-streaming.jar \ > ## -input inpath -output outpath -mapper \ > ## "mapReduce.R --mapper" -reducer "mapReduce.R --reducer" \ > ## -file ./mapReduce.R > > > > cleanEx() Warning in .Internal(gc(verbose, reset, full)) : closing unused connection 5 (datafile.txt) Warning in .Internal(gc(verbose, reset, full)) : closing unused connection 3 (stdin) Error: connections left open: head -n 2 datafile.txt (pipe) Execution halted head: write error * checking PDF version of manual ... OK * DONE Status: 1 ERROR, 1 NOTE