Advance R Programming

Introduction to R Packages

In R language functions and datasets are all stored in packages. The content of a package is only available when a package is loaded using the library() function.

To see which R packages are installed, write the following command (without argument)

> library()

To load a particular installed package, use package name as the argument to the library() function, that is,

> library(MASS)

If the computer system is connected to the internet and a required package is not installed on one’s computer, the user can use the install.packages() function to install the required package. To update the already installed package one can use the update.package() function. The search() function can be used to see which packages are loaded into computer memory.

R packages can be classified as standard (base) packages and contributed packages. The standard (or base) packages are considered part of the R source code. The base packages contain the basic functions that allow R to work. The base packages also contain datasets, standard statistical and graphical functions. The standard R functions are automatically available in any R installation, that is, you don’t need to install them.

The standard R packages are written by authors. These packages implement some specialized statistical methods, access to datasets and hardware. The contributed packages are distributed with every binary distribution of R and are available for download from CRAN and other repositories such as BioConductor.

R packages can have a namespace. Namespaces (i) allows the package writer to hide functions and data that are meant only for internal use, (ii) prevent functions from breaking when a user picks a name that clashes with one in the packages, and (iii) they provide a way to refer to an object within a particular package.

For example, in R the t() function is the transpose function. A user can define his own t() function. The namespaces will prevent the user’s definition from taking procedure and breaking every function that tries to transpose matrix.

There are two operators that work with namespaces, (i) :: double colon operator and triple colon operator :::. The double colon operator selects definitions from a particular namespace. For example, t() function is available as the base::t, because it is defined in the base package. The function that is exported from the package can be retrieved with a double colon operator.

The tiple colon operator acts as a double colon operator but it also allows access to hidden objects. The getAnywhere() function can be used to search for multiple packages.

Note: Packages are inter-dependent, and loading one package may cause other packages to be automatically loaded. The colon operators also cause automatic loading of the associated package. the package is not added to the search list when a package with namespaces is loaded automatically.

Debugging Tools in R Language

The built-in debugging tools (programs: functions) in R statistical computing environment are traceback(), debu(), browser(), trace(), and recover(). The purpose of the debugging tools is to help the programmer find unforeseen (غیر متوقع، جس کی اُمید نہ ہو) problems quickly and efficiently.

The R system has two main ways of reporting a problem in executing a function. One of them is a warning message while the other one is a simple error. The purpose of the warning is to tell the user (programmer) that “something unusual happened during the execution of the function, but the function was nevertheless able to execute to completion”. Writing a robust code (code which checks for imputing errors) is important for larger programs.

> log(-1)     #produce a warning (NaN)
> message <- function(x){
            if(x > 0)
            print(“Hello”)
            else
            print(“Goodbye”)
   }

The log(-1) will result in a fatal error, not a warning. The first thing one should do is to print the call stack (print the sequence of function calls which led to the error). The traceback() function can be used which prints the list of functions which were called before the error occurred. However, this can be uninteresting if the error occurred at a top-level function.

The traceback() Function

The traceback() function prints the sequence of function calls in reverse order from the top.

The debug() Function

The debug() function takes a single argument (the name of a function) and steps through the function line-by-line to identify the specific location of a bug, that function is flagged for debugging. In order to unflag a function, undebug() function is used. A function flagged for debugging does not execute in a usual way, rather, each statement in the function is executed one at a time and the user can control when each statement gets executed. After a statement is executed, the function suspends and the user is free to interact with the environment.

The browser() Function

The browser() function can be used to suspend execution of a function so that the user can browse the local environment.

The trace() Function

The trace() function is very useful for making minor modifications to function “on the fly” without having to modify functions and re-sourcing them. If is especially useful if you need to track down an error which occurs in a base function.

> trace(“mean”, quote( if( any(is.nan(x) ) ){ browser() }), print = FALSE)

The trace() function copy the original function code into a temporary location and replaces the original function with a new function containing the insert code.

The recover() Function

The recover() function can help to “jump up” to a higher level in the function call stack.

> options(error=recover)

The error option tells R what to do in the situation where a function must halt the execution.

Namespaces in R Language

In R language, the packages can have namespaces, and currently, all of the base and recommended packages do except the dataset packages. Understanding the use of namespaces is vital if one plans to submit a package to CRAN because CRAN requires that the package plays nicely with other submitted packages on CRAN.

Namesspaces in R Language

Namespaces ensure that other packages will not interfere with your code and that the package works regardless of the environment in which it’s run. In R Language, the namespace environment is the internal interface of the package. It includes all objects in the package, both exported and non-exported to ensure that every function can find every other function in the package.

For example, plyr and Hmisc both provide a function namely summarize(). Loading plyr package and then Hmise, the summarize() function will refer to the Hmisc. However, loading the package in the opposite order, the summarize() function will refer to the plyr package version.

To avoid confusion, one can explicitly refer to the specific function, for example,

> Hmisc::summarize()

and

> plyr::summarize()

Now, the order in which the packages are loaded would not matter.

Namespaces do three things:

  • Namespaces allow the package writer to hide functions and data that are meant only for internal use,
  • Namespaces prevent functions from breaking when a user (or other package writers) picks a name that clashes with one in the package, and
  • Namespaces provide a way to refer to an object within a particular package

Namespace Operators

In R language, two operators work with namespaces.

  • Doule-Colon Operator
    The double-colon operator:: selects definitions from a particular namespace. The transpose function t() will always be available as the base::t because it is defined in the base package. Only functions that are exported from the package can be retrieved in this way.
  • Triple-Colon Operator
    The triple-colon operator ::: acts like the double-colon operator but also allows access to hidden objects. Users are more likely to use the getAnywhere() function, which searches multiple packages.

Packages are often interdependent, and loading one may cause others to be automatically loaded. The colon operators will also cause automatic loading of the associated package. When packages with namespaces are loaded automatically they are not added to the search list.

Basic R Frequently Asked Questions

Online MCQs Test Preparation Website with Answers

Scroll to top
x  Powerful Protection for WordPress, from Shield Security
This Site Is Protected By
Shield Security