Formal Arguments in R: Quick Guide

Introduction to Formal Arguments in R

Formal arguments in R are essentially variables you define within a function’s code block. These arguments act as placeholders for the data that are provided when a user uses the function. Formal are the formal arguments of function returned as an object of class pairlist which can be thought of as something similar to a list with an important difference:

is.null(pairlist())
is.null(list())

That is a pairlist of length zero is NULL while a list is not.

Specifying Formal Arguments Positions

Formal arguments in R can be specified by position or by name and we can mix positional matching with matching by name. The following are equivalent.

mean(x = 1:5, trim = 0.1)
mean(1:5, trim = 0.1)
mean(x = 1:5, 0.1)
mean(1:5, 0.1)
mean(trim = 0.1, x = 1:5)
formal arguments in R Language

Functions Formals Default Values

Functions formals may also have the construct symbol=default, which unless differently specified, forces any argument to be used with its default value. Specifically, functions mean() also have a third argument na.rm that defaults to FALSE and as a result, passing vectors with NA values to mean() returns NA.

mean(c(1, 2, NA))

while by specifying na.rm=TRUE we get the mean of all non-missing elements of vector x.

mean(c(1, 2, NA), na.rm = TRUE)

we can redefine mean() function that defaults na.rm to TRUE by simply

mean(c(1, 2, NA))

Now we have a copy of mean.default() in our globalenv:

exists("mean.default", envir = globaenv())

also, notice

environment(mean.default)

The … Argument in a Function

The … argument of a function is special and can contain any number of symbol=value arguments. The … argument is transformed by R into a list that is simply added to the formal list:

h<-function(x, …){
            0
}

formals(h)

The … argument can be used if the number of arguments is unknown. Suppose one wants to define a function that counts the number of rows of any given number of data frames. One can write:

count_rows<-function(…){
     list<-list(…)
     lapply(list, nrow)
}

count_rows(airquality, cars)

By effectively using formal arguments in R Language, one can create reusable and adaptable functions that make the R code more concise and efficient.

https://itfeature.com

https://gmstat.com

Customize R Session

The .Rprofile file is used to customize R session every time you start it up. The R profile script (.Rprofile) can be created in the home directory. This script gets executed whenever you start a new R session. One can use it to pre-load libraries, set global options, or define custom functions. In this article, we will discuss how to customize the R sessions.

Want to make R work your way? This guide covers how to customize R session for maximum efficiency and comfort. This post is for Power users looking to automate session setup, the R users who want a more efficient workflow, and Team leads who need consistent R environments across projects. Therefore, the reader of this blog post will learn how to:

Set startup options (default working directory, memory limits)
Customize your .Rprofile for automatic configurations
Manage environment variables for consistent behavior
Personalize RStudio settings (themes, shortcuts, pane layouts)
Automate repetitive tasks with .First() and .Last() functions

What is .Rprofile?

The .Rprofile is an R script that runs automatically at startup, letting the user

  • Set default options
  • Load frequently used packages
  • Define custom functions
  • Configure environment variables

Customize an R Session

The R profile script (.Rprofile) file can be used to

  1. Change R’s default,
  2. Define handy command-line functions,
  3. Automatically load your favorite packages

On start-up, R will look for the Rprofile in the following places:

1) R Home Directory: R.home() is used to find the directory path in which R is installed.
2) User’s Home Directory: path.expand("~") is used to find the user’s home directory.
3) R Current Working Directory: getwd() is used to find the R’s current working directory.

# Set default CRAN mirror
options(repos = c(CRAN = "https://cloud.r-project.org"))

# Increase printed output length
options(max.print = 1000)

# Customize error behavior
options(error = recover)
options(warn = 1) # Immediate warnings

# Set default working directory
setwd("~/my_default_project")

Modifying R Default Settings

One can employ a few minor modifications on R default settings. For example, the default prompt is >, and the output printed in the console is seven numbers after the decimals. The following setting will:

  1. Replace the default standard R prompt
  2. Update (reduce) the number of digits from 7 to 4. Note: It does not reduce the precision with which these numbers are internally processed and stored.
  3. The show.signif.stars=FALSE will not show stars to indicate the significance of p-values at the conventional level.
options(prompt = "Imdad> ", digits = 4, show.signif.stars = F)
Customize R session

Edit Profile using usethis Package

  • Use the usethis::edit_r_profile() function (from the usethis package) to edit your easily .Rprofile.
  • Remember to include sensitive information (like API keys) directly in the script. Consider using a separate
  • The .Renviron files for such cases.
  • If you have both a project-specific .Rprofile and a user-level one, source the user profile at the beginning of your project’s .Rprofile.

Customizing RStudio

The appearance of RStudion can be customized using Appearance Tweaks available in RStudio.

  1. Editor Theme: Tools > Global Options > Appearance
  2. Pane Layout: Tools > Global Options > Pane Layout
  3. Fonts and Zoom: Tools > Global Options > Appearance

The Keyboard Shortcuts can also be used as a popular customization of RStudio:

# Add to .Rprofile to set shortcuts
options(rstudio.keyboard.shortcuts = list(
  "runCurrentLine" = "Ctrl+Enter",
  "knitDocument" = "Ctrl+Shift+K"
))

RStudio Addins can be used to customize RStudio.

# In R/addins.R
#' @export
helloAddin <- function() {
  message("Hello from your custom addin!")
}

Advanced Session Customization

The .First() and .Last() Functions can be used to customize advanced sessions.

.First <- function() {
  # Runs at startup
  message("Welcome back ", Sys.getenv("USER"), "!")
  if(interactive()) {
    library(tidyverse)
    library(here)
  }
}

.Last <- function() {
  # Runs at session end
  if(interactive()) {
    message("\nGoodbye at ", date(), "\n")
    save.image(".workspace.RData")
  }
}

Summary

In summary, .Rprofile script file allows the user to customize the R environment by setting options, loading libraries, and defining functions that you want available in every session.

Learn about R Workspace, Objects, and .RData File

https://gmstat.com, https://itfeature.com

Lexical Scoping in R Language

Introduction to Lexical Scoping

The Lexical Scoping in R Language is the set of rules that govern how R will look up the value of a symbol. For example

x <- 10

In this example, scoping is the set of rules that R applies to go from symbol $x$ to its value 10.

Types of Scoping

R has two types of scoping

  1. Lexical scoping: implemented automatically at the language level
  2. Dynamic scoping: used in select functions to save typing during interactive analysis.

Lexical scoping looks up symbol values based on how functions were nested when they were created, not how they are nested when they are called to figure out where the values of a variable will be looked up. You just need to look at the function’s definition.

Basic Principles of Lexical Scoping in R Language

There are four basic principles behind R’s implementation of lexical scoping in R Language:

Name Masking

The following example will illustrate the basic principle of lexical scoping

f <- function(){
      x <- 1
      y <- 2
      c(x, y)
}

f()
Name Masking in R Functions

If a name is not defined inside a function, R will look one level up.

x <- 2
g <- function(){
       y <- 1
       c(x,y)
}
g()

The same rules apply if a function is defined inside another function: look inside the current function, then where the function was defined, and so on, all the way up to the global environment, and then on to other loaded packages.

x <- 1
h <- function(){
       y <- 2
   i <- function(){
       z <- 3
       c(x,y,z)
   }
i()
}

h()
r(x,y)

The same rules apply to closures, functions created by other functions. The following function, j( ), returns a function.


How does R know what the value of y is after the function has been called? It works because k preserves the environment in which it was defined and because the environment includes the value of y.

j <- function(x){
       y <- 2
    function(){
       c(x,y)
    }
}
k<-j(1)
k()
rm(j,k)
Name Masking in R Example

Functions vs Variables

Finding functions works the same way as finding variables:

l <- function(x){
       x+1
}
m <- function(){
l <- function(x){
       x*2
}
  l(10)
}
m()
Lexical Scoping Functions VS Variables in R

If you are using a name in a context where it’s obvious that you want a function (e.g. f(3)), R will ignore objects that are not functions while it is searching. In the following example, n takes on a different value depending on whether R is looking for a function or a variable.

n <- function(x) {
      x/2
}
o <- function(){
      n <- 10
   n(n)
}
o()

Fresh Start

The following questions can be asked (i) What happens to the values in between invocation of a function? (ii) What will happen the first time you run this function? and (iii) What will happen the second time? (If you have not seen exists() before it returns TRUE if there’s a variable of that name, otherwise it returns FALSE).

j <- function(){
       if(!exists("a")) {
         a <- 1
       } else {
         a<-a+1
       }
    print(a)
          }
j()

From the above example, you might be surprised that it returns the same value, 1 every time. This is because every time a function is called, a new environment is created to host execution. A function has no way to tell what happened the last time it was run; each invocation is completely independent (but see mutable states).

Dynamic Lookup

Lexical scoping determines where to look for values, not when to look for them. R looks for values when the function is run, not when it’s created. This means that the output of a function can be different depending on objects outside its environment:

f <- function() {
       x
}
x <- 15
f()

x <- 20
f()

You generally want to avoid this behavior because it means the function is no longer self-contained.
One way to detect this problem is the findGlobals() function from codetools. This function lists all the external dependencies of a function:

f <- function{ 
     x + 1
}
codetools::findGlobals(f)
Lexical Scoping Dynamic Lookup in R


Another way to try and solve the problem would be to manually change the environment of the function to the emptyenv(), an environment that contains absolutely nothing:

environment(f) <- emptyenv()

This doesn’t work because R relies on lexical scoping to find everything, even the + operator. It’s never possible to make a function completely self-contained because you must always rely on functions defined in base R or other packages.

Since all standard operators in R are functions, you can override them with your alternatives.

'(' <- function(e1) {
      if(is.numeric(e1) && runif(1)<0.1){
         e1 + 1
      } else {
        e1
      }
}
replicate (50,(1+2))

A pernicious bug is introduced: 10% of the time, 1 will be added to any numeric calculation inside parenthesis. This is another good reason to regularly restart with a clean R session!

Bound Symbol or Variable

If a symbol is bound to a function argument, it is called a bound symbol or variable. In case, if a symbol is not bound to a function argument, it is called a free symbol or variable.

If a free variable is looked up in the environment in which the function is called, the scoping is said to be dynamic. If a free variable is looked up in the environment in which the function was originally defined the scoping is said to be static or lexical. R, like Lisp, is lexically scoped whereas R and S-plus are dynamically scoped.

y = 20
foo = function(){
  y = 10  #clouser for the foo function
  function(x) {
    x + y
    }
}
bar=foo()

Foo returns an anonymous function.

bar=foo() is a function in global like foo. $x + y$ is created in the foo environment not in global. Foo has a function as a return value, which is then bound to bar the global environment. Note that anonymous is a function that has no name.

https://itfeature.com

https://gmstat.com