Substitution Modes

John Mount

2023-08-19

The substitution modes

wrapr::let() now has three substitution implementations:

The semantics of the three methods can be illustrated by showing the effects of substituting the variable name “y” for “X” and the function “sin” for “F” in the somewhat complicated block of statements:

  {
    d <- data.frame("X" = "X", X2 = "XX", d = X*X, .X = X_)
    X <- list(X = d$X, X2 = d$"X", v1 = `X`, v2 = ` X`, F(1:2))
    X$a
    "X"$a
    X = function(X, ...) { X + 1 }
  }

This block a lot of different examples and corner-cases.

Language substitution (subsMethod='langsubs')

library("wrapr")

let(
  c(X = 'y', F = 'sin'), 
  {
    d <- data.frame("X" = "X", X2 = "XX", d = X*X, .X = X_)
    X <- list(X = d$X, X2 = d$"X", v1 = `X`, v2 = ` X`, F(1:2))
    X$a
    "X"$a
    X = function(X, ...) { X + 1 }
  },
  eval = FALSE, subsMethod = 'langsubs')
## {
##     d <- data.frame(y = "X", X2 = "XX", d = y * y, .X = X_)
##     y <- list(y = d$y, X2 = d$y, v1 = y, v2 = ` X`, sin(1:2))
##     y$a
##     "X"$a
##     y = function(y, ...) {
##         y + 1
##     }
## }

Notice the substitution replaced all symbol-like uses of “X”, and only these (including correctly working with some that were quoted!).

String substitution (subsMethod='stringsubs')

let(
  c(X = 'y', F = 'sin'), 
  {
    d <- data.frame("X" = "X", X2 = "XX", d = X*X, .X = X_)
    X <- list(X = d$X, X2 = d$"X", v1 = `X`, v2 = ` X`, F(1:2))
    X$a
    "X"$a
    X = function(X, ...) { X + 1 }
  },
  eval = FALSE, subsMethod = 'stringsubs')
## expression({
##     d <- data.frame(y = "y", X2 = "XX", d = y * y, .y = X_)
##     y <- list(y = d$y, X2 = d$y, v1 = y, v2 = ` y`, sin(1:2))
##     y$a
##     "y"$a
##     y = function(y, ...) {
##         y + 1
##     }
## })

Notice string substitution has a few flaws: it went after variable names that appeared to start with a word-boundary (the cases where the variable name started with a dot or a space). Substitution also occurred in some string constants (which as we have seen could be considered a good thing).

These situations are all avoidable as both the code inside the let-block and the substitution targets are chosen by the programmer, so they can be chosen to be simple and mutually consistent. We suggest “ALL_CAPS” style substitution targets as they jump out as being macro targets. But, of course, it is better to have stricter control on substitution.

Think of the language substitution implementation as a lower-bound on a perfect implementation (cautious, with a few corner cases to get coverage) and string substitution as an upper bound on a perfect implementation (aggressive, with a few over-reaches).

Substitute substitution (subsMethod='subsubs')

let(c(X = 'y', F = 'sin'), 
    {
      d <- data.frame("X" = "X", X2 = "XX", d = X*X, .X = X_)
      X <- list(X = d$X, X2 = d$"X", v1 = `X`, v2 = ` X`, F(1:2))
      X$a
      "X"$a
      X = function(X, ...) { X + 1 }
    },
    eval = FALSE, subsMethod = 'subsubs')
## {
##     d <- data.frame(X = "X", X2 = "XX", d = y * y, .X = X_)
##     y <- list(X = d$y, X2 = d$X, v1 = y, v2 = ` X`, sin(1:2))
##     y$a
##     "X"$a
##     y = function(X, ...) {
##         y + 1
##     }
## }

Notice base::substitute() doesn’t re-write left-hand-sides of argument bindings. This is why I originally didn’t consider using this implementation. Re-writing left-hand-sides of assignments is critical in expressions such as dplyr::mutate( RESULTCOL = INPUTCOL + 1). Also base::substitute() doesn’t special case the d$"X" situation (but that really isn’t very important).

Conclusion

wrapr::let() when used prudently is a safe and powerful tool.