Question

Why do I get different outputs for calling "replicate" with and without pipe "%>%" in R?

I want to generate a character vector with 20 elements, each has a random string.

So I generate a random string with the following code:

sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) %>% paste0(collapse = "")

and I generate the vector with this:

replicate(n = 20, expr = sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) %>% paste0(collapse = ""))

(I am basically putting the first code chunk in replicate as an argument.) and the output is the following (which is exactly what I want):

 [1] "DyHpcnruLfKHOsvy"                                                                          
 [2] "lQqwOkKD"                                                                                  
 [3] "XddKDCtOJqZHxAgHqreDwWSBQkDCBdwFclHMFhuCzXXwb"                                             
 [4] "bmuiUUdsnHJsxIEyeLClvbLGBbfEgXFVsScrWxiTcZxPNTTwAJGZDVgJzDKUG"                             
 [5] "yBDZYKxPXXGFwmlPWNMQuUJsfRXsBoQhuVXnYfMNkHFpmAgSRafGBzkKu"                                 
 [6] "LLuNdUoayRwtLqRJKrnxERpMmlntghfkjUqPkxMMubUozsLbPOFESOqtAWKoojOrttVCQlIYkyGRglr"           
 [7] "KuydhJOVZNNDrrMLDeWda"                                                                     
 [8] "ItwNtPGIQDsqCRBoVUCkClgHzCUiYRAiHIQRqpGBpfzRXgmWFArRtmnWhtciPgLlqrVs"                      
 [9] "BqqEjCpUHLzOlsmqiAOchAKysbtUCzce"                                                          
[10] "JJzdyoFqFnZOeLAABK"                                                                        
[11] "bakCawEaOkMspowlFUsAMjAbMxNxguHAHLomiGtenMuENNuPElGwqdqNdVS"                               
[12] "OEtuDejCDVfDwGjKbjWSCsicrRmqGGpWyqMfaGGPNkJhJMbgUtkjbcwitLqVojCERLxTWaCNFRltxgiwdJAbUtoksW"
[13] "crVVVzIyWbAlfyFndgipAZZJLcMqtEtZtBpbisbyAUWsKTJLiwyNvyVPPuoxOkafEeLARYDEOqEoh"             
[14] "QgAZkacEMBbGebUCToXFTLSqqlYhqpbdsPYvIrwJhfpgDcPiJlfiATEEDrYahyXgxLEVXvsbQ"                 
[15] "jHSYxhskNMxYnnbGQLQgTJKsuRXEeDpiPlonDABrXxivwepNNvZGrugSfHoMi"                             
[16] "CdCDpUjlUyLwiujvcLcxpNZjtxUMTMVYxnjCEQqbJQOXZJeTLHXQRbHaIsOIDmKeyNainhphvwEAHscCAhOjUsqQe" 
[17] "XvoelRDEYrxMfffBjRzmFPrLRjayCLRFVpWxzjcIxkRZQiPutModt"                                     
[18] "FNjvlFdyrRTVDWvnXVWjckCDFUkxnbUfkqYDNIPZVMOfjUejEKiuhhTXdi"                                
[19] "qsQDQtaVyoNVHtNNltPqLEuNGDxiscsOsXZfhaUNdBCoSwcouhhpwFhfCcqYPPFXrnjKqnlEknuKsaWVizaIacMiT" 
[20] "ykGqCILONPhzEABAuNjtEjzXxeFLnybwZdVEbDdzDQoDKmIiLvZNhoEEEvYJS" 

But here is the problem. When I try to pipe the first code chuck into replicate using magrittr's pipe (%>%), I get an output with similar strings!

This is how I do it:

sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) %>% paste0(collapse = "") %>% replicate(n = 20, expr = ., simplify = "array")

also, I tried to wrap them into parentheses:

(sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) %>% paste0(collapse = "")) %>% replicate(n = 20, expr = ., simplify = "array")

But, in both cases, this is the undesirable output:


 [1] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [2] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [3] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [4] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [5] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [6] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [7] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [8] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [9] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[10] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[11] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[12] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[13] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[14] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[15] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[16] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[17] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[18] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[19] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[20] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"

What do you think the problem is here?

 3  69  3
1 Jan 1970

Solution

 6

Both calls differ in number of operations. Essentially, magrittr's %>% simply passes the result of one operation into another operation.

In first call, only one operation is run where replicate runs its defined expression here being random sample + paste0 20 times. In other words, replicate directly runs sample 20 times.

replicate(
  n = 20, 
  expr = sample(
    x = c(letters, LETTERS), size = sample.int(100, 1), replace = T
  ) %>% paste0(collapse = "")
)

In second call, there are two operations where a random sample is run once and then its result is passed into replicate that simply repeats (i.e., replicates) this expression 20 times. Specifically, replicate never directly runs sample.

sample(
  x = c(letters, LETTERS), 
  size = sample.int(100, 1), 
  replace = T
) %>% replicate(n = 20, expr = ., simplify = "array")

Interestingly, this above behavior differs from base R's new pipe |> introduced in 4.1.0 and its RHS placeholder _ in 4.2.0:

sample(
  x = c(letters, LETTERS), 
  size = sample.int(100, 1), 
  replace = T
) |> replicate(n = 20, expr = _, simplify = "array")

where per deparse(substitute(...)) as shown by @Dirk is a handy re-write of nested calls which resembles your first call:

deparse(
  substitute(
    sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) |> 
      replicate(n = 20, expr = _, simplify = "array")
  )
)
[1] "replicate(n = 20, expr = sample(x = c(letters, LETTERS), size = sample.int(100, "
[2] "    1), replace = T), simplify = \"array\")"  
2024-06-30
Parfait