statar

Matthieu Gomez

2023-08-19

sum_up = summarize

sum_up prints detailed summary statistics (corresponds to Stata summarize)

N <- 100
df <- tibble(
  id = 1:N,
  v1 = sample(5, N, TRUE),
  v2 = sample(1e6, N, TRUE)
)
sum_up(df)
df %>% sum_up(starts_with("v"), d = TRUE)
df %>% group_by(v1) %>%  sum_up()

tab = tabulate

tab prints distinct rows with their count. Compared to the dplyr function count, this command adds frequency, percent, and cumulative percent.

N <- 1e2 ; K = 10
df <- tibble(
  id = sample(c(NA,1:5), N/K, TRUE),
  v1 = sample(1:5, N/K, TRUE)       
)
tab(df, id)
tab(df, id, na.rm = TRUE)
tab(df, id, v1)

join = merge

join is a wrapper for dplyr merge functionalities, with two added functions