Getting started

Colin Fay

2019-03-20

Getting started with {tidystringdist}

About {tidystringdist}

{tidystringdist} is a package that extends the {stringdist} package with tidy data principles.

The idea is to perform string distance calculation and combine it with functions for data manipulation and visualisation from the tidyverse framework.

Installing tidystringdist

You can install the last stable version from GitHub with:

Or the dev version from GitHub:

{tidystringdist} basic workflow

tidycomb()

The tidycomb() & tidy_comb_all() functions return all the possible combinations from a vector / a data.frame and a column / two vectors:

Compute string distance

Once you’ve got this data.frame, you can use tidy_string_dist() to compute string distance. This function takes a data.frame, the two columns containing the strings, and one or more stringdist methods.

Default call compute all the methods. You can use specific method with the method argument: