Preference Elicitation on Motor Trends Dataset

John Lepird

2022-04-24

This vignette demonstrates how to use the prefeR package on a real dataset. The mtcars dataset provides us such an opportunity.

1974 Motor Trends Car Data
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2

If we wanted to give a user a list of their top five most preferred cars from the mtcars dataset, there are three approaches we could take:

  1. Have our user manually rank all options.
  2. Make the user provide weights for the desirability of different car features, and calculate the weighted value of each option.
  3. Have the user compare a small number of alternatives, and derive their weights from those comparisons.

Option #1 quickly becomes an enormous burden on the user as the number of alternatives increases. Option #2 is difficult for the user to do and replicate. What exactly does it mean if the weight assigned to horsepower is double the weight assigned to fuel efficiency?

Option #3 is enabled by the preference elicitation package. To begin, we create a preference elicitation object and give it our data:

library(prefeR)
p <- prefEl(data = mtcars)
p
## Preference elicitation object with:
##  32 observations of 11 variables.
## And the following preferences:
##  No strict preferences.
##  No indifference preferences.

Now we can add in our Bayesian priors for the weights. Although it is difficult to determine weights exactly, usually one has some ballpark estimate for what they should be, and often one knows with certainty the sign of the weights: all else equal, everyone would prefer a more fuel efficient car. The prefeR package contains three built-in priors:

We can now add in our priors for our mtcars attributes.

p$priors <- c(Exp(1),   # MPG
              Normal(), # Number of cylinders (Normal() = Normal(0, 1))
              Normal(), # displacement
              Exp(2),   # horsepower
              Normal(), # real axle ratio
              Normal(), # weight
              Exp(-3),  # quarter mile time
              Normal(), # Engine type
              Normal(), # transmission type
              Normal(), # number of gears
              Normal()  # number of carburetors
)

Now, we can add in our user’s preferences:

p$addPref("Pontiac Firebird" %>% "Fiat 128")  # prefer a cool sports car 
p$addPref("Mazda RX4 Wag" %<% "Mazda RX4")    # prefer not to have the station wagon
p$addPref("Merc 280" %=% "Merc 280C")         # indifferent about C-option
p
## Preference elicitation object with:
##  32 observations of 11 variables.
## And the following preferences:
##  Pontiac Firebird preferred to Fiat 128
##  Mazda RX4 preferred to Mazda RX4 Wag
##  Merc 280 indifferent to Merc 280C

Now, we can infer what our attribute weights should be:

p$infer()
##        mpg        cyl       disp         hp       drat         wt       qsec 
##  0.2220478  0.3330885  0.3583347  2.6082377 -0.4364433 -0.1464981 -0.9751220 
##         vs         am       gear       carb 
## -0.2016490  0.1358719  0.5794767  0.2578508

And we can get our top five cars:

p$rank()[1:5]
##     Maserati Bora    Ford Pantera L        Duster 360        Camaro Z28 
##          976.4051          808.1425          759.2586          755.6060 
## Chrysler Imperial 
##          747.0812

Finally, we can figure out what query we should answer next:

p$suggest()
## [1] "Valiant"            "Cadillac Fleetwood"