Google authentication types for R

Mark Edmondson

2023-04-11

Other more modern libraries

googleAuthR was one of my first R packages and has enjoyed 7+ years of being my key workhorse for Google authentication, but in the meantime more modern packages have been released that may be more suited to your needs - consider gargle (that googleAuthR heavily depends upon now for authentication rather than its own home spun functions) and firebase for alternatives. The roles/functionality of googleAuthR aside authentication such as batching, package creation etc. will eventually be also superseded by smaller packages. However, googleAuthR is still in active use and will be supported in a maintenance mode.

Version 2.0

Version 2.0 removed googleAuthR shiny modules in favour of gar_shiny_*. If you need those older legacy functions depend on googleAuthR == 1.4.1 or before.

Quick user based authentication

Once setup, then you should go through the Google login flow in your browser when you run this command:

library(googleAuthR)
# starts auth process with defaults
gar_auth()
#>The googleAuthR package is requesting access to your Google account. Select a 
#> pre-authorised account or enter '0' to obtain a new token. Press Esc/Ctrl + C to abort.

#> 1: mark@work.com
#> 2: home@home.com

The authentication cache token is kept at a global level as per the gargle library documentation - see there for more details.

You can also specify your email to avoid the interactive menu:

gar_auth(email = "your@email.com")

These functions are usually wrapped in package specific functions when used in other packages, such as googleAnalyticsR::ga_auth()

Client options

Most libraries will set the appropriate options for you, otherwise you will need to supply them from the Google Cloud console, in its APIs & services > Credentials section ( https://console.cloud.google.com/apis/credentials ).

You will need as a minimum:

If creating your own library you can choose to supply some or all of the above to the end-user, as an end-user you may need to set some of the above (most usually your own user authentication).

Multiple authentication tokens

googleAuthR > 1.0.0

Authentication cache tokens are kept at a global level on your computer. When you authenticate the first time with a new client.id, scope or email then you will go through the authentication process in the browser, however the next time it wil be cached and be a lot quicker.

# switching between auth scopes
# first time new scope manual auth, then auto if supplied email   
gar_auth(email = "your@email.com", 
         scopes = "https://www.googleapis.com/auth/drive")
         
# ... query Google Drive functions ...

gar_auth(email = "your@email.com", 
         scopes = "https://www.googleapis.com/auth/bigquery")
         
# ..query BigQuery functions ...

Setting the client via Google Cloud client JSON

To avoid keeping track of which client_id/secret to use, Google offers a client ID JSON file you can download from the Google Cloud console here - https://console.cloud.google.com/apis/credentials. Make sure the client ID type is Desktop for desktop applications.

You can use this to set the client details before your first authentication. The above example would then be:

library(googleAuthR)
library(googleAnalyticsR)
library(searchConsoleR)

# set the scopes required
scopes = c("https://www.googleapis.com/auth/analytics", 
          "https://www.googleapis.com/auth/webmasters")
                                        
# set the client
gar_set_client("client-id.json", scopes = scopes)

# authenticate and go through the OAuth2 flow first time
gar_auth()
                                        
# can run Google Analytics API calls:
ga_account_list()

# and run Search Console API calls:
list_websites()

You can also place the file location of your client ID JSON in the GAR_CLIENT_JSON environment argument, where it will look for it by default:

# .Renviron
GAR_CLIENT_JSON="~/path/to/clientjson.json"

Then you just need to supply the scopes:

gar_set_client(scopes = "https://www.googleapis.com/auth/webmasters")

Authentication with no browser

Refer to this gargle article on how to authenticate in a non-interactive manner

Authentication with a JSON file via Service Accounts

You can also authenticate single users via a server side JSON file rather than going through the online OAuth2 flow. The end user could supply this JSON file, or you can upload your own JSON file to your applications. This is generally more secure if you know its only one user on the service, such as for Cloud services.

This involves downloading a secret JSON key with the authentication details. More details are available from Google here: Using OAuth2.0 for Server to Server Applications[https://developers.google.com/identity/protocols/oauth2/service-account]

To use, go to your Project in the Google Developement Console and select JSON Key type. Save the JSON file to your computer and supply the file location to the function gar_auth_service()

Roles

Roles all start with roles/* e.g. roles/editor - a list of predefined roles are here or you can see roles within your GCP console here.

Creating service account and a key

From R

The gar_service_create() and related functions let you create service accounts from a user OAuth2 login. The user requires permission iam.serviceAccounts.create for the project. Most often the user is an Owner/Editor.

The workflow for authenticating with a new key from R is:

  • Creating a new service account Id
  • Giving that service account any roles it needs to operate
  • Downloading a one-time only JSON key file that authenticates that account Id
  • Using that key file in gar_auth_service() or otherwise with the correct scopes to use the API

See this related Google help article or creating service accounts.

The above workflow is encapsulated within gar_service_provision() which will run through them for you if you supply it with your GCP projects Client Id (another JSON file that identifies your project.)

WebUI

Navigate to the JSON file from the Google Developer Console via: Credentials > New credentials > Service account Key > Select service account > Key type = JSON

If you are using the JSON file, you must ensure:

  • The service email has access to the resource you are trying to fetch (for example a Google Analytics View)
  • You have set the scopes to the correct API
  • The Google Project has the API turned on

An example using a service account JSON file for authentication is shown below:

library(googleAuthR)
options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/urlshortner")
service_token <- gar_auth_service(json_file="~/location/of/the/json/secret.json")
analytics_url <- function(shortUrl, 
                          timespan = c("allTime", "month", "week","day","twoHours")){
  
  timespan <- match.arg(timespan)
  
  f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url",
                         "GET",
                         pars_args = list(shortUrl = "shortUrl",
                                          projection = "FULL"),
                         data_parse_function = function(x) { 
                           a <- x$analytics 
                           return(a[timespan][[1]])
                         })
  
  f(pars_arguments = list(shortUrl = shortUrl))
}
analytics_url("https://goo.gl/2FcFVQbk")

Another example is from the searchConsoleR library - in this case we avoid using scr_auth() to authenticate via the JSON, which has had the service email added to the Search Console web property as a user.

library(googleAuthR)
library(searchConsoleR)
options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/webmasters") 

gar_auth_service("auth.json")

list_websites()

Authentication within Shiny

If you want to create a Shiny app just using your data, refer to the non-interactive authentication article on gargle

If you want to make a multi-user Shiny app, where users login to their own Google account and the app works with their data, googleAuthR provides the below functions to help make the Google login process as easy as possible.

Types of Shiny Authentication

There are now these types of logins available, which suit different needs:

Shiny Modules

googleAuthR uses Shiny Modules. This means less code and the ability to have multiple login buttons on the same app.

To use modules, you need to use the functions ending with _UI in your ui.R, then call the id you set there server side with the callModule(moduleName, "id") syntax. See the examples below.

Shiny Authentication Examples

Remember that client IDs and secrets will need to be created for the examples below. You need to pick a clientID for web applications, not “Other” as is used for offline googleAuthR functions.

URL redirects

In some platforms the URL you are authenticating from will not match the Docker container the script is running in (e.g. shinyapps.io or a kubernetes cluster) - in that case you can manually set it via options(googleAuthR.redirect = http://your-shiny-url.com). In other circumstances the Shiny app should be able to detect this itself.

gar_shiny_* functions example

This uses the most modern gar_shiny_* family of functions to create authentication. The app lists the files you have stored in Google Drive.

library(shiny)
library(googleAuthR)
gar_set_client(scopes = "https://www.googleapis.com/auth/drive")

fileSearch <- function(query) {
  gar_api_generator("https://www.googleapis.com/drive/v3/files/",
                    "GET",
                    pars_args=list(q=query),
                    data_parse_function = function(x) x$files)()
}

## ui.R
ui <- fluidPage(title = "googleAuthR Shiny Demo",
                textInput("query", 
                          label = "Google Drive query", 
                          value = "mimeType != 'application/vnd.google-apps.folder'"),
                tableOutput("gdrive")
)

## server.R
server <- function(input, output, session){
  
  # create a non-reactive access_token as we should never get past this if not authenticated
  gar_shiny_auth(session)
  
  output$gdrive <- renderTable({
    req(input$query)
    
    # no need for with_shiny()
    fileSearch(input$query)
    
  })
}

shinyApp(gar_shiny_ui(ui, login_ui = gar_shiny_login_ui), server)

googleSignIn module example

This module is suitable if you don’t need to authenticate APIs in your app, you just would like a login. You can then reach the user email, id, name or avatar to decide which content you want to show with durther logic within your Shiny app.

You only need to set the client.id for this login, as no secrets are being created.

library(shiny)
library(googleAuthR)

options(googleAuthR.webapp.client_id = "1080525199262-qecndq7frddi66vr35brgckc1md5rgcl.apps.googleusercontent.com")

ui <- fluidPage(
    
    titlePanel("Sample Google Sign-In"),
    
    sidebarLayout(
      sidebarPanel(
        googleSignInUI("demo")
      ),
      
      mainPanel(
        with(tags, dl(dt("Name"), dd(textOutput("g_name")),
                      dt("Email"), dd(textOutput("g_email")),
                      dt("Image"), dd(uiOutput("g_image")) ))
      )
    )
  )

server <- function(input, output, session) {
  
  sign_ins <- shiny::callModule(googleSignIn, "demo")
  
  output$g_name = renderText({ sign_ins()$name })
  output$g_email = renderText({ sign_ins()$email })
  output$g_image = renderUI({ img(src=sign_ins()$image) })
  
}

# Run the application 
shinyApp(ui = ui, server = server)

Auto-authentication

Auto-authentication can be performed upon a package load.

This requires the setup of environment variables either in your .Renviron file or via Sys.setenv() to point to a previously created authentication file. This file can be either a .httr-oauth file created via gar_auth() or a Google service account JSON downloaded from the Google API console.

This file will then be used for authentication via gar_auth_auto. You can call this function yourself in scripts or R sessions, but its main intention is to be called in the .onAttach function via gar_attach_auth_auto, so that you will authenticate right after you load the library via library(yourlibrary)

An example from googleCloudStorageR is shown below:

.onAttach <- function(libname, pkgname){

  googleAuthR::gar_attach_auto_auth("https://www.googleapis.com/auth/devstorage.full_control",
                                    environment_var = "GCS_AUTH_FILE")
}

..which calls an environment variable set in ~/.Renvion:

GCS_AUTH_FILE="/Users/mark/auth/my_auth_file.json"

Authentication on Google Cloud

Use googleAuthR::gar_gce_auth() to authenticate reusing the service keys of the Google Compute Engine instance (or other compute service).

Authentication on Kubernetes via workload identity

Workload identity is a way of federating the authentication of a service key to Kubernetes without needing to download a service key.

Its the “right” way to do authentication on K8s and other places if possible since it involves not downloading keys which is a potential security risk.

  1. Following the docs you create a service account as normal and give it permissions and scopes needed to say upload to BigQuery, as you would before. eg. my-service-key@my-project.iam.gserviceaccount.com with https://www.googleapis.com/auth/bigquery scopes
  2. Instead of downloading a JSON key, you instead federate that permission by adding a policy binding to another service account within Kubernetes
  3. Create the service account within Kubernetes, ideally within a new namespace:
# create namespace
kubectl create namespace my-namespace
# Create Kubernetes service account
kubectl create serviceaccount --namespace my-namespace bq-service-account 
  1. Bind that Kubernetes service account to the service account outside of kubernetes you created in step 1, and assign it an annotation
# Create IAM policy binding between k8s SA and GSA
gcloud iam service-accounts add-iam-policy-binding my-service-key@my-project.iam.gserviceaccount.com \
    --role roles/iam.workloadIdentityUser \
    --member "serviceAccount:my-project.svc.id.goog[my-namespace/bq-service-account]"
# Annotate k8s SA
kubectl annotate serviceaccount bq-service-account \
    --namespace my-namespace \
    iam.gke.io/gcp-service-account=my-service-key@my-project.iam.gserviceaccount.com

This key will now be available to add to pods within the cluster. For Airflow, you can pass them in using the GKEPodOperator(...., namespace='my-namespace', service_account_name='bq-service-account')

  1. When calling the gargle::gce_credentials() within R, you need first make sure its using the right internal kubernetes endpoint (options(gargle.gce.use_ip = TRUE)) and then call the service email that is not “default”. gargle:::list_service_accounts() was helpful in debugging (maybe export this?)
# code within the Docker container
library(bigQueryR)

options(gargle.gce.use_ip = TRUE)
googleAuthR::gar_gce_auth("my-service-key@my-project.iam.gserviceaccount.com")

... do authenticated stuff...

Revoking Authentication

For local use, call gar_deauth() to de-authenticate a session. To avoid cache tokens being reused delete them from the gargle cache folder, usually ~/.R/gargle/gargle-oauth/

For service level accounts delete the JSON file.

For a Shiny app, a cookie is left by Google that will mean a faster login next time a user uses the app with no Authorization screen that they get the first time through. To force this every time, activate the parameter revoke=TRUE within the googleAuth function.