2. P ERSONAL DATA AND DATA SUBJECTS
2.2 D IAGNOSTIC D ATA
As explained in Section 1.2, Google collects Diagnostic Data in multiple ways. Sections 2.2 to 2.4 discuss how Privacy Company obtained access to Diagnostic Data in the context of this DPIA and contains an overview of the content of such Diagnostic Data.
Though Google provides extensive documentation about the existence and contents of the logs that it makes available for administrators, there is very little public documentation about other Diagnostic Data Google collects, such as telemetry data, or other data Google collects on its servers about the use of G Suite (Enterprise) for Education applications.
2.2.1 Audit logs and visual reports
Google stores Diagnostic Data about the use of its cloud services in log files. Googles makes some of these logs available for admins in so-called audit logs. There is no public documentation what logs Google collects in system generated logs, and what data it makes available for admins.
The audit logs provide some information about the Diagnostic Data Google collects. Another source of information used for this report, is traffic interception from the installed apps. This will be discussed below, in Section 2.3.
In G Suite Enterprise for Education, Google makes 19 kinds of audit logs available through the Google Admin Console.114 These are: Admin, Login, SAML, LDAP, Drive, Calendar, Context-Aware Access, Devices, Password Vault, Token, Groups, Hangouts Chat, Google+, Voice, Hangouts Meet, User Accounts, Access Transparency and Rules.115 Additionally, admins can use a separate Email Log Search. The G Suite for Education administrators only have access to three logs: CalAudit, DriveAudit and LoginAudit. The absence of these logs in the ‘free’ G Suite for Education does not mean Google does not collect the same data.
Google also makes these logs available through its API so that administrators can obtain automated, almost realtime access to end-user activities.
Google additionally provides four types of visual reports:
1. Activity log files (activities of end-users and administrators)
2. Customer Usage Metrics (aggregated properties and statistics for all end-users, across an entire Enterprise domain
3. User Usage Metrics (individual Diagnostic Data. “The end-user usage report returns G Suite service usage information for a particular end-user in your domain. These reports can be customized and filtered for specific usage information. The default and maximum time period for each report is the last 450 days” ; and
4. Entities Usage Metrics (only about the use of Google+)
The logs and reports show that Google logs personal data at a granular level about individual end-user actions in three different categories: application usage (such as Gmail or Docs), file access (any activity related to the opening, changing, saving and sharing of files) and access to third party services using the Google credentials (Cloud Identity).
114 Google, understand audit logs, URL:
115 The following 5 logs were empty, because the functionality was not tested: SAML, LSDAP, Context-Aware Access, Voice and Password Vault.
Figures 15 and 16: Google list of different audit logs and reports API
The Drive audit log contains a log of action taken with documents in Google Drive. This includes actions like uploading, downloading, viewing and editing a file. For each action the filename is recorded, including a link to the actual document in Drive, the username, the owner of the file, a timestamp and IP-address of the computer performing the action.
This log file contains file and path names, in combination with the email address of the end-user.
Table 7: Drive Audit log
Item name Name of document with URL (path name)
Event description User name and executed action, such as ‘edited’ ‘viewed’ or ‘downloaded’
User User name and link to the account of the user who executed the actions.
Date Timestamp with timezone
Event name For example: view, download or edit Item ID Unique identifier for the document Item type For example: Google Docs or Slides
Owner Email address of the owner of the document (Prior) visibility Whether a document is visible or accessible.
IP Full IP address
In G Suite Enterprise for Education the Token audit log is available. This contains a log of authentication tokens that applications and websites use to access the Google Account. For each event the type (creation, use or revocation), user account, the application or website, the end-user’s IP address and timestamp are logged. Thus Google collects information on the use of websites and apps by an end-user with a corporate G Suite authentication token.
Many websites and apps accept easy sign-in with a Google Account. This is convenient for end-users, because they will not have to remember separate credentials for each website or app., It is reasonable to expect that end-users will frequently use their Cloud Identity Google Account for single sign-on services. In the G Suite Enterprise environment, this has as a side effect that the Token audit log allows administrators to view on what websites and apps end-users have logged with their Google account.
“G Suite audit and reporting help administrators track important activities. Log-in activity for third-party apps is included so administrators have a complete picture in one place.”116
A third example of logging is shown in the reports that provide an overview of activities of one end-user in one application, for example the use of Gmail.117
The Gmail usage reports provide aggregated information about one specific individual’s email behaviour, such as the total number of emails sent and received in the last 450 days, and the last time they accessed their mail through webmail, pop or imap.
116 Google, Google Identity Services for work, URL: https://storage.googleapis.com/gfw-touched-accounts-pdfs/google-identity-takeaway.pdf
117 Google Gmail Parameters, URL: https://developers.google.com/admin-sdk/reports/v1/appendix/usage/user/gmail
Table 8 Overview of individual end-user actions in Gmail
is_gmail_enabled boolean If true, the end-user's Gmail service is enabled
num_emails_exchanged integer The total number of emails exchanged. This is the total of num_emails_sent plus num_emails_received num_emails_received integer The number of emails received by the end-user num_emails_sent integer The number of emails sent by the end-user
num_spam_emails_received integer The number of emails received by the end-user's marked as spam mail
timestamp_last_access integer Last access timestamp timestamp_last_imap integer Last imap access timestamp timestamp_last_interaction integer Last interactive access timestamp timestamp_last_pop integer Last pop access timestamp timestamp_last_webmail integer Last web access timestamp
Google also creates aggregated statistics about Gmail usage in the Customer Usage Metrics.118 These statistics contain much more information about email behaviour, such as the number of encrypted inbound and outbound mails, and the number of inbound spam emails. Such information can be useful for administrators if they would want to change their security policy to for example ban unencrypted mails. These logs can also inform an administrator if a particular end-user suddenly receives a lot of spam. Without these user specific reports, it would require more effort to retrieve this information from the general Email logs.119
118 Google G Suite Admin SDK, Reports API, Gmail Parameters, URL:https://developers.google.com/admin-sdk/reports/v1/appendix/usage/customer/gmail
119 See Google, Email Log Search, URL:
Figure 17: G Suite Reports API: export Gmail actions