Best practices for safely exporting your valuable historical data
This is the sixth blog in our seven-part Google Analytics 4 Migration Guide series. The series focuses on how businesses can accelerate their move to GA4 ahead of looming sunset deadlines — to gain a competitive edge by mastering the GA4 platform’s next-generation data-driven marketing capabilities. You can read the full Guide here.
As the sunset of Universal Analytics looms, the most pressing concern for businesses is the likelihood that they will lose access to their historical data in UA properties by the end of 2022. Best practice is to begin planning for this well in advance by setting up data export to a data warehousing solution. Here are tips and best practices for safely exporting your historical data as you prepare to fully migrate to Google Analytics 4:
Exporting data to BigQuery
The easiest and the most flexible option for Google Analytics 360 customers is building a data export to BigQuery using the native integration between the tools. As soon as you have a billing account in Google Cloud Platform and you can create a project linked to this account, you can follow a straightforward integration flow and begin streaming data into BigQuery within 24 hours.
Additionally, previously collected data from 13 months before the date of the integration (or 10 billion hits, whichever is smaller) will show up in BigQuery shortly.
This step alone would let Analytics 360 customers collect 31 months of data before the full stop of UA data collection. Taking early action to set up this BigQuery data export allows for building a more detailed and robust data perspective to be stored on Google’s servers.
If you are not an enterprise customer of Analytics, you’d probably have to rely on one of third-party solutions for connecting your Google Analytics to BigQuery (e.g., Supermetrics or any other). However, these solutions most likely will not offer an option to export historical data as the native integration does.
Exporting data using the Google Analytics Reporting API
What about organizations with data that can’t be exported to BigQuery — either because the data is too old (older than 13 months) or isn’t covered by the third-party connector you’ve selected? There is an option to manually export all that data in CSV. Fortunately, you can work around this tedious process by utilizing the Google Analytics Reporting API.
This API tool enables you to request a subset of data from GA using service, client-side or server applications. While it likely requires assistance from an experienced developer who can run code in an application, once properly configured it could significantly speed up the process of extracting historical data from GA. The result of a query can be downloaded to Excel, Google spreadsheet, or exported to BigQuery to keep all Analytics data in one place.
Alternatively, you can also use the API for fast editing of Google Analytics data in an online document using the Google Analytics Spreadsheet Add-On. Generally, you should limit detailed data export to a few previous years and keep only high-level data from an earlier period. However, this may differ depending on how you’re planning to use the data going forward.
Addressing Limitations of the API and Add-On
The main limitation of the API and add-on is that it does not give access to raw data, including client identifiers (i.e., client_id) which are generously provided by the BigQuery native integration. Also, since the data is reported in a plain format (vs. nested data in BigQuery), you’ll need to make sure you’re combining dimensions and metrics that are available in a single query — similar to how it should be done when building reports in the Analytics UI.
Because of this limitation, you may want to build reports including user- and session-based data separately from pageview, event, and transaction data. Each of these require a separate reporting request and will liklely result in different datasets. On the other hand, exporting specific data (e.g., transactions, events, or pageviews) may be enriched with some session- or user-level attributes for more granular analysis if required — for example, adding traffic source or audience information. You can use a Google tool for reference to determine precisely what dimensions and metrics can be used together.
Another limitation of the API is the number of dimensions (9) and metrics (10) that can be added to a single report. As you can see, exporting previously collected data requires proper planning and designing a course of actions.
Key questions to guide data export planning
As a summary, setting up historical data export will require additional planning and analysis regarding data collection time frame, tools, and expected results:
- Has some data been collected previously in a data warehousing tool?
- Is there a need to export additional historical data (beyond the data that was collected previously or that is covered by BigQuery)?
- Is BigQuery available for data export?
- If BigQuery is not available, what third-party solution can be used for setting up export?
- What historical data (that is not covered by BigQuery and was not collected before) is consumed by stakeholders? How far back is the data needed?
- What level of detail is required? Would only high-level KPIs suffice for the business?
- How can you group the data in separate datasets? Will those groupings provide you with the data segmentation that you need?
The following is a simplified decision-making tree regarding the methods and tools to execute data export:
Next Up: Migrating Reports to GA4 — and Why You Should Fast-Track Your Migration
In the next and final installment of our seven-part series, we’ll cover best practices for migrating existing UA reports into GA4 — and how you can leverage new GA4 functionality to address the reporting needs of your business. We’ll conclude the series with a business case for why you can gain a valuable competitive advantage by fast-tracking your organization’s migration to GA4. We encourage you to check out the full Google Analytics 4 Migration Guide here: