Using Github Actions to archive Netlify Analytics data
Update November 2021: Netlify introduced a new V2 API and deprecated the V1 API. It’s still not official but they announced the deprecation mentioning external users of this API. The action has been updated and released for V2.
I recently started to take a closer look at the traffic this blog gets. Since it runs on Netlify and I really like their solution I have been a happy user of Netlify Analytics for a while now. The big drawback for me is that the UI only shows the last month of analytics. I wanted to obseve trends for a longer so I set out to archive this data somehow.
My first Google Search for an official API to analytics data was not successful since Netlify currently does not offer API access via their official API. But Netlify staff suggested using the unofficial API like others have done. Raymond Camden and Jim Nielsen did a great job explaining how it works. So I took a look at my browsers dev tools and started working with that.
I don’t want to care about servers, databases and stuff for my blog that’s why it is on Github and Netlify. So I thought let’s try a scheduled Github Action for storing that data. So here we go the Github Action: https://github.com/marketplace/actions/netlify-analytics-collector with this workflow:
name: Test run
on:
schedule:
- cron: '55 23 * * *'
workflow_dispatch:
jobs:
export-run:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- uses: docker://ghcr.io/niklasmerz/netlify-analytics-collector:2.0.0
with:
netlify-token: ${{ secrets.NETLIFY_TOKEN }}
netlify-site-id: ${{ secrets.NETLIFY_SITE }}
- uses: actions/upload-artifact@v2
with:
name: exports
path: '*.csv'
sheet-upload:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- uses: docker://ghcr.io/niklasmerz/netlify-analytics-collector:2.0.0
with:
netlify-token: ${{ secrets.NETLIFY_TOKEN }}
netlify-site-id: ${{ secrets.NETLIFY_SITE }}
days: 0
disable-header: true
- uses: niklasmerz/csv-to-google-spreadsheet@master
with:
csv_path: pageviews.csv
spreadsheet_id: ${{ secrets.google_spreadsheet_id }}
worksheet: 0
append_content: true
google_service_account_email: ${{ secrets.google_service_account_email }}
google_service_account_private_key: ${{ secrets.google_service_account_private_key }}
- uses: niklasmerz/csv-to-google-spreadsheet@master
with:
csv_path: visitors.csv
spreadsheet_id: ${{ secrets.google_spreadsheet_id }}
worksheet: 1
append_content: true
google_service_account_email: ${{ secrets.google_service_account_email }}
google_service_account_private_key: ${{ secrets.google_service_account_private_key }}
- uses: niklasmerz/csv-to-google-spreadsheet@master
with:
csv_path: bandwidth.csv
spreadsheet_id: ${{ secrets.google_spreadsheet_id }}
worksheet: 2
append_content: true
google_service_account_email: ${{ secrets.google_service_account_email }}
google_service_account_private_key: ${{ secrets.google_service_account_private_key }}
It looks complex at first but is pretty simple to use and customize. This workflow is set to run daily consists of two jobs. One archives the last 30 days of data in a ZIP file with CSV files into this Action run as an artifact. The other jobs updates a Google Sheet if you set it up. I like using a Google Sheet that gets updated daily to create some graphs and have an up-to date source of all data. Look at the secrets to replace and it should be ready to go. More info in the actions repo.
The second workflow creates a release each month with an ZIP file of your analytics data. This way you have a backup history of your data.
These workflows are just an example. You can customize the workflow to your needs via other actions in the workflow. For example I added an action to send a Slack notification every time the workflow fails so I can check it and not loose any data.
I really like the solution to have this action running right in the repo where my blog lives but it can run locally to.
Let’s hope the API does not change or we even get an official API to use some day.