

Web Analytics with Cloudfront and GoAccess
source link: https://isthisit.nz/posts/2021/web-analytics-with-cloudfront-and-goaccess/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Web Analytics with Cloudfront and GoAccess
Here is yet another answer to the perennial question of who is accessing my website and what pages are they looking at? A long time ago this website ran Google Analytics but I got put off by the privacy impact it has on users. I’ve since switched to another client side tracking solution Goat Counter which doesn’t uniquely track users and is open source.
These client side tracking solutions both slow the website down for users, and completely exclude the ever-growing segment of Ad and Tracking blockers (as they rightly should be!)
Because this is a static site hosted on AWS S3 and served with the Cloudfront CDN I wasn’t sure what ‘server side’ log data is available. I did some research and found not only do Cloudfront logs provide the information I’m interested in, but that they can be parsed and viewed offline by open source software GoAccess.
This post details how to set up Cloudfront CDN logging, downloading and combining those logs, and then using the Go Access to get an interactive dashboard like below.
Enable Cloudfront Logging
Assuming you’ve already set up a Cloudfront distribution for your site, you just need to enable standard logging. Cloudfront will write access logs of every user request into a bucket you specify. Logs are written every ~20 minutes in a gzip compressed format. This site gets low traffic and around three months of access logs uses 15mb of storage.
After enabling the setting check the bucket in an hour or so to confirm that logs are being written.
Download and Compress Logs
We run GoAccess on a single log file, yet Cloudfront produces thousands of log files in ~20 minute increments. We must download all log files and combine them into a single file. Here’s a simple bash script which does it for us.
#!/usr/bin/env bash
aws s3 sync s3://your-cloudfront-log-bucket/ .
cat *.gz > combined.log.gz
gzip -d combined.log.gz
rm *.gz
This script assumes the aws
CLI tool is installed and configured locally. You’ll also need to install GoAccess for the next command.
GoAccess and Cloudfront
With our single log we can run GoAccess to generate the HTML analytics report.
goaccess combined.log --log-format=CLOUDFRONT -o report.html
Or run it interactively in the terminal. See the GoAccess man page for more detail.
Recommend
-
89
GoAccess What is it? GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal on *nix systems or through your browser. It provides
-
73
本文章基于Ubuntu16.04系统/Nginx1.10.3日志环境 注:Nginx使用apt-get方式安装,日志格式为默认 ...
-
52
前言如果把运维看做是医生给病人看病,则日志就是病人对自己的陈述,很多时候医生需要通过对病人的描述中得出病人状况,是否严重,需要什么计量的药,什么类型的药。所以古人有句话叫对症下药,这个症就是病人的描述加医生的判断,在重一点的病在加上很多的化验。...
-
21
GoAccess简介GoAccess是一款开源(MIT许可证)的且具有交互视图界面的实时Web日志分析工具,通过你的Web浏览器或者*nix系统下的终端程序即可访问。能为系统管理员提供快速且有价值的HTTP统计,并以在线可视化服务器的方式呈现。GoAccess解析指定的Web日志文件并将统...
-
32
中文网站 https://www.goaccess.cc/ 英文网站 http://goaccess.io/ 包安装 yum install glib2...
-
8
2020年12月16日阅读Markdown格式6951字14分钟阅读这篇文章原计划在 2020 年中智源大会举办完毕之后整理出来,奈何各种事情阻塞,一直拖延至今。恰逢年末跑...
-
4
使用 GoAccess 分析 Nginx 日志以及 sed/awk 手动分析实践使用 Nginx 的网站可能会遇到访问流量异常、被友情检测、程序出现 Bug 等各种突然情况,这时大家的反应想必都是第一时间分析日志,然后发现日志有几十 GB 之多,又需要按照时间、错误类型或者关键字段检...
-
3
I want to show you how I use a tool called goaccess to do some quick analysis of access logs on webservers. Now that I’m doing more & more consulting work, this has become my favo...
-
3
macOS 下 GoAccess 踩坑 2022-03-01 安装好 GoAccess 后, 即便设置好日志时间格式, 还是会报错. 经过查找后发现是 Mac 用户需要声明 LANG 命令参考: LANG="en_US.UTF-8" goaccess --no-global-config --log-...
-
2
使用GoAccess统计分析Nginx的访问日志 来源: 石博文博客 | 浏览: 2011 | 评论: 0 发表时间: 2019-03-21
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK