![](/style/images/good.png)
![](/style/images/bad.png)
Three Ways to Count the Objects in an AWS S3 Bucket
source link: https://fuzzyblog.io/blog/aws/2019/10/24/three-ways-to-count-the-objects-in-an-aws-s3-bucket.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Three Ways to Count the Objects in an AWS S3 Bucket
Oct 24, 2019
![IMG_7776.jpeg](https://fuzzyblog.io/blog/assets/IMG_7776.jpeg)
AWS S3, "simple storage service", is the classic AWS service. It was the first to launch, the first one I ever used and, seemingly, lies at the very heart of almost everything AWS does.
Given that S3 is essentially a filesystem, a logical thing is to be able to count the files in an S3 bucket. Illustrated below are three ways.
Method 1: aws s3 ls
S3 is fundamentally a filesystem and you can just call ls on it. Yep – ls in the cloud. blink
aws s3 ls s3://adl-ohi/ --recursive --summarize | grep "Total Objects:"
Total Objects: 444803
Method 2: aws s3api
And since S3 is a modern filesystem, it actually has an API that you can call. Yep – a json api. blink blink
aws s3api list-objects --bucket adl-ohi --output json --query "[length(Contents[])]"
[
448444
]
Method 3: A Python Example
Naturally you can just run code to do all this. I started with an example from the Stack Overflow link below that was written for boto and upgraded it to boto3 (as still a Python novice, I feel pretty good about doing this successfully; I remember when Ruby went thru the same AWS v2 to v3 transition and it sucked there too). I also learned how to dynamically introspect methods from Python objects as part of this debugging cycle.
#!/usr/local/bin/python
import sys
import boto3
s3 = boto3.resource('s3')
s3bucket = s3.Bucket(sys.argv[1])
size = 0
totalCount = 0
for key in s3bucket.objects.all():
totalCount += 1
size += key.size
print('total size:')
print("%.3f GB" % (size*1.0/1024/1024/1024))
print('total count:')
print(totalCount)
which gives output like this:
python3 scratch/count_s3.py adl-ohi
total size:
0.298 GB
total count:
486468
Note: I have a live upload happening on another machine so the numbers do change and that's actually fine.
References
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK