22

Django models 详解之聚合查询(aggregate)与分组查询(annotate)

 3 years ago
source link: https://rollingstarky.github.io/2020/12/22/django-models-aggregate-and-annotate/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Django models 详解之聚合查询(aggregate)与分组查询(annotate)

发表于 2020-12-22

| 分类于 Python

| 0

| 阅读次数: 3

字数统计: 7.6k

|

阅读时长 ≈ 0:08

一、测试代码及数据

models.py 代码

from django.db import models


class Author(models.Model):
name = models.CharField(max_length=100)
age = models.IntegerField()


class Book(models.Model):
name = models.CharField(max_length=300)
price = models.DecimalField(max_digits=10, decimal_places=2)
authors = models.ManyToManyField(Author)
pubdate = models.DateField()

测试数据

authors:

[
{
"id": 1,
"name": "路人甲",
"age": 10
},
{
"id": 2,
"name": "路人乙",
"age": 18
},
{
"id": 3,
"name": "路人丙",
"age": 28
},
{
"id": 4,
"name": "路人丁",
"age": 50
}
]

books:

[
{
"id": 1,
"name": "人之初",
"price": "38.80",
"pubdate": "2020-12-01",
"authors": [
1
]
},
{
"id": 2,
"name": "性本善",
"price": "28.40",
"pubdate": "2020-06-01",
"authors": [
2
]
},
{
"id": 3,
"name": "性相近",
"price": "15.20",
"pubdate": "2019-10-01",
"authors": [
3
]
},
{
"id": 4,
"name": "习相远",
"price": "35.20",
"pubdate": "2019-07-01",
"authors": [
4
]
},
{
"id": 5,
"name": "苟不教",
"price": "5.20",
"pubdate": "2018-07-01",
"authors": [
1,
3,
4
]
},
{
"id": 6,
"name": "性乃迁",
"price": "55.20",
"pubdate": "2018-12-01",
"authors": [
2,
3,
4
]
},
{
"id": 7,
"name": "教之道",
"price": "33.20",
"pubdate": "2018-12-23",
"authors": [
2,
3
]
},
{
"id": 8,
"name": "贵以专",
"price": "27.20",
"pubdate": "2017-12-23",
"authors": [
1,
4
]
}
]

二、常用聚合操作

获取所有书籍的数量:

>>> Book.objects.count()
8

获取由路人甲参与著作的所有书籍的数量:

>>> Book.objects.filter(authors__name__contains='路人甲').count()
3

获取所有书籍的平均价格:

>>> Book.objects.all().aggregate(Avg('price'))
{'price__avg': Decimal('29.800000')}

获取所有书籍中的最高价格:

>>> Book.objects.all().aggregate(Max('price'))
{'price__max': Decimal('55.20')}
涉及到一对多或多对多关系的聚合查询

计算每一位作者各自参与写作了多少本书:

>>> from django.db.models import Count
>>> authors=Author.objects.annotate(num_books=Count('book'))
>>> authors
<QuerySet [<Author: Author object (1)>, <Author: Author object (2)>, <Author: Author object (3)>, <Author: Author object (4)>]>
>>> authors[0].num_books
3
>>> authors.values_list('name', 'num_books')
<QuerySet [('路人甲', 3), ('路人乙', 3), ('路人丙', 4), ('路人丁', 4)]>

即作者包含路人甲的书籍有3本,以此类推。

计算每一位作者各自参与写作的书籍数量,根据书籍出版年份是否在2020年以前分界:

>>> from django.db.models import Q
>>> before_2020 = Count('book', filter=Q(book__pubdate__lt='2020-01-01'))
>>> after_2020 = Count('book', filter=Q(book__pubdate__gt='2020-01-01'))
>>> authors = Author.objects.annotate(before_2020=before_2020).annotate(after_2020=after_2020)
>>> authors
<QuerySet [<Author: Author object (1)>, <Author: Author object (2)>, <Author: Author object (3)>, <Author: Author object (4)>]>
>>> authors[0].before_2020
2
>>> authors.values_list('name', 'before_2020', 'after_2020')
<QuerySet [('路人甲', 2, 1), ('路人乙', 2, 1), ('路人丙', 4, 0), ('路人丁', 4, 0)]>

即作者包含路人甲的书籍,2020年以前出版的有2本,2020年以后出版的有1本。以此类推。

获取每一位作者各自参与著作的书籍数量,将输出结果按书籍数量由大到小的顺序排序:

>>> authors = Author.objects.annotate(num_books=Count('book')).order_by('-num_books')
>>> authors
<QuerySet [<Author: Author object (3)>, <Author: Author object (4)>, <Author: Author object (1)>, <Author: Author object (2)>]>
>>> authors.values_list('name', 'num_books')
<QuerySet [('路人丙', 4), ('路人丁', 4), ('路人甲', 3), ('路人乙', 3)]>

三、aggregate

在聚合查询中,Django 支持通过 aggregate() 方法从整个 QuerySet 中计算出一个汇总数据。如获取所有书籍的平均价格:

>>> from django.db.models import Avg
>>> Book.objects.all().aggregate(Avg('price'))
{'price__avg': Decimal('29.800000')}

上述语句中的 all() 可以省略。aggregate() 的参数表示我们想要做聚合计算的那一列数据,其中的 'price' 即表示 Book 模型的 price 字段。

aggregate() 对于 QuerySet 来说是一种终止语句,会返回字典形式的键值对作为计算结果。其中的键会根据聚合的字段自动生成,也可以手动指定:

>>> Book.objects.all().aggregate(average_price=Avg('price'))
{'average_price': Decimal('29.800000')}

如果想要同时完成多个聚合查询操作,可以为 aggregate() 添加多个参数:

>> from django.db.models import Avg, Max, Min
>>> Book.objects.aggregate(Avg('price'), Max('price'), Min('price'))
{'price__avg': Decimal('29.800000'), 'price__max': Decimal('55.20'), 'price__min': Decimal('5.20')}

四、annotate

借助 annotate() 方法,Django 可以从 QuerySet 的每一个对象中计算出对应的独立的汇总数据。比如想获得 Book 模型中每一本书的作者的数量:

>>> from django.db.models import Count
>>> q = Book.objects.annotate(num_authors=Count('authors'))
>>> q
<QuerySet [<Book: Book object (1)>, <Book: Book object (2)>, <Book: Book object (3)>, <Book: Book object (4)>, <Book: Book object (5)>, <Book: Book object (6)>, <Book: Book object (7)>, <Book: Book object (8)>]>
>>> q[0].num_authors
1
>>> q.values_list('name', 'num_authors')
<QuerySet [('人之初', 1), ('性本善', 1), ('性相近', 1), ('习相远', 1), ('苟不教', 3), ('性乃迁', 3), ('教之道', 2), ('贵以专', 2)]>

不同于 aggregate()annotate() 对于 QuerySet 来说并不是终止语句,annotate() 方法的输出结果仍是 QuerySet 对象。该对象可以继续执行被 QuerySet 支持的任意操作,如 filter()order_by() 等,甚至另一个 annotate()

五、join & aggregate

某些情况下,你想要聚合的字段并不属于当前正在查询的模型,而是属于关联于当前模型的另一个模型。在对这些字段进行聚合查询时,Django 允许使用与 filter() 中相同的用于指定关联字段的双下划线语法。

比如想要获取每一位作者所著书籍的价格区间:

>>> from django.db.models import Max, Min
>>> authors = Author.objects.annotate(min_price=Min('book__price'), max_price=Max('book__price'))
>>> authors.values_list('name', 'min_price', 'max_price')
<QuerySet [('路人甲', Decimal('5.20'), Decimal('38.80')), ('路人乙', Decimal('28.40'), Decimal('55.20')), ('路人丙', Decimal('5.20'), Decimal('55.20')), ('路人丁', Decimal('5.20'), Decimal('55.20'))]>

即作者为路人甲的书籍中,最低的价格为 5.20,最高的价格为 38.80。

六、filter() 或 order_by() 应用到 annotate()

如查找所有多人合著(作者数量大于 1)的书籍列表:

>>> books = Book.objects.annotate(num_authors=Count('authors')).filter(num_authors__gt=1)
>>> books
<QuerySet [<Book: Book object (5)>, <Book: Book object (6)>, <Book: Book object (7)>, <Book: Book object (8)>]>
>>> books.values_list('name', 'num_authors')
<QuerySet [('苟不教', 3), ('性乃迁', 3), ('教之道', 2), ('贵以专', 2)]>

根据作者数量对全部书籍进行排序:

>>> books = Book.objects.annotate(num_authors=Count('authors')).order_by('num_authors')
>>> books
<QuerySet [<Book: Book object (2)>, <Book: Book object (4)>, <Book: Book object (1)>, <Book: Book object (3)>, <Book: Book object (8)>, <Book: Book object (7)>, <Book: Book object (5)>, <Book: Book object (6)>]>
>>> books.values_list('name', 'num_authors')
<QuerySet [('性本善', 1), ('习相远', 1), ('人之初', 1), ('性相近', 1), ('教之道', 2), ('贵以专', 2), ('苟不教', 3), ('性乃迁', 3)]>

Django 官方文档 —— Aggregation


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK