12

Python生成器的一个坑

 3 years ago
source link: https://chenjiehua.me/python/python-generator-trick.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Python生成器的一个坑 

用Python写迭代器(Iterator)的时候,可能会有人向你推荐生成器(Generator),并列举一堆生成器的好处。不过,今天要来分享一个生成器的坑……

假如我们要来写一个归一化函数,简单看个例子:

Python
def normalize(numbers):
total = sum(numbers)
result = []
for v in numbers:
p = 100.0 * v / total
result.append(p)
return result
def main():
data = [12, 20, 35, 9]
print normalize(data)

正常情况下输出结果非常符合预期:

Default
[15.789473684210526, 26.31578947368421, 46.05263157894737, 11.842105263157896]

但……

如果 numbers 传入了一个generator,那结果会变成什么样子呢?我们来试一下:

Python
def gen_data(data):
for v in data:
yield v
def main():
data = [12, 20, 35, 9]
gen = gen_data(data)
print normalize(gen)

猜猜结果输出什么:

Default

没错,是空列表!怎么回事呢?是for循环里的生成器坏了,试试直接输出:

Python
def main():
data = [12, 20, 35, 9]
gen = gen_data(data)
for v in gen:
print v

输出结果:

Default

没问题呀!

那……

排除生成器本身的问题,那肯定是normalize函数错了。断点看一下:

Python
def normalize(numbers):
total = sum(numbers)                  -------> 这里没错,total=76
result = []
for v in numbers:                     -------> for循环竟然直接跳过了
p = 100.0 * v / total
result.append(p)
return result

仔细想想,生成器的原理是什么?调用next() 获取下一个元素的值,直到返回 StopIteration。

这样子就说得通了,sum() 把生成器遍历了一遍,等到for循环的时候,已经没有内容需要遍历了。

Python
def main():
data = [12, 20, 35, 9]
gen = gen_data(data)
t1 = sum(gen)
t2 = sum(gen)
print t1, t2

结果输出:

Default

改……

这里提供一个简单的解决方法:

Python
class GenData(object):
def __init__(self, data):
self.data = data
def __iter__(self):
for v in self.data:
yield v

至于为啥,留给你思考一下。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK