Concurrent.futures in Python

2020-12-29

361 words 2 mins read 8 times read

executor.map() or executor.submit()

There are mainly two different ways to use executor for parallel computing, the first is via executor.map(), and the second way is via executor.submit() combined with concurrent.futures.as_completed().

Here is a simple example to demonstrate this:

import time
from contextlib import contextmanager
from concurrent.futures import ThreadPoolExecutor
import concurrent.futures


@contextmanager
def report_time(des):
    start = time.time()
    yield
    end = time.time()

    print(f"Time for {des}: {end-start}")

def square(x):
    time.sleep(0.5)
    return x*x


def main():
    num = 10

    with report_time("using executor.map"):
        # with ThreadPoolExecutor(max_workers=10) as executor:
        with ThreadPoolExecutor() as executor:
            res = executor.map(square, range(num))
        res = list(res)

    with report_time('using executor.submit'):
        with ThreadPoolExecutor() as executor:
            my_futures = [executor.submit(square, x) for x in range(num)]
            res = []
            for future in concurrent.futures.as_completed(my_futures):
                res.append(future.result())
            print(res)


if __name__ == "__main__":
    main()

Note that executor.map() will return an iterator instead of plain list, and the order of results corresponds to the argument order provided for the function we want to execute in parallel.

If we use executor.submit(), it will return a future object. We can later access the function return results via the future object. Unlike map(), we then use concurrent.futures.as_completed(my_futures) to make sure that the functions are actually executed and return the results. Any future that gets finished first will be returned first. So there is no guarantee of the result order anymore. We may get the following result for res:

[4, 9, 1, 16, 25, 49, 36, 0, 64, 81]

max_worker

For concurrent.futures.ThreadPoolExecutor(), there is parameter max_worker to specify the max number of threads to use. According to the official doc, it is set to min(32, os.cpu_count() + 4) for Python 3.8 and os.cpu_count() * 5 for Python version below 3.8 and above 3.5.

In some cases, the default max_worker may be too large to cause serious issues. For example, when I use ThreadPoolExecutor() to request a web with default parameters, my code runs for a few requests, then it hangs indefinitely without any progress. I have to reduce the max_worker to about 50 to run the code smoothly. So in our real project, we should tweak the value of max_workers to fit our needs.

Author jdhao

LastMod 2020-12-29

License CC BY-NC-ND 4.0

Reward

Git Directory and Work-Tree Explained

Concurrent.futures in Python

Concurrent.futures in Python

executor.map() or executor.submit()

max_worker

Recommend

年底谷歌扎堆升职，L3到L6升一级多$10w

三本学渣肉身翻墙，连斩FB、Amazon offer！

吉他的正确调音方式

属于我的三年·第一年

印度将举行大型频谱拍卖活动

工信部副部长刘烈宏：加快5G技术创新，大力推进6G研究

湖南马栏山节目交易中心（筹建）签约揭牌布局文创产业价值链

江苏有线旗下创新媒体生态平台——视界观正式上线

“十八线电视台”如今还有市场吗？

江苏有线领导到南京分公司调研，强调要把握全国一网整合战略发展机遇

About Joyk