![](/style/images/good.png)
![](/style/images/bad.png)
Concurrent.futures in Python
source link: https://jdhao.github.io/2020/12/29/python_concurrent_futures/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Concurrent.futures in Python
executor.map() or executor.submit()
There are mainly two different ways to use executor for parallel computing, the
first is via executor.map()
, and the second way is via executor.submit()
combined with concurrent.futures.as_completed()
.
Here is a simple example to demonstrate this:
import time
from contextlib import contextmanager
from concurrent.futures import ThreadPoolExecutor
import concurrent.futures
@contextmanager
def report_time(des):
start = time.time()
yield
end = time.time()
print(f"Time for {des}: {end-start}")
def square(x):
time.sleep(0.5)
return x*x
def main():
num = 10
with report_time("using executor.map"):
# with ThreadPoolExecutor(max_workers=10) as executor:
with ThreadPoolExecutor() as executor:
res = executor.map(square, range(num))
res = list(res)
with report_time('using executor.submit'):
with ThreadPoolExecutor() as executor:
my_futures = [executor.submit(square, x) for x in range(num)]
res = []
for future in concurrent.futures.as_completed(my_futures):
res.append(future.result())
print(res)
if __name__ == "__main__":
main()
Note that executor.map()
will return an iterator instead of plain list, and
the order of results corresponds to the argument order provided for the
function we want to execute in parallel.
If we use executor.submit()
, it will return a future object.
We can later access the function return results via the future object. Unlike
map()
, we then use concurrent.futures.as_completed(my_futures)
to make sure that the functions are actually executed and return the results.
Any future that gets finished first will be returned first. So there is no
guarantee of the result order anymore. We may get the following result for
res
:
[4, 9, 1, 16, 25, 49, 36, 0, 64, 81]
max_worker
For concurrent.futures.ThreadPoolExecutor(), there is parameter max_worker
to
specify the max number of threads to use. According to the official doc,
it is set to min(32, os.cpu_count() + 4)
for Python 3.8 and os.cpu_count() * 5
for Python version below 3.8 and above 3.5.
In some cases, the default max_worker
may be too large to cause serious
issues. For example, when I use ThreadPoolExecutor() to request a web with
default parameters, my code runs for a few requests, then it hangs indefinitely
without any progress. I have to reduce the max_worker
to about 50 to run the
code smoothly. So in our real project, we should tweak the value of
max_workers
to fit our needs.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK