Multithreading vs. Multiprocessing with OpenCV in Python
I have the following function:
def Upscale(path_to_image):
img = cv2.imread(path_to_image)
sr = cv2.dnn_superres.DnnSuperResImpl_create()
path = 'LapSRN_x8.pb'
sr.readModel(path)
sr.setModel('lapsrn',8)
result = sr.upsample(img)
cv2.imwrite(f'C:\\Users\\user\\Desktop\\PyStuff\\images\\{path_to_image}_resized.png',result)
return result
This function takes in an image, upscales the image, and then writes the image to a folder and returns the resulting image array. There is a list of PNG image paths, which is of the following form:
file_list = ['page-1.png','page-2.png',...etc]
I attempted to use multithreading to make the process faster, as each image takes 95 sec to complete (there are hundreds of images). The code for this is the following:
import tqdm
from concurrent.futures import ThreadPoolExecutor, as_completed
res=[]
with ThreadPoolExecutor(max_workers=10) as executor:
future_to_response = {
executor.submit(Upscale, f'C:\\Users\\rturedi\\Desktop\\DPI_proj\\images\\{i}'): i for i in file_list
}
t = tqdm.tqdm(total=len(future_to_response))
for future in as_completed(future_to_response):
res.append(future.result())
for i in range(len(res)):
cv2.imwrite(f'{i}.png',res[i])
The above code multithreads the process, and then has a simple loop to run through the list I am appending to 开发者_如何学运维in order to store the images. This process takes just as long as it would take to run each image consecutively (the multithreading does not make the process faster).
I attempted to fix this by instead using multiprocessing, and the code for this is as follows:
import multiprocessing
res = []
for i in file_list:
p = multiprocessing.Process(target=Upscale((f'C:\\Users\\rturedi\\Desktop\\DPI_proj\\images\\{i}')))
res.append(p)
p.start()
However, this takes just as long (does not decrease the time it takes to compute each image individually), and furthermore, the output of my res array is not a list of arrays, rather it is:
[<Process name='Process-54' pid=28028 parent=27048 stopped exitcode=1>,
<Process name='Process-55' pid=18272 parent=27048 stopped exitcode=1>,
<Process name='Process-56' pid=23116 parent=27048 stopped exitcode=1>,
<Process name='Process-57' pid=5536 parent=27048 stopped exitcode=1>,
<Process name='Process-58' pid=14496 parent=27048 stopped exitcode=1>,
<Process name='Process-59' pid=16964 parent=27048 stopped exitcode=1>,
<Process name='Process-60' pid=14832 parent=27048 stopped exitcode=1>,
<Process name='Process-61' pid=19584 parent=27048 stopped exitcode=1>,
<Process name='Process-62' pid=20244 parent=27048 stopped exitcode=1>,
<Process name='Process-63' pid=28768 parent=27048 stopped exitcode=1>,
<Process name='Process-64' pid=16164 parent=27048 stopped exitcode=1>,
<Process name='Process-65' pid=21196 parent=27048 stopped exitcode=1>]
Does anyone have an idea of how I can accomplish making this process faster with either multithreading or multiprocessing?
I think multiprocessing will work when you are running the function for multiple images at the same time.
Right now you are using a for loop and passing one image at a time to a single process.
Try the following approach and compare the time this takes with your original approach.
if __name__ == '__main__':
p1 = Process(target=Upscale,args(file_list[0:len(file_list)//2]))
p1.start()
p2 = Process(target=Upscale,args(file_list[len(file_list)//2:len(file_list)]))
p2.start()
p1.join()
p2.join()
Inside the Upscale function, make a for loop and loop through the images, and perform the same tasks as before.
Also, note that the function's argument is mentioned separately in "args" of the function "multiprocessing.Process".
P.S This is my first time answering, let me know if something is not clear.
For the output, you can try the following approach, I picked it up online through this https://superfastpython.com/multiprocessing-return-value-from-process/
# example of returning a variable from a process using a value
from random import random
from time import sleep
from multiprocessing import Value
from multiprocessing import Process
# function to execute in a child process
def task(variable):
# generate some data
data = random()
# block, to simulate computational effort
print(f'Generated {data}', flush=True)
sleep(data)
# return data via value
variable.value = data
# protect the entry point
if __name__ == '__main__':
# create shared variable
variable = Value('f', 0.0)
# create a child process process
process = Process(target=task, args=(variable,))
# start the process
process.start()
# wait for the process to finish
process.join()
# report return value
print(f'Returned: {variable.value}')
精彩评论