First, a baseline timing. Using this code to generate 5 random matrices:
import numpy as np
n, t = 1000, 5
with open("data", "w") as f:
for _ in range(t):
np.savetxt(np.random.randint(1000, size=(n, n)))
Your code takes about 10 seconds to execute on my machine.
Second, small stylistic fixes. Your try...except...else code is literally what Python basically does for every line anyway:
- It tries to execute the line.
- If this fails, raise an appropriate error.
- Otherwise go on with the next line.
So you can just remove that. In any case, you should never have a bar except, unless you are really sure what this means. You will not be able to abort the process using CTRL+C, for example.
You should also rename your function from printImage to print_image, to adhere to Python's official style-guide, PEP8.
Then, to improve the speed. This is quite tricky (as you might have discovered yourself already). These are the things I tried, but which failed to improve the time:
- Create the
figure only once and reuse it. This was recommended here.
- Get rid of the GUI drawing completely, by using the underlying objects. This was recommended here.
- Use
pandas.read_csv with a hack to read an iterator, as shown here.
What did make a huge difference is using 2. from above and using the multiprocessing module:
import itertools as it
import time
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from matplotlib.figure import Figure
import matplotlib.pyplot as plt
import multiprocessing
def timeit(func):
def wrapper(*args, **kwargs):
starting_time = time.clock()
result = func(*args, **kwargs)
ending_time = time.clock()
print('Duration: {}'.format(ending_time - starting_time))
return result
return wrapper
@timeit
def print_image_multiprocessing(n, t):
print("print_image_multiprocessing")
p = multiprocessing.Pool(4)
with open("data", "r") as f:
items_gen = ([list(map(float, i.split())) for i in it.islice(f, n)]
for i in range(t))
p.starmap(print_image_no_gui, zip(items_gen, it.count()))
def print_image_no_gui(items, i):
fig = plt.Figure()
ax = fig.add_subplot(111)
image = ax.imshow(items, interpolation='nearest')
image.set_cmap('hot')
fig.tight_layout()
plt.axis('off')
image.axes.get_xaxis().set_visible(False)
image.axes.get_yaxis().set_visible(False)
canvas = FigureCanvas(fig)
canvas.print_figure('mult%05d' % i + '.png', bbox_inches='tight',
pad_inches=0, dpi=300)
print(str(i) + '/' + str(t))
if __name__ === "__main__":
print_image_multiprocessing(1000, 5)
This takes less than 3 seconds (with the 4 workers), compared to 10 seconds, on my machine. For 10 images it needs 6 seconds, instead of 14. Not ideal, but I don't see any other obvious improvements.
Note that I removed the tight_layout, because it raises a warning per plot. According to this issue here, one can work-around it by using fig.set_tight_layout(True), but this then raises a different warning, complaining about the axes not being drawable. Since you remove the axis anyways, removing that call does no harm.
np.random.rand(1000,1000)instead of uploading a file. \$\endgroup\$