This is an interesting example from G.Lanaro's book, Python High Performance. The program is a simple simulator which describes movement of particles based on their positions and angular velocities (x
, y
, and ang_v
attributes of a Particle
class). The bottleneck of the program is the evolve
method which is initially implemented using pure Python, and then the author tries to show how we can improve the performance of this specific method using NumPy operations.
My problem is that when I compare the run-time of NumPy version (evolve_numpy
) versus the pure Python version (evolve
), I can hardly see any improvement in terms of runtime; in fact, it seems like pure Python is a bit faster here.
I wrote all of this code myself, based off of what I learned in the book.
The initial version of evolve
is as follows
def evolve(self, dt):
timestep = .00001
nstep = int(dt / timestep)
for i in range(nstep):
for p in self.particles:
t_x_ang = timestep * p.ang_v
norm = (p.x ** 2 + p.y ** 2) ** .5
p.x, p.y = (p.x - t_x_ang * p.y / norm, p.y + t_x_ang * p.x/norm)
# repeat for all particles
# repeat for all time-steps
Here is the NumPy version of the same method,
def evolve_numpy(self, dt):
timestep = 0.00001
nsteps = int(dt / timestep)
r_i = np.array([[p.x, p.y] for p in self.particles])
ang_vel_i = np.array([p.ang_v for p in self.particles])
for i in range(nsteps):
norm_i = np.sqrt((r_i ** 2).sum(axis=1))
v_i = r_i[:, [1, 0]]
v_i[:, 0] *= -1
v_i /= norm_i[:, np.newaxis]
d_i = timestep * ang_vel_i[:, np.newaxis] * v_i
r_i += d_i
for i, p in enumerate(self.particles):
p.x, p.y = r_i[i]
To test the runtime of both versions, a benchmark2
function is defined as follows:
def benchmark2(nparts, m):
particles = [Particle(uniform(-1, 1), uniform(-1, 1), uniform(-1, 1))
for i in range(nparts)]
sim = ParticleSimualtor(particles)
if m == 1:
sim.evolve(.1)
elif m == 2:
sim.evolve_numpy(.1)
else:
raise Exception('not a valid method')
You can access the complete code here: https://gist.github.com/OmidMiresmaeili/2891c4f1ce95f65cfab8ed0469e427d4
My results are as follows:
1min 40s for NumPy version and 1min 29s for Python version
But the author claim (and I understand the reasoning) that the NumPy version runs much faster than the initial version of evolve
without any NumPy operations:
This is a figure from page 71 of the book. It shows run-time for different particle size (nparts
argument in the benchmark2
function). The author then concludes:
The plot shows that both the implementations scale linearly with particle size, but the runtime in the pure Python version grows much faster than the NumPy version; at greater sizes, we have a greater NumPy advantage. In general, when using NumPy, you should try to pack things into large arrays and group the calculations using the broadcasting feature.
Now I'm not sure if it is a typo in the book or maybe I am not understanding this correctly. I would appreciate any input you can give me.