Profiling Python Code

pprofile

To get a line-by-line profile of your code, you can use the pprofile tool, for a granular, thread-aware analysis for deterministic or statistical profiling (pure Python). The form of pprofile is:

$ pprofile some_python_executable arg1 ...

After the tool finishes, it prints annotated code of each file involved in the execution.

By default, pprofile profiling is deterministic, which although it slows down the code, produces a very complete profile. You can also use pprofile in a statistical manner, which uses much less time:

$ pprofile --statistic .01 code.py

With the statistic option, you also need to specify the period of time between sampling. In this example, a period of 0.01 seconds was used.

Be careful when using the statistic option because, if the sample time is too long, you can miss a number of computations, and the output will incorrectly record zero percent activity. Conversely, to get a better estimation of the time spent in certain portions of the code, you have to reduce the time between samples to the point of almost deterministic profiling.

The deterministic pprofile sample output in Listing 3 uses the same code as the previous cProfile example. I cut out sections of the output because it is very extensive. One thing I want to point out is the increase in execution time by about a factor of 10 (i.e., it ran 10 times slower than without profiling).

Listing 3: pprofile Output

Command line: md_002.py
Total duration: 1662.48s
File: md_002.py
File duration: 1661.74s (99.96%)
Line #|      Hits|         Time| Time per hit|      %|Source code
------+----------+-------------+-------------+-------+-----------
     1|         0|            0|            0|  0.00%|# md test code
     2|         0|            0|            0|  0.00%|
     3|         2|  3.50475e-05|  1.75238e-05|  0.00%|import platform
     4|         1|  2.19345e-05|  2.19345e-05|  0.00%|from time import clock
(call)|         1|  2.67029e-05|  2.67029e-05|  0.00%|# :1009 _handle_fromlist
     5|         1|  2.55108e-05|  2.55108e-05|  0.00%|import numpy as np
(call)|         1|     0.745732|     0.745732|  0.04%|# :978 _find_and_load
     6|         1|  2.57492e-05|  2.57492e-05|  0.00%|from sys import exit
(call)|         1|   1.7643e-05|   1.7643e-05|  0.00%|# :1009 _handle_fromlist
     7|         1|  7.86781e-06|  7.86781e-06|  0.00%|import time
...
   234|         0|            0|            0|  0.00%|        #  Compute the potential energy and forces
   235|  12525000|      51.0831|  4.07849e-06|  3.07%|        for j in range(0, p_num):
   236|  12500000|      51.6473|  4.13179e-06|  3.11%|            if (i != j):
   237|         0|            0|            0|  0.00%|                #  Compute RIJ, the displacement vector
   238|  49900000|      210.704|  4.22253e-06| 12.67%|                for k in range(0, d_num):
   239|  37425000|      177.055|  4.73093e-06| 10.65%|                    rij[k] = pos[k,i] - pos[k,j]
   240|         0|            0|            0|  0.00%|                # end for
   241|         0|            0|            0|  0.00%|
   242|         0|            0|            0|  0.00%|                #  Compute D and D2, a distance and a truncated distance
   243|  12475000|      50.5158|  4.04936e-06|  3.04%|                d = 0.0
   244|  49900000|      209.465|   4.1977e-06| 12.60%|                for k in range(0, d_num):
   245|  37425000|      175.823|  4.69801e-06| 10.58%|                    d = d + rij[k] ** 2
   246|         0|            0|            0|  0.00%|                # end for
   247|  12475000|      78.9422|  6.32803e-06|  4.75%|                d = np.sqrt(d)
   248|  12475000|      64.7463|  5.19008e-06|  3.89%|                d2 = min(d, np.pi / 2.0)
   249|         0|            0|            0|  0.00%|
   250|         0|            0|            0|  0.00%|                #  Attribute half of the total potential energy to particle J
   251|  12475000|      84.7846|  6.79636e-06|  5.10%|                potential = potential + 0.5 * np.sin(d2) * np.sin(d2)
   252|         0|            0|            0|  0.00%|
   253|         0|            0|            0|  0.00%|                #  Add particle J's contribution to the force on particle I.
   254|  49900000|       227.88|  4.56674e-06| 13.71%|                for k in range(0, d_num):
   255|  37425000|      244.374|  6.52971e-06| 14.70%|                    force[k,i] = force[k,i] - rij[k] * np.sin(2.0 * d2) / d
   256|         0|            0|            0|  0.00%|                # end for
   257|         0|            0|            0|  0.00%|            # end if
   258|         0|            0|            0|  0.00%|
   259|         0|            0|            0|  0.00%|        # end for
...

Line-by-Line Function Profiling

The useful pprofile analyzes your entire code line by line. It can also do deterministic and statistical profiling. If you want to focus on a specific function within your code, line_profiler and kernprof can help. The line_profiler module performs line-by-line profiling of functions, and the kernprof script allows you to run either line_profiler or standard Python profilers such as cProfile.

To have kernprof run line_profiler, enter,

$ kernprof -l script_to_profile.py

which produces a binary file, script_to_profile.py.lprof. To “decode” the data, you can enter the command

$ python3 -m line_profiler script_to_profile.py.lprof > results.txt

and look at the results.txt file.

To get line_profiler to profile only certain functions, put an @profile decorator before the function declaration. The output is the elapsed time for the routine. The percentage of time, which is something I tend to check first, is relative to the total time for the function (be sure to remember that). The example in Listing 4 is output for some example code discussed in the next section.

Listing 4: Profiling a Function

Total time: 0.365088 s
File: ./md_002.py
Function: update at line 126
 
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   126                                           @profile
   127                                           def update(d_num, p_num, rmass, dt, pos, vel, acc, force):
   128                                           
   129                                               # Update
   130                                           
   131                                               #  Update positions
   132       200        196.0      1.0      0.1      for i in range(0, d_num):
   133     75150      29671.0      0.4      8.1          for j in range(0, p_num):
   134     75000     117663.0      1.6     32.2              pos[i,j] = pos[i,j] + vel[i,j]*dt + 0.5 * acc[i,j]*dt*dt
   135                                                   # end for
   136                                               # end for
   137                                           
   138                                               #  Update velocities
   139       200         99.0      0.5      0.0      for i in range(0, d_num):
   140     75150      29909.0      0.4      8.2          for j in range(0, p_num):
   141     75000     100783.0      1.3     27.6              vel[i,j] = vel[i,j] + 0.5*dt*( force[i,j] * rmass + acc[i,j] )
   142                                                   # end for
   143                                               # end for
   144                                           
   145                                               #  Update accelerations.
   146       200         95.0      0.5      0.0      for i in range(0, d_num):
   147     75150      29236.0      0.4      8.0          for j in range(0, p_num):
   148     75000      57404.0      0.8     15.7              acc[i,j] = force[i,j]*rmass
   149                                                   # end for
   150                                               # end for
   151                                           
   152        50         32.0      0.6      0.0      return pos, vel, acc