[7483] | 1 | Profiling |
---|
| 2 | ********* |
---|
| 3 | |
---|
| 4 | *or: What Did My Application Do in the Afternoon?* |
---|
| 5 | |
---|
| 6 | This how-to explains how you can profile your Grok application or |
---|
| 7 | Grok/Zope itself. This way you get an idea of the bottlenecks of your |
---|
| 8 | application or in other words: where it spends its time. |
---|
| 9 | |
---|
| 10 | .. note:: This text is an updated version of my how-to as it can seen |
---|
| 11 | on the old Grok site (no, I won't give a link to that |
---|
| 12 | slightly outdated stuff). |
---|
| 13 | |
---|
| 14 | Profiling with Grok |
---|
| 15 | =================== |
---|
| 16 | |
---|
| 17 | When it comes to a web framework like Grok, profiling is not as easy |
---|
| 18 | as with commandline or desktop applications. You normally want to |
---|
| 19 | trigger certain requests and see, in which parts of your code how much |
---|
| 20 | time or memory was spent. |
---|
| 21 | |
---|
| 22 | Here is how you can do this. |
---|
| 23 | |
---|
| 24 | Prerequisites |
---|
| 25 | ============= |
---|
| 26 | |
---|
| 27 | We assume you know how to create a Grok project (your application) and |
---|
| 28 | have one already at hand. |
---|
| 29 | |
---|
| 30 | Installing a profiler |
---|
| 31 | ===================== |
---|
| 32 | |
---|
| 33 | There are some profiling tools available with every Python |
---|
| 34 | installation. Usually, they are started together with some desktop- or |
---|
| 35 | commandline application, wrapping it in a way to measure internal calls |
---|
| 36 | and other things interesting for developers. |
---|
| 37 | |
---|
| 38 | With web-frameworks, however, we often only want to check certain |
---|
| 39 | requests. Luckily, with `Paste`_ and `WSGI`_ we have a mechanism, that |
---|
| 40 | can serve in a similar way: we can define a (WSGI_) pipeline of apps |
---|
| 41 | and filters that wrap each other, similar to the approach of regular |
---|
| 42 | non-web applications. We also have a profiler already available, that |
---|
| 43 | *is* some WSGI_ application itself and can profile things when some |
---|
| 44 | HTTP request comes in: `repoze.profile`_ is even configurable over a |
---|
| 45 | web frontend. |
---|
| 46 | |
---|
| 47 | Install repoze.profile_ |
---|
| 48 | ======================= |
---|
| 49 | |
---|
| 50 | In the ``buildout.cfg`` of your project add the repoze.profile_ egg to list |
---|
| 51 | of eggs of your application. Look out for a section named ``[app]``, which |
---|
| 52 | could read like this or similar:: |
---|
| 53 | |
---|
| 54 | |
---|
| 55 | ... |
---|
| 56 | [app] |
---|
| 57 | recipe = zc.recipe.egg |
---|
| 58 | eggs = cptapp |
---|
| 59 | z3c.evalexception>=2.0 |
---|
| 60 | Paste |
---|
| 61 | PasteScript |
---|
| 62 | PasteDeploy |
---|
| 63 | repoze.profile |
---|
| 64 | interpreter = python-console |
---|
| 65 | ... |
---|
| 66 | |
---|
| 67 | Here the added ``repoze.profile`` stanza is important. |
---|
| 68 | |
---|
| 69 | Now run:: |
---|
| 70 | |
---|
| 71 | $ bin/buildout |
---|
| 72 | |
---|
| 73 | to fetch the egg from the net if it is not already available and to |
---|
| 74 | make it known to the generated scripts. |
---|
| 75 | |
---|
| 76 | Create a ``profiler.ini`` |
---|
| 77 | ========================= |
---|
| 78 | |
---|
| 79 | To make use of the new egg we must tell paster about it. This is done |
---|
| 80 | by an appropriate initialization file we create now:: |
---|
| 81 | |
---|
| 82 | # file: profiler.ini |
---|
| 83 | [app:myapp] |
---|
| 84 | use = egg:sample |
---|
| 85 | |
---|
| 86 | [filter:myprofiler] |
---|
| 87 | use = egg:repoze.profile#profile |
---|
| 88 | |
---|
| 89 | [pipeline:main] |
---|
| 90 | pipeline = myprofiler myapp |
---|
| 91 | |
---|
| 92 | [server:main] |
---|
| 93 | use = egg:Paste#http |
---|
| 94 | host = 127.0.0.1 |
---|
| 95 | port = 8080 |
---|
| 96 | |
---|
| 97 | [DEFAULT] |
---|
| 98 | # set the name of the zope.conf file |
---|
| 99 | zope_conf = %(here)s/zope.conf |
---|
| 100 | |
---|
| 101 | Here we tell paster (the HTTP server) in the ``pipeline`` to filter |
---|
| 102 | each web request passing it first to repoze.profile_ which then calls |
---|
| 103 | our own ``sample`` application. This way the profiler can start its |
---|
| 104 | job and measure all calls done by the following application call. |
---|
| 105 | |
---|
| 106 | It is crucial, that you use the name of *your* project egg here in the |
---|
| 107 | ``[app:myapp]`` section. In the sample config above we assume that the |
---|
| 108 | project (application) is called ``sample``, so that the egg with our |
---|
| 109 | application is called the same. |
---|
| 110 | |
---|
| 111 | .. note:: modern Grok projects put their paster configuration files as |
---|
| 112 | *templates* into the local ``etc/`` directory and do not |
---|
| 113 | call them ``debug.ini``, ``profile.ini``, etc. but |
---|
| 114 | ``debug.ini.in``, ``profile.ini.in``, etc (note the trailing |
---|
| 115 | ``.in``). This is not covered here. |
---|
| 116 | |
---|
| 117 | Put this new file in the same directory as where your ``zope.conf`` |
---|
| 118 | lives (*not* ``zope.conf.in``). For newer projects this is |
---|
| 119 | ``parts/etc/``. |
---|
| 120 | |
---|
| 121 | Start Profiling |
---|
| 122 | =============== |
---|
| 123 | |
---|
| 124 | With the given setup we can start profiling by:: |
---|
| 125 | |
---|
| 126 | $ bin/paster serve parts/etc/profiler.ini |
---|
| 127 | |
---|
| 128 | If your ``profiler.ini`` file resides elsewhere, you must use a |
---|
| 129 | different location, of course. |
---|
| 130 | |
---|
| 131 | .. note:: you dont't *have* to create a separate profiler config. You |
---|
| 132 | can, of course, update an existing ``deploy.ini`` or |
---|
| 133 | ``debug.ini``. But as profiling takes more time than calls |
---|
| 134 | without profiling, it is a good idea to enable profiling |
---|
| 135 | only on request, i.e. when starting paster with a |
---|
| 136 | respective configuration. |
---|
| 137 | |
---|
| 138 | The server will start as usual and you can do everything you like with |
---|
| 139 | it. |
---|
| 140 | |
---|
| 141 | Browsing the Profiler |
---|
| 142 | ===================== |
---|
| 143 | |
---|
| 144 | To get to the profiler, enter the following URL: |
---|
| 145 | |
---|
| 146 | http://localhost:8080/__profile__ |
---|
| 147 | |
---|
| 148 | This brings us to the profiler web frontend. If you have browsed your |
---|
| 149 | instance before, you will get some values about the timings of last |
---|
| 150 | requests. If not, then browse a bit over your application to collect |
---|
| 151 | some data. The data is collected 'in background' during each request |
---|
| 152 | and added to the values already collected. |
---|
| 153 | |
---|
| 154 | The result might look like this: |
---|
| 155 | |
---|
| 156 | .. image:: screenshot01.png |
---|
| 157 | |
---|
| 158 | As you can see, there are a few options you can play with. You can |
---|
| 159 | filter the results by name, sort it by time or other stats, enable |
---|
| 160 | display of full paths, etc. |
---|
| 161 | |
---|
| 162 | When you stop paster, all collected data is removed. On the next start |
---|
| 163 | you can generate new one. |
---|
| 164 | |
---|
| 165 | Profiling a certain view |
---|
| 166 | ======================== |
---|
| 167 | |
---|
| 168 | Say we want to profile the performance of the ``index`` view created |
---|
| 169 | by the our application. To do this, we first have to install an |
---|
| 170 | instance of our application. Let's call it simply ``app``. |
---|
| 171 | |
---|
| 172 | So, create an instance of your application under the name ``app`` in |
---|
| 173 | the grok UI admin interface. |
---|
| 174 | |
---|
| 175 | Now we can access |
---|
| 176 | |
---|
| 177 | http://localhost:8080/app |
---|
| 178 | |
---|
| 179 | and the usual index page will (hopefully) appear. |
---|
| 180 | |
---|
| 181 | If we go back to the profiler, however, we will see the summed up |
---|
| 182 | values of *all* requests we did up to now - including all the actions |
---|
| 183 | in the admin interface etc. we are not interested in. |
---|
| 184 | |
---|
| 185 | We therefore clear the current data by clicking on ``clear`` on the |
---|
| 186 | profiler page. |
---|
| 187 | |
---|
| 188 | Now we access the page we want to examine directly and go to the above |
---|
| 189 | URL directly. |
---|
| 190 | |
---|
| 191 | When we now go back to the profiler, we only see the values of the |
---|
| 192 | last request. That's the data we are interested in. |
---|
| 193 | |
---|
| 194 | Profiling mass requests |
---|
| 195 | ======================= |
---|
| 196 | |
---|
| 197 | Very often a single request to a view does not give us reliable data: |
---|
| 198 | too many factors can influence the request to make its values not very |
---|
| 199 | representative. What we often want are *many* requests and the average |
---|
| 200 | values appearing here. |
---|
| 201 | |
---|
| 202 | This means for our view: we want to do several hundreds requests to |
---|
| 203 | the same view. But as we are lazy, we don't want to press the reload |
---|
| 204 | button several hundred or even thousand times. Luckily there are tools |
---|
| 205 | available, which can do that for us. |
---|
| 206 | |
---|
| 207 | One of this tools is the apache benchmarking tool ``ab`` from the |
---|
| 208 | Apache project. On Ubuntu systems it is automatically installed, if |
---|
| 209 | you have the apache webserver installed. |
---|
| 210 | |
---|
| 211 | With ``ab`` (apache benchmarking) we can trigger 1,000 requests to our |
---|
| 212 | index page with one command:: |
---|
| 213 | |
---|
| 214 | $ ab -n1000 -c4 http://127.0.0.1/app/@@index |
---|
| 215 | |
---|
| 216 | This will give us 1,000 requests, of which at most four are triggered |
---|
| 217 | concurrently, to the URL http://127.0.0.1/app/@@index. Please don't do |
---|
| 218 | this on foreign machines! |
---|
| 219 | |
---|
| 220 | The result might look like this:: |
---|
| 221 | |
---|
| 222 | Benchmarking 127.0.0.1 (be patient) |
---|
| 223 | Completed 100 requests |
---|
| 224 | Completed 200 requests |
---|
| 225 | Completed 300 requests |
---|
| 226 | Completed 400 requests |
---|
| 227 | Completed 500 requests |
---|
| 228 | Completed 600 requests |
---|
| 229 | Completed 700 requests |
---|
| 230 | Completed 800 requests |
---|
| 231 | Completed 900 requests |
---|
| 232 | Finished 1000 requests |
---|
| 233 | |
---|
| 234 | |
---|
| 235 | Server Software: PasteWSGIServer/0.5 |
---|
| 236 | Server Hostname: 127.0.0.1 |
---|
| 237 | Server Port: 8080 |
---|
| 238 | |
---|
| 239 | Document Path: /app/@@index |
---|
| 240 | Document Length: 198 bytes |
---|
| 241 | |
---|
| 242 | Concurrency Level: 4 |
---|
| 243 | Time taken for tests: 38.297797 seconds |
---|
| 244 | Complete requests: 1000 |
---|
| 245 | Failed requests: 0 |
---|
| 246 | Write errors: 0 |
---|
| 247 | Total transferred: 448000 bytes |
---|
| 248 | HTML transferred: 198000 bytes |
---|
| 249 | Requests per second: 26.11 [#/sec] (mean) |
---|
| 250 | Time per request: 153.191 [ms] (mean) |
---|
| 251 | Time per request: 38.298 [ms] (mean, across all concurrent requests) |
---|
| 252 | Transfer rate: 11.41 [Kbytes/sec] received |
---|
| 253 | |
---|
| 254 | Connection Times (ms) |
---|
| 255 | min mean[+/-sd] median max |
---|
| 256 | Connect: 0 0 0.0 0 0 |
---|
| 257 | Processing: 94 152 17.3 151 232 |
---|
| 258 | Waiting: 86 151 17.3 150 231 |
---|
| 259 | Total: 94 152 17.3 151 232 |
---|
| 260 | |
---|
| 261 | Percentage of the requests served within a certain time (ms) |
---|
| 262 | 50% 151 |
---|
| 263 | 66% 153 |
---|
| 264 | 75% 156 |
---|
| 265 | 80% 158 |
---|
| 266 | 90% 176 |
---|
| 267 | 95% 189 |
---|
| 268 | 98% 203 |
---|
| 269 | 99% 215 |
---|
| 270 | 100% 232 (longest request) |
---|
| 271 | |
---|
| 272 | Also this benchmarking results can be interesting. But we want to know |
---|
| 273 | more about the functions called during this mass request and how much |
---|
| 274 | time they spent each. This can be seen, if we now go back to the |
---|
| 275 | browser and open |
---|
| 276 | |
---|
| 277 | http://localhost:8080/__profile__ |
---|
| 278 | |
---|
| 279 | again. |
---|
| 280 | |
---|
| 281 | .. note:: ``ab``, of course, is a simple, rude method for stress |
---|
| 282 | testing. Although it provides many options, you might want |
---|
| 283 | to look for more sophisticated methods to generate tons of |
---|
| 284 | requests. ``multi-mechanize`` is such a tool that can kill |
---|
| 285 | your server with the same power as ``ab`` but does it |
---|
| 286 | smarter. |
---|
| 287 | |
---|
| 288 | Turning the Data into a Graph |
---|
| 289 | ============================= |
---|
| 290 | |
---|
| 291 | All this is very nice, but sometimes a picture tells more than |
---|
| 292 | thousand words. So let's turn all this data into some graph. |
---|
| 293 | |
---|
| 294 | As repoze.profile_ cannot do this for us, we have to 'export' the |
---|
| 295 | collected data first. |
---|
| 296 | |
---|
| 297 | Exporting Profiling Data |
---|
| 298 | ======================== |
---|
| 299 | |
---|
| 300 | The web frontend provided by repoze.profile_ is very nice for |
---|
| 301 | analyzing ad-hoc. But sometimes we want to have the data 'exported' to |
---|
| 302 | process it further with other tools or simply archiving the results. |
---|
| 303 | |
---|
| 304 | Luckily we can do so by grabbing the file ``wsgi.prof`` which contains |
---|
| 305 | all the data presented in the web interface. This file is created by |
---|
| 306 | repoze.profile_ while working and in the top of the project |
---|
| 307 | directory. In fact the profiler stores all collected data in that |
---|
| 308 | ``wsgi.prof`` file also for own usage. |
---|
| 309 | |
---|
| 310 | Be careful: when you click ``clear`` in the webinterface, then the |
---|
| 311 | file will vanish. Also stopping ``paster`` will make it disappear. So |
---|
| 312 | copy it to some secure location where we can process the data further |
---|
| 313 | while the web server is still running (and after you did all requests |
---|
| 314 | you want to analyze). |
---|
| 315 | |
---|
| 316 | Because repoze.profile_ makes use of the standard Python profiler in |
---|
| 317 | the :mod:`profile` or :mod:`cProfile` module, the data in the |
---|
| 318 | ``wsgi.prof`` file conforms to output generated by these profilers. |
---|
| 319 | |
---|
| 320 | Converting the Data into dot-format |
---|
| 321 | =================================== |
---|
| 322 | |
---|
| 323 | One of the more advanced tools to create graphs from profiling |
---|
| 324 | information is ``dot`` from graphviz_. To make use of it, we first |
---|
| 325 | have to convert the data in ``wsgi.prof`` file into something |
---|
| 326 | ``dot``-compatible. |
---|
| 327 | |
---|
| 328 | There is a tool available, which can do the job for us, a Python |
---|
| 329 | script named ``GProf2Dot`` which is available at: |
---|
| 330 | |
---|
| 331 | http://code.google.com/p/jrfonseca/wiki/Gprof2Dot |
---|
| 332 | |
---|
| 333 | Download the script from: |
---|
| 334 | |
---|
| 335 | http://jrfonseca.googlecode.com/svn/trunk/gprof2dot/gprof2dot.py |
---|
| 336 | |
---|
| 337 | We can now turn our profiling data into a ``dot`` graph by doing:: |
---|
| 338 | |
---|
| 339 | $ python grprof2dot.py -f pstats -o wsgi.prof.dot wsgi.prof |
---|
| 340 | |
---|
| 341 | This will turn our input file ``wsgi.prof`` of format `pstats` |
---|
| 342 | (=Python stats) into a ``dot``-file named ``wsgi.prof.dot``. |
---|
| 343 | |
---|
| 344 | Converting the dot file into Graphics |
---|
| 345 | ===================================== |
---|
| 346 | |
---|
| 347 | Now we can do the last step and turn our dot file into a nice graphics |
---|
| 348 | file. For this we need of course the dot programme, which on Ubuntu |
---|
| 349 | systems can be easily installed doing:: |
---|
| 350 | |
---|
| 351 | $ sudo apt-get install dot |
---|
| 352 | |
---|
| 353 | Afterwards we do the final transformation by:: |
---|
| 354 | |
---|
| 355 | $ dot -Tpng -omygraph.png wsgi.prof.dot |
---|
| 356 | |
---|
| 357 | This will generate a PNG file where we can see in terms of graphs |
---|
| 358 | where a request (or thousands thereof) spends most of the |
---|
| 359 | time. ``dot`` can generate **huge** graphs. A tiny snippet of a sample |
---|
| 360 | is shown below. |
---|
| 361 | |
---|
| 362 | .. image:: screenshot02.png |
---|
| 363 | |
---|
| 364 | All the used tools (``ab``, ``dot``, ``gprof2dot``) provide a huge |
---|
| 365 | bunch of options you might want to explore further. This way we can |
---|
| 366 | generate more or less complete graphs (leaving out functions of little |
---|
| 367 | impact), coulours etc. |
---|
| 368 | |
---|
| 369 | In the end you hopefully know more about your application and where it |
---|
| 370 | spent its time. |
---|
| 371 | |
---|
| 372 | .. _`repoze.profile`: http://pypi.python.org/pypi/repoze.profile |
---|
| 373 | .. _`Paste`: http://pythonpaste.org/ |
---|
| 374 | .. _`WSGI`: http://www.python.org/peps/pep-0333.html |
---|
| 375 | .. _`graphviz`: http://www.graphviz.org/ |
---|
| 376 | .. _`GProf2Dot`: http://code.google.com/p/jrfonseca/wiki/Gprof2Dot |
---|