1 | Profiling |
---|
2 | ********* |
---|
3 | |
---|
4 | *or: What Did My Application Do in the Afternoon?* |
---|
5 | |
---|
6 | This how-to explains how you can profile your Grok application or |
---|
7 | Grok/Zope itself. This way you get an idea of the bottlenecks of your |
---|
8 | application or in other words: where it spends its time. |
---|
9 | |
---|
10 | .. note:: This text is an updated version of my how-to as it can seen |
---|
11 | on the old Grok site (no, I won't give a link to that |
---|
12 | slightly outdated stuff). |
---|
13 | |
---|
14 | Profiling with Grok |
---|
15 | =================== |
---|
16 | |
---|
17 | When it comes to a web framework like Grok, profiling is not as easy |
---|
18 | as with commandline or desktop applications. You normally want to |
---|
19 | trigger certain requests and see, in which parts of your code how much |
---|
20 | time or memory was spent. |
---|
21 | |
---|
22 | Here is how you can do this. |
---|
23 | |
---|
24 | Prerequisites |
---|
25 | ============= |
---|
26 | |
---|
27 | We assume you know how to create a Grok project (your application) and |
---|
28 | have one already at hand. |
---|
29 | |
---|
30 | Installing a profiler |
---|
31 | ===================== |
---|
32 | |
---|
33 | There are some profiling tools available with every Python |
---|
34 | installation. Usually, they are started together with some desktop- or |
---|
35 | commandline application, wrapping it in a way to measure internal calls |
---|
36 | and other things interesting for developers. |
---|
37 | |
---|
38 | With web-frameworks, however, we often only want to check certain |
---|
39 | requests. Luckily, with `Paste`_ and `WSGI`_ we have a mechanism, that |
---|
40 | can serve in a similar way: we can define a (WSGI_) pipeline of apps |
---|
41 | and filters that wrap each other, similar to the approach of regular |
---|
42 | non-web applications. We also have a profiler already available, that |
---|
43 | *is* some WSGI_ application itself and can profile things when some |
---|
44 | HTTP request comes in: `repoze.profile`_ is even configurable over a |
---|
45 | web frontend. |
---|
46 | |
---|
47 | Install repoze.profile_ |
---|
48 | ======================= |
---|
49 | |
---|
50 | In the ``buildout.cfg`` of your project add the repoze.profile_ egg to list |
---|
51 | of eggs of your application. Look out for a section named ``[app]``, which |
---|
52 | could read like this or similar:: |
---|
53 | |
---|
54 | |
---|
55 | ... |
---|
56 | [app] |
---|
57 | recipe = zc.recipe.egg |
---|
58 | eggs = cptapp |
---|
59 | z3c.evalexception>=2.0 |
---|
60 | Paste |
---|
61 | PasteScript |
---|
62 | PasteDeploy |
---|
63 | repoze.profile |
---|
64 | interpreter = python-console |
---|
65 | ... |
---|
66 | |
---|
67 | Here the added ``repoze.profile`` stanza is important. |
---|
68 | |
---|
69 | Now run:: |
---|
70 | |
---|
71 | $ bin/buildout |
---|
72 | |
---|
73 | to fetch the egg from the net if it is not already available and to |
---|
74 | make it known to the generated scripts. |
---|
75 | |
---|
76 | Create a ``profiler.ini`` |
---|
77 | ========================= |
---|
78 | |
---|
79 | To make use of the new egg we must tell paster about it. This is done |
---|
80 | by an appropriate initialization file we create now:: |
---|
81 | |
---|
82 | # file: profiler.ini |
---|
83 | [app:myapp] |
---|
84 | use = egg:sample |
---|
85 | |
---|
86 | [filter:myprofiler] |
---|
87 | use = egg:repoze.profile#profile |
---|
88 | |
---|
89 | [pipeline:main] |
---|
90 | pipeline = myprofiler myapp |
---|
91 | |
---|
92 | [server:main] |
---|
93 | use = egg:Paste#http |
---|
94 | host = 127.0.0.1 |
---|
95 | port = 8080 |
---|
96 | |
---|
97 | [DEFAULT] |
---|
98 | # set the name of the zope.conf file |
---|
99 | zope_conf = %(here)s/zope.conf |
---|
100 | |
---|
101 | Here we tell paster (the HTTP server) in the ``pipeline`` to filter |
---|
102 | each web request passing it first to repoze.profile_ which then calls |
---|
103 | our own ``sample`` application. This way the profiler can start its |
---|
104 | job and measure all calls done by the following application call. |
---|
105 | |
---|
106 | It is crucial, that you use the name of *your* project egg here in the |
---|
107 | ``[app:myapp]`` section. In the sample config above we assume that the |
---|
108 | project (application) is called ``sample``, so that the egg with our |
---|
109 | application is called the same. |
---|
110 | |
---|
111 | .. note:: modern Grok projects put their paster configuration files as |
---|
112 | *templates* into the local ``etc/`` directory and do not |
---|
113 | call them ``debug.ini``, ``profile.ini``, etc. but |
---|
114 | ``debug.ini.in``, ``profile.ini.in``, etc (note the trailing |
---|
115 | ``.in``). This is not covered here. |
---|
116 | |
---|
117 | Put this new file in the same directory as where your ``zope.conf`` |
---|
118 | lives (*not* ``zope.conf.in``). For newer projects this is |
---|
119 | ``parts/etc/``. |
---|
120 | |
---|
121 | Start Profiling |
---|
122 | =============== |
---|
123 | |
---|
124 | With the given setup we can start profiling by:: |
---|
125 | |
---|
126 | $ bin/paster serve parts/etc/profiler.ini |
---|
127 | |
---|
128 | If your ``profiler.ini`` file resides elsewhere, you must use a |
---|
129 | different location, of course. |
---|
130 | |
---|
131 | .. note:: you dont't *have* to create a separate profiler config. You |
---|
132 | can, of course, update an existing ``deploy.ini`` or |
---|
133 | ``debug.ini``. But as profiling takes more time than calls |
---|
134 | without profiling, it is a good idea to enable profiling |
---|
135 | only on request, i.e. when starting paster with a |
---|
136 | respective configuration. |
---|
137 | |
---|
138 | The server will start as usual and you can do everything you like with |
---|
139 | it. |
---|
140 | |
---|
141 | Browsing the Profiler |
---|
142 | ===================== |
---|
143 | |
---|
144 | To get to the profiler, enter the following URL: |
---|
145 | |
---|
146 | http://localhost:8080/__profile__ |
---|
147 | |
---|
148 | This brings us to the profiler web frontend. If you have browsed your |
---|
149 | instance before, you will get some values about the timings of last |
---|
150 | requests. If not, then browse a bit over your application to collect |
---|
151 | some data. The data is collected 'in background' during each request |
---|
152 | and added to the values already collected. |
---|
153 | |
---|
154 | The result might look like this: |
---|
155 | |
---|
156 | .. image:: screenshot01.png |
---|
157 | |
---|
158 | As you can see, there are a few options you can play with. You can |
---|
159 | filter the results by name, sort it by time or other stats, enable |
---|
160 | display of full paths, etc. |
---|
161 | |
---|
162 | When you stop paster, all collected data is removed. On the next start |
---|
163 | you can generate new one. |
---|
164 | |
---|
165 | Profiling a certain view |
---|
166 | ======================== |
---|
167 | |
---|
168 | Say we want to profile the performance of the ``index`` view created |
---|
169 | by the our application. To do this, we first have to install an |
---|
170 | instance of our application. Let's call it simply ``app``. |
---|
171 | |
---|
172 | So, create an instance of your application under the name ``app`` in |
---|
173 | the grok UI admin interface. |
---|
174 | |
---|
175 | Now we can access |
---|
176 | |
---|
177 | http://localhost:8080/app |
---|
178 | |
---|
179 | and the usual index page will (hopefully) appear. |
---|
180 | |
---|
181 | If we go back to the profiler, however, we will see the summed up |
---|
182 | values of *all* requests we did up to now - including all the actions |
---|
183 | in the admin interface etc. we are not interested in. |
---|
184 | |
---|
185 | We therefore clear the current data by clicking on ``clear`` on the |
---|
186 | profiler page. |
---|
187 | |
---|
188 | Now we access the page we want to examine directly and go to the above |
---|
189 | URL directly. |
---|
190 | |
---|
191 | When we now go back to the profiler, we only see the values of the |
---|
192 | last request. That's the data we are interested in. |
---|
193 | |
---|
194 | Profiling mass requests |
---|
195 | ======================= |
---|
196 | |
---|
197 | Very often a single request to a view does not give us reliable data: |
---|
198 | too many factors can influence the request to make its values not very |
---|
199 | representative. What we often want are *many* requests and the average |
---|
200 | values appearing here. |
---|
201 | |
---|
202 | This means for our view: we want to do several hundreds requests to |
---|
203 | the same view. But as we are lazy, we don't want to press the reload |
---|
204 | button several hundred or even thousand times. Luckily there are tools |
---|
205 | available, which can do that for us. |
---|
206 | |
---|
207 | One of this tools is the apache benchmarking tool ``ab`` from the |
---|
208 | Apache project. On Ubuntu systems it is automatically installed, if |
---|
209 | you have the apache webserver installed. |
---|
210 | |
---|
211 | With ``ab`` (apache benchmarking) we can trigger 1,000 requests to our |
---|
212 | index page with one command:: |
---|
213 | |
---|
214 | $ ab -n1000 -c4 http://127.0.0.1/app/@@index |
---|
215 | |
---|
216 | This will give us 1,000 requests, of which at most four are triggered |
---|
217 | concurrently, to the URL http://127.0.0.1/app/@@index. Please don't do |
---|
218 | this on foreign machines! |
---|
219 | |
---|
220 | The result might look like this:: |
---|
221 | |
---|
222 | Benchmarking 127.0.0.1 (be patient) |
---|
223 | Completed 100 requests |
---|
224 | Completed 200 requests |
---|
225 | Completed 300 requests |
---|
226 | Completed 400 requests |
---|
227 | Completed 500 requests |
---|
228 | Completed 600 requests |
---|
229 | Completed 700 requests |
---|
230 | Completed 800 requests |
---|
231 | Completed 900 requests |
---|
232 | Finished 1000 requests |
---|
233 | |
---|
234 | |
---|
235 | Server Software: PasteWSGIServer/0.5 |
---|
236 | Server Hostname: 127.0.0.1 |
---|
237 | Server Port: 8080 |
---|
238 | |
---|
239 | Document Path: /app/@@index |
---|
240 | Document Length: 198 bytes |
---|
241 | |
---|
242 | Concurrency Level: 4 |
---|
243 | Time taken for tests: 38.297797 seconds |
---|
244 | Complete requests: 1000 |
---|
245 | Failed requests: 0 |
---|
246 | Write errors: 0 |
---|
247 | Total transferred: 448000 bytes |
---|
248 | HTML transferred: 198000 bytes |
---|
249 | Requests per second: 26.11 [#/sec] (mean) |
---|
250 | Time per request: 153.191 [ms] (mean) |
---|
251 | Time per request: 38.298 [ms] (mean, across all concurrent requests) |
---|
252 | Transfer rate: 11.41 [Kbytes/sec] received |
---|
253 | |
---|
254 | Connection Times (ms) |
---|
255 | min mean[+/-sd] median max |
---|
256 | Connect: 0 0 0.0 0 0 |
---|
257 | Processing: 94 152 17.3 151 232 |
---|
258 | Waiting: 86 151 17.3 150 231 |
---|
259 | Total: 94 152 17.3 151 232 |
---|
260 | |
---|
261 | Percentage of the requests served within a certain time (ms) |
---|
262 | 50% 151 |
---|
263 | 66% 153 |
---|
264 | 75% 156 |
---|
265 | 80% 158 |
---|
266 | 90% 176 |
---|
267 | 95% 189 |
---|
268 | 98% 203 |
---|
269 | 99% 215 |
---|
270 | 100% 232 (longest request) |
---|
271 | |
---|
272 | Also this benchmarking results can be interesting. But we want to know |
---|
273 | more about the functions called during this mass request and how much |
---|
274 | time they spent each. This can be seen, if we now go back to the |
---|
275 | browser and open |
---|
276 | |
---|
277 | http://localhost:8080/__profile__ |
---|
278 | |
---|
279 | again. |
---|
280 | |
---|
281 | .. note:: ``ab``, of course, is a simple, rude method for stress |
---|
282 | testing. Although it provides many options, you might want |
---|
283 | to look for more sophisticated methods to generate tons of |
---|
284 | requests. ``multi-mechanize`` is such a tool that can kill |
---|
285 | your server with the same power as ``ab`` but does it |
---|
286 | smarter. |
---|
287 | |
---|
288 | Turning the Data into a Graph |
---|
289 | ============================= |
---|
290 | |
---|
291 | All this is very nice, but sometimes a picture tells more than |
---|
292 | thousand words. So let's turn all this data into some graph. |
---|
293 | |
---|
294 | As repoze.profile_ cannot do this for us, we have to 'export' the |
---|
295 | collected data first. |
---|
296 | |
---|
297 | Exporting Profiling Data |
---|
298 | ======================== |
---|
299 | |
---|
300 | The web frontend provided by repoze.profile_ is very nice for |
---|
301 | analyzing ad-hoc. But sometimes we want to have the data 'exported' to |
---|
302 | process it further with other tools or simply archiving the results. |
---|
303 | |
---|
304 | Luckily we can do so by grabbing the file ``wsgi.prof`` which contains |
---|
305 | all the data presented in the web interface. This file is created by |
---|
306 | repoze.profile_ while working and in the top of the project |
---|
307 | directory. In fact the profiler stores all collected data in that |
---|
308 | ``wsgi.prof`` file also for own usage. |
---|
309 | |
---|
310 | Be careful: when you click ``clear`` in the webinterface, then the |
---|
311 | file will vanish. Also stopping ``paster`` will make it disappear. So |
---|
312 | copy it to some secure location where we can process the data further |
---|
313 | while the web server is still running (and after you did all requests |
---|
314 | you want to analyze). |
---|
315 | |
---|
316 | Because repoze.profile_ makes use of the standard Python profiler in |
---|
317 | the :mod:`profile` or :mod:`cProfile` module, the data in the |
---|
318 | ``wsgi.prof`` file conforms to output generated by these profilers. |
---|
319 | |
---|
320 | Converting the Data into dot-format |
---|
321 | =================================== |
---|
322 | |
---|
323 | One of the more advanced tools to create graphs from profiling |
---|
324 | information is ``dot`` from graphviz_. To make use of it, we first |
---|
325 | have to convert the data in ``wsgi.prof`` file into something |
---|
326 | ``dot``-compatible. |
---|
327 | |
---|
328 | There is a tool available, which can do the job for us, a Python |
---|
329 | script named ``GProf2Dot`` which is available at: |
---|
330 | |
---|
331 | http://code.google.com/p/jrfonseca/wiki/Gprof2Dot |
---|
332 | |
---|
333 | Download the script from: |
---|
334 | |
---|
335 | http://jrfonseca.googlecode.com/svn/trunk/gprof2dot/gprof2dot.py |
---|
336 | |
---|
337 | We can now turn our profiling data into a ``dot`` graph by doing:: |
---|
338 | |
---|
339 | $ python grprof2dot.py -f pstats -o wsgi.prof.dot wsgi.prof |
---|
340 | |
---|
341 | This will turn our input file ``wsgi.prof`` of format `pstats` |
---|
342 | (=Python stats) into a ``dot``-file named ``wsgi.prof.dot``. |
---|
343 | |
---|
344 | Converting the dot file into Graphics |
---|
345 | ===================================== |
---|
346 | |
---|
347 | Now we can do the last step and turn our dot file into a nice graphics |
---|
348 | file. For this we need of course the dot programme, which on Ubuntu |
---|
349 | systems can be easily installed doing:: |
---|
350 | |
---|
351 | $ sudo apt-get install dot |
---|
352 | |
---|
353 | Afterwards we do the final transformation by:: |
---|
354 | |
---|
355 | $ dot -Tpng -omygraph.png wsgi.prof.dot |
---|
356 | |
---|
357 | This will generate a PNG file where we can see in terms of graphs |
---|
358 | where a request (or thousands thereof) spends most of the |
---|
359 | time. ``dot`` can generate **huge** graphs. A tiny snippet of a sample |
---|
360 | is shown below. |
---|
361 | |
---|
362 | .. image:: screenshot02.png |
---|
363 | |
---|
364 | All the used tools (``ab``, ``dot``, ``gprof2dot``) provide a huge |
---|
365 | bunch of options you might want to explore further. This way we can |
---|
366 | generate more or less complete graphs (leaving out functions of little |
---|
367 | impact), coulours etc. |
---|
368 | |
---|
369 | In the end you hopefully know more about your application and where it |
---|
370 | spent its time. |
---|
371 | |
---|
372 | .. _`repoze.profile`: http://pypi.python.org/pypi/repoze.profile |
---|
373 | .. _`Paste`: http://pythonpaste.org/ |
---|
374 | .. _`WSGI`: http://www.python.org/peps/pep-0333.html |
---|
375 | .. _`graphviz`: http://www.graphviz.org/ |
---|
376 | .. _`GProf2Dot`: http://code.google.com/p/jrfonseca/wiki/Gprof2Dot |
---|