1 |
efrain |
1 |
<html>
|
|
|
2 |
<title>
|
|
|
3 |
XHProf Documentation (Draft)
|
|
|
4 |
</title>
|
|
|
5 |
<body>
|
|
|
6 |
<h3>XHProf Documentation (Draft)</h3>
|
|
|
7 |
|
|
|
8 |
<b>Contents</b>
|
|
|
9 |
<ol>
|
|
|
10 |
<li><a href="#introduction">Introduction</a></li>
|
|
|
11 |
|
|
|
12 |
<li><a href="#overview">XHProf Overview</a></li>
|
|
|
13 |
|
|
|
14 |
<li><a href="#installation">Installing the XHProf extension</a></li>
|
|
|
15 |
|
|
|
16 |
<li><a href="#using_extension">Profiling using XHProf</a></li>
|
|
|
17 |
|
|
|
18 |
<li><a href="#ui_setup">Setting up the XHProf UI</a></li>
|
|
|
19 |
|
|
|
20 |
<li><a href="#production_notes">Notes on using XHProf in production</a></li>
|
|
|
21 |
|
|
|
22 |
<li><a href="#sampling_mode">Lightweight Sampling Mode</a>
|
|
|
23 |
|
|
|
24 |
<li><a href="#misc">Additional features</a></li>
|
|
|
25 |
|
|
|
26 |
<li><a href="#dependencies">Dependencies</a></li>
|
|
|
27 |
|
|
|
28 |
<li><a href="#credits">Acknowledgements</a></li>
|
|
|
29 |
</ol>
|
|
|
30 |
|
|
|
31 |
<ol>
|
|
|
32 |
<li><a name="introduction"><h2>Introduction</h2></a>
|
|
|
33 |
|
|
|
34 |
<p>XHProf is a hierarchical profiler for PHP. It reports
|
|
|
35 |
function-level call counts and <a href="#inclusive">inclusive</a> and
|
|
|
36 |
<a href="#exclusive">exclusive</a> metrics such as wall (elapsed)
|
|
|
37 |
time, CPU time and memory usage. A function's profile can be broken
|
|
|
38 |
down by callers or callees. The raw data collection component is
|
|
|
39 |
implemented in C as a PHP Zend extension called
|
|
|
40 |
<code><b>xhprof</b></code>. XHProf has a simple HTML based user
|
|
|
41 |
interface (written in PHP). The browser based UI for viewing profiler
|
|
|
42 |
results makes it easy to view results or to share results with peers.
|
|
|
43 |
A callgraph image view is also supported.
|
|
|
44 |
|
|
|
45 |
<p>XHProf reports can often be helpful in understanding the structure
|
|
|
46 |
of the code being executed. The hierarchical nature of the reports can
|
|
|
47 |
be used to determine, for example, what chain of calls led to a
|
|
|
48 |
particular function getting called.
|
|
|
49 |
|
|
|
50 |
<p>XHProf supports ability to compare two runs (a.k.a. "diff" reports)
|
|
|
51 |
or aggregate data from multiple runs. Diff and aggregate reports, much
|
|
|
52 |
like single run reports, offer "flat" as well as "hierarchical" views
|
|
|
53 |
of the profile.
|
|
|
54 |
|
|
|
55 |
<p>XHProf is a light-weight instrumentation based profiler. During the
|
|
|
56 |
data collection phase, it keeps track of call counts and inclusive
|
|
|
57 |
metrics for arcs in the dynamic callgraph of a program. It computes
|
|
|
58 |
exclusive metrics in the reporting/post processing phase. XHProf
|
|
|
59 |
handles recursive functions by detecting cycles in the callgraph at
|
|
|
60 |
data collection time itself and avoiding the cycles by giving unique
|
|
|
61 |
depth qualified names for the recursive invocations.
|
|
|
62 |
</p>
|
|
|
63 |
|
|
|
64 |
<p>XHProf's light-weight nature and aggregation capabilities make it
|
|
|
65 |
well suited for collecting "function-level" performance statistics
|
|
|
66 |
from production environments. [See <a
|
|
|
67 |
href="#production_notes">additional notes</a> for use in production.]
|
|
|
68 |
|
|
|
69 |
<ul>
|
|
|
70 |
<hr>
|
|
|
71 |
|
|
|
72 |
<p>XHProfLive (not part of the open source kit), for example, is a
|
|
|
73 |
system-wide performance monitoring system in use at Facebook that is
|
|
|
74 |
built on top of XHProf. XHProfLive continually gathers function-level
|
|
|
75 |
profiler data from production tier by running a sample of page
|
|
|
76 |
requests under XHProf. XHProfLive then aggregates the profile data
|
|
|
77 |
corresponding to individual requests by various dimensions such as
|
|
|
78 |
time, page type, and can help answer a variety of questions such as:
|
|
|
79 |
What is the function-level profile for a specific page? How expensive
|
|
|
80 |
is function "foo" across all pages, or on a specific page? What
|
|
|
81 |
functions regressed most in the last hour/day/week? What is the
|
|
|
82 |
historical trend for execution time of a page/function? and so on.
|
|
|
83 |
|
|
|
84 |
<hr>
|
|
|
85 |
</ul>
|
|
|
86 |
|
|
|
87 |
|
|
|
88 |
<p>Originally developed at Facebook, XHProf was open sourced in Mar, 2009.</p>
|
|
|
89 |
|
|
|
90 |
|
|
|
91 |
</ul>
|
|
|
92 |
|
|
|
93 |
|
|
|
94 |
<li><a name="overview"><h2>XHProf Overview</h2></a>
|
|
|
95 |
|
|
|
96 |
<p>XHProf provides:
|
|
|
97 |
|
|
|
98 |
<ul>
|
|
|
99 |
<li><b>Flat profile</b> (<a href="sample-flat-view.jpg" >screenshot</a>)
|
|
|
100 |
|
|
|
101 |
<p>Function-level summary information such as number of calls,
|
|
|
102 |
inclusive/exclusive wall time, memory usage, and CPU time.
|
|
|
103 |
|
|
|
104 |
<p><li><b>Hierarchical profile (Parent/Child View)</b>
|
|
|
105 |
(<a href="sample-parent-child-view.jpg" >screenshot</a>)
|
|
|
106 |
|
|
|
107 |
<p>For each function, it provides a breakdown of calls and times per
|
|
|
108 |
parent (caller) & child (callee), such as:
|
|
|
109 |
|
|
|
110 |
<ul>
|
|
|
111 |
|
|
|
112 |
<li> what functions call a particular function and how many times?
|
|
|
113 |
|
|
|
114 |
<li> what functions does a particular function call?
|
|
|
115 |
|
|
|
116 |
<li> The total time spent under a function when called from a particular parent.
|
|
|
117 |
|
|
|
118 |
</ul>
|
|
|
119 |
|
|
|
120 |
<p><li><b>Diff Reports</b>
|
|
|
121 |
|
|
|
122 |
<p>You may want to compare data from two XHProf runs for various
|
|
|
123 |
reasons-- to figure out what's causing a regression between one
|
|
|
124 |
version of the code base to another, to evaluate the performance
|
|
|
125 |
improvement of a code change you are making, and so on.
|
|
|
126 |
|
|
|
127 |
<p>A diff report takes two runs as input and provides both flat
|
|
|
128 |
function-level diff information, and hierarchical information
|
|
|
129 |
(breakdown of diff by parent/children functions) for each function.
|
|
|
130 |
|
|
|
131 |
<p>The "flat" view (<a href="sample-diff-report-flat-view.jpg"
|
|
|
132 |
>sample screenshot</a>) in the diff report points out the top
|
|
|
133 |
regressions & improvements.
|
|
|
134 |
|
|
|
135 |
<p>Clicking on functions in the "flat" view of the diff report, leads
|
|
|
136 |
to the "hierarchical" (or parent/child) diff view of a function (<a href="sample-diff-report-parent-child-view.jpg"
|
|
|
137 |
>sample screenshot</a>). We can get a
|
|
|
138 |
breakdown of the diff by parent/children functions.
|
|
|
139 |
|
|
|
140 |
|
|
|
141 |
<p><li><b>Callgraph View</b> (<a href="sample-callgraph-image.jpg"
|
|
|
142 |
>sample screenshot</a>)
|
|
|
143 |
|
|
|
144 |
<p>The profile data can also be viewed as a callgraph. The callgraph
|
|
|
145 |
view highlights the critical path of the program.
|
|
|
146 |
|
|
|
147 |
|
|
|
148 |
<p><li><b>Memory Profile</b>
|
|
|
149 |
|
|
|
150 |
<p>XHProf's memory profile mode helps track functions that
|
|
|
151 |
allocate lots of memory.
|
|
|
152 |
|
|
|
153 |
<p>It is worth clarifying that that XHProf doesn't strictly track each
|
|
|
154 |
allocation/free operation. Rather it uses a more simplistic
|
|
|
155 |
scheme. It tracks the increase/decrease in the amount of memory
|
|
|
156 |
allocated to PHP between each function's entry and exit. It also
|
|
|
157 |
tracks increase/decrease in the amount of <b>peak</b> memory allocated to
|
|
|
158 |
PHP for each function.
|
|
|
159 |
|
|
|
160 |
<li>XHProf tracks <code>include, include_once, require and
|
|
|
161 |
require_once</code> operations as if they were functions. The name of
|
|
|
162 |
the file being included is used to generate the name for these <a
|
|
|
163 |
href="#include_funcs">"fake" functions</a>.
|
|
|
164 |
|
|
|
165 |
|
|
|
166 |
</ul>
|
|
|
167 |
|
|
|
168 |
<a name="Terminology"></a><h2>Terminology</h2>
|
|
|
169 |
<ol>
|
|
|
170 |
|
|
|
171 |
<a name="inclusive"></a><li><b>Inclusive Time (or Subtree Time)</b>:
|
|
|
172 |
Includes time spent in the function as well as in descendant functions
|
|
|
173 |
called from a given function.
|
|
|
174 |
|
|
|
175 |
<a name="exclusive"></a><li><b>Exclusive Time/Self Time</b>: Measures
|
|
|
176 |
time spent in the function itself. Does not include time in descendant
|
|
|
177 |
functions.
|
|
|
178 |
|
|
|
179 |
<li><b>Wall Time</b>: a.k.a. Elapsed time or wall clock time.
|
|
|
180 |
|
|
|
181 |
<li><b>CPU Time</b>: CPU time in user space + CPU time in kernel space
|
|
|
182 |
|
|
|
183 |
</ol>
|
|
|
184 |
<a name="Naming_convention_for_special_functions"></a><h2>Naming convention for special functions</h2>
|
|
|
185 |
|
|
|
186 |
<ol>
|
|
|
187 |
<p><li><code><b>main()</b></code>: a fictitious function that is at the root of the call graph.
|
|
|
188 |
|
|
|
189 |
<a name="include_funcs"></a>
|
|
|
190 |
<p><li><code><b>load::<filename></b></code>
|
|
|
191 |
and <code><b>run_init::<filename></b></code>:
|
|
|
192 |
|
|
|
193 |
<p>XHProf tracks PHP <code>include/require</code> operations as
|
|
|
194 |
function calls.
|
|
|
195 |
|
|
|
196 |
<p>For example, an <b>include "lib/common.php";</b> operation will
|
|
|
197 |
result in two XHProf function entries:
|
|
|
198 |
|
|
|
199 |
<ul>
|
|
|
200 |
|
|
|
201 |
<li> <code><b>load::lib/common.php</b></code> - This represents the work done by the
|
|
|
202 |
interpreter to compile/load the file. [Note: If you are using a PHP
|
|
|
203 |
opcode cache like APC, then the compile only happens on a cache miss
|
|
|
204 |
in APC.]
|
|
|
205 |
|
|
|
206 |
<li> <code><b>run_init::lib/common.php</b></code> - This represents
|
|
|
207 |
initialization code executed at the file scope as a result of the
|
|
|
208 |
include operation.
|
|
|
209 |
|
|
|
210 |
</ul>
|
|
|
211 |
|
|
|
212 |
<p><li><code><b>foo@<n></b></code>: Implies that this is a
|
|
|
213 |
recursive invocation of <code>foo()</code>, where <code><n></code> represents
|
|
|
214 |
the recursion depth. The recursion may be direct (such as due to
|
|
|
215 |
<code>foo()</code> --> <code>foo()</code>), or indirect (such as
|
|
|
216 |
due to </code>foo()</code> --> <code>goo()</code> --> foo()).
|
|
|
217 |
|
|
|
218 |
</ol>
|
|
|
219 |
|
|
|
220 |
|
|
|
221 |
<a name="Limitations"></a><h2>Limitations</h2>
|
|
|
222 |
|
|
|
223 |
<p>True hierarchical profilers keep track of a full call stack at
|
|
|
224 |
every data gathering point, and are later able to answer questions
|
|
|
225 |
like: what was the cost of the 3rd invokation of foo()? or what was
|
|
|
226 |
the cost of bar() when the call stack looked like
|
|
|
227 |
a()->b()->bar()?
|
|
|
228 |
|
|
|
229 |
</p>
|
|
|
230 |
|
|
|
231 |
<p>XHProf keeps track of only 1-level of calling context and is
|
|
|
232 |
therefore only able to answer questions about a function looking
|
|
|
233 |
either 1-level up or 1-level down. It turns out that in practice this
|
|
|
234 |
is sufficient for most use cases.
|
|
|
235 |
</p>
|
|
|
236 |
|
|
|
237 |
<p>To make this more concrete, take for instance the following
|
|
|
238 |
example.
|
|
|
239 |
</p>
|
|
|
240 |
|
|
|
241 |
<pre>
|
|
|
242 |
Say you have:
|
|
|
243 |
1 call from a() --> c()
|
|
|
244 |
1 call from b() --> c()
|
|
|
245 |
50 calls from c() --> d()
|
|
|
246 |
</pre>
|
|
|
247 |
|
|
|
248 |
<p>While XHProf can tell you that d() was called from c() 50 times, it
|
|
|
249 |
cannot tell you how many of those calls were triggered due to a()
|
|
|
250 |
vs. b(). [We could speculate that perhaps 25 were due to a() and 25
|
|
|
251 |
due to b(), but that's not necessarily true.]
|
|
|
252 |
</p>
|
|
|
253 |
|
|
|
254 |
<p>In practice however, this isn't a very big limitation.
|
|
|
255 |
</p>
|
|
|
256 |
|
|
|
257 |
<li><a name="installation"><h2>Installing the XHProf Extension</h2></a>
|
|
|
258 |
|
|
|
259 |
<p> The extension lives in the "extension/" sub-directory.
|
|
|
260 |
|
|
|
261 |
<ul><hr>
|
|
|
262 |
|
|
|
263 |
<p><b>Note:</b> A windows port hasn't been implemented yet. We have
|
|
|
264 |
tested <code>xhprof</code> on <b>Linux/FreeBSD</b> so far.
|
|
|
265 |
|
|
|
266 |
<p>Version 0.9.2 and above of XHProf is also expected to work on <b>Mac
|
|
|
267 |
OS</b>. [We have tested on Mac OS 10.5.]
|
|
|
268 |
|
|
|
269 |
<p><b>Note:</b> XHProf uses the RDTSC instruction (time stamp counter)
|
|
|
270 |
to implement a really low overhead timer for elapsed time. So at the
|
|
|
271 |
moment <code>xhprof</code> only works on <b>x86</b> architecture.
|
|
|
272 |
Also, since RDTSC values may not be synchronized across CPUs,
|
|
|
273 |
<code>xhprof</code> binds the program to a single CPU during the
|
|
|
274 |
profiling period.
|
|
|
275 |
|
|
|
276 |
<p>XHProf's RDTSC based timer functionality doesn't work correctly if
|
|
|
277 |
<b>SpeedStep</b> technology is turned on. This technology is available on
|
|
|
278 |
some Intel processors. [Note: Mac desktops and laptops typically have
|
|
|
279 |
SpeedStep turned on by default. To use XHProf, you'll need to disable
|
|
|
280 |
SpeedStep.]
|
|
|
281 |
|
|
|
282 |
<hr></ul>
|
|
|
283 |
|
|
|
284 |
<p> The steps
|
|
|
285 |
below should work for Linux/Unix environments.
|
|
|
286 |
|
|
|
287 |
|
|
|
288 |
<pre>
|
|
|
289 |
% cd <xhprof_source_directory>/extension/
|
|
|
290 |
% phpize
|
|
|
291 |
% ./configure --with-php-config=<path to php-config>
|
|
|
292 |
% make
|
|
|
293 |
% make install
|
|
|
294 |
% make test
|
|
|
295 |
</pre>
|
|
|
296 |
|
|
|
297 |
|
|
|
298 |
<p><a name="ini_file"></a><b>php.ini file</b>: You can update your
|
|
|
299 |
php.ini file to automatically load your extension. Add the following
|
|
|
300 |
to your php.ini file.
|
|
|
301 |
|
|
|
302 |
<pre>
|
|
|
303 |
[xhprof]
|
|
|
304 |
extension=xhprof.so
|
|
|
305 |
;
|
|
|
306 |
; directory used by default implementation of the iXHProfRuns
|
|
|
307 |
; interface (namely, the XHProfRuns_Default class) for storing
|
|
|
308 |
; XHProf runs.
|
|
|
309 |
;
|
|
|
310 |
xhprof.output_dir=<directory_for_storing_xhprof_runs>
|
|
|
311 |
</pre>
|
|
|
312 |
|
|
|
313 |
|
|
|
314 |
<li><a name="using_extension"><h2>Profiling using XHProf</h2></a>
|
|
|
315 |
|
|
|
316 |
<p>Test generating raw profiler data using a sample test program like:
|
|
|
317 |
|
|
|
318 |
<ul>
|
|
|
319 |
<p><b>foo.php</b>
|
|
|
320 |
<pre>
|
|
|
321 |
<?php
|
|
|
322 |
|
|
|
323 |
function bar($x) {
|
|
|
324 |
if ($x > 0) {
|
|
|
325 |
bar($x - 1);
|
|
|
326 |
}
|
|
|
327 |
}
|
|
|
328 |
|
|
|
329 |
function foo() {
|
|
|
330 |
for ($idx = 0; $idx < 2; $idx++) {
|
|
|
331 |
bar($idx);
|
|
|
332 |
$x = strlen("abc");
|
|
|
333 |
}
|
|
|
334 |
}
|
|
|
335 |
|
|
|
336 |
// start profiling
|
|
|
337 |
<b>xhprof_enable();</b>
|
|
|
338 |
|
|
|
339 |
// run program
|
|
|
340 |
foo();
|
|
|
341 |
|
|
|
342 |
// stop profiler
|
|
|
343 |
<b>$xhprof_data = xhprof_disable();</b>
|
|
|
344 |
|
|
|
345 |
// display raw xhprof data for the profiler run
|
|
|
346 |
print_r($xhprof_data);
|
|
|
347 |
</pre>
|
|
|
348 |
</ul>
|
|
|
349 |
|
|
|
350 |
<p><b>Run the above test program:</b>
|
|
|
351 |
|
|
|
352 |
<pre>
|
|
|
353 |
% php -dextension=xhprof.so foo.php
|
|
|
354 |
</pre>
|
|
|
355 |
|
|
|
356 |
<p><b>You should get an output like:</b>
|
|
|
357 |
|
|
|
358 |
<pre>
|
|
|
359 |
Array
|
|
|
360 |
(
|
|
|
361 |
[foo==>bar] => Array
|
|
|
362 |
(
|
|
|
363 |
[ct] => 2 # 2 calls to bar() from foo()
|
|
|
364 |
[wt] => 27 # inclusive time in bar() when called from foo()
|
|
|
365 |
)
|
|
|
366 |
|
|
|
367 |
[foo==>strlen] => Array
|
|
|
368 |
(
|
|
|
369 |
[ct] => 2
|
|
|
370 |
[wt] => 2
|
|
|
371 |
)
|
|
|
372 |
|
|
|
373 |
[bar==>bar@1] => Array # a recursive call to bar()
|
|
|
374 |
(
|
|
|
375 |
[ct] => 1
|
|
|
376 |
[wt] => 2
|
|
|
377 |
)
|
|
|
378 |
|
|
|
379 |
[main()==>foo] => Array
|
|
|
380 |
(
|
|
|
381 |
[ct] => 1
|
|
|
382 |
[wt] => 74
|
|
|
383 |
)
|
|
|
384 |
|
|
|
385 |
[main()==>xhprof_disable] => Array
|
|
|
386 |
(
|
|
|
387 |
[ct] => 1
|
|
|
388 |
[wt] => 0
|
|
|
389 |
)
|
|
|
390 |
|
|
|
391 |
[main()] => Array # fake symbol representing root
|
|
|
392 |
(
|
|
|
393 |
[ct] => 1
|
|
|
394 |
[wt] => 83
|
|
|
395 |
)
|
|
|
396 |
|
|
|
397 |
)
|
|
|
398 |
</pre>
|
|
|
399 |
|
|
|
400 |
<p><b>Note:</b> The raw data only contains "inclusive" metrics. For
|
|
|
401 |
example, the wall time metric in the raw data represents inclusive
|
|
|
402 |
time in microsecs. Exclusive times for any function are computed
|
|
|
403 |
during the analysis/reporting phase.
|
|
|
404 |
|
|
|
405 |
<p><b>Note:</b> By default only call counts & elapsed time is profiled.
|
|
|
406 |
You can optionally also profile CPU time and/or memory usage. Replace,
|
|
|
407 |
|
|
|
408 |
<ul><pre>
|
|
|
409 |
<b>xhprof_enable();</b>
|
|
|
410 |
</pre></ul>
|
|
|
411 |
in the above program with, for example:
|
|
|
412 |
<ul><pre>
|
|
|
413 |
<b>xhprof_enable(XHPROF_FLAGS_CPU + XHPROF_FLAGS_MEMORY)</b>;
|
|
|
414 |
</pre></ul>
|
|
|
415 |
|
|
|
416 |
<p><b>You should now get an output like:</b>
|
|
|
417 |
|
|
|
418 |
<pre>
|
|
|
419 |
Array
|
|
|
420 |
(
|
|
|
421 |
[foo==>bar] => Array
|
|
|
422 |
(
|
|
|
423 |
[ct] => 2 # number of calls to bar() from foo()
|
|
|
424 |
[wt] => 37 # time in bar() when called from foo()
|
|
|
425 |
[cpu] => 0 # cpu time in bar() when called from foo()
|
|
|
426 |
[mu] => 2208 # change in PHP memory usage in bar() when called from foo()
|
|
|
427 |
[pmu] => 0 # change in PHP peak memory usage in bar() when called from foo()
|
|
|
428 |
)
|
|
|
429 |
|
|
|
430 |
[foo==>strlen] => Array
|
|
|
431 |
(
|
|
|
432 |
[ct] => 2
|
|
|
433 |
[wt] => 3
|
|
|
434 |
[cpu] => 0
|
|
|
435 |
[mu] => 624
|
|
|
436 |
[pmu] => 0
|
|
|
437 |
)
|
|
|
438 |
|
|
|
439 |
[bar==>bar@1] => Array
|
|
|
440 |
(
|
|
|
441 |
[ct] => 1
|
|
|
442 |
[wt] => 2
|
|
|
443 |
[cpu] => 0
|
|
|
444 |
[mu] => 856
|
|
|
445 |
[pmu] => 0
|
|
|
446 |
)
|
|
|
447 |
|
|
|
448 |
[main()==>foo] => Array
|
|
|
449 |
(
|
|
|
450 |
[ct] => 1
|
|
|
451 |
[wt] => 104
|
|
|
452 |
[cpu] => 0
|
|
|
453 |
[mu] => 4168
|
|
|
454 |
[pmu] => 0
|
|
|
455 |
)
|
|
|
456 |
|
|
|
457 |
[main()==>xhprof_disable] => Array
|
|
|
458 |
(
|
|
|
459 |
[ct] => 1
|
|
|
460 |
[wt] => 1
|
|
|
461 |
[cpu] => 0
|
|
|
462 |
[mu] => 344
|
|
|
463 |
[pmu] => 0
|
|
|
464 |
)
|
|
|
465 |
|
|
|
466 |
[main()] => Array
|
|
|
467 |
(
|
|
|
468 |
[ct] => 1
|
|
|
469 |
[wt] => 139
|
|
|
470 |
[cpu] => 0
|
|
|
471 |
[mu] => 5936
|
|
|
472 |
[pmu] => 0
|
|
|
473 |
)
|
|
|
474 |
|
|
|
475 |
)
|
|
|
476 |
</pre>
|
|
|
477 |
|
|
|
478 |
<p><b>Skipping builtin functions during profiling</b>
|
|
|
479 |
|
|
|
480 |
<p>By default PHP builtin functions (such as <code>strlen</code>) are
|
|
|
481 |
profiled. If you do not want to profile builtin functions (to either
|
|
|
482 |
reduce the overhead of profiling further or size of generated raw
|
|
|
483 |
data), you can use the <code><b>XHPROF_FLAGS_NO_BUILTINS</b></code>
|
|
|
484 |
flag as in for example:
|
|
|
485 |
|
|
|
486 |
<ul><pre>
|
|
|
487 |
// do not profile builtin functions
|
|
|
488 |
<b>xhprof_enable(XHPROF_FLAGS_NO_BUILTINS)</b>;
|
|
|
489 |
</pre></ul>
|
|
|
490 |
|
|
|
491 |
|
|
|
492 |
<p><b>Ignoring specific functions during profiling (0.9.2 or higher)</b>
|
|
|
493 |
|
|
|
494 |
<p>Starting with release 0.9.2 of xhprof, you can tell XHProf to
|
|
|
495 |
ignore a specified list of functions during profiling. This allows you
|
|
|
496 |
to ignore, for example, functions used for indirect function calls
|
|
|
497 |
such as <code>call_user_func</code> and
|
|
|
498 |
<code>call_user_func_array</code>. These intermediate functions
|
|
|
499 |
unnecessarily complicate the call hierarchy and make the XHProf
|
|
|
500 |
reports harder to interpret since they muddle the parent-child
|
|
|
501 |
relationship for functions called indirectly.
|
|
|
502 |
|
|
|
503 |
<p> To specify the list of functions to be ignored during profiling
|
|
|
504 |
use the 2nd (optional) argument to <code>xhprof_enable</code>.
|
|
|
505 |
For example,
|
|
|
506 |
|
|
|
507 |
<pre>
|
|
|
508 |
<ul><pre>
|
|
|
509 |
<b>
|
|
|
510 |
// elapsed time profiling; ignore call_user_func* during profiling
|
|
|
511 |
xhprof_enable(0,
|
|
|
512 |
array('ignored_functions' => array('call_user_func',
|
|
|
513 |
'call_user_func_array')));
|
|
|
514 |
</b>
|
|
|
515 |
or,
|
|
|
516 |
<b>
|
|
|
517 |
// elapsed time + memory profiling; ignore call_user_func* during profiling
|
|
|
518 |
xhprof_enable(XHPROF_FLAGS_MEMORY,
|
|
|
519 |
array('ignored_functions' => array('call_user_func',
|
|
|
520 |
'call_user_func_array')));
|
|
|
521 |
</b>
|
|
|
522 |
</pre></ul>
|
|
|
523 |
|
|
|
524 |
|
|
|
525 |
</li>
|
|
|
526 |
|
|
|
527 |
<li><a name="ui_setup"><h2>Setting up XHProf UI</h2></a>
|
|
|
528 |
|
|
|
529 |
|
|
|
530 |
<ol>
|
|
|
531 |
|
|
|
532 |
<li><b>PHP source structure</b>
|
|
|
533 |
<p>The XHProf UI is implemented in PHP. The code resides in two
|
|
|
534 |
subdirectories, <code>xhprof_html/</code> and <code>xhprof_lib/</code>.
|
|
|
535 |
|
|
|
536 |
<p>The <code>xhprof_html</code> directory contains the 3 top-level PHP pages.
|
|
|
537 |
|
|
|
538 |
<ul>
|
|
|
539 |
<li><code>index.php</code>: For viewing a single run or diff report.
|
|
|
540 |
<li><code>callgraph.php</code>: For viewing a callgraph of a XHProf run as an image.
|
|
|
541 |
<li><code>typeahead.php</code>: Used implicitly for the function typeahead form
|
|
|
542 |
on a XHProf report.
|
|
|
543 |
</ul>
|
|
|
544 |
|
|
|
545 |
<p>The <code>xhprof_lib</code> directory contains supporting code for
|
|
|
546 |
display as well as analysis (computing flat profile info, computing
|
|
|
547 |
diffs, aggregating data from multiple runs, etc.).
|
|
|
548 |
|
|
|
549 |
<li><p><b>Web server config: </b> You'll need to make sure that the
|
|
|
550 |
<code>xhprof_html/</code> directory is accessible from your web server, and that
|
|
|
551 |
your web server is setup to serve PHP scripts.
|
|
|
552 |
|
|
|
553 |
<li><p><b>Managing XHProf Runs</b>
|
|
|
554 |
|
|
|
555 |
<p>Clients have flexibility in how they save the XHProf raw data
|
|
|
556 |
obtained from an XHProf run. The XHProf UI layer exposes an interface
|
|
|
557 |
iXHProfRuns (see xhprof_lib/utils/xhprof_runs.php) that clients can
|
|
|
558 |
implement. This allows the clients to tell the UI layer how to fetch
|
|
|
559 |
the data corresponding to a XHProf run.
|
|
|
560 |
|
|
|
561 |
<p>The XHProf UI libaries come with a default file based
|
|
|
562 |
implementation of the iXHProfRuns interface, namely
|
|
|
563 |
"XHProfRuns_Default" (also in xhprof_lib/utils/xhprof_runs.php).
|
|
|
564 |
This default implementation stores runs in the directory specified by
|
|
|
565 |
<a href="#ini_file"><b>xhprof.output_dir</b></a> INI parameter.
|
|
|
566 |
|
|
|
567 |
<p>A XHProf run must be uniquely identified by a namespace and a run
|
|
|
568 |
id.
|
|
|
569 |
|
|
|
570 |
|
|
|
571 |
|
|
|
572 |
<p><b>a) Saving XHProf data persistently</b>:
|
|
|
573 |
|
|
|
574 |
<p>Assuming you are using the default implementation
|
|
|
575 |
<code><b>XHProfRuns_Default</b></code> of the
|
|
|
576 |
<code><b>iXHProfRuns</b></code> interface, a typical XHProf run
|
|
|
577 |
followed by the save step might look something like:
|
|
|
578 |
|
|
|
579 |
|
|
|
580 |
<pre>
|
|
|
581 |
// start profiling
|
|
|
582 |
xhprof_enable();
|
|
|
583 |
|
|
|
584 |
// run program
|
|
|
585 |
....
|
|
|
586 |
|
|
|
587 |
// stop profiler
|
|
|
588 |
$xhprof_data = xhprof_disable();
|
|
|
589 |
|
|
|
590 |
//
|
|
|
591 |
// Saving the XHProf run
|
|
|
592 |
// using the default implementation of iXHProfRuns.
|
|
|
593 |
//
|
|
|
594 |
include_once $XHPROF_ROOT . "/xhprof_lib/utils/xhprof_lib.php";
|
|
|
595 |
include_once $XHPROF_ROOT . "/xhprof_lib/utils/xhprof_runs.php";
|
|
|
596 |
|
|
|
597 |
$xhprof_runs = new <b>XHProfRuns_Default()</b>;
|
|
|
598 |
|
|
|
599 |
// Save the run under a namespace "xhprof_foo".
|
|
|
600 |
//
|
|
|
601 |
// **NOTE**:
|
|
|
602 |
// By default save_run() will automatically generate a unique
|
|
|
603 |
// run id for you. [You can override that behavior by passing
|
|
|
604 |
// a run id (optional arg) to the save_run() method instead.]
|
|
|
605 |
//
|
|
|
606 |
<b>$run_id = $xhprof_runs->save_run($xhprof_data, "xhprof_foo");</b>
|
|
|
607 |
|
|
|
608 |
echo "---------------\n".
|
|
|
609 |
"Assuming you have set up the http based UI for \n".
|
|
|
610 |
"XHProf at some address, you can view run at \n".
|
|
|
611 |
"http://<xhprof-ui-address>/index.php?run=$run_id&source=xhprof_foo\n".
|
|
|
612 |
"---------------\n";
|
|
|
613 |
|
|
|
614 |
</pre>
|
|
|
615 |
|
|
|
616 |
<p>The above should save the run as a file in the directory specified
|
|
|
617 |
by the <code><b>xhprof.output_dir</b></code> INI parameter. The file's
|
|
|
618 |
name might be something like
|
|
|
619 |
<b><code>49bafaa3a3f66.xhprof_foo</code></b>; the two parts being the
|
|
|
620 |
run id ("49bafaa3a3f66") and the namespace ("xhprof_foo"). [If you
|
|
|
621 |
want to create/assign run ids yourself (such as a database sequence
|
|
|
622 |
number, or a timestamp), you can explicitly pass in the run id to the
|
|
|
623 |
<code>save_run</code> method.
|
|
|
624 |
|
|
|
625 |
|
|
|
626 |
<p><b>b) Using your own implementation of iXHProfRuns</b>
|
|
|
627 |
|
|
|
628 |
<p> If you decide you want your XHProf runs to be stored differently
|
|
|
629 |
(either in a compressed format, in an alternate place such as DB,
|
|
|
630 |
etc.) database, you'll need to implement a class that implements the
|
|
|
631 |
iXHProfRuns() interface.
|
|
|
632 |
|
|
|
633 |
<p> You'll also need to modify the 3 main PHP entry pages (index.php,
|
|
|
634 |
callgraph.php, typeahead.php) in the "xhprof_html/" directory to use
|
|
|
635 |
the new class instead of the default class <code>XHProfRuns_Default</code>.
|
|
|
636 |
Change this line in the 3 files.
|
|
|
637 |
|
|
|
638 |
<pre>
|
|
|
639 |
$xhprof_runs_impl = new XHProfRuns_Default();
|
|
|
640 |
</pre>
|
|
|
641 |
|
|
|
642 |
<p>You'll also need to "include" the file that implements your class in
|
|
|
643 |
the above files.
|
|
|
644 |
|
|
|
645 |
|
|
|
646 |
<li><p><b>Accessing runs from UI</b>
|
|
|
647 |
|
|
|
648 |
<p><b>a) Viewing a Single Run Report</b>
|
|
|
649 |
|
|
|
650 |
<p>To view the report for run id say <run_id> and namespace
|
|
|
651 |
<namespace> use a URL of the form:
|
|
|
652 |
|
|
|
653 |
<p><code>
|
|
|
654 |
http://<xhprof-ui-address>/index.php?run=<run_id>&source=<namespace>
|
|
|
655 |
</code>
|
|
|
656 |
|
|
|
657 |
<p>For example,
|
|
|
658 |
<p><code>
|
|
|
659 |
http://<xhprof-ui-address>/index.php?run=49bafaa3a3f66&source=xhprof_foo
|
|
|
660 |
</code>
|
|
|
661 |
|
|
|
662 |
|
|
|
663 |
<p><b>b) Viewing a Diff Report</b>
|
|
|
664 |
|
|
|
665 |
<p>To view the report for run ids say <run_id1> and
|
|
|
666 |
<run_id2> in namespace <namespace> use a URL of the form:
|
|
|
667 |
|
|
|
668 |
<p><code>
|
|
|
669 |
http://<xhprof-ui-address>/index.php?<b>run1=<run_id1>&run2=<run_id2></b>&source=<namespace>
|
|
|
670 |
</code>
|
|
|
671 |
|
|
|
672 |
<p><b>c) Aggregate Report</b>
|
|
|
673 |
|
|
|
674 |
<p>You can also specify a set of run ids for which you want an aggregated view/report.
|
|
|
675 |
|
|
|
676 |
<p>Say you have three XHProf runs with ids 1, 2 & 3 in namespace
|
|
|
677 |
"benchmark". To view an aggregate report of these runs:
|
|
|
678 |
|
|
|
679 |
<ul><p><code>
|
|
|
680 |
http://<xhprof-ui-address>/index.php?<b>run=1,2,3</b>&source=benchmark
|
|
|
681 |
</code></p></ul>
|
|
|
682 |
|
|
|
683 |
<p><b>Weighted aggregations</b>: Further suppose that the above three runs
|
|
|
684 |
correspond to three types of programs p1.php, p2.php and p3.php that
|
|
|
685 |
typically occur in a mix of 20%, 30%, 50% respectively. To view an
|
|
|
686 |
aggregate report that corresponds to a weighted average of these runs
|
|
|
687 |
using:
|
|
|
688 |
|
|
|
689 |
<ul><p><code>
|
|
|
690 |
http://<xhprof-ui-address>/index.php?<b>run=1,2,3&wts=20,30,50</b>&source=benchmark
|
|
|
691 |
</code></p></ul>
|
|
|
692 |
|
|
|
693 |
|
|
|
694 |
</ol>
|
|
|
695 |
|
|
|
696 |
<li><a name="production_notes"><h2>Notes on using XHProf in production</h2></a>
|
|
|
697 |
|
|
|
698 |
<p>Some observations/guidelines. Your mileage may vary:
|
|
|
699 |
|
|
|
700 |
<ul>
|
|
|
701 |
|
|
|
702 |
<li>CPU timer (getrusage) on Linux has high overheads. It is also
|
|
|
703 |
coarse grained (millisec accuracy rather than microsec level) to be
|
|
|
704 |
useful at function level. Therefore, the skew in reported numbers
|
|
|
705 |
when using XHPROF_FLAGS_CPU mode tends to be higher.
|
|
|
706 |
|
|
|
707 |
<p>We recommend using elapsed time + memory profiling mode in
|
|
|
708 |
production. [Note: The additional overhead of memory profiling
|
|
|
709 |
mode is really low.]
|
|
|
710 |
|
|
|
711 |
<p><ul><pre><b>
|
|
|
712 |
// elapsed time profiling (default) + memory profiling
|
|
|
713 |
xhprof_enable(XHPROF_FLAGS_MEMORY);
|
|
|
714 |
</b></pre>
|
|
|
715 |
</ul></p>
|
|
|
716 |
|
|
|
717 |
|
|
|
718 |
<li>Profiling a random sample of pages/requests works well in capturing
|
|
|
719 |
data that is representative of your production workload.
|
|
|
720 |
|
|
|
721 |
<p>To profile say 1/10000 of your requests, instrument the beginning of
|
|
|
722 |
your request processing with something along the lines of:
|
|
|
723 |
|
|
|
724 |
<p><ul><pre><code>
|
|
|
725 |
if (mt_rand(1, 10000) == 1) {
|
|
|
726 |
xhprof_enable(XHPROF_FLAGS_MEMORY);
|
|
|
727 |
$xhprof_on = true;
|
|
|
728 |
}
|
|
|
729 |
</code></pre></ul></p>
|
|
|
730 |
|
|
|
731 |
<p>At the end of the request (or in a request shutdown function), you might
|
|
|
732 |
then do something like:
|
|
|
733 |
|
|
|
734 |
<p><ul><pre><code>
|
|
|
735 |
if ($xhprof_on) {
|
|
|
736 |
// stop profiler
|
|
|
737 |
$xhprof_data = xhprof_disable();
|
|
|
738 |
|
|
|
739 |
// save $xhprof_data somewhere (say a central DB)
|
|
|
740 |
...
|
|
|
741 |
}
|
|
|
742 |
</code></pre></ul></p>
|
|
|
743 |
|
|
|
744 |
<p> You can then rollup/aggregate these individual profiles by time
|
|
|
745 |
(e.g., 5 minutely/hourly/daily basis), page/request type,or other
|
|
|
746 |
dimensions using <a href="#xhprof_aggregate_runs"><code><b>xhprof_aggregate_runs()</b></code></a>.
|
|
|
747 |
|
|
|
748 |
</ul>
|
|
|
749 |
|
|
|
750 |
|
|
|
751 |
<li><a name="sampling_mode"><h2>Lightweight Sampling Mode</h2></a>
|
|
|
752 |
|
|
|
753 |
<p>The xhprof extension also provides a very light weight <b>sampling
|
|
|
754 |
mode</b>. The sampling interval is 0.1 secs. Samples record the full
|
|
|
755 |
function call stack. The sampling mode can be useful if an extremely
|
|
|
756 |
low overhead means of doing performance monitoring and diagnostics is
|
|
|
757 |
desired.
|
|
|
758 |
|
|
|
759 |
<p> The relevant functions exposed by the extension for using the
|
|
|
760 |
sampling mode are <code><b>xhprof_sample_enable()</b></code> and
|
|
|
761 |
<code><b>xhprof_sample_disable()</b></code>.
|
|
|
762 |
|
|
|
763 |
|
|
|
764 |
<p>[<b>TBD</b>: more detailed documentation on sampling mode.]
|
|
|
765 |
|
|
|
766 |
|
|
|
767 |
<li><a name="misc"><h2>Additional Features</h2></a></li>
|
|
|
768 |
|
|
|
769 |
<p>The <code><b>xhprof_lib/utils/xhprof_lib.php</b></code> file contains
|
|
|
770 |
additional library functions that can be used for manipulating/
|
|
|
771 |
aggregating XHProf runs.
|
|
|
772 |
|
|
|
773 |
<p>For example:
|
|
|
774 |
|
|
|
775 |
<ul>
|
|
|
776 |
|
|
|
777 |
<a name="xhprof_aggregate_runs"></a>
|
|
|
778 |
<p><li><code><b>xhprof_aggregate_runs()</b></code>:
|
|
|
779 |
can be used to aggregate multiple XHProf runs into a single run. This
|
|
|
780 |
can be helpful for building a system-wide "function-level" performance
|
|
|
781 |
monitoring tool using XHProf. [For example, you might to roll up
|
|
|
782 |
XHProf runs sampled from production periodically to generate hourly,
|
|
|
783 |
daily, reports.]
|
|
|
784 |
|
|
|
785 |
<p><li><code><b>xhprof_prune_run()</b></code>: Aggregating large number of
|
|
|
786 |
XHProf runs (especially if they correspond to different types of
|
|
|
787 |
programs) can result in the callgraph size becoming too large. You can
|
|
|
788 |
use <code>xhprof_prune_run</code> function to prune the callgraph data
|
|
|
789 |
by editing out subtrees that account for a very small portion of the
|
|
|
790 |
total time.
|
|
|
791 |
|
|
|
792 |
</ul>
|
|
|
793 |
|
|
|
794 |
|
|
|
795 |
<ol>
|
|
|
796 |
|
|
|
797 |
</ol>
|
|
|
798 |
|
|
|
799 |
<li><a name="dependencies"><h2>Dependencies</h2></a></li>
|
|
|
800 |
|
|
|
801 |
<ul>
|
|
|
802 |
<li><b>JQuery Javascript</b>: For tooltips and function name
|
|
|
803 |
typeahead, we make use of JQuery's javascript libraries. JQuery is
|
|
|
804 |
available under both a MIT and GPL licencse
|
|
|
805 |
(http://docs.jquery.com/Licensing). The relevant JQuery code, used by
|
|
|
806 |
XHProf, is in the <code>xhprof_html/jquery</code> subdirectory.
|
|
|
807 |
|
|
|
808 |
<li><b>dot (image generation utility):</b> The callgraph image
|
|
|
809 |
visualization ([View Callgraph]) feature relies on the presence of
|
|
|
810 |
Graphviz "dot" utility in your path. "dot" is a utility to
|
|
|
811 |
draw/generate an image for a directed graph.
|
|
|
812 |
|
|
|
813 |
</ul>
|
|
|
814 |
<li><a name="credits"><h2>Acknowledgements</h2></a>
|
|
|
815 |
|
|
|
816 |
<p>The HTML-based navigational interface for browsing profiler results
|
|
|
817 |
is inspired by that of a similar tool that exists for Oracle's stored
|
|
|
818 |
procedure language, PL/SQL. But that's where the similarity ends; the
|
|
|
819 |
internals of the profiler itself are quite different.
|
|
|
820 |
|
|
|
821 |
</li>
|
|
|
822 |
|
|
|
823 |
</ol>
|
|
|
824 |
|
|
|
825 |
</body>
|
|
|
826 |
</html>
|