AutorÃa | Ultima modificación | Ver Log |
<html><title>XHProf Documentation (Draft)</title><body><h3>XHProf Documentation (Draft)</h3><b>Contents</b><ol><li><a href="#introduction">Introduction</a></li><li><a href="#overview">XHProf Overview</a></li><li><a href="#installation">Installing the XHProf extension</a></li><li><a href="#using_extension">Profiling using XHProf</a></li><li><a href="#ui_setup">Setting up the XHProf UI</a></li><li><a href="#production_notes">Notes on using XHProf in production</a></li><li><a href="#sampling_mode">Lightweight Sampling Mode</a><li><a href="#misc">Additional features</a></li><li><a href="#dependencies">Dependencies</a></li><li><a href="#credits">Acknowledgements</a></li></ol><ol><li><a name="introduction"><h2>Introduction</h2></a><p>XHProf is a hierarchical profiler for PHP. It reportsfunction-level call counts and <a href="#inclusive">inclusive</a> and<a href="#exclusive">exclusive</a> metrics such as wall (elapsed)time, CPU time and memory usage. A function's profile can be brokendown by callers or callees. The raw data collection component isimplemented in C as a PHP Zend extension called<code><b>xhprof</b></code>. XHProf has a simple HTML based userinterface (written in PHP). The browser based UI for viewing profilerresults makes it easy to view results or to share results with peers.A callgraph image view is also supported.<p>XHProf reports can often be helpful in understanding the structureof the code being executed. The hierarchical nature of the reports canbe used to determine, for example, what chain of calls led to aparticular function getting called.<p>XHProf supports ability to compare two runs (a.k.a. "diff" reports)or aggregate data from multiple runs. Diff and aggregate reports, muchlike single run reports, offer "flat" as well as "hierarchical" viewsof the profile.<p>XHProf is a light-weight instrumentation based profiler. During thedata collection phase, it keeps track of call counts and inclusivemetrics for arcs in the dynamic callgraph of a program. It computesexclusive metrics in the reporting/post processing phase. XHProfhandles recursive functions by detecting cycles in the callgraph atdata collection time itself and avoiding the cycles by giving uniquedepth qualified names for the recursive invocations.</p><p>XHProf's light-weight nature and aggregation capabilities make itwell suited for collecting "function-level" performance statisticsfrom production environments. [See <ahref="#production_notes">additional notes</a> for use in production.]<ul><hr><p>XHProfLive (not part of the open source kit), for example, is asystem-wide performance monitoring system in use at Facebook that isbuilt on top of XHProf. XHProfLive continually gathers function-levelprofiler data from production tier by running a sample of pagerequests under XHProf. XHProfLive then aggregates the profile datacorresponding to individual requests by various dimensions such astime, page type, and can help answer a variety of questions such as:What is the function-level profile for a specific page? How expensiveis function "foo" across all pages, or on a specific page? Whatfunctions regressed most in the last hour/day/week? What is thehistorical trend for execution time of a page/function? and so on.<hr></ul><p>Originally developed at Facebook, XHProf was open sourced in Mar, 2009.</p></ul><li><a name="overview"><h2>XHProf Overview</h2></a><p>XHProf provides:<ul><li><b>Flat profile</b> (<a href="sample-flat-view.jpg" >screenshot</a>)<p>Function-level summary information such as number of calls,inclusive/exclusive wall time, memory usage, and CPU time.<p><li><b>Hierarchical profile (Parent/Child View)</b>(<a href="sample-parent-child-view.jpg" >screenshot</a>)<p>For each function, it provides a breakdown of calls and times perparent (caller) & child (callee), such as:<ul><li> what functions call a particular function and how many times?<li> what functions does a particular function call?<li> The total time spent under a function when called from a particular parent.</ul><p><li><b>Diff Reports</b><p>You may want to compare data from two XHProf runs for variousreasons-- to figure out what's causing a regression between oneversion of the code base to another, to evaluate the performanceimprovement of a code change you are making, and so on.<p>A diff report takes two runs as input and provides both flatfunction-level diff information, and hierarchical information(breakdown of diff by parent/children functions) for each function.<p>The "flat" view (<a href="sample-diff-report-flat-view.jpg">sample screenshot</a>) in the diff report points out the topregressions & improvements.<p>Clicking on functions in the "flat" view of the diff report, leadsto the "hierarchical" (or parent/child) diff view of a function (<a href="sample-diff-report-parent-child-view.jpg">sample screenshot</a>). We can get abreakdown of the diff by parent/children functions.<p><li><b>Callgraph View</b> (<a href="sample-callgraph-image.jpg">sample screenshot</a>)<p>The profile data can also be viewed as a callgraph. The callgraphview highlights the critical path of the program.<p><li><b>Memory Profile</b><p>XHProf's memory profile mode helps track functions thatallocate lots of memory.<p>It is worth clarifying that that XHProf doesn't strictly track eachallocation/free operation. Rather it uses a more simplisticscheme. It tracks the increase/decrease in the amount of memoryallocated to PHP between each function's entry and exit. It alsotracks increase/decrease in the amount of <b>peak</b> memory allocated toPHP for each function.<li>XHProf tracks <code>include, include_once, require andrequire_once</code> operations as if they were functions. The name ofthe file being included is used to generate the name for these <ahref="#include_funcs">"fake" functions</a>.</ul><a name="Terminology"></a><h2>Terminology</h2><ol><a name="inclusive"></a><li><b>Inclusive Time (or Subtree Time)</b>:Includes time spent in the function as well as in descendant functionscalled from a given function.<a name="exclusive"></a><li><b>Exclusive Time/Self Time</b>: Measurestime spent in the function itself. Does not include time in descendantfunctions.<li><b>Wall Time</b>: a.k.a. Elapsed time or wall clock time.<li><b>CPU Time</b>: CPU time in user space + CPU time in kernel space</ol><a name="Naming_convention_for_special_functions"></a><h2>Naming convention for special functions</h2><ol><p><li><code><b>main()</b></code>: a fictitious function that is at the root of the call graph.<a name="include_funcs"></a><p><li><code><b>load::<filename></b></code>and <code><b>run_init::<filename></b></code>:<p>XHProf tracks PHP <code>include/require</code> operations asfunction calls.<p>For example, an <b>include "lib/common.php";</b> operation willresult in two XHProf function entries:<ul><li> <code><b>load::lib/common.php</b></code> - This represents the work done by theinterpreter to compile/load the file. [Note: If you are using a PHPopcode cache like APC, then the compile only happens on a cache missin APC.]<li> <code><b>run_init::lib/common.php</b></code> - This representsinitialization code executed at the file scope as a result of theinclude operation.</ul><p><li><code><b>foo@<n></b></code>: Implies that this is arecursive invocation of <code>foo()</code>, where <code><n></code> representsthe recursion depth. The recursion may be direct (such as due to<code>foo()</code> --> <code>foo()</code>), or indirect (such asdue to </code>foo()</code> --> <code>goo()</code> --> foo()).</ol><a name="Limitations"></a><h2>Limitations</h2><p>True hierarchical profilers keep track of a full call stack atevery data gathering point, and are later able to answer questionslike: what was the cost of the 3rd invokation of foo()? or what wasthe cost of bar() when the call stack looked likea()->b()->bar()?</p><p>XHProf keeps track of only 1-level of calling context and istherefore only able to answer questions about a function lookingeither 1-level up or 1-level down. It turns out that in practice thisis sufficient for most use cases.</p><p>To make this more concrete, take for instance the followingexample.</p><pre>Say you have:1 call from a() --> c()1 call from b() --> c()50 calls from c() --> d()</pre><p>While XHProf can tell you that d() was called from c() 50 times, itcannot tell you how many of those calls were triggered due to a()vs. b(). [We could speculate that perhaps 25 were due to a() and 25due to b(), but that's not necessarily true.]</p><p>In practice however, this isn't a very big limitation.</p><li><a name="installation"><h2>Installing the XHProf Extension</h2></a><p> The extension lives in the "extension/" sub-directory.<ul><hr><p><b>Note:</b> A windows port hasn't been implemented yet. We havetested <code>xhprof</code> on <b>Linux/FreeBSD</b> so far.<p>Version 0.9.2 and above of XHProf is also expected to work on <b>MacOS</b>. [We have tested on Mac OS 10.5.]<p><b>Note:</b> XHProf uses the RDTSC instruction (time stamp counter)to implement a really low overhead timer for elapsed time. So at themoment <code>xhprof</code> only works on <b>x86</b> architecture.Also, since RDTSC values may not be synchronized across CPUs,<code>xhprof</code> binds the program to a single CPU during theprofiling period.<p>XHProf's RDTSC based timer functionality doesn't work correctly if<b>SpeedStep</b> technology is turned on. This technology is available onsome Intel processors. [Note: Mac desktops and laptops typically haveSpeedStep turned on by default. To use XHProf, you'll need to disableSpeedStep.]<hr></ul><p> The stepsbelow should work for Linux/Unix environments.<pre>% cd <xhprof_source_directory>/extension/% phpize% ./configure --with-php-config=<path to php-config>% make% make install% make test</pre><p><a name="ini_file"></a><b>php.ini file</b>: You can update yourphp.ini file to automatically load your extension. Add the followingto your php.ini file.<pre>[xhprof]extension=xhprof.so;; directory used by default implementation of the iXHProfRuns; interface (namely, the XHProfRuns_Default class) for storing; XHProf runs.;xhprof.output_dir=<directory_for_storing_xhprof_runs></pre><li><a name="using_extension"><h2>Profiling using XHProf</h2></a><p>Test generating raw profiler data using a sample test program like:<ul><p><b>foo.php</b><pre><?phpfunction bar($x) {if ($x > 0) {bar($x - 1);}}function foo() {for ($idx = 0; $idx < 2; $idx++) {bar($idx);$x = strlen("abc");}}// start profiling<b>xhprof_enable();</b>// run programfoo();// stop profiler<b>$xhprof_data = xhprof_disable();</b>// display raw xhprof data for the profiler runprint_r($xhprof_data);</pre></ul><p><b>Run the above test program:</b><pre>% php -dextension=xhprof.so foo.php</pre><p><b>You should get an output like:</b><pre>Array([foo==>bar] => Array([ct] => 2 # 2 calls to bar() from foo()[wt] => 27 # inclusive time in bar() when called from foo())[foo==>strlen] => Array([ct] => 2[wt] => 2)[bar==>bar@1] => Array # a recursive call to bar()([ct] => 1[wt] => 2)[main()==>foo] => Array([ct] => 1[wt] => 74)[main()==>xhprof_disable] => Array([ct] => 1[wt] => 0)[main()] => Array # fake symbol representing root([ct] => 1[wt] => 83))</pre><p><b>Note:</b> The raw data only contains "inclusive" metrics. Forexample, the wall time metric in the raw data represents inclusivetime in microsecs. Exclusive times for any function are computedduring the analysis/reporting phase.<p><b>Note:</b> By default only call counts & elapsed time is profiled.You can optionally also profile CPU time and/or memory usage. Replace,<ul><pre><b>xhprof_enable();</b></pre></ul>in the above program with, for example:<ul><pre><b>xhprof_enable(XHPROF_FLAGS_CPU + XHPROF_FLAGS_MEMORY)</b>;</pre></ul><p><b>You should now get an output like:</b><pre>Array([foo==>bar] => Array([ct] => 2 # number of calls to bar() from foo()[wt] => 37 # time in bar() when called from foo()[cpu] => 0 # cpu time in bar() when called from foo()[mu] => 2208 # change in PHP memory usage in bar() when called from foo()[pmu] => 0 # change in PHP peak memory usage in bar() when called from foo())[foo==>strlen] => Array([ct] => 2[wt] => 3[cpu] => 0[mu] => 624[pmu] => 0)[bar==>bar@1] => Array([ct] => 1[wt] => 2[cpu] => 0[mu] => 856[pmu] => 0)[main()==>foo] => Array([ct] => 1[wt] => 104[cpu] => 0[mu] => 4168[pmu] => 0)[main()==>xhprof_disable] => Array([ct] => 1[wt] => 1[cpu] => 0[mu] => 344[pmu] => 0)[main()] => Array([ct] => 1[wt] => 139[cpu] => 0[mu] => 5936[pmu] => 0))</pre><p><b>Skipping builtin functions during profiling</b><p>By default PHP builtin functions (such as <code>strlen</code>) areprofiled. If you do not want to profile builtin functions (to eitherreduce the overhead of profiling further or size of generated rawdata), you can use the <code><b>XHPROF_FLAGS_NO_BUILTINS</b></code>flag as in for example:<ul><pre>// do not profile builtin functions<b>xhprof_enable(XHPROF_FLAGS_NO_BUILTINS)</b>;</pre></ul><p><b>Ignoring specific functions during profiling (0.9.2 or higher)</b><p>Starting with release 0.9.2 of xhprof, you can tell XHProf toignore a specified list of functions during profiling. This allows youto ignore, for example, functions used for indirect function callssuch as <code>call_user_func</code> and<code>call_user_func_array</code>. These intermediate functionsunnecessarily complicate the call hierarchy and make the XHProfreports harder to interpret since they muddle the parent-childrelationship for functions called indirectly.<p> To specify the list of functions to be ignored during profilinguse the 2nd (optional) argument to <code>xhprof_enable</code>.For example,<pre><ul><pre><b>// elapsed time profiling; ignore call_user_func* during profilingxhprof_enable(0,array('ignored_functions' => array('call_user_func','call_user_func_array')));</b>or,<b>// elapsed time + memory profiling; ignore call_user_func* during profilingxhprof_enable(XHPROF_FLAGS_MEMORY,array('ignored_functions' => array('call_user_func','call_user_func_array')));</b></pre></ul></li><li><a name="ui_setup"><h2>Setting up XHProf UI</h2></a><ol><li><b>PHP source structure</b><p>The XHProf UI is implemented in PHP. The code resides in twosubdirectories, <code>xhprof_html/</code> and <code>xhprof_lib/</code>.<p>The <code>xhprof_html</code> directory contains the 3 top-level PHP pages.<ul><li><code>index.php</code>: For viewing a single run or diff report.<li><code>callgraph.php</code>: For viewing a callgraph of a XHProf run as an image.<li><code>typeahead.php</code>: Used implicitly for the function typeahead formon a XHProf report.</ul><p>The <code>xhprof_lib</code> directory contains supporting code fordisplay as well as analysis (computing flat profile info, computingdiffs, aggregating data from multiple runs, etc.).<li><p><b>Web server config: </b> You'll need to make sure that the<code>xhprof_html/</code> directory is accessible from your web server, and thatyour web server is setup to serve PHP scripts.<li><p><b>Managing XHProf Runs</b><p>Clients have flexibility in how they save the XHProf raw dataobtained from an XHProf run. The XHProf UI layer exposes an interfaceiXHProfRuns (see xhprof_lib/utils/xhprof_runs.php) that clients canimplement. This allows the clients to tell the UI layer how to fetchthe data corresponding to a XHProf run.<p>The XHProf UI libaries come with a default file basedimplementation of the iXHProfRuns interface, namely"XHProfRuns_Default" (also in xhprof_lib/utils/xhprof_runs.php).This default implementation stores runs in the directory specified by<a href="#ini_file"><b>xhprof.output_dir</b></a> INI parameter.<p>A XHProf run must be uniquely identified by a namespace and a runid.<p><b>a) Saving XHProf data persistently</b>:<p>Assuming you are using the default implementation<code><b>XHProfRuns_Default</b></code> of the<code><b>iXHProfRuns</b></code> interface, a typical XHProf runfollowed by the save step might look something like:<pre>// start profilingxhprof_enable();// run program....// stop profiler$xhprof_data = xhprof_disable();//// Saving the XHProf run// using the default implementation of iXHProfRuns.//include_once $XHPROF_ROOT . "/xhprof_lib/utils/xhprof_lib.php";include_once $XHPROF_ROOT . "/xhprof_lib/utils/xhprof_runs.php";$xhprof_runs = new <b>XHProfRuns_Default()</b>;// Save the run under a namespace "xhprof_foo".//// **NOTE**:// By default save_run() will automatically generate a unique// run id for you. [You can override that behavior by passing// a run id (optional arg) to the save_run() method instead.]//<b>$run_id = $xhprof_runs->save_run($xhprof_data, "xhprof_foo");</b>echo "---------------\n"."Assuming you have set up the http based UI for \n"."XHProf at some address, you can view run at \n"."http://<xhprof-ui-address>/index.php?run=$run_id&source=xhprof_foo\n"."---------------\n";</pre><p>The above should save the run as a file in the directory specifiedby the <code><b>xhprof.output_dir</b></code> INI parameter. The file'sname might be something like<b><code>49bafaa3a3f66.xhprof_foo</code></b>; the two parts being therun id ("49bafaa3a3f66") and the namespace ("xhprof_foo"). [If youwant to create/assign run ids yourself (such as a database sequencenumber, or a timestamp), you can explicitly pass in the run id to the<code>save_run</code> method.<p><b>b) Using your own implementation of iXHProfRuns</b><p> If you decide you want your XHProf runs to be stored differently(either in a compressed format, in an alternate place such as DB,etc.) database, you'll need to implement a class that implements theiXHProfRuns() interface.<p> You'll also need to modify the 3 main PHP entry pages (index.php,callgraph.php, typeahead.php) in the "xhprof_html/" directory to usethe new class instead of the default class <code>XHProfRuns_Default</code>.Change this line in the 3 files.<pre>$xhprof_runs_impl = new XHProfRuns_Default();</pre><p>You'll also need to "include" the file that implements your class inthe above files.<li><p><b>Accessing runs from UI</b><p><b>a) Viewing a Single Run Report</b><p>To view the report for run id say <run_id> and namespace<namespace> use a URL of the form:<p><code>http://<xhprof-ui-address>/index.php?run=<run_id>&source=<namespace></code><p>For example,<p><code>http://<xhprof-ui-address>/index.php?run=49bafaa3a3f66&source=xhprof_foo</code><p><b>b) Viewing a Diff Report</b><p>To view the report for run ids say <run_id1> and<run_id2> in namespace <namespace> use a URL of the form:<p><code>http://<xhprof-ui-address>/index.php?<b>run1=<run_id1>&run2=<run_id2></b>&source=<namespace></code><p><b>c) Aggregate Report</b><p>You can also specify a set of run ids for which you want an aggregated view/report.<p>Say you have three XHProf runs with ids 1, 2 & 3 in namespace"benchmark". To view an aggregate report of these runs:<ul><p><code>http://<xhprof-ui-address>/index.php?<b>run=1,2,3</b>&source=benchmark</code></p></ul><p><b>Weighted aggregations</b>: Further suppose that the above three runscorrespond to three types of programs p1.php, p2.php and p3.php thattypically occur in a mix of 20%, 30%, 50% respectively. To view anaggregate report that corresponds to a weighted average of these runsusing:<ul><p><code>http://<xhprof-ui-address>/index.php?<b>run=1,2,3&wts=20,30,50</b>&source=benchmark</code></p></ul></ol><li><a name="production_notes"><h2>Notes on using XHProf in production</h2></a><p>Some observations/guidelines. Your mileage may vary:<ul><li>CPU timer (getrusage) on Linux has high overheads. It is alsocoarse grained (millisec accuracy rather than microsec level) to beuseful at function level. Therefore, the skew in reported numberswhen using XHPROF_FLAGS_CPU mode tends to be higher.<p>We recommend using elapsed time + memory profiling mode inproduction. [Note: The additional overhead of memory profilingmode is really low.]<p><ul><pre><b>// elapsed time profiling (default) + memory profilingxhprof_enable(XHPROF_FLAGS_MEMORY);</b></pre></ul></p><li>Profiling a random sample of pages/requests works well in capturingdata that is representative of your production workload.<p>To profile say 1/10000 of your requests, instrument the beginning ofyour request processing with something along the lines of:<p><ul><pre><code>if (mt_rand(1, 10000) == 1) {xhprof_enable(XHPROF_FLAGS_MEMORY);$xhprof_on = true;}</code></pre></ul></p><p>At the end of the request (or in a request shutdown function), you mightthen do something like:<p><ul><pre><code>if ($xhprof_on) {// stop profiler$xhprof_data = xhprof_disable();// save $xhprof_data somewhere (say a central DB)...}</code></pre></ul></p><p> You can then rollup/aggregate these individual profiles by time(e.g., 5 minutely/hourly/daily basis), page/request type,or otherdimensions using <a href="#xhprof_aggregate_runs"><code><b>xhprof_aggregate_runs()</b></code></a>.</ul><li><a name="sampling_mode"><h2>Lightweight Sampling Mode</h2></a><p>The xhprof extension also provides a very light weight <b>samplingmode</b>. The sampling interval is 0.1 secs. Samples record the fullfunction call stack. The sampling mode can be useful if an extremelylow overhead means of doing performance monitoring and diagnostics isdesired.<p> The relevant functions exposed by the extension for using thesampling mode are <code><b>xhprof_sample_enable()</b></code> and<code><b>xhprof_sample_disable()</b></code>.<p>[<b>TBD</b>: more detailed documentation on sampling mode.]<li><a name="misc"><h2>Additional Features</h2></a></li><p>The <code><b>xhprof_lib/utils/xhprof_lib.php</b></code> file containsadditional library functions that can be used for manipulating/aggregating XHProf runs.<p>For example:<ul><a name="xhprof_aggregate_runs"></a><p><li><code><b>xhprof_aggregate_runs()</b></code>:can be used to aggregate multiple XHProf runs into a single run. Thiscan be helpful for building a system-wide "function-level" performancemonitoring tool using XHProf. [For example, you might to roll upXHProf runs sampled from production periodically to generate hourly,daily, reports.]<p><li><code><b>xhprof_prune_run()</b></code>: Aggregating large number ofXHProf runs (especially if they correspond to different types ofprograms) can result in the callgraph size becoming too large. You canuse <code>xhprof_prune_run</code> function to prune the callgraph databy editing out subtrees that account for a very small portion of thetotal time.</ul><ol></ol><li><a name="dependencies"><h2>Dependencies</h2></a></li><ul><li><b>JQuery Javascript</b>: For tooltips and function nametypeahead, we make use of JQuery's javascript libraries. JQuery isavailable under both a MIT and GPL licencse(http://docs.jquery.com/Licensing). The relevant JQuery code, used byXHProf, is in the <code>xhprof_html/jquery</code> subdirectory.<li><b>dot (image generation utility):</b> The callgraph imagevisualization ([View Callgraph]) feature relies on the presence ofGraphviz "dot" utility in your path. "dot" is a utility todraw/generate an image for a directed graph.</ul><li><a name="credits"><h2>Acknowledgements</h2></a><p>The HTML-based navigational interface for browsing profiler resultsis inspired by that of a similar tool that exists for Oracle's storedprocedure language, PL/SQL. But that's where the similarity ends; theinternals of the profiler itself are quite different.</li></ol></body></html>