1<?xml version="1.0" encoding="ISO-8859-1"?> 2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 3<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!-- 4 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 5 This file is generated from xml source: DO NOT EDIT 6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 7 --> 8<title>Apache Performance Tuning - Apache HTTP Server</title> 9<link href="/style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" /> 10<link href="/style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" /> 11<link href="/style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" /><link rel="stylesheet" type="text/css" href="/style/css/prettify.css" /> 12<script src="/style/scripts/prettify.min.js" type="text/javascript"> 13</script> 14 15<link href="/images/favicon.ico" rel="shortcut icon" /></head> 16<body id="manual-page"><div id="page-header"> 17<p class="menu"><a href="/mod/">Modules</a> | <a href="/mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="/glossary.html">Glossary</a> | <a href="/sitemap.html">Sitemap</a></p> 18<p class="apache">Apache HTTP Server Version 2.4</p> 19<img alt="" src="/images/feather.gif" /></div> 20<div class="up"><a href="./"><img title="<-" alt="<-" src="/images/left.gif" /></a></div> 21<div id="path"> 22<a href="http://www.apache.org/">Apache</a> > <a href="http://httpd.apache.org/">HTTP Server</a> > <a href="http://httpd.apache.org/docs/">Documentation</a> > <a href="../">Version 2.4</a> > <a href="./">Miscellaneous Documentation</a></div><div id="page-content"><div id="preamble"><h1>Apache Performance Tuning</h1> 23<div class="toplang"> 24<p><span>Available Languages: </span><a href="/en/misc/perf-tuning.html" title="English"> en </a> | 25<a href="/fr/misc/perf-tuning.html" hreflang="fr" rel="alternate" title="Fran�ais"> fr </a> | 26<a href="/ko/misc/perf-tuning.html" hreflang="ko" rel="alternate" title="Korean"> ko </a> | 27<a href="/tr/misc/perf-tuning.html" hreflang="tr" rel="alternate" title="T�rk�e"> tr </a></p> 28</div> 29 30 31 <p>Apache 2.x is a general-purpose webserver, designed to 32 provide a balance of flexibility, portability, and performance. 33 Although it has not been designed specifically to set benchmark 34 records, Apache 2.x is capable of high performance in many 35 real-world situations.</p> 36 37 <p>Compared to Apache 1.3, release 2.x contains many additional 38 optimizations to increase throughput and scalability. Most of 39 these improvements are enabled by default. However, there are 40 compile-time and run-time configuration choices that can 41 significantly affect performance. This document describes the 42 options that a server administrator can configure to tune the 43 performance of an Apache 2.x installation. Some of these 44 configuration options enable the httpd to better take advantage 45 of the capabilities of the hardware and OS, while others allow 46 the administrator to trade functionality for speed.</p> 47 48 </div> 49<div id="quickview"><ul id="toc"><li><img alt="" src="/images/down.gif" /> <a href="#hardware">Hardware and Operating System Issues</a></li> 50<li><img alt="" src="/images/down.gif" /> <a href="#runtime">Run-Time Configuration Issues</a></li> 51<li><img alt="" src="/images/down.gif" /> <a href="#compiletime">Compile-Time Configuration Issues</a></li> 52<li><img alt="" src="/images/down.gif" /> <a href="#trace">Appendix: Detailed Analysis of a Trace</a></li> 53</ul><ul class="seealso"><li><a href="#comments_section">Comments</a></li></ul></div> 54<div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div> 55<div class="section"> 56<h2><a name="hardware" id="hardware">Hardware and Operating System Issues</a></h2> 57 58 59 60 <p>The single biggest hardware issue affecting webserver 61 performance is RAM. A webserver should never ever have to swap, 62 as swapping increases the latency of each request beyond a point 63 that users consider "fast enough". This causes users to hit 64 stop and reload, further increasing the load. You can, and 65 should, control the <code class="directive"><a href="/mod/mpm_common.html#maxrequestworkers">MaxRequestWorkers</a></code> setting so that your server 66 does not spawn so many children it starts swapping. This procedure 67 for doing this is simple: determine the size of your average Apache 68 process, by looking at your process list via a tool such as 69 <code>top</code>, and divide this into your total available memory, 70 leaving some room for other processes.</p> 71 72 <p>Beyond that the rest is mundane: get a fast enough CPU, a 73 fast enough network card, and fast enough disks, where "fast 74 enough" is something that needs to be determined by 75 experimentation.</p> 76 77 <p>Operating system choice is largely a matter of local 78 concerns. But some guidelines that have proven generally 79 useful are:</p> 80 81 <ul> 82 <li> 83 <p>Run the latest stable release and patchlevel of the 84 operating system that you choose. Many OS suppliers have 85 introduced significant performance improvements to their 86 TCP stacks and thread libraries in recent years.</p> 87 </li> 88 89 <li> 90 <p>If your OS supports a <code>sendfile(2)</code> system 91 call, make sure you install the release and/or patches 92 needed to enable it. (With Linux, for example, this means 93 using Linux 2.4 or later. For early releases of Solaris 8, 94 you may need to apply a patch.) On systems where it is 95 available, <code>sendfile</code> enables Apache 2 to deliver 96 static content faster and with lower CPU utilization.</p> 97 </li> 98 </ul> 99 100 </div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div> 101<div class="section"> 102<h2><a name="runtime" id="runtime">Run-Time Configuration Issues</a></h2> 103 104 105 106 <table class="related"><tr><th>Related Modules</th><th>Related Directives</th></tr><tr><td><ul><li><code class="module"><a href="/mod/mod_dir.html">mod_dir</a></code></li><li><code class="module"><a href="/mod/mpm_common.html">mpm_common</a></code></li><li><code class="module"><a href="/mod/mod_status.html">mod_status</a></code></li></ul></td><td><ul><li><code class="directive"><a href="/mod/core.html#allowoverride">AllowOverride</a></code></li><li><code class="directive"><a href="/mod/mod_dir.html#directoryindex">DirectoryIndex</a></code></li><li><code class="directive"><a href="/mod/core.html#hostnamelookups">HostnameLookups</a></code></li><li><code class="directive"><a href="/mod/core.html#enablemmap">EnableMMAP</a></code></li><li><code class="directive"><a href="/mod/core.html#enablesendfile">EnableSendfile</a></code></li><li><code class="directive"><a href="/mod/core.html#keepalivetimeout">KeepAliveTimeout</a></code></li><li><code class="directive"><a href="/mod/prefork.html#maxspareservers">MaxSpareServers</a></code></li><li><code class="directive"><a href="/mod/prefork.html#minspareservers">MinSpareServers</a></code></li><li><code class="directive"><a href="/mod/core.html#options">Options</a></code></li><li><code class="directive"><a href="/mod/mpm_common.html#startservers">StartServers</a></code></li></ul></td></tr></table> 107 108 <h3><a name="dns" id="dns">HostnameLookups and other DNS considerations</a></h3> 109 110 111 112 <p>Prior to Apache 1.3, <code class="directive"><a href="/mod/core.html#hostnamelookups">HostnameLookups</a></code> defaulted to <code>On</code>. 113 This adds latency to every request because it requires a 114 DNS lookup to complete before the request is finished. In 115 Apache 1.3 this setting defaults to <code>Off</code>. If you need 116 to have addresses in your log files resolved to hostnames, use the 117 <code class="program"><a href="/programs/logresolve.html">logresolve</a></code> 118 program that comes with Apache, or one of the numerous log 119 reporting packages which are available.</p> 120 121 <p>It is recommended that you do this sort of postprocessing of 122 your log files on some machine other than the production web 123 server machine, in order that this activity not adversely affect 124 server performance.</p> 125 126 <p>If you use any <code><code class="directive"><a href="/mod/mod_access_compat.html#allow">Allow</a></code> from domain</code> or <code><code class="directive"><a href="/mod/mod_access_compat.html#deny">Deny</a></code> from domain</code> 127 directives (i.e., using a hostname, or a domain name, rather than 128 an IP address) then you will pay for 129 two DNS lookups (a reverse, followed by a forward lookup 130 to make sure that the reverse is not being spoofed). For best 131 performance, therefore, use IP addresses, rather than names, when 132 using these directives, if possible.</p> 133 134 <p>Note that it's possible to scope the directives, such as 135 within a <code><Location /server-status></code> section. 136 In this case the DNS lookups are only performed on requests 137 matching the criteria. Here's an example which disables lookups 138 except for <code>.html</code> and <code>.cgi</code> files:</p> 139 140 <pre class="prettyprint lang-config">HostnameLookups off 141<Files ~ "\.(html|cgi)$"> 142 HostnameLookups on 143</Files></pre> 144 145 146 <p>But even still, if you just need DNS names in some CGIs you 147 could consider doing the <code>gethostbyname</code> call in the 148 specific CGIs that need it.</p> 149 150 151 152 <h3><a name="symlinks" id="symlinks">FollowSymLinks and SymLinksIfOwnerMatch</a></h3> 153 154 155 156 <p>Wherever in your URL-space you do not have an <code>Options 157 FollowSymLinks</code>, or you do have an <code>Options 158 SymLinksIfOwnerMatch</code> Apache will have to issue extra 159 system calls to check up on symlinks. One extra call per 160 filename component. For example, if you had:</p> 161 162 <pre class="prettyprint lang-config">DocumentRoot /www/htdocs 163<Directory /> 164 Options SymLinksIfOwnerMatch 165</Directory></pre> 166 167 168 <p>and a request is made for the URI <code>/index.html</code>. 169 Then Apache will perform <code>lstat(2)</code> on 170 <code>/www</code>, <code>/www/htdocs</code>, and 171 <code>/www/htdocs/index.html</code>. The results of these 172 <code>lstats</code> are never cached, so they will occur on 173 every single request. If you really desire the symlinks 174 security checking you can do something like this:</p> 175 176 <pre class="prettyprint lang-config">DocumentRoot /www/htdocs 177<Directory /> 178 Options FollowSymLinks 179</Directory> 180 181<Directory /www/htdocs> 182 Options -FollowSymLinks +SymLinksIfOwnerMatch 183</Directory></pre> 184 185 186 <p>This at least avoids the extra checks for the 187 <code class="directive"><a href="/mod/core.html#documentroot">DocumentRoot</a></code> path. 188 Note that you'll need to add similar sections if you 189 have any <code class="directive"><a href="/mod/mod_alias.html#alias">Alias</a></code> or 190 <code class="directive"><a href="/mod/mod_rewrite.html#rewriterule">RewriteRule</a></code> paths 191 outside of your document root. For highest performance, 192 and no symlink protection, set <code>FollowSymLinks</code> 193 everywhere, and never set <code>SymLinksIfOwnerMatch</code>.</p> 194 195 196 197 <h3><a name="htaccess" id="htaccess">AllowOverride</a></h3> 198 199 200 201 <p>Wherever in your URL-space you allow overrides (typically 202 <code>.htaccess</code> files) Apache will attempt to open 203 <code>.htaccess</code> for each filename component. For 204 example,</p> 205 206 <pre class="prettyprint lang-config">DocumentRoot /www/htdocs 207<Directory /> 208 AllowOverride all 209</Directory></pre> 210 211 212 <p>and a request is made for the URI <code>/index.html</code>. 213 Then Apache will attempt to open <code>/.htaccess</code>, 214 <code>/www/.htaccess</code>, and 215 <code>/www/htdocs/.htaccess</code>. The solutions are similar 216 to the previous case of <code>Options FollowSymLinks</code>. 217 For highest performance use <code>AllowOverride None</code> 218 everywhere in your filesystem.</p> 219 220 221 222 <h3><a name="negotiation" id="negotiation">Negotiation</a></h3> 223 224 225 226 <p>If at all possible, avoid content-negotiation if you're 227 really interested in every last ounce of performance. In 228 practice the benefits of negotiation outweigh the performance 229 penalties. There's one case where you can speed up the server. 230 Instead of using a wildcard such as:</p> 231 232 <pre class="prettyprint lang-config">DirectoryIndex index</pre> 233 234 235 <p>Use a complete list of options:</p> 236 237 <pre class="prettyprint lang-config">DirectoryIndex index.cgi index.pl index.shtml index.html</pre> 238 239 240 <p>where you list the most common choice first.</p> 241 242 <p>Also note that explicitly creating a <code>type-map</code> 243 file provides better performance than using 244 <code>MultiViews</code>, as the necessary information can be 245 determined by reading this single file, rather than having to 246 scan the directory for files.</p> 247 248 <p>If your site needs content negotiation consider using 249 <code>type-map</code> files, rather than the <code>Options 250 MultiViews</code> directive to accomplish the negotiation. See the 251 <a href="/content-negotiation.html">Content Negotiation</a> 252 documentation for a full discussion of the methods of negotiation, 253 and instructions for creating <code>type-map</code> files.</p> 254 255 256 257 <h3>Memory-mapping</h3> 258 259 260 261 <p>In situations where Apache 2.x needs to look at the contents 262 of a file being delivered--for example, when doing server-side-include 263 processing--it normally memory-maps the file if the OS supports 264 some form of <code>mmap(2)</code>.</p> 265 266 <p>On some platforms, this memory-mapping improves performance. 267 However, there are cases where memory-mapping can hurt the performance 268 or even the stability of the httpd:</p> 269 270 <ul> 271 <li> 272 <p>On some operating systems, <code>mmap</code> does not scale 273 as well as <code>read(2)</code> when the number of CPUs increases. 274 On multiprocessor Solaris servers, for example, Apache 2.x sometimes 275 delivers server-parsed files faster when <code>mmap</code> is disabled.</p> 276 </li> 277 278 <li> 279 <p>If you memory-map a file located on an NFS-mounted filesystem 280 and a process on another NFS client machine deletes or truncates 281 the file, your process may get a bus error the next time it tries 282 to access the mapped file content.</p> 283 </li> 284 </ul> 285 286 <p>For installations where either of these factors applies, you 287 should use <code>EnableMMAP off</code> to disable the memory-mapping 288 of delivered files. (Note: This directive can be overridden on 289 a per-directory basis.)</p> 290 291 292 293 <h3>Sendfile</h3> 294 295 296 297 <p>In situations where Apache 2.x can ignore the contents of the file 298 to be delivered -- for example, when serving static file content -- 299 it normally uses the kernel sendfile support the file if the OS 300 supports the <code>sendfile(2)</code> operation.</p> 301 302 <p>On most platforms, using sendfile improves performance by eliminating 303 separate read and send mechanics. However, there are cases where using 304 sendfile can harm the stability of the httpd:</p> 305 306 <ul> 307 <li> 308 <p>Some platforms may have broken sendfile support that the build 309 system did not detect, especially if the binaries were built on 310 another box and moved to such a machine with broken sendfile support.</p> 311 </li> 312 <li> 313 <p>With an NFS-mounted filesystem, the kernel may be unable 314 to reliably serve the network file through its own cache.</p> 315 </li> 316 </ul> 317 318 <p>For installations where either of these factors applies, you 319 should use <code>EnableSendfile off</code> to disable sendfile 320 delivery of file contents. (Note: This directive can be overridden 321 on a per-directory basis.)</p> 322 323 324 325 <h3><a name="process" id="process">Process Creation</a></h3> 326 327 328 329 <p>Prior to Apache 1.3 the <code class="directive"><a href="/mod/prefork.html#minspareservers">MinSpareServers</a></code>, <code class="directive"><a href="/mod/prefork.html#maxspareservers">MaxSpareServers</a></code>, and <code class="directive"><a href="/mod/mpm_common.html#startservers">StartServers</a></code> settings all had drastic effects on 330 benchmark results. In particular, Apache required a "ramp-up" 331 period in order to reach a number of children sufficient to serve 332 the load being applied. After the initial spawning of 333 <code class="directive"><a href="/mod/mpm_common.html#startservers">StartServers</a></code> children, 334 only one child per second would be created to satisfy the 335 <code class="directive"><a href="/mod/prefork.html#minspareservers">MinSpareServers</a></code> 336 setting. So a server being accessed by 100 simultaneous 337 clients, using the default <code class="directive"><a href="/mod/mpm_common.html#startservers">StartServers</a></code> of <code>5</code> would take on 338 the order 95 seconds to spawn enough children to handle 339 the load. This works fine in practice on real-life servers, 340 because they aren't restarted frequently. But does really 341 poorly on benchmarks which might only run for ten minutes.</p> 342 343 <p>The one-per-second rule was implemented in an effort to 344 avoid swamping the machine with the startup of new children. If 345 the machine is busy spawning children it can't service 346 requests. But it has such a drastic effect on the perceived 347 performance of Apache that it had to be replaced. As of Apache 348 1.3, the code will relax the one-per-second rule. It will spawn 349 one, wait a second, then spawn two, wait a second, then spawn 350 four, and it will continue exponentially until it is spawning 351 32 children per second. It will stop whenever it satisfies the 352 <code class="directive"><a href="/mod/prefork.html#minspareservers">MinSpareServers</a></code> 353 setting.</p> 354 355 <p>This appears to be responsive enough that it's almost 356 unnecessary to twiddle the <code class="directive"><a href="/mod/prefork.html#minspareservers">MinSpareServers</a></code>, <code class="directive"><a href="/mod/prefork.html#maxspareservers">MaxSpareServers</a></code> and <code class="directive"><a href="/mod/mpm_common.html#startservers">StartServers</a></code> knobs. When more than 4 children are 357 spawned per second, a message will be emitted to the 358 <code class="directive"><a href="/mod/core.html#errorlog">ErrorLog</a></code>. If you 359 see a lot of these errors then consider tuning these settings. 360 Use the <code class="module"><a href="/mod/mod_status.html">mod_status</a></code> output as a guide.</p> 361 362 <p>Related to process creation is process death induced by the 363 <code class="directive"><a href="/mod/mpm_common.html#maxconnectionsperchild">MaxConnectionsPerChild</a></code> 364 setting. By default this is <code>0</code>, 365 which means that there is no limit to the number of connections 366 handled per child. If your configuration currently has this set 367 to some very low number, such as <code>30</code>, you may want to bump this 368 up significantly. If you are running SunOS or an old version of 369 Solaris, limit this to <code>10000</code> or so because of memory leaks.</p> 370 371 <p>When keep-alives are in use, children will be kept busy 372 doing nothing waiting for more requests on the already open 373 connection. The default <code class="directive"><a href="/mod/core.html#keepalivetimeout">KeepAliveTimeout</a></code> of <code>5</code> 374 seconds attempts to minimize this effect. The tradeoff here is 375 between network bandwidth and server resources. In no event 376 should you raise this above about <code>60</code> seconds, as <a href="http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-95-4.html"> 377 most of the benefits are lost</a>.</p> 378 379 380 381 </div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div> 382<div class="section"> 383<h2><a name="compiletime" id="compiletime">Compile-Time Configuration Issues</a></h2> 384 385 386 387 <h3>Choosing an MPM</h3> 388 389 390 391 <p>Apache 2.x supports pluggable concurrency models, called 392 <a href="/mpm.html">Multi-Processing Modules</a> (MPMs). 393 When building Apache, you must choose an MPM to use. There 394 are platform-specific MPMs for some platforms: 395 <code class="module"><a href="/mod/mpm_netware.html">mpm_netware</a></code>, 396 <code class="module"><a href="/mod/mpmt_os2.html">mpmt_os2</a></code>, and <code class="module"><a href="/mod/mpm_winnt.html">mpm_winnt</a></code>. For 397 general Unix-type systems, there are several MPMs from which 398 to choose. The choice of MPM can affect the speed and scalability 399 of the httpd:</p> 400 401 <ul> 402 403 <li>The <code class="module"><a href="/mod/worker.html">worker</a></code> MPM uses multiple child 404 processes with many threads each. Each thread handles 405 one connection at a time. Worker generally is a good 406 choice for high-traffic servers because it has a smaller 407 memory footprint than the prefork MPM.</li> 408 409 <li>The <code class="module"><a href="/mod/event.html">event</a></code> MPM is threaded like the 410 Worker MPM, but is designed to allow more requests to be 411 served simultaneously by passing off some processing work 412 to supporting threads, freeing up the main threads to work 413 on new requests.</li> 414 415 <li>The <code class="module"><a href="/mod/prefork.html">prefork</a></code> MPM uses multiple child 416 processes with one thread each. Each process handles 417 one connection at a time. On many systems, prefork is 418 comparable in speed to worker, but it uses more memory. 419 Prefork's threadless design has advantages over worker 420 in some situations: it can be used with non-thread-safe 421 third-party modules, and it is easier to debug on platforms 422 with poor thread debugging support.</li> 423 424 </ul> 425 426 <p>For more information on these and other MPMs, please 427 see the MPM <a href="/mpm.html">documentation</a>.</p> 428 429 430 431 <h3><a name="modules" id="modules">Modules</a></h3> 432 433 434 435 <p>Since memory usage is such an important consideration in 436 performance, you should attempt to eliminate modules that you are 437 not actually using. If you have built the modules as <a href="/dso.html">DSOs</a>, eliminating modules is a simple 438 matter of commenting out the associated <code class="directive"><a href="/mod/mod_so.html#loadmodule">LoadModule</a></code> directive for that module. 439 This allows you to experiment with removing modules, and seeing 440 if your site still functions in their absence.</p> 441 442 <p>If, on the other hand, you have modules statically linked 443 into your Apache binary, you will need to recompile Apache in 444 order to remove unwanted modules.</p> 445 446 <p>An associated question that arises here is, of course, what 447 modules you need, and which ones you don't. The answer here 448 will, of course, vary from one web site to another. However, the 449 <em>minimal</em> list of modules which you can get by with tends 450 to include <code class="module"><a href="/mod/mod_mime.html">mod_mime</a></code>, <code class="module"><a href="/mod/mod_dir.html">mod_dir</a></code>, 451 and <code class="module"><a href="/mod/mod_log_config.html">mod_log_config</a></code>. <code>mod_log_config</code> is, 452 of course, optional, as you can run a web site without log 453 files. This is, however, not recommended.</p> 454 455 456 457 <h3>Atomic Operations</h3> 458 459 460 461 <p>Some modules, such as <code class="module"><a href="/mod/mod_cache.html">mod_cache</a></code> and 462 recent development builds of the worker MPM, use APR's 463 atomic API. This API provides atomic operations that can 464 be used for lightweight thread synchronization.</p> 465 466 <p>By default, APR implements these operations using the 467 most efficient mechanism available on each target 468 OS/CPU platform. Many modern CPUs, for example, have 469 an instruction that does an atomic compare-and-swap (CAS) 470 operation in hardware. On some platforms, however, APR 471 defaults to a slower, mutex-based implementation of the 472 atomic API in order to ensure compatibility with older 473 CPU models that lack such instructions. If you are 474 building Apache for one of these platforms, and you plan 475 to run only on newer CPUs, you can select a faster atomic 476 implementation at build time by configuring Apache with 477 the <code>--enable-nonportable-atomics</code> option:</p> 478 479 <div class="example"><p><code> 480 /buildconf<br /> 481 /configure --with-mpm=worker --enable-nonportable-atomics=yes 482 </code></p></div> 483 484 <p>The <code>--enable-nonportable-atomics</code> option is 485 relevant for the following platforms:</p> 486 487 <ul> 488 489 <li>Solaris on SPARC<br /> 490 By default, APR uses mutex-based atomics on Solaris/SPARC. 491 If you configure with <code>--enable-nonportable-atomics</code>, 492 however, APR generates code that uses a SPARC v8plus opcode for 493 fast hardware compare-and-swap. If you configure Apache with 494 this option, the atomic operations will be more efficient 495 (allowing for lower CPU utilization and higher concurrency), 496 but the resulting executable will run only on UltraSPARC 497 chips. 498 </li> 499 500 <li>Linux on x86<br /> 501 By default, APR uses mutex-based atomics on Linux. If you 502 configure with <code>--enable-nonportable-atomics</code>, 503 however, APR generates code that uses a 486 opcode for fast 504 hardware compare-and-swap. This will result in more efficient 505 atomic operations, but the resulting executable will run only 506 on 486 and later chips (and not on 386). 507 </li> 508 509 </ul> 510 511 512 513 <h3>mod_status and ExtendedStatus On</h3> 514 515 516 517 <p>If you include <code class="module"><a href="/mod/mod_status.html">mod_status</a></code> and you also set 518 <code>ExtendedStatus On</code> when building and running 519 Apache, then on every request Apache will perform two calls to 520 <code>gettimeofday(2)</code> (or <code>times(2)</code> 521 depending on your operating system), and (pre-1.3) several 522 extra calls to <code>time(2)</code>. This is all done so that 523 the status report contains timing indications. For highest 524 performance, set <code>ExtendedStatus off</code> (which is the 525 default).</p> 526 527 528 529 <h3>accept Serialization - multiple sockets</h3> 530 531 532 533 <div class="warning"><h3>Warning:</h3> 534 <p>This section has not been fully updated 535 to take into account changes made in the 2.x version of the 536 Apache HTTP Server. Some of the information may still be 537 relevant, but please use it with care.</p> 538 </div> 539 540 <p>This discusses a shortcoming in the Unix socket API. Suppose 541 your web server uses multiple <code class="directive"><a href="/mod/mpm_common.html#listen">Listen</a></code> statements to listen on either multiple 542 ports or multiple addresses. In order to test each socket 543 to see if a connection is ready Apache uses 544 <code>select(2)</code>. <code>select(2)</code> indicates that a 545 socket has <em>zero</em> or <em>at least one</em> connection 546 waiting on it. Apache's model includes multiple children, and 547 all the idle ones test for new connections at the same time. A 548 naive implementation looks something like this (these examples 549 do not match the code, they're contrived for pedagogical 550 purposes):</p> 551 552 <pre class="prettyprint lang-c"> for (;;) { 553 for (;;) { 554 fd_set accept_fds; 555 556 FD_ZERO (&accept_fds); 557 for (i = first_socket; i <= last_socket; ++i) { 558 FD_SET (i, &accept_fds); 559 } 560 rc = select (last_socket+1, &accept_fds, NULL, NULL, NULL); 561 if (rc < 1) continue; 562 new_connection = -1; 563 for (i = first_socket; i <= last_socket; ++i) { 564 if (FD_ISSET (i, &accept_fds)) { 565 new_connection = accept (i, NULL, NULL); 566 if (new_connection != -1) break; 567 } 568 } 569 if (new_connection != -1) break; 570 } 571 process_the(new_connection); 572 }</pre> 573 574 575 <p>But this naive implementation has a serious starvation problem. 576 Recall that multiple children execute this loop at the same 577 time, and so multiple children will block at 578 <code>select</code> when they are in between requests. All 579 those blocked children will awaken and return from 580 <code>select</code> when a single request appears on any socket 581 (the number of children which awaken varies depending on the 582 operating system and timing issues). They will all then fall 583 down into the loop and try to <code>accept</code> the 584 connection. But only one will succeed (assuming there's still 585 only one connection ready), the rest will be <em>blocked</em> 586 in <code>accept</code>. This effectively locks those children 587 into serving requests from that one socket and no other 588 sockets, and they'll be stuck there until enough new requests 589 appear on that socket to wake them all up. This starvation 590 problem was first documented in <a href="http://bugs.apache.org/index/full/467">PR#467</a>. There 591 are at least two solutions.</p> 592 593 <p>One solution is to make the sockets non-blocking. In this 594 case the <code>accept</code> won't block the children, and they 595 will be allowed to continue immediately. But this wastes CPU 596 time. Suppose you have ten idle children in 597 <code>select</code>, and one connection arrives. Then nine of 598 those children will wake up, try to <code>accept</code> the 599 connection, fail, and loop back into <code>select</code>, 600 accomplishing nothing. Meanwhile none of those children are 601 servicing requests that occurred on other sockets until they 602 get back up to the <code>select</code> again. Overall this 603 solution does not seem very fruitful unless you have as many 604 idle CPUs (in a multiprocessor box) as you have idle children, 605 not a very likely situation.</p> 606 607 <p>Another solution, the one used by Apache, is to serialize 608 entry into the inner loop. The loop looks like this 609 (differences highlighted):</p> 610 611 <pre class="prettyprint lang-c"> for (;;) { 612 <strong>accept_mutex_on ();</strong> 613 for (;;) { 614 fd_set accept_fds; 615 616 FD_ZERO (&accept_fds); 617 for (i = first_socket; i <= last_socket; ++i) { 618 FD_SET (i, &accept_fds); 619 } 620 rc = select (last_socket+1, &accept_fds, NULL, NULL, NULL); 621 if (rc < 1) continue; 622 new_connection = -1; 623 for (i = first_socket; i <= last_socket; ++i) { 624 if (FD_ISSET (i, &accept_fds)) { 625 new_connection = accept (i, NULL, NULL); 626 if (new_connection != -1) break; 627 } 628 } 629 if (new_connection != -1) break; 630 } 631 <strong>accept_mutex_off ();</strong> 632 process the new_connection; 633 }</pre> 634 635 636 <p><a id="serialize" name="serialize">The functions</a> 637 <code>accept_mutex_on</code> and <code>accept_mutex_off</code> 638 implement a mutual exclusion semaphore. Only one child can have 639 the mutex at any time. There are several choices for 640 implementing these mutexes. The choice is defined in 641 <code>src/conf.h</code> (pre-1.3) or 642 <code>src/include/ap_config.h</code> (1.3 or later). Some 643 architectures do not have any locking choice made, on these 644 architectures it is unsafe to use multiple 645 <code class="directive"><a href="/mod/mpm_common.html#listen">Listen</a></code> 646 directives.</p> 647 648 <p>The <code class="directive"><a href="/mod/core.html#mutex">Mutex</a></code> directive can 649 be used to change the mutex implementation of the 650 <code>mpm-accept</code> mutex at run-time. Special considerations 651 for different mutex implementations are documented with that 652 directive.</p> 653 654 <p>Another solution that has been considered but never 655 implemented is to partially serialize the loop -- that is, let 656 in a certain number of processes. This would only be of 657 interest on multiprocessor boxes where it's possible multiple 658 children could run simultaneously, and the serialization 659 actually doesn't take advantage of the full bandwidth. This is 660 a possible area of future investigation, but priority remains 661 low because highly parallel web servers are not the norm.</p> 662 663 <p>Ideally you should run servers without multiple 664 <code class="directive"><a href="/mod/mpm_common.html#listen">Listen</a></code> 665 statements if you want the highest performance. 666 But read on.</p> 667 668 669 670 <h3>accept Serialization - single socket</h3> 671 672 673 674 <p>The above is fine and dandy for multiple socket servers, but 675 what about single socket servers? In theory they shouldn't 676 experience any of these same problems because all children can 677 just block in <code>accept(2)</code> until a connection 678 arrives, and no starvation results. In practice this hides 679 almost the same "spinning" behaviour discussed above in the 680 non-blocking solution. The way that most TCP stacks are 681 implemented, the kernel actually wakes up all processes blocked 682 in <code>accept</code> when a single connection arrives. One of 683 those processes gets the connection and returns to user-space, 684 the rest spin in the kernel and go back to sleep when they 685 discover there's no connection for them. This spinning is 686 hidden from the user-land code, but it's there nonetheless. 687 This can result in the same load-spiking wasteful behaviour 688 that a non-blocking solution to the multiple sockets case 689 can.</p> 690 691 <p>For this reason we have found that many architectures behave 692 more "nicely" if we serialize even the single socket case. So 693 this is actually the default in almost all cases. Crude 694 experiments under Linux (2.0.30 on a dual Pentium pro 166 695 w/128Mb RAM) have shown that the serialization of the single 696 socket case causes less than a 3% decrease in requests per 697 second over unserialized single-socket. But unserialized 698 single-socket showed an extra 100ms latency on each request. 699 This latency is probably a wash on long haul lines, and only an 700 issue on LANs. If you want to override the single socket 701 serialization you can define 702 <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code> and then 703 single-socket servers will not serialize at all.</p> 704 705 706 707 <h3>Lingering Close</h3> 708 709 710 711 <p>As discussed in <a href="http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt"> 712 draft-ietf-http-connection-00.txt</a> section 8, in order for 713 an HTTP server to <strong>reliably</strong> implement the 714 protocol it needs to shutdown each direction of the 715 communication independently (recall that a TCP connection is 716 bi-directional, each half is independent of the other).</p> 717 718 <p>When this feature was added to Apache it caused a flurry of 719 problems on various versions of Unix because of a 720 shortsightedness. The TCP specification does not state that the 721 <code>FIN_WAIT_2</code> state has a timeout, but it doesn't prohibit it. 722 On systems without the timeout, Apache 1.2 induces many sockets 723 stuck forever in the <code>FIN_WAIT_2</code> state. In many cases this 724 can be avoided by simply upgrading to the latest TCP/IP patches 725 supplied by the vendor. In cases where the vendor has never 726 released patches (<em>i.e.</em>, SunOS4 -- although folks with 727 a source license can patch it themselves) we have decided to 728 disable this feature.</p> 729 730 <p>There are two ways of accomplishing this. One is the socket 731 option <code>SO_LINGER</code>. But as fate would have it, this 732 has never been implemented properly in most TCP/IP stacks. Even 733 on those stacks with a proper implementation (<em>i.e.</em>, 734 Linux 2.0.31) this method proves to be more expensive (cputime) 735 than the next solution.</p> 736 737 <p>For the most part, Apache implements this in a function 738 called <code>lingering_close</code> (in 739 <code>http_main.c</code>). The function looks roughly like 740 this:</p> 741 742 <pre class="prettyprint lang-c"> void lingering_close (int s) 743 { 744 char junk_buffer[2048]; 745 746 /* shutdown the sending side */ 747 shutdown (s, 1); 748 749 signal (SIGALRM, lingering_death); 750 alarm (30); 751 752 for (;;) { 753 select (s for reading, 2 second timeout); 754 if (error) break; 755 if (s is ready for reading) { 756 if (read (s, junk_buffer, sizeof (junk_buffer)) <= 0) { 757 break; 758 } 759 /* just toss away whatever is here */ 760 } 761 } 762 763 close (s); 764 }</pre> 765 766 767 <p>This naturally adds some expense at the end of a connection, 768 but it is required for a reliable implementation. As HTTP/1.1 769 becomes more prevalent, and all connections are persistent, 770 this expense will be amortized over more requests. If you want 771 to play with fire and disable this feature you can define 772 <code>NO_LINGCLOSE</code>, but this is not recommended at all. 773 In particular, as HTTP/1.1 pipelined persistent connections 774 come into use <code>lingering_close</code> is an absolute 775 necessity (and <a href="http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html"> 776 pipelined connections are faster</a>, so you want to support 777 them).</p> 778 779 780 781 <h3>Scoreboard File</h3> 782 783 784 785 <p>Apache's parent and children communicate with each other 786 through something called the scoreboard. Ideally this should be 787 implemented in shared memory. For those operating systems that 788 we either have access to, or have been given detailed ports 789 for, it typically is implemented using shared memory. The rest 790 default to using an on-disk file. The on-disk file is not only 791 slow, but it is unreliable (and less featured). Peruse the 792 <code>src/main/conf.h</code> file for your architecture and 793 look for either <code>USE_MMAP_SCOREBOARD</code> or 794 <code>USE_SHMGET_SCOREBOARD</code>. Defining one of those two 795 (as well as their companions <code>HAVE_MMAP</code> and 796 <code>HAVE_SHMGET</code> respectively) enables the supplied 797 shared memory code. If your system has another type of shared 798 memory, edit the file <code>src/main/http_main.c</code> and add 799 the hooks necessary to use it in Apache. (Send us back a patch 800 too please.)</p> 801 802 <div class="note">Historical note: The Linux port of Apache didn't start to 803 use shared memory until version 1.2 of Apache. This oversight 804 resulted in really poor and unreliable behaviour of earlier 805 versions of Apache on Linux.</div> 806 807 808 809 <h3>DYNAMIC_MODULE_LIMIT</h3> 810 811 812 813 <p>If you have no intention of using dynamically loaded modules 814 (you probably don't if you're reading this and tuning your 815 server for every last ounce of performance) then you should add 816 <code>-DDYNAMIC_MODULE_LIMIT=0</code> when building your 817 server. This will save RAM that's allocated only for supporting 818 dynamically loaded modules.</p> 819 820 821 822 </div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div> 823<div class="section"> 824<h2><a name="trace" id="trace">Appendix: Detailed Analysis of a Trace</a></h2> 825 826 827 828 <p>Here is a system call trace of Apache 2.0.38 with the worker MPM 829 on Solaris 8. This trace was collected using:</p> 830 831 <div class="example"><p><code> 832 truss -l -p <var>httpd_child_pid</var>. 833 </code></p></div> 834 835 <p>The <code>-l</code> option tells truss to log the ID of the 836 LWP (lightweight process--Solaris' form of kernel-level thread) 837 that invokes each system call.</p> 838 839 <p>Other systems may have different system call tracing utilities 840 such as <code>strace</code>, <code>ktrace</code>, or <code>par</code>. 841 They all produce similar output.</p> 842 843 <p>In this trace, a client has requested a 10KB static file 844 from the httpd. Traces of non-static requests or requests 845 with content negotiation look wildly different (and quite ugly 846 in some cases).</p> 847 848 <div class="example"><pre>/67: accept(3, 0x00200BEC, 0x00200C0C, 1) (sleeping...) 849/67: accept(3, 0x00200BEC, 0x00200C0C, 1) = 9</pre></div> 850 851 <p>In this trace, the listener thread is running within LWP #67.</p> 852 853 <div class="note">Note the lack of <code>accept(2)</code> serialization. On this 854 particular platform, the worker MPM uses an unserialized accept by 855 default unless it is listening on multiple ports.</div> 856 857 <div class="example"><pre>/65: lwp_park(0x00000000, 0) = 0 858/67: lwp_unpark(65, 1) = 0</pre></div> 859 860 <p>Upon accepting the connection, the listener thread wakes up 861 a worker thread to do the request processing. In this trace, 862 the worker thread that handles the request is mapped to LWP #65.</p> 863 864 <div class="example"><pre>/65: getsockname(9, 0x00200BA4, 0x00200BC4, 1) = 0</pre></div> 865 866 <p>In order to implement virtual hosts, Apache needs to know 867 the local socket address used to accept the connection. It 868 is possible to eliminate this call in many situations (such 869 as when there are no virtual hosts, or when 870 <code class="directive"><a href="/mod/mpm_common.html#listen">Listen</a></code> directives 871 are used which do not have wildcard addresses). But 872 no effort has yet been made to do these optimizations. </p> 873 874 <div class="example"><pre>/65: brk(0x002170E8) = 0 875/65: brk(0x002190E8) = 0</pre></div> 876 877 <p>The <code>brk(2)</code> calls allocate memory from the heap. 878 It is rare to see these in a system call trace, because the httpd 879 uses custom memory allocators (<code>apr_pool</code> and 880 <code>apr_bucket_alloc</code>) for most request processing. 881 In this trace, the httpd has just been started, so it must 882 call <code>malloc(3)</code> to get the blocks of raw memory 883 with which to create the custom memory allocators.</p> 884 885 <div class="example"><pre>/65: fcntl(9, F_GETFL, 0x00000000) = 2 886/65: fstat64(9, 0xFAF7B818) = 0 887/65: getsockopt(9, 65535, 8192, 0xFAF7B918, 0xFAF7B910, 2190656) = 0 888/65: fstat64(9, 0xFAF7B818) = 0 889/65: getsockopt(9, 65535, 8192, 0xFAF7B918, 0xFAF7B914, 2190656) = 0 890/65: setsockopt(9, 65535, 8192, 0xFAF7B918, 4, 2190656) = 0 891/65: fcntl(9, F_SETFL, 0x00000082) = 0</pre></div> 892 893 <p>Next, the worker thread puts the connection to the client (file 894 descriptor 9) in non-blocking mode. The <code>setsockopt(2)</code> 895 and <code>getsockopt(2)</code> calls are a side-effect of how 896 Solaris' libc handles <code>fcntl(2)</code> on sockets.</p> 897 898 <div class="example"><pre>/65: read(9, " G E T / 1 0 k . h t m".., 8000) = 97</pre></div> 899 900 <p>The worker thread reads the request from the client.</p> 901 902 <div class="example"><pre>/65: stat("/var/httpd/apache/httpd-8999/htdocs/10k.html", 0xFAF7B978) = 0 903/65: open("/var/httpd/apache/httpd-8999/htdocs/10k.html", O_RDONLY) = 10</pre></div> 904 905 <p>This httpd has been configured with <code>Options FollowSymLinks</code> 906 and <code>AllowOverride None</code>. Thus it doesn't need to 907 <code>lstat(2)</code> each directory in the path leading up to the 908 requested file, nor check for <code>.htaccess</code> files. 909 It simply calls <code>stat(2)</code> to verify that the file: 910 1) exists, and 2) is a regular file, not a directory.</p> 911 912 <div class="example"><pre>/65: sendfilev(0, 9, 0x00200F90, 2, 0xFAF7B53C) = 10269</pre></div> 913 914 <p>In this example, the httpd is able to send the HTTP response 915 header and the requested file with a single <code>sendfilev(2)</code> 916 system call. Sendfile semantics vary among operating systems. On some other 917 systems, it is necessary to do a <code>write(2)</code> or 918 <code>writev(2)</code> call to send the headers before calling 919 <code>sendfile(2)</code>.</p> 920 921 <div class="example"><pre>/65: write(4, " 1 2 7 . 0 . 0 . 1 - ".., 78) = 78</pre></div> 922 923 <p>This <code>write(2)</code> call records the request in the 924 access log. Note that one thing missing from this trace is a 925 <code>time(2)</code> call. Unlike Apache 1.3, Apache 2.x uses 926 <code>gettimeofday(3)</code> to look up the time. On some operating 927 systems, like Linux or Solaris, <code>gettimeofday</code> has an 928 optimized implementation that doesn't require as much overhead 929 as a typical system call.</p> 930 931 <div class="example"><pre>/65: shutdown(9, 1, 1) = 0 932/65: poll(0xFAF7B980, 1, 2000) = 1 933/65: read(9, 0xFAF7BC20, 512) = 0 934/65: close(9) = 0</pre></div> 935 936 <p>The worker thread does a lingering close of the connection.</p> 937 938 <div class="example"><pre>/65: close(10) = 0 939/65: lwp_park(0x00000000, 0) (sleeping...)</pre></div> 940 941 <p>Finally the worker thread closes the file that it has just delivered 942 and blocks until the listener assigns it another connection.</p> 943 944 <div class="example"><pre>/67: accept(3, 0x001FEB74, 0x001FEB94, 1) (sleeping...)</pre></div> 945 946 <p>Meanwhile, the listener thread is able to accept another connection 947 as soon as it has dispatched this connection to a worker thread (subject 948 to some flow-control logic in the worker MPM that throttles the listener 949 if all the available workers are busy). Though it isn't apparent from 950 this trace, the next <code>accept(2)</code> can (and usually does, under 951 high load conditions) occur in parallel with the worker thread's handling 952 of the just-accepted connection.</p> 953 954 </div></div> 955<div class="bottomlang"> 956<p><span>Available Languages: </span><a href="/en/misc/perf-tuning.html" title="English"> en </a> | 957<a href="/fr/misc/perf-tuning.html" hreflang="fr" rel="alternate" title="Fran�ais"> fr </a> | 958<a href="/ko/misc/perf-tuning.html" hreflang="ko" rel="alternate" title="Korean"> ko </a> | 959<a href="/tr/misc/perf-tuning.html" hreflang="tr" rel="alternate" title="T�rk�e"> tr </a></p> 960</div><div class="top"><a href="#page-header"><img src="/images/up.gif" alt="top" /></a></div><div class="section"><h2><a id="comments_section" name="comments_section">Comments</a></h2><div class="warning"><strong>Notice:</strong><br />This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed again by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Freenode, or sent to our <a href="http://httpd.apache.org/lists.html">mailing lists</a>.</div> 961<script type="text/javascript"><!--//--><![CDATA[//><!-- 962var comments_shortname = 'httpd'; 963var comments_identifier = 'http://httpd.apache.org/docs/2.4/misc/perf-tuning.html'; 964(function(w, d) { 965 if (w.location.hostname.toLowerCase() == "httpd.apache.org") { 966 d.write('<div id="comments_thread"><\/div>'); 967 var s = d.createElement('script'); 968 s.type = 'text/javascript'; 969 s.async = true; 970 s.src = 'https://comments.apache.org/show_comments.lua?site=' + comments_shortname + '&page=' + comments_identifier; 971 (d.getElementsByTagName('head')[0] || d.getElementsByTagName('body')[0]).appendChild(s); 972 } 973 else { 974 d.write('<div id="comments_thread">Comments are disabled for this page at the moment.<\/div>'); 975 } 976})(window, document); 977//--><!]]></script></div><div id="footer"> 978<p class="apache">Copyright 2014 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p> 979<p class="menu"><a href="/mod/">Modules</a> | <a href="/mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="/glossary.html">Glossary</a> | <a href="/sitemap.html">Sitemap</a></p></div><script type="text/javascript"><!--//--><![CDATA[//><!-- 980if (typeof(prettyPrint) !== 'undefined') { 981 prettyPrint(); 982} 983//--><!]]></script> 984</body></html>