1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!--
4        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
5              This file is generated from xml source: DO NOT EDIT
6        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7      -->
8<title>Advanced Techniques with mod_rewrite - Apache HTTP Server</title>
9<link href="/style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
10<link href="/style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
11<link href="/style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" /><link rel="stylesheet" type="text/css" href="/style/css/prettify.css" />
12<script src="/style/scripts/prettify.js" type="text/javascript">
13</script>
14
15<link href="/images/favicon.ico" rel="shortcut icon" /></head>
16<body id="manual-page"><div id="page-header">
17<p class="menu"><a href="/mod/">Modules</a> | <a href="/mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="/glossary.html">Glossary</a> | <a href="/sitemap.html">Sitemap</a></p>
18<p class="apache">Apache HTTP Server Version 2.2</p>
19<img alt="" src="/images/feather.gif" /></div>
20<div class="up"><a href="./"><img title="&lt;-" alt="&lt;-" src="/images/left.gif" /></a></div>
21<div id="path">
22<a href="http://www.apache.org/">Apache</a> &gt; <a href="http://httpd.apache.org/">HTTP Server</a> &gt; <a href="http://httpd.apache.org/docs/">Documentation</a> &gt; <a href="../">Version 2.2</a> &gt; <a href="./">Rewrite</a></div><div id="page-content"><div id="preamble"><h1>Advanced Techniques with mod_rewrite</h1>
23<div class="toplang">
24<p><span>Available Languages: </span><a href="/en/rewrite/avoid.html" title="English">&nbsp;en&nbsp;</a></p>
25</div>
26
27
28<p>This document supplements the <code class="module"><a href="/mod/mod_rewrite.html">mod_rewrite</a></code>
29<a href="/mod/mod_rewrite.html">reference documentation</a>. It provides
30a few advanced techniques and tricks using mod_rewrite.</p>
31
32<div class="warning">Note that many of these examples won't work unchanged in your
33particular server configuration, so it's important that you understand
34them, rather than merely cutting and pasting the examples into your
35configuration.</div>
36
37</div>
38<div id="quickview"><ul id="toc"><li><img alt="" src="/images/down.gif" /> <a href="#sharding">URL-based sharding accross multiple backends</a></li>
39<li><img alt="" src="/images/down.gif" /> <a href="#on-the-fly-content">On-the-fly Content-Regeneration</a></li>
40<li><img alt="" src="/images/down.gif" /> <a href="#load-balancing">Load Balancing</a></li>
41<li><img alt="" src="/images/down.gif" /> <a href="#autorefresh">Document With Autorefresh</a></li>
42<li><img alt="" src="/images/down.gif" /> <a href="#structuredhomedirs">Structured Userdirs</a></li>
43<li><img alt="" src="/images/down.gif" /> <a href="#redirectanchors">Redirecting Anchors</a></li>
44<li><img alt="" src="/images/down.gif" /> <a href="#time-dependent">Time-Dependent Rewriting</a></li>
45<li><img alt="" src="/images/down.gif" /> <a href="#setenvvars">Set Environment Variables Based On URL Parts</a></li>
46</ul><h3>See also</h3><ul class="seealso"><li><a href="/mod/mod_rewrite.html">Module documentation</a></li><li><a href="intro.html">mod_rewrite introduction</a></li><li><a href="remapping.html">Redirection and remapping</a></li><li><a href="access.html">Controlling access</a></li><li><a href="vhosts.html">Virtual hosts</a></li><li><a href="proxy.html">Proxying</a></li><li><a href="rewritemap.html">Using RewriteMap</a></li><li><a href="avoid.html">When not to use mod_rewrite</a></li></ul><ul class="seealso"><li><a href="#comments_section">Comments</a></li></ul></div>
47<div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
48<div class="section">
49<h2><a name="sharding" id="sharding">URL-based sharding accross multiple backends</a></h2>
50
51  
52
53  <dl>
54    <dt>Description:</dt>
55
56    <dd>
57      <p>A common technique for distributing the burden of
58      server load or storage space is called "sharding".
59      When using this method, a front-end server will use the
60      url to consistently "shard" users or objects to separate
61      backend servers.</p>
62    </dd>
63
64    <dt>Solution:</dt>
65
66    <dd>
67      <p>A mapping is maintained, from users to target servers, in
68      external map files. They look like:</p>
69
70<div class="example"><p><code>
71user1  physical_host_of_user1<br />
72user2  physical_host_of_user2<br />
73:      :
74</code></p></div>
75
76  <p>We put this into a <code>map.users-to-hosts</code> file. The
77    aim is to map;</p>
78
79<div class="example"><p><code>
80/u/user1/anypath
81</code></p></div>
82
83  <p>to</p>
84
85<div class="example"><p><code>
86http://physical_host_of_user1/u/user/anypath
87</code></p></div>
88
89      <p>thus every URL path need not be valid on every backend physical
90      host. The following ruleset does this for us with the help of the map
91      files assuming that server0 is a default server which will be used if
92      a user has no entry in the map:</p>
93
94<div class="example"><p><code>
95RewriteEngine on<br />
96<br />
97RewriteMap      users-to-hosts   txt:/path/to/map.users-to-hosts<br />
98<br />
99RewriteRule   ^/u/<strong>([^/]+)</strong>/?(.*)   http://<strong>${users-to-hosts:$1|server0}</strong>/u/$1/$2
100</code></p></div>
101    </dd>
102  </dl>
103
104  <p>See the <code class="directive"><a href="/mod/mod_rewrite.html#rewritemap">RewriteMap</a></code>
105  documentation for more discussion of the syntax of this directive.</p>
106
107</div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
108<div class="section">
109<h2><a name="on-the-fly-content" id="on-the-fly-content">On-the-fly Content-Regeneration</a></h2>
110
111  
112
113  <dl>
114    <dt>Description:</dt>
115
116    <dd>
117      <p>We wish to dynamically generate content, but store it
118      statically once it is generated. This rule will check for the
119      existence of the static file, and if it's not there, generate
120      it. The static files can be removed periodically, if desired (say,
121      via cron) and will be regenerated on demand.</p>
122    </dd>
123
124    <dt>Solution:</dt>
125
126    <dd>
127      This is done via the following ruleset:
128
129<pre class="prettyprint lang-config">
130# This example is valid in per-directory context only
131RewriteCond %{REQUEST_URI}   !-U
132RewriteRule ^(.+)\.html$          /regenerate_page.cgi   [PT,L]
133</pre>
134
135
136    <p>The <code>-U</code> operator determines whether the test string
137    (in this case, <code>REQUEST_URI</code>) is a valid URL. It does
138    this via a subrequest. In the event that this subrequest fails -
139    that is, the requested resource doesn't exist - this rule invokes
140    the CGI program <code>/regenerate_page.cgi</code>, which generates
141    the requested resource and saves it into the document directory, so
142    that the next time it is requested, a static copy can be served.</p>
143
144    <p>In this way, documents that are infrequently updated can be served in
145    static form. if documents need to be refreshed, they can be deleted
146    from the document directory, and they will then be regenerated the
147    next time they are requested.</p>
148    </dd>
149  </dl>
150
151</div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
152<div class="section">
153<h2><a name="load-balancing" id="load-balancing">Load Balancing</a></h2>
154
155  
156
157  <dl>
158    <dt>Description:</dt>
159
160    <dd>
161      <p>We wish to randomly distribute load across several servers
162      using mod_rewrite.</p>
163    </dd>
164
165    <dt>Solution:</dt>
166
167    <dd>
168      <p>We'll use <code class="directive"><a href="/mod/mod_rewrite.html#rewritemap">RewriteMap</a></code> and a list of servers
169      to accomplish this.</p>
170
171<div class="example"><p><code>
172RewriteEngine on<br />
173RewriteMap lb rnd:/path/to/serverlist.txt<br />
174<br />
175RewriteRule ^/(.*) http://${lb:servers}/$1 [P,L]
176</code></p></div>
177
178<p><code>serverlist.txt</code> will contain a list of the servers:</p>
179
180<div class="example"><p><code>
181## serverlist.txt<br />
182<br />
183servers one.example.com|two.example.com|three.example.com<br />
184</code></p></div>
185
186<p>If you want one particular server to get more of the load than the
187others, add it more times to the list.</p>
188
189   </dd>
190
191   <dt>Discussion</dt>
192   <dd>
193<p>Apache comes with a load-balancing module -
194<code class="module"><a href="/mod/mod_proxy_balancer.html">mod_proxy_balancer</a></code> - which is far more flexible and
195featureful than anything you can cobble together using mod_rewrite.</p>
196   </dd>
197  </dl>
198
199</div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
200<div class="section">
201<h2><a name="autorefresh" id="autorefresh">Document With Autorefresh</a></h2>
202
203  
204
205  <dl>
206    <dt>Description:</dt>
207
208    <dd>
209      <p>Wouldn't it be nice, while creating a complex web page, if
210      the web browser would automatically refresh the page every
211      time we save a new version from within our editor?
212      Impossible?</p>
213    </dd>
214
215    <dt>Solution:</dt>
216
217    <dd>
218      <p>No! We just combine the MIME multipart feature, the
219      web server NPH feature, and the URL manipulation power of
220      <code class="module"><a href="/mod/mod_rewrite.html">mod_rewrite</a></code>. First, we establish a new
221      URL feature: Adding just <code>:refresh</code> to any
222      URL causes the 'page' to be refreshed every time it is
223      updated on the filesystem.</p>
224
225<div class="example"><p><code>
226RewriteRule   ^(/[uge]/[^/]+/?.*):refresh  /internal/cgi/apache/nph-refresh?f=$1
227</code></p></div>
228
229      <p>Now when we reference the URL</p>
230
231<div class="example"><p><code>
232/u/foo/bar/page.html:refresh
233</code></p></div>
234
235      <p>this leads to the internal invocation of the URL</p>
236
237<div class="example"><p><code>
238/internal/cgi/apache/nph-refresh?f=/u/foo/bar/page.html
239</code></p></div>
240
241      <p>The only missing part is the NPH-CGI script. Although
242      one would usually say "left as an exercise to the reader"
243      ;-) I will provide this, too.</p>
244
245<div class="example"><pre>
246#!/sw/bin/perl
247##
248##  nph-refresh -- NPH/CGI script for auto refreshing pages
249##  Copyright (c) 1997 Ralf S. Engelschall, All Rights Reserved.
250##
251$| = 1;
252
253#   split the QUERY_STRING variable
254@pairs = split( /&amp;/, $ENV{'QUERY_STRING'} );
255foreach $pair (@pairs) {
256    ( $name, $value ) = split( /=/, $pair );
257    $name =~ tr/A-Z/a-z/;
258    $name = 'QS_' . $name;
259    $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
260    eval "\$$name = \"$value\"";
261}
262$QS_s = 1    if ( $QS_s eq '' );
263$QS_n = 3600 if ( $QS_n eq '' );
264if ( $QS_f eq '' ) {
265    print "HTTP/1.0 200 OK\n";
266    print "Content-type: text/html\n\n";
267print "&amp;lt;b&amp;gt;ERROR&amp;lt;/b&amp;gt;: No file given\n";
268    exit(0);
269}
270if ( !-f $QS_f ) {
271    print "HTTP/1.0 200 OK\n";
272    print "Content-type: text/html\n\n";
273print "&amp;lt;b&amp;gt;ERROR&amp;lt;/b&amp;gt;: File $QS_f not found\n";
274    exit(0);
275}
276
277sub print_http_headers_multipart_begin {
278    print "HTTP/1.0 200 OK\n";
279    $bound = "ThisRandomString12345";
280    print "Content-type: multipart/x-mixed-replace;boundary=$bound\n";
281    &amp;print_http_headers_multipart_next;
282}
283
284sub print_http_headers_multipart_next {
285    print "\n--$bound\n";
286}
287
288sub print_http_headers_multipart_end {
289    print "\n--$bound--\n";
290}
291
292sub displayhtml {
293    local ($buffer) = @_;
294    $len = length($buffer);
295    print "Content-type: text/html\n";
296    print "Content-length: $len\n\n";
297    print $buffer;
298}
299
300sub readfile {
301    local ($file) = @_;
302    local ( *FP, $size, $buffer, $bytes );
303    ( $x, $x, $x, $x, $x, $x, $x, $size ) = stat($file);
304    $size = sprintf( "%d", $size );
305open(FP, "&amp;lt;$file");
306    $bytes = sysread( FP, $buffer, $size );
307    close(FP);
308    return $buffer;
309}
310
311$buffer = &amp;readfile($QS_f);
312&amp;print_http_headers_multipart_begin;
313&amp;displayhtml($buffer);
314
315sub mystat {
316    local ($file) = $_[0];
317    local ($time);
318
319    ( $x, $x, $x, $x, $x, $x, $x, $x, $x, $mtime ) = stat($file);
320    return $mtime;
321}
322
323$mtimeL = &amp;mystat($QS_f);
324$mtime  = $mtime;
325for ( $n = 0 ; $n &amp; lt ; $QS_n ; $n++ ) {
326    while (1) {
327        $mtime = &amp;mystat($QS_f);
328        if ( $mtime ne $mtimeL ) {
329            $mtimeL = $mtime;
330            sleep(2);
331            $buffer = &amp;readfile($QS_f);
332            &amp;print_http_headers_multipart_next;
333            &amp;displayhtml($buffer);
334            sleep(5);
335            $mtimeL = &amp;mystat($QS_f);
336            last;
337        }
338        sleep($QS_s);
339    }
340}
341
342&amp;print_http_headers_multipart_end;
343
344exit(0);
345
346##EOF##
347</pre></div>
348    </dd>
349  </dl>
350
351</div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
352<div class="section">
353<h2><a name="structuredhomedirs" id="structuredhomedirs">Structured Userdirs</a></h2>
354
355  
356
357  <dl>
358    <dt>Description:</dt>
359
360    <dd>
361      <p>Some sites with thousands of users use a
362      structured homedir layout, <em>i.e.</em> each homedir is in a
363      subdirectory which begins (for instance) with the first
364      character of the username. So, <code>/~larry/anypath</code>
365      is <code>/home/<strong>l</strong>/larry/public_html/anypath</code>
366      while <code>/~waldo/anypath</code> is
367      <code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p>
368    </dd>
369
370    <dt>Solution:</dt>
371
372    <dd>
373      <p>We use the following ruleset to expand the tilde URLs
374      into the above layout.</p>
375
376<div class="example"><p><code>
377RewriteEngine on<br />
378RewriteRule   ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*)  /home/<strong>$2</strong>/$1/public_html$3
379</code></p></div>
380    </dd>
381  </dl>
382
383</div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
384<div class="section">
385<h2><a name="redirectanchors" id="redirectanchors">Redirecting Anchors</a></h2>
386
387  
388
389  <dl>
390    <dt>Description:</dt>
391
392    <dd>
393    <p>By default, redirecting to an HTML anchor doesn't work,
394    because mod_rewrite escapes the <code>#</code> character,
395    turning it into <code>%23</code>. This, in turn, breaks the
396    redirection.</p>
397    </dd>
398
399    <dt>Solution:</dt>
400
401    <dd>
402      <p>Use the <code>[NE]</code> flag on the
403      <code>RewriteRule</code>. NE stands for No Escape.
404      </p>
405    </dd>
406
407    <dt>Discussion:</dt>
408    <dd>This technique will of course also work with other
409    special characters that mod_rewrite, by default, URL-encodes.</dd>
410  </dl>
411
412</div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
413<div class="section">
414<h2><a name="time-dependent" id="time-dependent">Time-Dependent Rewriting</a></h2>
415
416  
417
418  <dl>
419    <dt>Description:</dt>
420
421    <dd>
422      <p>We wish to use mod_rewrite to serve different content based on
423      the time of day.</p>
424    </dd>
425
426    <dt>Solution:</dt>
427
428    <dd>
429      <p>There are a lot of variables named <code>TIME_xxx</code>
430      for rewrite conditions. In conjunction with the special
431      lexicographic comparison patterns <code>&lt;STRING</code>,
432      <code>&gt;STRING</code> and <code>=STRING</code> we can
433      do time-dependent redirects:</p>
434
435<div class="example"><p><code>
436RewriteEngine on<br />
437RewriteCond   %{TIME_HOUR}%{TIME_MIN} &gt;0700<br />
438RewriteCond   %{TIME_HOUR}%{TIME_MIN} &lt;1900<br />
439RewriteRule   ^foo\.html$             foo.day.html [L]<br />
440RewriteRule   ^foo\.html$             foo.night.html
441</code></p></div>
442
443      <p>This provides the content of <code>foo.day.html</code>
444      under the URL <code>foo.html</code> from
445      <code>07:01-18:59</code> and at the remaining time the
446      contents of <code>foo.night.html</code>.</p>
447
448      <div class="warning"><code class="module"><a href="/mod/mod_cache.html">mod_cache</a></code>, intermediate proxies
449      and browsers may each cache responses and cause the either page to be
450      shown outside of the time-window configured.
451      <code class="module"><a href="/mod/mod_expires.html">mod_expires</a></code> may be used to control this
452      effect. You are, of course, much better off simply serving the
453      content dynamically, and customizing it based on the time of day.</div>
454
455    </dd>
456  </dl>
457
458</div><div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
459<div class="section">
460<h2><a name="setenvvars" id="setenvvars">Set Environment Variables Based On URL Parts</a></h2>
461
462  
463
464  <dl>
465    <dt>Description:</dt>
466
467    <dd>
468      <p>At time, we want to maintain some kind of status when we
469      perform a rewrite. For example, you want to make a note that
470      you've done that rewrite, so that you can check later to see if a
471      request can via that rewrite. One way to do this is by setting an
472      environment variable.</p>
473    </dd>
474
475    <dt>Solution:</dt>
476
477    <dd>
478      <p>Use the [E] flag to set an environment variable.</p>
479
480<div class="example"><p><code>
481RewriteEngine on<br />
482RewriteRule   ^/horse/(.*)   /pony/$1 [E=<strong>rewritten:1</strong>]
483</code></p></div>
484
485    <p>Later in your ruleset you might check for this environment
486    variable using a RewriteCond:</p>
487
488<div class="example"><p><code>
489RewriteCond %{ENV:rewritten} =1
490</code></p></div>
491
492    </dd>
493  </dl>
494
495</div></div>
496<div class="bottomlang">
497<p><span>Available Languages: </span><a href="/en/rewrite/avoid.html" title="English">&nbsp;en&nbsp;</a></p>
498</div><div class="top"><a href="#page-header"><img src="/images/up.gif" alt="top" /></a></div><div class="section"><h2><a id="comments_section" name="comments_section">Comments</a></h2><div class="warning"><strong>Notice:</strong><br />This is not a Q&amp;A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed again by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Freenode, or sent to our <a href="http://httpd.apache.org/lists.html">mailing lists</a>.</div>
499<script type="text/javascript"><!--//--><![CDATA[//><!--
500var comments_shortname = 'httpd';
501var comments_identifier = 'http://httpd.apache.org/docs/2.2/rewrite/avoid.html';
502(function(w, d) {
503    if (w.location.hostname.toLowerCase() == "httpd.apache.org") {
504        d.write('<div id="comments_thread"><\/div>');
505        var s = d.createElement('script');
506        s.type = 'text/javascript';
507        s.async = true;
508        s.src = 'https://comments.apache.org/show_comments.lua?site=' + comments_shortname + '&page=' + comments_identifier;
509        (d.getElementsByTagName('head')[0] || d.getElementsByTagName('body')[0]).appendChild(s);
510    }
511    else { 
512        d.write('<div id="comments_thread">Comments are disabled for this page at the moment.<\/div>');
513    }
514})(window, document);
515//--><!]]></script></div><div id="footer">
516<p class="apache">Copyright 2013 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
517<p class="menu"><a href="/mod/">Modules</a> | <a href="/mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="/glossary.html">Glossary</a> | <a href="/sitemap.html">Sitemap</a></p></div><script type="text/javascript"><!--//--><![CDATA[//><!--
518if (typeof(prettyPrint) !== 'undefined') {
519    prettyPrint();
520}
521//--><!]]></script>
522</body></html>