1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!--
4        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
5              This file is generated from xml source: DO NOT EDIT
6        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7      -->
8<title>mod_unique_id - Apache HTTP Server</title>
9<link href="/style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
10<link href="/style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
11<link href="/style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" /><link rel="stylesheet" type="text/css" href="/style/css/prettify.css" />
12<script src="/style/scripts/prettify.min.js" type="text/javascript">
13</script>
14
15<link href="/images/favicon.ico" rel="shortcut icon" /></head>
16<body>
17<div id="page-header">
18<p class="menu"><a href="/mod/">Modules</a> | <a href="/mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="/glossary.html">Glossary</a> | <a href="/sitemap.html">Sitemap</a></p>
19<p class="apache">Apache HTTP Server Version 2.4</p>
20<img alt="" src="/images/feather.gif" /></div>
21<div class="up"><a href="./"><img title="&lt;-" alt="&lt;-" src="/images/left.gif" /></a></div>
22<div id="path">
23<a href="http://www.apache.org/">Apache</a> &gt; <a href="http://httpd.apache.org/">HTTP Server</a> &gt; <a href="http://httpd.apache.org/docs/">Documentation</a> &gt; <a href="../">Version 2.4</a> &gt; <a href="./">Modules</a></div>
24<div id="page-content">
25<div id="preamble"><h1>Apache Module mod_unique_id</h1>
26<div class="toplang">
27<p><span>Available Languages: </span><a href="/en/mod/mod_unique_id.html" title="English">&nbsp;en&nbsp;</a> |
28<a href="/fr/mod/mod_unique_id.html" hreflang="fr" rel="alternate" title="Fran�ais">&nbsp;fr&nbsp;</a> |
29<a href="/ja/mod/mod_unique_id.html" hreflang="ja" rel="alternate" title="Japanese">&nbsp;ja&nbsp;</a> |
30<a href="/ko/mod/mod_unique_id.html" hreflang="ko" rel="alternate" title="Korean">&nbsp;ko&nbsp;</a></p>
31</div>
32<table class="module"><tr><th><a href="module-dict.html#Description">Description:</a></th><td>Provides an environment variable with a unique
33identifier for each request</td></tr>
34<tr><th><a href="module-dict.html#Status">Status:</a></th><td>Extension</td></tr>
35<tr><th><a href="module-dict.html#ModuleIdentifier">Module�Identifier:</a></th><td>unique_id_module</td></tr>
36<tr><th><a href="module-dict.html#SourceFile">Source�File:</a></th><td>mod_unique_id.c</td></tr></table>
37<h3>Summary</h3>
38
39
40    <p>This module provides a magic token for each request which is
41    guaranteed to be unique across "all" requests under very
42    specific conditions. The unique identifier is even unique
43    across multiple machines in a properly configured cluster of
44    machines. The environment variable <code>UNIQUE_ID</code> is
45    set to the identifier for each request. Unique identifiers are
46    useful for various reasons which are beyond the scope of this
47    document.</p>
48</div>
49<div id="quickview"><h3 class="directives">Directives</h3>
50<p>This module provides no
51            directives.</p>
52<h3>Topics</h3>
53<ul id="topics">
54<li><img alt="" src="/images/down.gif" /> <a href="#theory">Theory</a></li>
55</ul><ul class="seealso"><li><a href="#comments_section">Comments</a></li></ul></div>
56<div class="top"><a href="#page-header"><img alt="top" src="/images/up.gif" /></a></div>
57<div class="section">
58<h2><a name="theory" id="theory">Theory</a></h2>
59    
60
61    <p>First a brief recap of how the Apache server works on Unix
62    machines. This feature currently isn't supported on Windows NT.
63    On Unix machines, Apache creates several children, the children
64    process requests one at a time. Each child can serve multiple
65    requests in its lifetime. For the purpose of this discussion,
66    the children don't share any data with each other. We'll refer
67    to the children as <dfn>httpd processes</dfn>.</p>
68
69    <p>Your website has one or more machines under your
70    administrative control, together we'll call them a cluster of
71    machines. Each machine can possibly run multiple instances of
72    Apache. All of these collectively are considered "the
73    universe", and with certain assumptions we'll show that in this
74    universe we can generate unique identifiers for each request,
75    without extensive communication between machines in the
76    cluster.</p>
77
78    <p>The machines in your cluster should satisfy these
79    requirements. (Even if you have only one machine you should
80    synchronize its clock with NTP.)</p>
81
82    <ul>
83      <li>The machines' times are synchronized via NTP or other
84      network time protocol.</li>
85
86      <li>The machines' hostnames all differ, such that the module
87      can do a hostname lookup on the hostname and receive a
88      different IP address for each machine in the cluster.</li>
89    </ul>
90
91    <p>As far as operating system assumptions go, we assume that
92    pids (process ids) fit in 32-bits. If the operating system uses
93    more than 32-bits for a pid, the fix is trivial but must be
94    performed in the code.</p>
95
96    <p>Given those assumptions, at a single point in time we can
97    identify any httpd process on any machine in the cluster from
98    all other httpd processes. The machine's IP address and the pid
99    of the httpd process are sufficient to do this. A httpd process
100    can handle multiple requests simultaneously if you use a
101    multi-threaded MPM. In order to identify threads, we use a thread
102    index Apache httpd uses internally. So in order to
103    generate unique identifiers for requests we need only
104    distinguish between different points in time.</p>
105
106    <p>To distinguish time we will use a Unix timestamp (seconds
107    since January 1, 1970 UTC), and a 16-bit counter. The timestamp
108    has only one second granularity, so the counter is used to
109    represent up to 65536 values during a single second. The
110    quadruple <em>( ip_addr, pid, time_stamp, counter )</em> is
111    sufficient to enumerate 65536 requests per second per httpd
112    process. There are issues however with pid reuse over time, and
113    the counter is used to alleviate this issue.</p>
114
115    <p>When an httpd child is created, the counter is initialized
116    with ( current microseconds divided by 10 ) modulo 65536 (this
117    formula was chosen to eliminate some variance problems with the
118    low order bits of the microsecond timers on some systems). When
119    a unique identifier is generated, the time stamp used is the
120    time the request arrived at the web server. The counter is
121    incremented every time an identifier is generated (and allowed
122    to roll over).</p>
123
124    <p>The kernel generates a pid for each process as it forks the
125    process, and pids are allowed to roll over (they're 16-bits on
126    many Unixes, but newer systems have expanded to 32-bits). So
127    over time the same pid will be reused. However unless it is
128    reused within the same second, it does not destroy the
129    uniqueness of our quadruple. That is, we assume the system does
130    not spawn 65536 processes in a one second interval (it may even
131    be 32768 processes on some Unixes, but even this isn't likely
132    to happen).</p>
133
134    <p>Suppose that time repeats itself for some reason. That is,
135    suppose that the system's clock is screwed up and it revisits a
136    past time (or it is too far forward, is reset correctly, and
137    then revisits the future time). In this case we can easily show
138    that we can get pid and time stamp reuse. The choice of
139    initializer for the counter is intended to help defeat this.
140    Note that we really want a random number to initialize the
141    counter, but there aren't any readily available numbers on most
142    systems (<em>i.e.</em>, you can't use rand() because you need
143    to seed the generator, and can't seed it with the time because
144    time, at least at one second resolution, has repeated itself).
145    This is not a perfect defense.</p>
146
147    <p>How good a defense is it? Suppose that one of your machines
148    serves at most 500 requests per second (which is a very
149    reasonable upper bound at this writing, because systems
150    generally do more than just shovel out static files). To do
151    that it will require a number of children which depends on how
152    many concurrent clients you have. But we'll be pessimistic and
153    suppose that a single child is able to serve 500 requests per
154    second. There are 1000 possible starting counter values such
155    that two sequences of 500 requests overlap. So there is a 1.5%
156    chance that if time (at one second resolution) repeats itself
157    this child will repeat a counter value, and uniqueness will be
158    broken. This was a very pessimistic example, and with real
159    world values it's even less likely to occur. If your system is
160    such that it's still likely to occur, then perhaps you should
161    make the counter 32 bits (by editing the code).</p>
162
163    <p>You may be concerned about the clock being "set back" during
164    summer daylight savings. However this isn't an issue because
165    the times used here are UTC, which "always" go forward. Note
166    that x86 based Unixes may need proper configuration for this to
167    be true -- they should be configured to assume that the
168    motherboard clock is on UTC and compensate appropriately. But
169    even still, if you're running NTP then your UTC time will be
170    correct very shortly after reboot.</p>
171
172    
173    <p>The <code>UNIQUE_ID</code> environment variable is
174    constructed by encoding the 144-bit (32-bit IP address, 32 bit
175    pid, 32 bit time stamp, 16 bit counter, 32 bit thread index)
176    quadruple using the
177    alphabet <code>[A-Za-z0-9@-]</code> in a manner similar to MIME
178    base64 encoding, producing 24 characters. The MIME base64
179    alphabet is actually <code>[A-Za-z0-9+/]</code> however
180    <code>+</code> and <code>/</code> need to be specially encoded
181    in URLs, which makes them less desirable. All values are
182    encoded in network byte ordering so that the encoding is
183    comparable across architectures of different byte ordering. The
184    actual ordering of the encoding is: time stamp, IP address,
185    pid, counter. This ordering has a purpose, but it should be
186    emphasized that applications should not dissect the encoding.
187    Applications should treat the entire encoded
188    <code>UNIQUE_ID</code> as an opaque token, which can be
189    compared against other <code>UNIQUE_ID</code>s for equality
190    only.</p>
191
192    <p>The ordering was chosen such that it's possible to change
193    the encoding in the future without worrying about collision
194    with an existing database of <code>UNIQUE_ID</code>s. The new
195    encodings should also keep the time stamp as the first element,
196    and can otherwise use the same alphabet and bit length. Since
197    the time stamps are essentially an increasing sequence, it's
198    sufficient to have a <em>flag second</em> in which all machines
199    in the cluster stop serving and request, and stop using the old
200    encoding format. Afterwards they can resume requests and begin
201    issuing the new encodings.</p>
202
203    <p>This we believe is a relatively portable solution to this
204    problem. The identifiers
205    generated have essentially an infinite life-time because future
206    identifiers can be made longer as required. Essentially no
207    communication is required between machines in the cluster (only
208    NTP synchronization is required, which is low overhead), and no
209    communication between httpd processes is required (the
210    communication is implicit in the pid value assigned by the
211    kernel). In very specific situations the identifier can be
212    shortened, but more information needs to be assumed (for
213    example the 32-bit IP address is overkill for any site, but
214    there is no portable shorter replacement for it). </p>
215</div>
216</div>
217<div class="bottomlang">
218<p><span>Available Languages: </span><a href="/en/mod/mod_unique_id.html" title="English">&nbsp;en&nbsp;</a> |
219<a href="/fr/mod/mod_unique_id.html" hreflang="fr" rel="alternate" title="Fran�ais">&nbsp;fr&nbsp;</a> |
220<a href="/ja/mod/mod_unique_id.html" hreflang="ja" rel="alternate" title="Japanese">&nbsp;ja&nbsp;</a> |
221<a href="/ko/mod/mod_unique_id.html" hreflang="ko" rel="alternate" title="Korean">&nbsp;ko&nbsp;</a></p>
222</div><div class="top"><a href="#page-header"><img src="/images/up.gif" alt="top" /></a></div><div class="section"><h2><a id="comments_section" name="comments_section">Comments</a></h2><div class="warning"><strong>Notice:</strong><br />This is not a Q&amp;A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed again by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Freenode, or sent to our <a href="http://httpd.apache.org/lists.html">mailing lists</a>.</div>
223<script type="text/javascript"><!--//--><![CDATA[//><!--
224var comments_shortname = 'httpd';
225var comments_identifier = 'http://httpd.apache.org/docs/2.4/mod/mod_unique_id.html';
226(function(w, d) {
227    if (w.location.hostname.toLowerCase() == "httpd.apache.org") {
228        d.write('<div id="comments_thread"><\/div>');
229        var s = d.createElement('script');
230        s.type = 'text/javascript';
231        s.async = true;
232        s.src = 'https://comments.apache.org/show_comments.lua?site=' + comments_shortname + '&page=' + comments_identifier;
233        (d.getElementsByTagName('head')[0] || d.getElementsByTagName('body')[0]).appendChild(s);
234    }
235    else { 
236        d.write('<div id="comments_thread">Comments are disabled for this page at the moment.<\/div>');
237    }
238})(window, document);
239//--><!]]></script></div><div id="footer">
240<p class="apache">Copyright 2014 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
241<p class="menu"><a href="/mod/">Modules</a> | <a href="/mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="/glossary.html">Glossary</a> | <a href="/sitemap.html">Sitemap</a></p></div><script type="text/javascript"><!--//--><![CDATA[//><!--
242if (typeof(prettyPrint) !== 'undefined') {
243    prettyPrint();
244}
245//--><!]]></script>
246</body></html>