1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" 2 "http://www.w3.org/TR/REC-html40/loose.dtd"> 3<HTML> 4 <HEAD> 5 <TITLE>Common Gateway Interface - 1.1 *Draft 03* [http://cgi-spec.golux.com/draft-coar-cgi-v11-03-clean.html] 6 </TITLE> 7<!--#if expr="$HTTP_USER_AGENT != /Lynx/" --> 8 <!--#set var="GUI" value="1" --> 9<!--#endif --> 10 <LINK HREF="mailto:Ken.Coar@Golux.Com" rev="revised"> 11 <LINK REL="STYLESHEET" HREF="cgip-style-rfc.css" TYPE="text/css"> 12 <META name="latexstyle" content="rfc"> 13 <META name="author" content="Ken A L Coar"> 14 <META name="institute" content="IBM Corporation"> 15 <META name="date" content="25 June 1999"> 16 <META name="expires" content="Expires 31 December 1999"> 17 <META name="document" content="INTERNET-DRAFT"> 18 <META name="file" content="<draft-coar-cgi-v11-03.txt>"> 19 <META name="group" content="INTERNET-DRAFT"> 20<!-- 21 There are a lot of BNF fragments in this document. To make it work 22 in all possible browsers (including Lynx, which is used to turn it 23 into text/plain), we handle these by using PREformatted blocks with 24 a universal internal margin of 2, inside one-level DL blocks. 25 --> 26 </HEAD> 27 <BODY> 28 <!-- 29 HTML doesn't do paper pagination, so we need to fake it out. Basing 30 our formatting upon RFC2068, there are four (4) lines of header and 31 four (4) lines of footer for each page. 32 33<DIV ALIGN="CENTER"> 34 <PRE> 35 36 37 38 39Coar, et al. CGI/1.1 Specification May, 1998 40INTERNET-DRAFT Expires 1 December 1998 [Page 2] 41 42 43 </PRE> 44</DIV> 45 --> 46 <!-- 47 The following weirdness wrt non-breaking spaces is to get Lynx 48 (which is barely TABLE-aware) to line the left/right justified 49 text up properly. 50 --> 51 <DIV ALIGN="CENTER"> 52 <TABLE WIDTH="100%" CELLPADDING=0 CELLSPACING=0> 53 <TR VALIGN="TOP"> 54 <TD ALIGN="LEFT"> 55 INTERNET-DRAFT 56 </TD> 57 <TD ALIGN="RIGHT"> 58 Ken A L Coar 59 </TD> 60 </TR> 61 <TR VALIGN="TOP"> 62 <TD ALIGN="LEFT"> 63 draft-coar-cgi-v11-03.{html,txt} 64 </TD> 65 <TD ALIGN="RIGHT"> 66 IBM Corporation 67 </TD> 68 </TR> 69 <TR VALIGN="TOP"> 70 <TD ALIGN="LEFT"> 71 72 </TD> 73 <TD ALIGN="RIGHT"> 74 D.R.T. Robinson 75 </TD> 76 </TR> 77 <TR VALIGN="TOP"> 78 <TD ALIGN="LEFT"> 79 80 </TD> 81 <TD ALIGN="RIGHT"> 82 E*TRADE UK Ltd. 83 </TD> 84 </TR> 85 <TR VALIGN="TOP"> 86 <TD ALIGN="LEFT"> 87 88 </TD> 89 <TD ALIGN="RIGHT"> 90 25 June 1999 91 </TD> 92 </TR> 93 </TABLE> 94 </DIV> 95 96 <H1 ALIGN="CENTER"> 97 The WWW Common Gateway Interface 98 <BR> 99 Version 1.1 100 </H1> 101 102<!--#include virtual="I-D-statement" --> 103 104 <H2> 105 <A NAME="Abstract"> 106 Abstract 107 </A> 108 </H2> 109 <P> 110 The Common Gateway Interface (CGI) is a simple interface for running 111 external programs, software or gateways under an information server 112 in a platform-independent manner. Currently, the supported information 113 servers are HTTP servers. 114 </P> 115 <P> 116 The interface has been in use by the World-Wide Web since 1993. This 117 specification defines the 118 "current practice" parameters of the 119 'CGI/1.1' interface developed and documented at the U.S. National 120 Centre for Supercomputing Applications [NCSA-CGI]. 121 This document also defines the use of the CGI/1.1 interface 122 on the Unix and AmigaDOS(tm) systems. 123 </P> 124 <P> 125 Discussion of this draft occurs on the CGI-WG mailing list; see the 126 project Web page at 127 <SAMP><URL:<A HREF="http://CGI-Spec.Golux.Com/" 128 >http://CGI-Spec.Golux.Com/</A>></SAMP> 129 for details on the mailing list and the status of the project. 130 </P> 131 132<!--#if expr="$GUI" --> 133 <H2> 134 Revision History 135 </H2> 136 <P> 137 The revision history of this draft is being maintained using Web-based 138 GUI notation, such as struck-through characters and colour-coded 139 sections. The following legend describes how to determine the origin 140 of a particular revision according to the colour of the text: 141 </P> 142 <DL COMPACT> 143 <DT>Black 144 </DT> 145 <DD>Revision 00, released 28 May 1998 146 </DD> 147 <DT>Green 148 </DT> 149 <DD>Revision 01, released 28 December 1998 150 <BR> 151 Major structure change: Section 4, "Request Metadata (Meta-Variables)" 152 was moved entirely under <A HREF="#7.0">Section 7</A>, "Data Input to the 153 CGI Script." 154 Due to the size of this change, it is noted here and the text in its 155 former location does <EM>not</EM> appear as struckthrough. This has 156 caused major <A HREF="#6.0">sections 5</A> and following to decrement 157 by one. Other 158 large text movements are likewise not marked up. References to RFC 159 1738 were changed to 2396 (1738's replacement). 160 </DD> 161 <DT>Red 162 </DT> 163 <DD>Revision 02, released 2 April, 1999 164 <BR> 165 Added text to <A HREF="#8.3">section 8.3</A> defining correct handling 166 of HTTP/1.1 167 requests using "chunked" Transfer-Encoding. Labelled metavariable 168 names in <A HREF="#8.0">section 8</A> with the appropriate detail section 169 numbers. 170 Clarified allowed usage of <SAMP>Status</SAMP> and 171 <SAMP>Location</SAMP> response header fields. Included new 172 Internet-Draft language. 173 </DD> 174 <DT>Fuchsia 175 </DT> 176 <DD>Revision 03, released 25 June 1999 177 <BR> 178 Changed references from "HTTP" to "Protocol-Specific" for the listing of 179 things like HTTP_ACCEPT. Changed 'entity-body' and 'content-body' to 180 'message-body.' Added a note that response headers must comply with 181 requirements of the protocol level in use. Added a lot of stuff about 182 security (section 11). Clarified a bunch of productions. Pointed out 183 that zero-length and omitted values are indistinguishable in this 184 specification. Clarified production describing order of fields in 185 script response header. Clarified issues surrounding encoding of 186 data. Acknowledged additional contributors, and changed one of 187 the authors' addresses. 188 </DD> 189 </DL> 190<!--#endif --> 191 192 <H2> 193 <A NAME="Contents"> 194 Table of Contents 195 </A> 196 </H2> 197 <DIV ALIGN="CENTER"> 198 <PRE> 199 1 Introduction..............................................<A 200 HREF="#1.0" 201 >TBD</A> 202 1.1 Purpose................................................<A 203 HREF="#1.1" 204 >TBD</A> 205 1.2 Requirements...........................................<A 206 HREF="#1.2" 207 >TBD</A> 208 1.3 Specifications.........................................<A 209 HREF="#1.3" 210 >TBD</A> 211 1.4 Terminology............................................<A 212 HREF="#1.4" 213 >TBD</A> 214 2 Notational Conventions and Generic Grammar................<A 215 HREF="#2.0" 216 >TBD</A> 217 2.1 Augmented BNF..........................................<A 218 HREF="#2.1" 219 >TBD</A> 220 2.2 Basic Rules............................................<A 221 HREF="#2.2" 222 >TBD</A> 223 3 Protocol Parameters.......................................<A 224 HREF="#3.0" 225 >TBD</A> 226 3.1 URL Encoding...........................................<A 227 HREF="#3.1" 228 >TBD</A> 229 3.2 The Script-URI.........................................<A 230 HREF="#3.2" 231 >TBD</A> 232 4 Invoking the Script.......................................<A 233 HREF="#4.0" 234 >TBD</A> 235 5 The CGI Script Command Line...............................<A 236 HREF="#5.0" 237 >TBD</A> 238 6 Data Input to the CGI Script..............................<A 239 HREF="#6.0" 240 >TBD</A> 241 6.1 Request Metadata (Metavariables).......................<A 242 HREF="#6.1" 243 >TBD</A> 244 6.1.1 AUTH_TYPE...........................................<A 245 HREF="#6.1.1" 246 >TBD</A> 247 6.1.2 CONTENT_LENGTH......................................<A 248 HREF="#6.1.2" 249 >TBD</A> 250 6.1.3 CONTENT_TYPE........................................<A 251 HREF="#6.1.3" 252 >TBD</A> 253 6.1.4 GATEWAY_INTERFACE...................................<A 254 HREF="#6.1.4" 255 >TBD</A> 256 6.1.5 Protocol-Specific Metavariables.....................<A 257 HREF="#6.1.5" 258 >TBD</A> 259 6.1.6 PATH_INFO...........................................<A 260 HREF="#6.1.6" 261 >TBD</A> 262 6.1.7 PATH_TRANSLATED.....................................<A 263 HREF="#6.1.7" 264 >TBD</A> 265 6.1.8 QUERY_STRING........................................<A 266 HREF="#6.1.8" 267 >TBD</A> 268 6.1.9 REMOTE_ADDR.........................................<A 269 HREF="#6.1.9" 270 >TBD</A> 271 6.1.10 REMOTE_HOST........................................<A 272 HREF="#6.1.10" 273 >TBD</A> 274 6.1.11 REMOTE_IDENT.......................................<A 275 HREF="#6.1.11" 276 >TBD</A> 277 6.1.12 REMOTE_USER........................................<A 278 HREF="#6.1.12" 279 >TBD</A> 280 6.1.13 REQUEST_METHOD.....................................<A 281 HREF="#6.1.13" 282 >TBD</A> 283 6.1.14 SCRIPT_NAME........................................<A 284 HREF="#6.1.14" 285 >TBD</A> 286 6.1.15 SERVER_NAME........................................<A 287 HREF="#6.1.15" 288 >TBD</A> 289 6.1.16 SERVER_PORT........................................<A 290 HREF="#6.1.16" 291 >TBD</A> 292 6.1.17 SERVER_PROTOCOL....................................<A 293 HREF="#6.1.17" 294 >TBD</A> 295 6.1.18 SERVER_SOFTWARE....................................<A 296 HREF="#6.1.18" 297 >TBD</A> 298 6.2 Request Message-Bodies................................<A 299 HREF="#6.2" 300 >TBD</A> 301 7 Data Output from the CGI Script...........................<A 302 HREF="#7.0" 303 >TBD</A> 304 7.1 Non-Parsed Header Output...............................<A 305 HREF="#7.1" 306 >TBD</A> 307 7.2 Parsed Header Output...................................<A 308 HREF="#7.2" 309 >TBD</A> 310 7.2.1 CGI header fields...................................<A 311 HREF="#7.2.1" 312 >TBD</A> 313 7.2.1.1 Content-Type.....................................<A 314 HREF="#7.2.1.1" 315 >TBD</A> 316 7.2.1.2 Location.........................................<A 317 HREF="#7.2.1.2" 318 >TBD</A> 319 7.2.1.3 Status...........................................<A 320 HREF="#7.2.1.3" 321 >TBD</A> 322 7.2.1.4 Extension header fields..........................<A 323 HREF="#7.2.1.3" 324 >TBD</A> 325 7.2.2 HTTP header fields..................................<A 326 HREF="#7.2.2" 327 >TBD</A> 328 8 Server Implementation.....................................<A 329 HREF="#8.0" 330 >TBD</A> 331 8.1 Requirements for Servers...............................<A 332 HREF="#8.1" 333 >TBD</A> 334 8.1.1 Script-URI..........................................<A 335 HREF="#8.1" 336 >TBD</A> 337 8.1.2 Request Message-body Handling.......................<A 338 HREF="#8.1.2" 339 >TBD</A> 340 8.1.3 Required Metavariables..............................<A 341 HREF="#8.1.3" 342 >TBD</A> 343 8.1.4 Response Compliance.................................<A 344 HREF="#8.1.4" 345 >TBD</A> 346 8.2 Recommendations for Servers............................<A 347 HREF="#8.2" 348 >TBD</A> 349 8.3 Summary of Metavariables...............................<A 350 HREF="#8.3" 351 >TBD</A> 352 9 Script Implementation.....................................<A 353 HREF="#9.0" 354 >TBD</A> 355 9.1 Requirements for Scripts...............................<A 356 HREF="#9.1" 357 >TBD</A> 358 9.2 Recommendations for Scripts............................<A 359 HREF="#9.2" 360 >TBD</A> 361 10 System Specifications....................................<A 362 HREF="#10.0" 363 >TBD</A> 364 10.1 AmigaDOS..............................................<A 365 HREF="#10.1" 366 >TBD</A> 367 10.2 Unix..................................................<A 368 HREF="#10.2" 369 >TBD</A> 370 11 Security Considerations..................................<A 371 HREF="#11.0" 372 >TBD</A> 373 11.1 Safe Methods..........................................<A 374 HREF="#11.1" 375 >TBD</A> 376 11.2 HTTP Header Fields Containing Sensitive Information...<A 377 HREF="#11.2" 378 >TBD</A> 379 11.3 Script Interference with the Server...................<A 380 HREF="#11.3" 381 >TBD</A> 382 11.4 Data Length and Buffering Considerations..............<A 383 HREF="#11.4" 384 >TBD</A> 385 11.5 Stateless Processing..................................<A 386 HREF="#11.5" 387 >TBD</A> 388 12 Acknowledgments..........................................<A 389 HREF="#12.0" 390 >TBD</A> 391 13 References...............................................<A 392 HREF="#13.0" 393 >TBD</A> 394 14 Authors' Addresses.......................................<A 395 HREF="#14.0" 396 >TBD</A> 397 </PRE> 398 </DIV> 399 400 <H2> 401 <A NAME="1.0"> 402 1. Introduction 403 </A> 404 </H2> 405 406 <H3> 407 <A NAME="1.1"> 408 1.1. Purpose 409 </A> 410 </H3> 411 <P> 412 Together the HTTP [<A HREF="#[3]">3</A>,<A HREF="#[8]">8</A>] server 413 and the CGI script are responsible 414 for servicing a client 415 request by sending back responses. The client 416 request comprises a Universal Resource Identifier (URI) 417 [<A HREF="#[1]">1</A>], a 418 request method, and various ancillary 419 information about the request 420 provided by the transport mechanism. 421 </P> 422 <P> 423 The CGI defines the abstract parameters, known as 424 metavariables, 425 which describe the client's 426 request. Together with a 427 concrete programmer interface this specifies a platform-independent 428 interface between the script and the HTTP server. 429 </P> 430 431 <H3> 432 <A NAME="1.2"> 433 1.2. Requirements 434 </A> 435 </H3> 436 <P> 437 This specification uses the same words as RFC 1123 438 [<A HREF="#[5]">5</A>] to define the 439 significance of each particular requirement. These are: 440 </P><!--#if expr="! $GUI" --> 441 <P></P><!--#endif --> 442 <DL> 443 <DT><EM>MUST</EM> 444 </DT> 445 <DD> 446 <P> 447 This word or the adjective 'required' means that the item is an 448 absolute requirement of the specification. 449 </P> 450 </DD> 451 <DT><EM>SHOULD</EM> 452 </DT> 453 <DD> 454 <P> 455 This word or the adjective 'recommended' means that there may 456 exist valid reasons in particular circumstances to ignore this 457 item, but the full implications should be understood and the case 458 carefully weighed before choosing a different course. 459 </P> 460 </DD> 461 <DT><EM>MAY</EM> 462 </DT> 463 <DD> 464 <P> 465 This word or the adjective 'optional' means that this item is 466 truly optional. One vendor may choose to include the item because 467 a particular marketplace requires it or because it enhances the 468 product, for example; another vendor may omit the same item. 469 </P> 470 </DD> 471 </DL> 472 <P> 473 An implementation is not compliant if it fails to satisfy one or more 474 of the 'must' requirements for the protocols it implements. An 475 implementation that satisfies all of the 'must' and all of the 476 'should' requirements for its features is said to be 'unconditionally 477 compliant'; one that satisfies all of the 'must' requirements but not 478 all of the 'should' requirements for its features is said to be 479 'conditionally compliant.' 480 </P> 481 482 <H3> 483 <A NAME="1.3"> 484 1.3. Specifications 485 </A> 486 </H3> 487 <P> 488 Not all of the functions and features of the CGI are defined in the 489 main part of this specification. The following phrases are used to 490 describe the features which are not specified: 491 </P> 492 <DL> 493 <DT><EM>system defined</EM> 494 </DT> 495 <DD> 496 <P> 497 The feature may differ between systems, but must be the same for 498 different implementations using the same system. A system will 499 usually identify a class of operating-systems. Some systems are 500 defined in 501 <A HREF="#10.0" 502 >section 10</A> of this document. 503 New systems may be defined 504 by new specifications without revision of this document. 505 </P> 506 </DD> 507 <DT><EM>implementation defined</EM> 508 </DT> 509 <DD> 510 <P> 511 The behaviour of the feature may vary from implementation to 512 implementation, but a particular implementation must document its 513 behaviour. 514 </P> 515 </DD> 516 </DL> 517 518 <H3> 519 <A NAME="1.4"> 520 1.4. Terminology 521 </A> 522 </H3> 523 <P> 524 This specification uses many terms defined in the HTTP/1.1 525 specification [<A HREF="#[8]">8</A>]; however, the following terms are 526 used here in a 527 sense which may not accord with their definitions in that document, 528 or with their common meaning. 529 </P> 530 531 <DL> 532 <DT><EM>metavariable</EM> 533 </DT> 534 <DD> 535 <P> 536 A named parameter that carries information from the server to the 537 script. It is not necessarily a variable in the operating-system's 538 environment, although that is the most common implementation. 539 </P> 540 </DD> 541 542 <DT><EM>script</EM> 543 </DT> 544 <DD> 545 <P> 546 The software which is invoked by the server <EM>via</EM> this 547 interface. It 548 need not be a standalone program, but could be a 549 dynamically-loaded or shared library, or even a subroutine in the 550 server. It <EM>may</EM> be a set of statements 551 interpreted at run-time, as the term 'script' is frequently 552 understood, but that is not a requirement and within the context 553 of this specification the term has the broader definition stated. 554 </P> 555 </DD> 556 <DT><EM>server</EM> 557 </DT> 558 <DD> 559 <P> 560 The application program which invokes the script in order to service 561 requests. 562 </P> 563 </DD> 564 </DL> 565 566 <H2> 567 <A NAME="2.0"> 568 2. Notational Conventions and Generic Grammar 569 </A> 570 </H2> 571 572 <H3> 573 <A NAME="2.1"> 574 2.1. Augmented BNF 575 </A> 576 </H3> 577 <P> 578 All of the mechanisms specified in this document are described in 579 both prose and an augmented Backus-Naur Form (BNF) similar to that 580 used by RFC 822 [<A HREF="#[6]">6</A>]. This augmented BNF contains 581 the following constructs: 582 </P> 583 <DL> 584 <DT>name = definition 585 </DT> 586 <DD> 587 <P> 588 The 589 definition by the equal character ("="). Whitespace is only 590 significant in that continuation lines of a definition are 591 indented. 592 </P> 593 </DD> 594 <DT>"literal" 595 </DT> 596 <DD> 597 <P> 598 Quotation marks (") surround literal text, except for a literal 599 quotation mark, which is surrounded by angle-brackets ("<" and ">"). 600 Unless stated otherwise, the text is case-sensitive. 601 </P> 602 </DD> 603 <DT>rule1 | rule2 604 </DT> 605 <DD> 606 <P> 607 Alternative rules are separated by a vertical bar ("|"). 608 </P> 609 </DD> 610 <DT>(rule1 rule2 rule3) 611 </DT> 612 <DD> 613 <P> 614 Elements enclosed in parentheses are treated as a single element. 615 </P> 616 </DD> 617 <DT>*rule 618 </DT> 619 <DD> 620 <P> 621 A rule preceded by an asterisk ("*") may have zero or more 622 occurrences. A rule preceded by an integer followed by an asterisk 623 must occur at least the specified number of times. 624 </P> 625 </DD> 626 <DT>[rule] 627 </DT> 628 <DD> 629 <P> 630 An element enclosed in square 631 brackets ("[" and "]") is optional. 632 </P> 633 </DD> 634 </DL> 635 636 <H3> 637 <A NAME="2.2"> 638 2.2. Basic Rules 639 </A> 640 </H3> 641 <P> 642 The following rules are used throughout this specification to 643 describe basic parsing constructs. 644 </P><!--#if expr="! $GUI" --> 645 <P></P><!--#endif --> 646 <PRE> 647 alpha = lowalpha | hialpha 648 alphanum = alpha | digit 649 lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" 650 | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" 651 | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" 652 | "y" | "z" 653 hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" 654 | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" 655 | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" 656 | "Y" | "Z" 657 digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" 658 | "8" | "9" 659 hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" 660 | "b" | "c" | "d" | "e" | "f" 661 escaped = "%" hex hex 662 OCTET = <any 8-bit sequence of data> 663 CHAR = <any US-ASCII character (octets 0 - 127)> 664 CTL = <any US-ASCII control character 665 (octets 0 - 31) and DEL (127)> 666 CR = <US-ASCII CR, carriage return (13)> 667 LF = <US-ASCII LF, linefeed (10)> 668 SP = <US-ASCII SP, space (32)> 669 HT = <US-ASCII HT, horizontal tab (9)> 670 NL = CR | LF 671 LWSP = SP | HT | NL 672 tspecial = "(" | ")" | "@" | "," | ";" | ":" | "\" | <"> 673 | "/" | "[" | "]" | "?" | "<" | ">" | "{" | "}" 674 | SP | HT | NL 675 token = 1*<any CHAR except CTLs or tspecials> 676 quoted-string = ( <"> *qdtext <"> ) | ( "<" *qatext ">") 677 qdtext = <any CHAR except <"> and CTLs but including LWSP> 678 qatext = <any CHAR except "<", ">" and CTLs but 679 including LWSP> 680 mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" 681 unreserved = alphanum | mark 682 reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | 683 "$" | "," 684 uric = reserved | unreserved | escaped 685 </PRE> 686 <P> 687 Note that newline (NL) need not be a single character, but can be a 688 character sequence. 689 </P> 690 691 <H2> 692 <A NAME="3.0"> 693 3. Protocol Parameters 694 </A> 695 </H2> 696 697 <H3> 698 <A NAME="3.1"> 699 3.1. URL Encoding 700 </A> 701 </H3> 702 <P> 703 Some variables and constructs used here are described as being 704 'URL-encoded'. This encoding is described in section 705 2 of RFC 706 2396 707 [<A HREF="#[4]">4</A>]. 708 </P> 709 <P> 710 An alternate "shortcut" encoding for representing the space 711 character exists and is in common use. Scripts MUST be prepared to 712 recognise both '+' and '%20' as an encoded space in a 713 URL-encoded value. 714 </P> 715 <P> 716 Note that some unsafe characters may have different semantics if 717 they are encoded. The definition of which characters are unsafe 718 depends on the context. 719 For example, the following two URLs do not 720 necessarily refer to the same resource: 721 </P><!--#if expr="! $GUI" --> 722 <P></P><!--#endif --> 723 <PRE> 724 http://somehost.com/somedir%2Fvalue 725 http://somehost.com/somedir/value 726 </PRE> 727 <P> 728 See section 729 2 of RFC 730 2396 [<A HREF="#[4]">4</A>] 731 for authoritative treatment of this issue. 732 </P> 733 734 <H3> 735 <A NAME="3.2"> 736 3.2. The Script-URI 737 </A> 738 </H3> 739 <P> 740 The 'Script-URI' is defined as the URI of the resource identified 741 by the metavariables. Often, 742 this URI will be the same as 743 the URI requested by the client (the 'Client-URI'); however, it need 744 not be. Instead, it could be a URI invented by the server, and so it 745 can only be used in the context of the server and its CGI interface. 746 </P> 747 <P> 748 The Script-URI has the syntax of generic-RL as defined in section 2.1 749 of RFC 1808 [<A HREF="#[7]">7</A>], with the exception that object 750 parameters and 751 fragment identifiers are not permitted: 752 </P><!--#if expr="! $GUI" --> 753 <P></P><!--#endif --> 754 <PRE> 755 <scheme>://<host><port>/<path>?<query> 756 </PRE> 757 <P> 758 The various components of the 759 Script-URI 760 are defined by some of the 761 metavariables (see 762 <A HREF="#4.0">section 4</A> 763 below); 764 </P><!--#if expr="! $GUI" --> 765 <P></P><!--#endif --> 766 <PRE> 767 script-uri = protocol "://" SERVER_NAME ":" SERVER_PORT enc-script 768 enc-path-info "?" QUERY_STRING 769 </PRE> 770 <P> 771 where 'protocol' is obtained 772 from SERVER_PROTOCOL, 'enc-script' is a 773 URL-encoded version of SCRIPT_NAME and 'enc-path-info' is a 774 URL-encoded version of PATH_INFO. See 775 <A HREF="#4.6">section 4.6</A> for more information about the PATH_INFO 776 metavariable. 777 </P> 778 <P> 779 Note that the scheme and the protocol are <EM>not</EM> identical; 780 for instance, a resource accessed <EM>via</EM> an SSL mechanism 781 may have a Client-URI with a scheme of "<SAMP>https</SAMP>" 782 rather than "<SAMP>http</SAMP>". CGI/1.1 provides no means 783 for the script to reconstruct this, and therefore 784 the Script-URI includes the base protocol used. 785 </P> 786 787 <H2> 788 <A NAME="4.0"> 789 4. Invoking the Script 790 </A> 791 </H2> 792 <P> 793 The 794 script is invoked in a system defined manner. Unless specified 795 otherwise, the file containing the script will be invoked as an 796 executable program. 797 </P> 798 799 <H2> 800 <A NAME="5.0"> 801 5. The CGI Script Command Line 802 </A> 803 </H2> 804 <P> 805 Some systems support a method for supplying an array of strings to 806 the CGI script. This is only used in the case of an 'indexed' query. 807 This is identified by a "GET" or "HEAD" HTTP request with a URL 808 query 809 string not containing any unencoded "=" characters. For such a 810 request, 811 servers SHOULD parse the search string 812 into words, using the following rules: 813 </P><!--#if expr="! $GUI" --> 814 <P></P><!--#endif --> 815 <PRE> 816 search-string = search-word *( "+" search-word ) 817 search-word = 1*schar 818 schar = xunreserved | escaped | xreserved 819 xunreserved = alpha | digit | xsafe | extra 820 xsafe = "$" | "-" | "_" | "." 821 xreserved = ";" | "/" | "?" | ":" | "@" | "&" 822 </PRE> 823 <P> 824 After parsing, each word is URL-decoded, optionally encoded in a 825 system defined manner, 826 and then the argument list is set to the list 827 of words. 828 </P> 829 <P> 830 If the server cannot create any part of the argument list, then the 831 server SHOULD NOT generate any command line information. For example, the 832 number of arguments may be greater than operating system or server 833 limitations permit, or one of the words may not be representable as an 834 argument. 835 </P> 836 <P> 837 Scripts SHOULD check to see if the QUERY_STRING value contains an 838 unencoded "=" character, and SHOULD NOT use the command line arguments 839 if it does. 840 </P> 841 842 <H2> 843 <A NAME="6.0"> 844 6. Data Input to the CGI Script 845 </A> 846 </H2> 847 <P> 848 Information about a request comes from two different sources: the 849 request header, and any associated 850 message-body. 851 Servers MUST 852 make portions of this information available to 853 scripts. 854 </P> 855 856 <H3> 857 <A NAME="6.1"> 858 6.1. Request Metadata 859 (Metavariables) 860 </A> 861 </H3> 862 <P> 863 Each CGI server 864 implementation MUST define a mechanism 865 to pass data about the request from 866 the server to the script. 867 The metavariables containing these 868 data 869 are accessed by the script in a system 870 defined manner. 871 The 872 representation of the characters in the 873 metavariables is 874 system defined. 875 </P> 876 <P> 877 This specification does not distinguish between the representation of 878 null values and missing ones. Whether null or missing values 879 (such as a query component of "?" or "", respectively) are represented 880 by undefined metavariables or by metavariables with values of "" is 881 implementation-defined. 882 </P> 883 <P> 884 Case is not significant in the 885 metavariable 886 names, in that there cannot be two 887 different variables 888 whose names differ in case only. Here they are 889 shown using a canonical representation of capitals plus underscore 890 ("_"). The actual representation of the names is system defined; for 891 a particular system the representation MAY be defined differently 892 than this. 893 </P> 894 <P> 895 Metavariable 896 values MUST be 897 considered case-sensitive except as noted 898 otherwise. 899 </P> 900 <P> 901 The canonical 902 metavariables 903 defined by this specification are: 904 </P><!--#if expr="! $GUI" --> 905 <P></P><!--#endif --> 906 <PRE> 907 AUTH_TYPE 908 CONTENT_LENGTH 909 CONTENT_TYPE 910 GATEWAY_INTERFACE 911 PATH_INFO 912 PATH_TRANSLATED 913 QUERY_STRING 914 REMOTE_ADDR 915 REMOTE_HOST 916 REMOTE_IDENT 917 REMOTE_USER 918 REQUEST_METHOD 919 SCRIPT_NAME 920 SERVER_NAME 921 SERVER_PORT 922 SERVER_PROTOCOL 923 SERVER_SOFTWARE 924 </PRE> 925 <P> 926 Metavariables with names beginning with the protocol name (<EM>e.g.</EM>, 927 "HTTP_ACCEPT") are also canonical in their description of request header 928 fields. The number and meaning of these fields may change independently 929 of this specification. (See also <A HREF="#6.1.5">section 6.1.5</A>.) 930 </P> 931 932 <H4> 933 <A NAME="6.1.1"> 934 6.1.1. AUTH_TYPE 935 </A> 936 </H4> 937 <P> 938 This variable is specific to requests made 939 <EM>via</EM> the 940 "<CODE>http</CODE>" 941 scheme. 942 </P> 943 <P> 944 If the Script-URI 945 required access authentication for external 946 access, then the server 947 MUST set 948 the value of 949 this variable 950 from the '<SAMP>auth-scheme</SAMP>' token in 951 the request's "<SAMP>Authorization</SAMP>" header 952 field. 953 Otherwise 954 it is 955 set to NULL. 956 </P><!--#if expr="! $GUI" --> 957 <P></P><!--#endif --> 958 <PRE> 959 AUTH_TYPE = "" | auth-scheme 960 auth-scheme = "Basic" | "Digest" | token 961 </PRE> 962 <P> 963 HTTP access authentication schemes are described in section 11 of the 964 HTTP/1.1 specification [<A HREF="#[8]">8</A>]. The auth-scheme is 965 not case-sensitive. 966 </P> 967 <P> 968 Servers 969 MUST 970 provide this metavariable 971 to scripts if the request 972 header included an "<SAMP>Authorization</SAMP>" field 973 that was authenticated. 974 </P> 975 976 <H4> 977 <A NAME="6.1.2"> 978 6.1.2. CONTENT_LENGTH 979 </A> 980 </H4> 981 <P> 982 This 983 metavariable 984 is set to the 985 size of the message-body 986 entity attached to the request, if any, in decimal 987 number of octets. If no data are attached, then this 988 metavariable 989 is either NULL or not 990 defined. The syntax is 991 the same as for 992 the HTTP "<SAMP>Content-Length</SAMP>" header field (section 14.14, HTTP/1.1 993 specification [<A HREF="#[8]">8</A>]). 994 </P><!--#if expr="! $GUI" --> 995 <P></P><!--#endif --> 996 <PRE> 997 CONTENT_LENGTH = "" | 1*digit 998 </PRE> 999 <P> 1000 Servers MUST provide this metavariable 1001 to scripts if the request 1002 was accompanied by a 1003 message-body entity. 1004 </P> 1005 1006 <H4> 1007 <A NAME="6.1.3"> 1008 6.1.3. CONTENT_TYPE 1009 </A> 1010 </H4> 1011 <P> 1012 If the request includes a 1013 message-body, 1014 CONTENT_TYPE is set 1015 to 1016 the Internet Media Type 1017 [<A HREF="#[9]">9</A>] of the attached 1018 entity if the type was provided <EM>via</EM> 1019 a "<SAMP>Content-type</SAMP>" field in the 1020 request header, or if the server can determine it in the absence 1021 of a supplied "<SAMP>Content-type</SAMP>" field. The syntax is the 1022 same as for the HTTP 1023 "<SAMP>Content-Type</SAMP>" header field. 1024 </P><!--#if expr="! $GUI" --> 1025 <P></P><!--#endif --> 1026 <PRE> 1027 CONTENT_TYPE = "" | media-type 1028 media-type = type "/" subtype *( ";" parameter) 1029 type = token 1030 subtype = token 1031 parameter = attribute "=" value 1032 attribute = token 1033 value = token | quoted-string 1034 </PRE> 1035 <P> 1036 The type, subtype, 1037 and parameter attribute names are not 1038 case-sensitive. Parameter values MAY be case sensitive. 1039 Media types and their use in HTTP are described 1040 in section 3.7 of the 1041 HTTP/1.1 specification [<A HREF="#[8]">8</A>]. 1042 </P> 1043 <P> 1044 Example: 1045 </P><!--#if expr="! $GUI" --> 1046 <P></P><!--#endif --> 1047 <PRE> 1048 application/x-www-form-urlencoded 1049 </PRE> 1050 <P> 1051 There is no default value for this variable. If and only if it is 1052 unset, then the script MAY attempt to determine the media type from 1053 the data received. If the type remains unknown, then 1054 the script MAY choose to either assume a 1055 content-type of 1056 <SAMP>application/octet-stream</SAMP> 1057 or reject the request with a 415 ("Unsupported Media Type") 1058 error. See <A HREF="#7.2.1.3">section 7.2.1.3</A> 1059 for more information about returning error status values. 1060 </P> 1061 <P> 1062 Servers MUST provide this metavariable 1063 to scripts if 1064 a "<SAMP>Content-Type</SAMP>" field was present 1065 in the original request header. If the server receives a request 1066 with an attached entity but no "<SAMP>Content-Type</SAMP>" 1067 header field, it MAY attempt to 1068 determine the correct datatype, or it MAY omit this 1069 metavariable when 1070 communicating the request information to the script. 1071 </P> 1072 1073 <H4> 1074 <A NAME="6.1.4"> 1075 6.1.4. GATEWAY_INTERFACE 1076 </A> 1077 </H4> 1078 <P> 1079 This 1080 metavariable 1081 is set to 1082 the dialect of CGI being used 1083 by the server to communicate with the script. 1084 Syntax: 1085 </P><!--#if expr="! $GUI" --> 1086 <P></P><!--#endif --> 1087 <PRE> 1088 GATEWAY_INTERFACE = "CGI" "/" major "." minor 1089 major = 1*digit 1090 minor = 1*digit 1091 </PRE> 1092 <P> 1093 Note that the major and minor numbers are treated as separate 1094 integers and hence each may be 1095 more than a single 1096 digit. Thus CGI/2.4 is a lower version than CGI/2.13 which in turn 1097 is lower than CGI/12.3. Leading zeros in either 1098 the major or the minor number MUST be ignored by scripts and 1099 SHOULD NOT be generated by servers. 1100 </P> 1101 <P> 1102 This document defines the 1.1 version of the CGI interface 1103 ("CGI/1.1"). 1104 </P> 1105 <P> 1106 Servers MUST provide this metavariable 1107 to scripts. 1108 </P> 1109 1110 <H4> 1111 <A NAME="6.1.5"> 1112 6.1.5. Protocol-Specific Metavariables 1113 </A> 1114 </H4> 1115 <P> 1116 These metavariables are specific to 1117 the protocol 1118 <EM>via</EM> which the request is made. 1119 Interpretation of these variables depends on the value of 1120 the 1121 SERVER_PROTOCOL 1122 metavariable 1123 (see 1124 <A HREF="#6.1.17">section 6.1.17</A>). 1125 </P> 1126 <P> 1127 Metavariables 1128 with names beginning with "HTTP_" contain 1129 values from the request header, if the 1130 scheme used was HTTP. 1131 Each 1132 HTTP header field name is converted to upper case, has all occurrences of 1133 "-" replaced with "_", 1134 and has "HTTP_" prepended to form 1135 the metavariable name. 1136 Similar transformations are applied for other 1137 protocols. 1138 The header data MAY be presented as sent 1139 by the client, or MAY be rewritten in ways which do not change its 1140 semantics. If multiple header fields with the same field-name are received 1141 then the server 1142 MUST rewrite them as though they 1143 had been received as a single header field having the same 1144 semantics before being represented in a 1145 metavariable. 1146 Similarly, a header field that is received on more than one line 1147 MUST be merged into a single line. The server MUST, if necessary, 1148 change the representation of the data (for example, the character 1149 set) to be appropriate for a CGI 1150 metavariable. 1151 <!-- ###NOTE: See if 2068 describes this thoroughly, and 1152 point there if so. --> 1153 </P> 1154 <P> 1155 Servers are 1156 not required to create 1157 metavariables for all 1158 the request 1159 header fields that they 1160 receive. In particular, 1161 they MAY 1162 decline to make available any 1163 header fields carrying authentication information, such as 1164 "<SAMP>Authorization</SAMP>", or 1165 which are available to the script 1166 <EM>via</EM> other metavariables, 1167 such as "<SAMP>Content-Length</SAMP>" and "<SAMP>Content-Type</SAMP>". 1168 </P> 1169 1170 <H4> 1171 <A NAME="6.1.6"> 1172 6.1.6. PATH_INFO 1173 </A> 1174 </H4> 1175 <P> 1176 The PATH_INFO 1177 metavariable 1178 specifies 1179 a path to be interpreted by the CGI script. It identifies the 1180 resource or sub-resource to be returned 1181 by the CGI 1182 script, and it is derived from the portion 1183 of the URI path following the script name but preceding 1184 any query data. 1185 The syntax 1186 and semantics are similar to a decoded HTTP URL 1187 'path' token 1188 (defined in 1189 RFC 2396 1190 [<A HREF="#[4]">4</A>]), with the exception 1191 that a PATH_INFO of "/" 1192 represents a single void path segment. 1193 </P><!--#if expr="! $GUI" --> 1194 <P></P><!--#endif --> 1195 <PRE> 1196 PATH_INFO = "" | ( "/" path ) 1197 path = segment *( "/" segment ) 1198 segment = *pchar 1199 pchar = <any CHAR except "/"> 1200 </PRE> 1201 <P> 1202 The PATH_INFO string is the trailing part of the <path> component of 1203 the Script-URI 1204 (see <A HREF="#3.2">section 3.2</A>) 1205 that follows the SCRIPT_NAME 1206 portion of the path. 1207 </P> 1208 <P> 1209 Servers MAY impose their own restrictions and 1210 limitations on what values they will accept for PATH_INFO, and MAY 1211 reject or edit any values they 1212 consider objectionable before passing 1213 them to the script. 1214 </P> 1215 <P> 1216 Servers MUST make this URI component available 1217 to CGI scripts. The PATH_INFO 1218 value is case-sensitive, and the 1219 server MUST preserve the case of the PATH_INFO element of the URI 1220 when making it available to scripts. 1221 </P> 1222 1223 <H4> 1224 <A NAME="6.1.7"> 1225 6.1.7. PATH_TRANSLATED 1226 </A> 1227 </H4> 1228 <P> 1229 PATH_TRANSLATED is derived by taking any path-info component of the 1230 request URI (see 1231 <A HREF="#6.1.6">section 6.1.6</A>), decoding it 1232 (see <A HREF="#3.1">section 3.1</A>), parsing it as a URI in its own 1233 right, and performing any virtual-to-physical 1234 translation appropriate to map it onto the 1235 server's document repository structure. 1236 If the request URI includes no path-info 1237 component, the PATH_TRANSLATED metavariable SHOULD NOT be defined. 1238 </P><!--#if expr="! $GUI" --> 1239 <P></P><!--#endif --> 1240 <PRE> 1241 PATH_TRANSLATED = *CHAR 1242 </PRE> 1243 <P> 1244 For a request such as the following: 1245 </P><!--#if expr="! $GUI" --> 1246 <P></P><!--#endif --> 1247 <PRE> 1248 http://somehost.com/cgi-bin/somescript/this%2eis%2epath%2einfo 1249 </PRE> 1250 <P> 1251 the PATH_INFO component would be decoded, and the result 1252 parsed as though it were a request for the following: 1253 </P><!--#if expr="! $GUI" --> 1254 <P></P><!--#endif --> 1255 <PRE> 1256 http://somehost.com/this.is.the.path.info 1257 </PRE> 1258 <P> 1259 This would then be translated to a 1260 location in the server's document repository, 1261 perhaps a filesystem path something 1262 like this: 1263 </P><!--#if expr="! $GUI" --> 1264 <P></P><!--#endif --> 1265 <PRE> 1266 /usr/local/www/htdocs/this.is.the.path.info 1267 </PRE> 1268 <P> 1269 The result of the translation is the value of PATH_TRANSLATED. 1270 </P> 1271 <P> 1272 The value of PATH_TRANSLATED may or may not map to a valid 1273 repository 1274 location. 1275 Servers MUST preserve the case of the path-info 1276 segment if and only if the underlying 1277 repository 1278 supports case-sensitive 1279 names. If the 1280 repository 1281 is only case-aware, case-preserving, or case-blind 1282 with regard to 1283 document names, 1284 servers are not required to preserve the 1285 case of the original segment through the translation. 1286 </P> 1287 <P> 1288 The 1289 translation 1290 algorithm the server uses to derive PATH_TRANSLATED is 1291 implementation defined; CGI scripts which use this variable may 1292 suffer limited portability. 1293 </P> 1294 <P> 1295 Servers SHOULD provide this metavariable 1296 to scripts if and only if the request URI includes a 1297 path-info component. 1298 </P> 1299 1300 <H4> 1301 <A NAME="6.1.8"> 1302 6.1.8. QUERY_STRING 1303 </A> 1304 </H4> 1305 <P> 1306 A URL-encoded 1307 string; the <query> part of the 1308 Script-URI. 1309 (See 1310 <A HREF="#3.2">section 3.2</A>.) 1311 </P><!--#if expr="! $GUI" --> 1312 <P></P><!--#endif --> 1313 <PRE> 1314 QUERY_STRING = query-string 1315 query-string = *uric 1316 </PRE> 1317 <P> 1318 The URL syntax for a query 1319 string is described in 1320 section 3 of 1321 RFC 2396 1322 [<A HREF="#[4]">4</A>]. 1323 </P> 1324 <P> 1325 Servers MUST supply this value to scripts. 1326 The QUERY_STRING value is case-sensitive. 1327 If the Script-URI does not include a query component, 1328 the QUERY_STRING metavariable MUST be defined as an empty string (""). 1329 </P> 1330 1331 <H4> 1332 <A NAME="6.1.9"> 1333 6.1.9. REMOTE_ADDR 1334 </A> 1335 </H4> 1336 <P> 1337 The IP address of the client 1338 sending the request to the server. This 1339 is not necessarily that of the user 1340 agent 1341 (such as if the request came through a proxy). 1342 </P><!--#if expr="! $GUI" --> 1343 <P></P><!--#endif --> 1344 <PRE> 1345 REMOTE_ADDR = hostnumber 1346 hostnumber = ipv4-address | ipv6-address 1347 </PRE> 1348 <P> 1349 The definitions of <SAMP>ipv4-address</SAMP> and <SAMP>ipv6-address</SAMP> 1350 are provided in Appendix B of RFC 2373 [<A HREF="#[13]">13</A>]. 1351 </P> 1352 <P> 1353 Servers MUST supply this value to scripts. 1354 </P> 1355 1356 <H4> 1357 <A NAME="6.1.10"> 1358 6.1.10. REMOTE_HOST 1359 </A> 1360 </H4> 1361 <P> 1362 The fully qualified domain name of the 1363 client sending the request to 1364 the server, if available, otherwise NULL. 1365 (See <A HREF="#6.1.9">section 6.1.9</A>.) 1366 Fully qualified domain names take the form as described in 1367 section 3.5 of RFC 1034 [<A HREF="#[10]">10</A>] and section 2.1 of 1368 RFC 1123 [<A HREF="#[5]">5</A>]. Domain names are not case sensitive. 1369 </P> 1370 <P> 1371 Servers SHOULD provide this information to 1372 scripts. 1373 </P> 1374 1375 <H4> 1376 <A NAME="6.1.11"> 1377 6.1.11. REMOTE_IDENT 1378 </A> 1379 </H4> 1380 <P> 1381 The identity information reported about the connection by a 1382 RFC 1413 [<A HREF="#[11]">11</A>] request to the remote agent, if 1383 available. Servers 1384 MAY choose not 1385 to support this feature, or not to request the data 1386 for efficiency reasons. 1387 </P><!--#if expr="! $GUI" --> 1388 <P></P><!--#endif --> 1389 <PRE> 1390 REMOTE_IDENT = *CHAR 1391 </PRE> 1392 <P> 1393 The data returned 1394 may be used for authentication purposes, but the level 1395 of trust reposed in them should be minimal. 1396 </P> 1397 <P> 1398 Servers MAY supply this information to scripts if the 1399 RFC1413 [<A HREF="#[11]">11</A>] lookup is performed. 1400 </P> 1401 1402 <H4> 1403 <A NAME="6.1.12"> 1404 6.1.12. REMOTE_USER 1405 </A> 1406 </H4> 1407 <P> 1408 If the request required authentication using the "Basic" 1409 mechanism (<EM>i.e.</EM>, the AUTH_TYPE 1410 metavariable is set 1411 to "Basic"), then the value of the REMOTE_USER 1412 metavariable is set to the 1413 user-ID supplied. In all other cases 1414 the value of this metavariable 1415 is undefined. 1416 </P><!--#if expr="! $GUI" --> 1417 <P></P><!--#endif --> 1418 <PRE> 1419 REMOTE_USER = *OCTET 1420 </PRE> 1421 <P> 1422 This variable is specific to requests made <EM>via</EM> the 1423 HTTP protocol. 1424 </P> 1425 <P> 1426 Servers SHOULD provide this metavariable 1427 to scripts. 1428 </P> 1429 1430 <H4> 1431 <A NAME="6.1.13"> 1432 6.1.13. REQUEST_METHOD 1433 </A> 1434 </H4> 1435 <P> 1436 The REQUEST_METHOD 1437 metavariable 1438 is set to the 1439 method with which the request was made, as described in section 1440 5.1.1 of the HTTP/1.0 specification [<A HREF="#[3]">3</A>] and 1441 section 5.1.1 of the 1442 HTTP/1.1 specification [<A HREF="#[8]">8</A>]. 1443 </P><!--#if expr="! $GUI" --> 1444 <P></P><!--#endif --> 1445 <PRE> 1446 REQUEST_METHOD = http-method 1447 http-method = "GET" | "HEAD" | "POST" | "PUT" | "DELETE" 1448 | "OPTIONS" | "TRACE" | extension-method 1449 extension-method = token 1450 </PRE> 1451 <P> 1452 The method is case sensitive. 1453 CGI/1.1 servers MAY choose to process some methods 1454 directly rather than passing them to scripts. 1455 </P> 1456 <P> 1457 This variable is specific to requests made with HTTP. 1458 </P> 1459 <P> 1460 Servers MUST provide this metavariable 1461 to scripts. 1462 </P> 1463 1464 <H4> 1465 <A NAME="6.1.14"> 1466 6.1.14. SCRIPT_NAME 1467 </A> 1468 </H4> 1469 <P> 1470 The SCRIPT_NAME 1471 metavariable 1472 is 1473 set to a URL path that could identify the CGI script (rather than the 1474 script's 1475 output). The syntax and semantics are identical to a 1476 decoded HTTP URL 'path' token 1477 (see RFC 2396 1478 [<A HREF="#[4]">4</A>]). 1479 </P><!--#if expr="! $GUI" --> 1480 <P></P><!--#endif --> 1481 <PRE> 1482 SCRIPT_NAME = "" | ( "/" [ path ] ) 1483 </PRE> 1484 <P> 1485 The SCRIPT_NAME string is some leading part of the <path> component 1486 of the Script-URI derived in some 1487 implementation defined manner. 1488 No PATH_INFO or QUERY_STRING segments 1489 (see sections <A HREF="#6.1.6">6.1.6</A> and 1490 <A HREF="#6.1.8">6.1.8</A>) are included 1491 in the SCRIPT_NAME value. 1492 </P> 1493 <P> 1494 Servers MUST provide this metavariable 1495 to scripts. 1496 </P> 1497 1498 <H4> 1499 <A NAME="6.1.15"> 1500 6.1.15. SERVER_NAME 1501 </A> 1502 </H4> 1503 <P> 1504 The SERVER_NAME 1505 metavariable 1506 is set to the 1507 name of the 1508 server, as 1509 derived from the <host> part of the 1510 Script-URI 1511 (see <A HREF="#3.2">section 3.2</A>). 1512 </P><!--#if expr="! $GUI" --> 1513 <P></P><!--#endif --> 1514 <PRE> 1515 SERVER_NAME = hostname | hostnumber 1516 </PRE> 1517 <P> 1518 Servers MUST provide this metavariable 1519 to scripts. 1520 </P> 1521 1522 <H4> 1523 <A NAME="6.1.16"> 1524 6.1.16. SERVER_PORT 1525 </A> 1526 </H4> 1527 <P> 1528 The SERVER_PORT 1529 metavariable 1530 is set to the 1531 port on which the 1532 request was received, as used in the <port> 1533 part of the Script-URI. 1534 </P><!--#if expr="! $GUI" --> 1535 <P></P><!--#endif --> 1536 <PRE> 1537 SERVER_PORT = 1*digit 1538 </PRE> 1539 <P> 1540 If the <port> portion of the script-URI is blank, the actual 1541 port number upon which the request was received MUST be supplied. 1542 </P> 1543 <P> 1544 Servers MUST provide this metavariable 1545 to scripts. 1546 </P> 1547 1548 <H4> 1549 <A NAME="6.1.17"> 1550 6.1.17. SERVER_PROTOCOL 1551 </A> 1552 </H4> 1553 <P> 1554 The SERVER_PROTOCOL 1555 metavariable 1556 is set to 1557 the 1558 name and revision of the information protocol with which 1559 the 1560 request 1561 arrived. This is not necessarily the same as the protocol version used by 1562 the server in its response to the client. 1563 </P><!--#if expr="! $GUI" --> 1564 <P></P><!--#endif --> 1565 <PRE> 1566 SERVER_PROTOCOL = HTTP-Version | extension-version 1567 | extension-token 1568 HTTP-Version = "HTTP" "/" 1*digit "." 1*digit 1569 extension-version = protocol "/" 1*digit "." 1*digit 1570 protocol = 1*( alpha | digit | "+" | "-" | "." ) 1571 extension-token = token 1572 </PRE> 1573 <P> 1574 'protocol' is a version of the <scheme> part of the 1575 Script-URI, but is 1576 not identical to it. For example, the scheme of a request may be 1577 "<SAMP>https</SAMP>" while the protocol remains "<SAMP>http</SAMP>". 1578 The protocol is not case sensitive, but 1579 by convention, 'protocol' is in 1580 upper case. 1581 </P> 1582 <P> 1583 A well-known extension token value is "INCLUDED", 1584 which signals that the current document is being included as part of 1585 a composite document, rather than being the direct target of the 1586 client request. 1587 </P> 1588 <P> 1589 Servers MUST provide this metavariable 1590 to scripts. 1591 </P> 1592 1593 <H4> 1594 <A NAME="6.1.18"> 1595 6.1.18. SERVER_SOFTWARE 1596 </A> 1597 </H4> 1598 <P> 1599 The SERVER_SOFTWARE 1600 metavariable 1601 is set to the 1602 name and version of the information server software answering the 1603 request (and running the gateway). 1604 </P><!--#if expr="! $GUI" --> 1605 <P></P><!--#endif --> 1606 <PRE> 1607 SERVER_SOFTWARE = 1*product 1608 product = token [ "/" product-version ] 1609 product-version = token 1610 </PRE> 1611 <P> 1612 Servers MUST provide this metavariable 1613 to scripts. 1614 </P> 1615 1616 <H3> 1617 <A NAME="6.2"> 1618 6.2. Request Message-Bodies 1619 </A> 1620 </H3> 1621 <P> 1622 As there may be a data entity attached to the request, there MUST be 1623 a system defined method for the script to read 1624 these data. Unless 1625 defined otherwise, this will be <EM>via</EM> the 'standard input' file 1626 descriptor. 1627 </P> 1628 <P> 1629 If the CONTENT_LENGTH value (see <A HREF="#6.1.2">section 6.1.2</A>) 1630 is non-NULL, the server MUST supply at least that many bytes to 1631 scripts on the standard input stream. 1632 Scripts are 1633 not obliged to read the data. 1634 Servers MAY signal an EOF condition after CONTENT_LENGTH bytes have been 1635 read, but are 1636 not obligated to do so. Therefore, scripts 1637 MUST NOT 1638 attempt to read more than CONTENT_LENGTH bytes, even if more data 1639 are available. 1640 </P> 1641 <P> 1642 For non-parsed header (NPH) scripts (see 1643 <A HREF="#7.1">section 7.1</A> 1644 below), 1645 servers SHOULD 1646 attempt to ensure that the data 1647 supplied to the script are precisely 1648 as supplied by the client and unaltered by 1649 the server. 1650 </P> 1651 <P> 1652 <A HREF="#8.1.2">Section 8.1.2</A> describes the requirements of 1653 servers with regard to requests that include 1654 message-bodies. 1655 </P> 1656 1657 <H2> 1658 <A NAME="7.0"> 1659 7. Data Output from the CGI Script 1660 </A> 1661 </H2> 1662 <P> 1663 There MUST be a system defined method for the script to send data 1664 back to the server or client; a script MUST always return some data. 1665 Unless defined otherwise, this will be <EM>via</EM> the 'standard 1666 output' file descriptor. 1667 </P> 1668 <P> 1669 There are two forms of output that scripts can supply to servers: non-parsed 1670 header (NPH) output, and parsed header output. 1671 Servers MUST support parsed header 1672 output and MAY support NPH output. The method of 1673 distinguishing between the two 1674 types of output (or scripts) is implementation defined. 1675 </P> 1676 <P> 1677 Servers MAY implement a timeout period within which data must be 1678 received from scripts. If a server implementation defines such 1679 a timeout and receives no data from a script within the timeout 1680 period, the server MAY terminate the script process and SHOULD 1681 abort the client request with 1682 either a 1683 '504 Gateway Timed Out' or a 1684 '500 Internal Server Error' response. 1685 </P> 1686 1687 <H3> 1688 <A NAME="7.1"> 1689 7.1. Non-Parsed Header Output 1690 </A> 1691 </H3> 1692 <P> 1693 Scripts using the NPH output form 1694 MUST return a complete HTTP response message, as described 1695 in Section 6 of the HTTP specifications 1696 [<A HREF="#[3]">3</A>,<A HREF="#[8]">8</A>]. 1697 NPH scripts 1698 MUST use the SERVER_PROTOCOL variable to determine the appropriate format 1699 for a response. 1700 </P> 1701 <P> 1702 Servers 1703 SHOULD attempt to ensure that the script output is sent 1704 directly to the client, with minimal 1705 internal and no transport-visible 1706 buffering. 1707 </P> 1708 1709 <H3> 1710 <A NAME="7.2"> 1711 7.2. Parsed Header Output 1712 </A> 1713 </H3> 1714 <P> 1715 Scripts using the parsed header output form MUST supply 1716 a CGI response message to the server 1717 as follows: 1718 </P><!--#if expr="! $GUI" --> 1719 <P></P><!--#endif --> 1720 <PRE> 1721 CGI-Response = *optional-field CGI-Field *optional-field NL [ Message-Body ] 1722 optional-field = ( CGI-Field | HTTP-Field ) 1723 CGI-Field = Content-type 1724 | Location 1725 | Status 1726 | extension-header 1727 </PRE> 1728 <P><!-- ##### If HTTP defines x-headers, remove ours except x-cgi- --> 1729 The response comprises a header and a body, separated by a blank line. 1730 The body may be NULL. 1731 The header fields are either CGI header fields to be interpreted by 1732 the server, or HTTP header fields 1733 to be included in the response returned 1734 to the client 1735 if the request method is HTTP. At least one 1736 CGI-Field MUST be 1737 supplied, but no CGI field name may be used more than once 1738 in a response. 1739 If a body is supplied, then a "<SAMP>Content-type</SAMP>" 1740 header field MUST be 1741 supplied by the script, 1742 otherwise the script MUST send a "<SAMP>Location</SAMP>" 1743 or "<SAMP>Status</SAMP>" header field. If a 1744 <SAMP>Location</SAMP> CGI-Field 1745 is returned, then the script MUST NOT supply 1746 any HTTP-Fields. 1747 </P> 1748 <P> 1749 Each header field in a CGI-Response MUST be specified on a single line; 1750 CGI/1.1 does not support continuation lines. 1751 </P> 1752 1753 <H4> 1754 <A NAME="7.2.1"> 1755 7.2.1. CGI header fields 1756 </A> 1757 </H4> 1758 <P> 1759 The CGI header fields have the generic syntax: 1760 </P><!--#if expr="! $GUI" --> 1761 <P></P><!--#endif --> 1762 <PRE> 1763 generic-field = field-name ":" [ field-value ] NL 1764 field-name = token 1765 field-value = *( field-content | LWSP ) 1766 field-content = *( token | tspecial | quoted-string ) 1767 </PRE> 1768 <P> 1769 The field-name is not case sensitive; a NULL field value is 1770 equivalent to the header field not being sent. 1771 </P> 1772 1773 <H4> 1774 <A NAME="7.2.1.1"> 1775 7.2.1.1. Content-Type 1776 </A> 1777 </H4> 1778 <P> 1779 The Internet Media Type [<A HREF="#[9]">9</A>] of the entity 1780 body, which is to be sent unmodified to the client. 1781 </P><!--#if expr="! $GUI" --> 1782 <P></P><!--#endif --> 1783 <PRE> 1784 Content-Type = "Content-Type" ":" media-type NL 1785 </PRE> 1786 <P> 1787 This is actually an HTTP-Field 1788 rather than a CGI-Field, but 1789 it is listed here because of its importance in the CGI dialogue as 1790 a member of the "one of these is required" set of header 1791 fields. 1792 </P> 1793 1794 <H4> 1795 <A NAME="7.2.1.2"> 1796 7.2.1.2. Location 1797 </A> 1798 </H4> 1799 <P> 1800 This is used to specify to the server that the script is returning a 1801 reference to a document rather than an actual document. 1802 </P><!--#if expr="! $GUI" --> 1803 <P></P><!--#endif --> 1804 <PRE> 1805 Location = "Location" ":" 1806 ( fragment-URI | rel-URL-abs-path ) NL 1807 fragment-URI = URI [ # fragmentid ] 1808 URI = scheme ":" *qchar 1809 fragmentid = *qchar 1810 rel-URL-abs-path = "/" [ hpath ] [ "?" query-string ] 1811 hpath = fpsegment *( "/" psegment ) 1812 fpsegment = 1*hchar 1813 psegment = *hchar 1814 hchar = alpha | digit | safe | extra 1815 | ":" | "@" | "& | "=" 1816 </PRE> 1817 <P> 1818 The Location 1819 value is either an absolute URI with optional fragment, 1820 as defined in RFC 1630 [<A HREF="#[1]">1</A>], or an absolute path 1821 within the server's URI space (<EM>i.e.</EM>, 1822 omitting the scheme and network-related fields) and optional 1823 query-string. If an absolute URI is returned by the script, 1824 then the 1825 server MUST generate a 1826 '302 redirect' HTTP response 1827 message unless the script has supplied an 1828 explicit Status response header field. 1829 Scripts returning an absolute URI MAY choose to 1830 provide a message-body. Servers MUST make any appropriate modifications 1831 to the script's output to ensure the response to the user-agent complies 1832 with the response protocol version. 1833 If the Location value is a path, then the server 1834 MUST generate 1835 the response that it would have produced in response to a request 1836 containing the URL 1837 </P><!--#if expr="! $GUI" --> 1838 <P></P><!--#endif --> 1839 <PRE> 1840 scheme "://" SERVER_NAME ":" SERVER_PORT rel-URL-abs-path 1841 </PRE> 1842 <P> 1843 Note: If the request was accompanied by a 1844 message-body 1845 (such as for a POST request), and the script 1846 redirects the request with a Location field, the 1847 message-body 1848 may not be 1849 available to the resource that is the target of the redirect. 1850 </P> 1851 1852 <H4> 1853 <A NAME="7.2.1.3"> 1854 7.2.1.3. Status 1855 </A> 1856 </H4> 1857 <P> 1858 The "<SAMP>Status</SAMP>" header field is used to indicate to the server what 1859 status code the server MUST use in the response message. 1860 </P><!--#if expr="! $GUI" --> 1861 <P></P><!--#endif --> 1862 <PRE> 1863 Status = "Status" ":" digit digit digit SP reason-phrase NL 1864 reason-phrase = *<CHAR, excluding CTLs, NL> 1865 </PRE> 1866 <P> 1867 The valid status codes are listed in section 6.1.1 of the HTTP/1.0 1868 specifications [<A HREF="#[3]">3</A>]. If the SERVER_PROTOCOL is 1869 "HTTP/1.1", then the status codes defined in the HTTP/1.1 1870 specification [<A HREF="#[8]">8</A>] may 1871 be used. If the script does not return a "<SAMP>Status</SAMP>" header 1872 field, then "200 OK" SHOULD be assumed by the server. 1873 </P> 1874 <P> 1875 If a script is being used to handle a particular error or condition 1876 encountered by the server, such as a '404 Not Found' error, the script 1877 SHOULD use the "<SAMP>Status</SAMP>" CGI header field to propagate the error 1878 condition back to the client. <EM>E.g.</EM>, in the example mentioned it 1879 SHOULD include a "Status: 404 Not Found" in the 1880 header data returned to the server. 1881 </P> 1882 1883 <H4> 1884 <A NAME="7.2.1.4"> 1885 7.2.1.4. Extension header fields 1886 </A> 1887 </H4> 1888 <P> 1889 Scripts MAY include in their CGI response header additional fields 1890 not defined in this or the HTTP specification. 1891 These are called "extension" fields, 1892 and have the syntax of a <SAMP>generic-field</SAMP> as defined in 1893 <A HREF="#7.2.1">section 7.2.1</A>. The name of an extension field 1894 MUST NOT conflict with a field name defined in this or any other 1895 specification; extension field names SHOULD begin with "X-CGI-" 1896 to ensure uniqueness. 1897 </P> 1898 1899 <H4> 1900 <A NAME="7.2.2"> 1901 7.2.2. HTTP header fields 1902 </A> 1903 </H4> 1904 <P> 1905 The script MAY return any other header fields defined by the 1906 specification 1907 for the SERVER_PROTOCOL (HTTP/1.0 [<A HREF="#[3]">3</A>] or HTTP/1.1 1908 [<A HREF="#[8]">8</A>]). 1909 Servers MUST resolve conflicts beteen CGI header 1910 and HTTP header formats or names (see <A HREF="#8.0">section 8</A>). 1911 </P> 1912 1913 <H2> 1914 <A NAME="8.0"> 1915 8. Server Implementation 1916 </A> 1917 </H2> 1918 <P> 1919 This section defines the requirements that must be met by HTTP 1920 servers in order to provide a coherent and correct CGI/1.1 1921 environment in which scripts may function. It is intended 1922 primarily for server implementors, but it is useful for 1923 script authors to be familiar with the information as well. 1924 </P> 1925 1926 <H3> 1927 <A NAME="8.1"> 1928 8.1. Requirements for Servers 1929 </A> 1930 </H3> 1931 <P> 1932 In order to be considered CGI/1.1-compliant, a server must meet 1933 certain basic criteria and provide certain minimal functionality. 1934 The details of these requirements are described in the following sections. 1935 </P> 1936 1937 <H3> 1938 <A NAME="8.1.1"> 1939 8.1.1. Script-URI 1940 </A> 1941 </H3> 1942 <P> 1943 Servers MUST support the standard mechanism (described below) which 1944 allows 1945 script authors to determine 1946 what URL to use in documents 1947 which reference the script; 1948 specifically, what URL to use in order to 1949 achieve particular settings of the 1950 metavariables. This 1951 mechanism is as follows: 1952 </P> 1953 <P> 1954 The server 1955 MUST translate the header data from the CGI header field syntax to 1956 the HTTP 1957 header field syntax if these differ. For example, the character 1958 sequence for 1959 newline (such as Unix's ASCII NL) used by CGI scripts may not be the 1960 same as that used by HTTP (ASCII CR followed by LF). The server MUST 1961 also resolve any conflicts between header fields returned by the script 1962 and header fields that it would otherwise send itself. 1963 </P> 1964 1965 <H3> 1966 <A NAME="8.1.2"> 1967 8.1.2. Request Message-body Handling 1968 </A> 1969 </H3> 1970 <P> 1971 These are the requirements for server handling of message-bodies directed 1972 to CGI/1.1 resources: 1973 </P> 1974 <OL> 1975 <LI>The message-body the server provides to the CGI script MUST 1976 have any transfer encodings removed. 1977 </LI> 1978 <LI>The server MUST derive and provide a value for the CONTENT_LENGTH 1979 metavariable that reflects the length of the message-body after any 1980 transfer decoding. 1981 </LI> 1982 <LI>The server MUST leave intact any content-encodings of the message-body. 1983 </LI> 1984 </OL> 1985 1986 <H3> 1987 <A NAME="8.1.3"> 1988 8.1.3. Required Metavariables 1989 </A> 1990 </H3> 1991 <P> 1992 Servers MUST provide scripts with certain information and 1993 metavariables 1994 as described in <A HREF="#8.3">section 8.3</A>. 1995 </P> 1996 1997 <H3> 1998 <A NAME="8.1.4"> 1999 8.1.4. Response Compliance 2000 </A> 2001 </H3> 2002 <P> 2003 Servers MUST ensure that responses sent to the user-agent meet all 2004 requirements of the protocol level in effect. This may involve 2005 modifying, deleting, or augmenting any header 2006 fields and/or message-body supplied by the script. 2007 </P> 2008 2009 <H3> 2010 <A NAME="8.2"> 2011 8.2. Recommendations for Servers 2012 </A> 2013 </H3> 2014 <P> 2015 Servers SHOULD provide the "<SAMP>query</SAMP>" component of the script-URI 2016 as command-line arguments to scripts if it does not 2017 contain any unencoded '=' characters and the command-line arguments can 2018 be generated in an unambiguous manner. 2019 (See <A HREF="#5.0">section 5</A>.) 2020 </P> 2021 <P> 2022 Servers SHOULD set the AUTH_TYPE 2023 metavariable to the value of the 2024 '<SAMP>auth-scheme</SAMP>' token of the "<SAMP>Authorization</SAMP>" 2025 field if it was supplied as part of the request header. 2026 (See <A HREF="#6.1.1">section 6.1.1</A>.) 2027 </P> 2028 <P> 2029 Where applicable, servers SHOULD set the current working directory 2030 to the directory in which the script is located before invoking 2031 it. 2032 </P> 2033 <P> 2034 Servers MAY reject with error '404 Not Found' 2035 any requests that would result in 2036 an encoded "/" being decoded into PATH_INFO or SCRIPT_NAME, as this 2037 might represent a loss of information to the script. 2038 </P> 2039 <P> 2040 Although the server and the CGI script need not be consistent in 2041 their handling of URL paths (client URLs and the PATH_INFO data, 2042 respectively), server authors may wish to impose consistency. 2043 So the server implementation SHOULD define its behaviour for the 2044 following cases: 2045 </P> 2046 <OL> 2047 <LI>define any restrictions on allowed characters, in particular 2048 whether ASCII NUL is permitted; 2049 </LI> 2050 <LI>define any restrictions on allowed path segments, in particular 2051 whether non-terminal NULL segments are permitted; 2052 </LI> 2053 <LI>define the behaviour for <SAMP>"."</SAMP> or <SAMP>".."</SAMP> path 2054 segments; <EM>i.e.</EM>, whether they are prohibited, treated as 2055 ordinary path 2056 segments or interpreted in accordance with the relative URL 2057 specification [<A HREF="#[7]">7</A>]; 2058 </LI> 2059 <LI>define any limits of the implementation, including limits on path or 2060 search string lengths, and limits on the volume of header data the server 2061 will parse. 2062 </LI><!-- ##### Move the field resolution/translation para below here --> 2063 </OL> 2064 <P> 2065 Servers MAY generate the 2066 Script-URI in 2067 any way from the client URI, 2068 or from any other data (but the behaviour SHOULD be documented). 2069 </P> 2070 <P> 2071 For non-parsed header (NPH) scripts (see 2072 <A HREF="#7.1">section 7.1</A>), servers SHOULD 2073 attempt to ensure that the script input comes directly from the 2074 client, with minimal buffering. For all scripts the data will be 2075 as supplied by the client. 2076 </P> 2077 2078 <H3> 2079 <A NAME="8.3"> 2080 8.3. Summary of 2081 MetaVariables 2082 </A> 2083 </H3> 2084 <P> 2085 Servers MUST provide the following 2086 metavariables to 2087 scripts. See the individual descriptions for exceptions and semantics. 2088 </P><!--#if expr="! $GUI" --> 2089 <P></P><!--#endif --> 2090 <PRE> 2091 CONTENT_LENGTH (section <A HREF="#6.1.2">6.1.2</A>) 2092 CONTENT_TYPE (section <A HREF="#6.1.3">6.1.3</A>) 2093 GATEWAY_INTERFACE (section <A HREF="#6.1.4">6.1.4</A>) 2094 PATH_INFO (section <A HREF="#6.1.6">6.1.6</A>) 2095 QUERY_STRING (section <A HREF="#6.1.8">6.1.8</A>) 2096 REMOTE_ADDR (section <A HREF="#6.1.9">6.1.9</A>) 2097 REQUEST_METHOD (section <A HREF="#6.1.13">6.1.13</A>) 2098 SCRIPT_NAME (section <A HREF="#6.1.14">6.1.14</A>) 2099 SERVER_NAME (section <A HREF="#6.1.15">6.1.15</A>) 2100 SERVER_PORT (section <A HREF="#6.1.16">6.1.16</A>) 2101 SERVER_PROTOCOL (section <A HREF="#6.1.17">6.1.17</A>) 2102 SERVER_SOFTWARE (section <A HREF="#6.1.18">6.1.18</A>) 2103 </PRE> 2104 <P> 2105 Servers SHOULD define the following 2106 metavariables for scripts. 2107 See the individual descriptions for exceptions and semantics. 2108 </P><!--#if expr="! $GUI" --> 2109 <P></P><!--#endif --> 2110 <PRE> 2111 AUTH_TYPE (section <A HREF="#6.1.1">6.1.1</A>) 2112 REMOTE_HOST (section <A HREF="#6.1.10">6.1.10</A>) 2113 </PRE> 2114 <P> 2115 In addition, servers SHOULD provide 2116 metavariables for all fields present 2117 in the HTTP request header, with the exception of those involved with 2118 access control. Servers MAY at their discretion provide 2119 metavariables 2120 for access control fields. 2121 </P> 2122 <P> 2123 Servers MAY define the following 2124 metavariables. See the individual 2125 descriptions for exceptions and semantics. 2126 </P><!--#if expr="! $GUI" --> 2127 <P></P><!--#endif --> 2128 <PRE> 2129 PATH_TRANSLATED (section <A HREF="#6.1.7">6.1.7</A>) 2130 REMOTE_IDENT (section <A HREF="#6.1.11">6.1.11</A>) 2131 REMOTE_USER (section <A HREF="#6.1.12">6.1.12</A>) 2132 </PRE> 2133 <P> 2134 Servers MAY 2135 at their discretion define additional implementation-specific 2136 extension metavariables 2137 provided their names do not 2138 conflict with defined header field names. Implementation-specific 2139 metavariable names SHOULD 2140 be prefixed with "X_" (<EM>e.g.</EM>, 2141 "X_DBA") to avoid the potential for such conflicts. 2142 </P> 2143 2144 <H2> 2145 <A NAME="9.0"> 2146 9. 2147 Script Implementation 2148 </A> 2149 </H2> 2150 <P> 2151 This section defines the requirements and recommendations for scripts 2152 that are intended to function in a CGI/1.1 environment. It is intended 2153 primarily as a reference for script authors, but server implementors 2154 should be familiar with these issues as well. 2155 </P> 2156 2157 <H3> 2158 <A NAME="9.1"> 2159 9.1. Requirements for Scripts 2160 </A> 2161 </H3> 2162 <P> 2163 Scripts using the parsed-header method to communicate with servers 2164 MUST supply a response header to the server. 2165 (See <A HREF="#7.0">section 7</A>.) 2166 </P> 2167 <P> 2168 Scripts using the NPH method to communicate with servers MUST 2169 provide complete HTTP responses, and MUST use the value of the 2170 SERVER_PROTOCOL metavariable 2171 to determine the appropriate format. 2172 (See <A HREF="#7.1">section 7.1</A>.) 2173 </P> 2174 <P> 2175 Scripts MUST check the value of the REQUEST_METHOD 2176 metavariable in order 2177 to provide an appropriate response. 2178 (See <A HREF="#6.1.13">section 6.1.13</A>.) 2179 </P> 2180 <P> 2181 Scripts MUST be prepared to handled URL-encoded values in 2182 metavariables. 2183 In addition, they MUST recognise both "+" and "%20" in URL-encoded 2184 quantities as representing the space character. 2185 (See <A HREF="#3.1">section 3.1</A>.) 2186 </P> 2187 <P> 2188 Scripts MUST ignore leading zeros in the major and minor version numbers 2189 in the GATEWAY_INTERFACE 2190 metavariable value. (See 2191 <A HREF="#6.1.4">section 6.1.4</A>.) 2192 </P> 2193 <P> 2194 When processing requests that include a 2195 message-body, scripts 2196 MUST NOT read more than CONTENT_LENGTH bytes from the input stream. 2197 (See sections <A HREF="#6.1.2">6.1.2</A> and <A HREF="#6.2">6.2</A>.) 2198 </P> 2199 2200 <H3> 2201 <A NAME="9.2"> 2202 9.2. Recommendations for Scripts 2203 </A> 2204 </H3> 2205 <P> 2206 Servers may interrupt or terminate script execution at any time 2207 and without warning, so scripts SHOULD be prepared to deal with 2208 abnormal termination. 2209 </P> 2210 <P> 2211 Scripts MUST 2212 reject with 2213 error '405 Method Not 2214 Allowed' requests 2215 made using methods that they do not support. If the script does 2216 not intend 2217 processing the PATH_INFO data, then it SHOULD reject the request with 2218 '404 Not 2219 Found' if PATH_INFO is not NULL. 2220 </P> 2221 <P> 2222 If a script is processing the output of a form, it SHOULD 2223 verify that the CONTENT_TYPE 2224 is "<SAMP>application/x-www-form-urlencoded</SAMP>" [<A HREF="#[2]">2</A>] 2225 or whatever other media type is expected. 2226 </P> 2227 <P> 2228 Scripts parsing PATH_INFO, 2229 PATH_TRANSLATED, or SCRIPT_NAME 2230 SHOULD be careful 2231 of void path segments ("<SAMP>//</SAMP>") and special path segments 2232 (<SAMP>"."</SAMP> and 2233 <SAMP>".."</SAMP>). They SHOULD either be removed from the path before 2234 use in OS 2235 system calls, or the request SHOULD be rejected with 2236 '404 Not Found'. 2237 </P> 2238 <P> 2239 As it is impossible for 2240 scripts to determine the client URI that 2241 initiated a 2242 request without knowledge of the specific server in 2243 use, the script SHOULD NOT return "<SAMP>text/html</SAMP>" 2244 documents containing 2245 relative URL links without including a "<SAMP><BASE></SAMP>" 2246 tag in the document. 2247 </P> 2248 <P> 2249 When returning header fields, 2250 scripts SHOULD try to send the CGI 2251 header fields (see section 2252 <A HREF="#7.2">7.2</A>) as soon as possible, and 2253 SHOULD send them 2254 before any HTTP header fields. This may 2255 help reduce the server's memory requirements. 2256 </P> 2257 2258 <H2> 2259 <A NAME="10.0"> 2260 10. System Specifications 2261 </A> 2262 </H2> 2263 2264 <H3> 2265 <A NAME="10.1"> 2266 10.1. AmigaDOS 2267 </A> 2268 </H3> 2269 <P> 2270 The implementation of the CGI on an AmigaDOS operating system platform 2271 SHOULD use environment variables as the mechanism of providing 2272 request metadata to CGI scripts. 2273 </P> 2274 <DL> 2275 <DT><STRONG>Environment variables</STRONG> 2276 </DT> 2277 <DD> 2278 <P> 2279 These are accessed by the DOS library routine <SAMP>GetVar</SAMP>. The 2280 flags argument SHOULD be 0. Case is ignored, but upper case is 2281 recommended for compatibility with case-sensitive systems. 2282 </P> 2283 </DD> 2284 <DT><STRONG>The current working directory</STRONG> 2285 </DT> 2286 <DD> 2287 <P> 2288 The current working directory for the script is set to the directory 2289 containing the script. 2290 </P> 2291 </DD> 2292 <DT><STRONG>Character set</STRONG> 2293 </DT> 2294 <DD> 2295 <P> 2296 The US-ASCII character set is used for the definition of environment 2297 variable names and header 2298 field names; the newline (NL) sequence is LF; 2299 servers SHOULD also accept CR LF as a newline. 2300 </P> 2301 </DD> 2302 </DL> 2303 2304 <H3> 2305 <A NAME="10.2"> 2306 10.2. Unix 2307 </A> 2308 </H3> 2309 <P> 2310 The implementation of the CGI on a UNIX operating system platform 2311 SHOULD use environment variables as the mechanism of providing 2312 request metadata to CGI scripts. 2313 </P> 2314 <P> 2315 For Unix compatible operating systems, the following are defined: 2316 </P> 2317 <DL> 2318 <DT><STRONG>Environment variables</STRONG> 2319 </DT> 2320 <DD> 2321 <P> 2322 These are accessed by the C library routine <SAMP>getenv</SAMP>. 2323 </P> 2324 </DD> 2325 <DT><STRONG>The command line</STRONG> 2326 </DT> 2327 <DD> 2328 <P> 2329 This is accessed using the 2330 <SAMP>argc</SAMP> and <SAMP>argv</SAMP> 2331 arguments to <SAMP>main()</SAMP>. The words have any characters 2332 that 2333 are 'active' in the Bourne shell escaped with a backslash. 2334 If the value of the QUERY_STRING 2335 metavariable 2336 contains an unencoded equals-sign '=', then the command line 2337 SHOULD NOT be used by the script. 2338 </P> 2339 </DD> 2340 <DT><STRONG>The current working directory</STRONG> 2341 </DT> 2342 <DD> 2343 <P> 2344 The current working directory for the script 2345 SHOULD be set to the directory 2346 containing the script. 2347 </P> 2348 </DD> 2349 <DT><STRONG>Character set</STRONG> 2350 </DT> 2351 <DD> 2352 <P> 2353 The US-ASCII character set is used for the definition of environment 2354 variable names and header field names; the newline (NL) sequence is LF; 2355 servers SHOULD also accept CR LF as a newline. 2356 </P> 2357 </DD> 2358 </DL> 2359 2360 <H2> 2361 <A NAME="11.0"> 2362 11. Security Considerations 2363 </A> 2364 </H2> 2365 2366 <H3> 2367 <A NAME="11.1"> 2368 11.1. Safe Methods 2369 </A> 2370 </H3> 2371 <P> 2372 As discussed in the security considerations of the HTTP 2373 specifications [<A HREF="#[3]">3</A>,<A HREF="#[8]">8</A>], the 2374 convention has been established that the 2375 GET and HEAD methods should be 'safe'; they should cause no 2376 side-effects and only have the significance of resource retrieval. 2377 </P> 2378 <P> 2379 CGI scripts are responsible for enforcing any HTTP security considerations 2380 [<A HREF="#[3]">3</A>,<A HREF="#[8]">8</A>] 2381 with respect to the protocol version level of the request and 2382 any side effects generated by the scripts on behalf of 2383 the server. Primary 2384 among these 2385 are the considerations of safe and idempotent methods. Idempotent 2386 requests are those that may be repeated an arbitrary number of times 2387 and produce side effects identical to a single request. 2388 </P> 2389 2390 <H3> 2391 <A NAME="11.2"> 2392 11.2. HTTP Header 2393 Fields Containing Sensitive Information 2394 </A> 2395 </H3> 2396 <P> 2397 Some HTTP header fields may carry sensitive information which the server 2398 SHOULD NOT pass on to the script unless explicitly configured to do 2399 so. For example, if the server protects the script using the 2400 "<SAMP>Basic</SAMP>" 2401 authentication scheme, then the client will send an 2402 "<SAMP>Authorization</SAMP>" 2403 header field containing a username and password. If the server, rather 2404 than the script, validates this information then the password SHOULD 2405 NOT be passed on to the script <EM>via</EM> the HTTP_AUTHORIZATION 2406 metavariable 2407 without careful consideration. 2408 This also applies to the 2409 Proxy-Authorization header field and the corresponding 2410 HTTP_PROXY_AUTHORIZATION 2411 metavariable. 2412 </P> 2413 2414 <H3> 2415 <A NAME="11.3"> 2416 11.3. Script 2417 Interference with the Server 2418 </A> 2419 </H3> 2420 <P> 2421 The most common implementation of CGI invokes the script as a child 2422 process using the same user and group as the server process. It 2423 SHOULD therefore be ensured that the script cannot interfere with the 2424 server process, its configuration, or documents. 2425 </P> 2426 <P> 2427 If the script is executed by calling a function linked in to the 2428 server software (either at compile-time or run-time) then precautions 2429 SHOULD be taken to protect the core memory of the server, or to 2430 ensure that untrusted code cannot be executed. 2431 </P> 2432 2433 <H3> 2434 <A NAME="11.4"> 2435 11.4. Data Length and Buffering Considerations 2436 </A> 2437 </H3> 2438 <P> 2439 This specification places no limits on the length of message-bodies 2440 presented to the script. Scripts should not assume that statically 2441 allocated buffers of any size are sufficient to contain the entire 2442 submission at one time. Use of a fixed length buffer without careful 2443 overflow checking may result in an attacker exploiting 'stack-smashing' 2444 or 'stack-overflow' vulnerabilities of the operating system. 2445 Scripts may spool large submissions to disk or other buffering media, 2446 but a rapid succession of large submissions may result in denial of 2447 service conditions. If the CONTENT_LENGTH of a message-body is larger 2448 than resource considerations allow, scripts should respond with an 2449 error status appropriate for the protocol version; potentially applicable 2450 status codes include '503 Service Unavailable' (HTTP/1.0 and HTTP/1.1), 2451 '413 Request Entity Too Large' (HTTP/1.1), and 2452 '414 Request-URI Too Long' (HTTP/1.1). 2453 </P> 2454 2455 <H3> 2456 <A NAME="11.5"> 2457 11.5. Stateless Processing 2458 </A> 2459 </H3> 2460 <P> 2461 The stateless nature of the Web makes each script execution and resource 2462 retrieval independent of all others even when multiple requests constitute a 2463 single conceptual Web transaction. Because of this, a script should not 2464 make any assumptions about the context of the user-agent submitting a 2465 request. In particular, scripts should examine data obtained from the client 2466 and verify that they are valid, both in form and content, before allowing 2467 them to be used for sensitive purposes such as input to other 2468 applications, commands, or operating system services. These uses 2469 include, but are not 2470 limited to: system call arguments, database writes, dynamically evaluated 2471 source code, and input to billing or other secure processes. It is important 2472 that applications be protected from invalid input regardless of whether 2473 the invalidity is the result of user error, logic error, or malicious action. 2474 </P> 2475 <P> 2476 Authors of scripts involved in multi-request transactions should be 2477 particularly cautios about validating the state information; 2478 undesirable effects may result from the substitution of dangerous 2479 values for portions of the submission which might otherwise be 2480 presumed safe. Subversion of this type occurs when alterations 2481 are made to data from a prior stage of the transaction that were 2482 not meant to be controlled by the client (<EM>e.g.</EM>, hidden 2483 HTML form elements, cookies, embedded URLs, <EM>etc.</EM>). 2484 </P> 2485 2486 <H2> 2487 <A NAME="12.0"> 2488 12. Acknowledgements 2489 </A> 2490 </H2> 2491 <P> 2492 This work is based on a draft published in 1997 by David R. Robinson, 2493 which in turn was based on the original CGI interface that arose out of 2494 discussions on the <EM>www-talk</EM> mailing list. In particular, 2495 Rob McCool, John Franks, Ari Luotonen, 2496 George Phillips and 2497 Tony Sanders deserve special recognition for their efforts in 2498 defining and implementing the early versions of this interface. 2499 </P> 2500 <P> 2501 This document has also greatly benefited from the comments and 2502 suggestions made by Chris Adie, Dave Kristol, 2503 Mike Meyer, David Morris, Jeremy Madea, 2504 Patrick M<SUP>c</SUP>Manus, Adam Donahue, 2505 Ross Patterson, and Harald Alvestrand. 2506 </P> 2507 2508 <H2> 2509 <A NAME="13.0"> 2510 13. References 2511 </A> 2512 </H2> 2513 <DL COMPACT> 2514 <DT><A NAME="[1]">[1]</A> 2515 </DT> 2516 <DD>Berners-Lee, T., 'Universal Resource Identifiers in WWW: A 2517 Unifying Syntax for the Expression of Names and Addresses of 2518 Objects on the Network as used in the World-Wide Web', RFC 1630, 2519 CERN, June 1994. 2520 <P> 2521 </P> 2522 </DD> 2523 <DT><A NAME="[2]">[2]</A> 2524 </DT> 2525 <DD>Berners-Lee, T. and Connolly, D., 'Hypertext Markup Language - 2526 2.0', RFC 1866, MIT/W3C, November 1995. 2527 <P> 2528 </P> 2529 </DD> 2530 <DT><A NAME="[3]">[3]</A> 2531 </DT> 2532 <DD>Berners-Lee, T., Fielding, R. T. and Frystyk, H., 2533 'Hypertext Transfer Protocol -- HTTP/1.0', RFC 1945, MIT/LCS, 2534 UC Irvine, May 1996. 2535 <P> 2536 </P> 2537 </DD> 2538 2539 <DT><A NAME="[4]">[4]</A> 2540 </DT> 2541 <DD>Berners-Lee, T., Fielding, R., and Masinter, L., Editors, 2542 'Uniform Resource Identifiers (URI): Generic Syntax', RFC 2396, 2543 MIT, U.C. Irvine, Xerox Corporation, August 1996. 2544 <P> 2545 </P> 2546 </DD> 2547 2548 <DT><A NAME="[5]">[5]</A> 2549 </DT> 2550 <DD>Braden, R., Editor, 'Requirements for Internet Hosts -- 2551 Application and Support', STD 3, RFC 1123, IETF, October 1989. 2552 <P> 2553 </P> 2554 </DD> 2555 <DT><A NAME="[6]">[6]</A> 2556 </DT> 2557 <DD>Crocker, D.H., 'Standard for the Format of ARPA Internet Text 2558 Messages', STD 11, RFC 822, University of Delaware, August 1982. 2559 <P> 2560 </P> 2561 </DD> 2562 <DT><A NAME="[7]">[7]</A> 2563 </DT> 2564 <DD>Fielding, R., 'Relative Uniform Resource Locators', RFC 1808, 2565 UC Irvine, June 1995. 2566 <P> 2567 </P> 2568 </DD> 2569 <DT><A NAME="[8]">[8]</A> 2570 </DT> 2571 <DD>Fielding, R., Gettys, J., Mogul, J., Frystyk, H. and 2572 Berners-Lee, T., 'Hypertext Transfer Protocol -- HTTP/1.1', 2573 RFC 2068, UC Irvine, DEC, 2574 MIT/LCS, January 1997. 2575 <P> 2576 </P> 2577 </DD> 2578 <DT><A NAME="[9]">[9]</A> 2579 </DT> 2580 <DD>Freed, N. and Borenstein N., 'Multipurpose Internet Mail 2581 Extensions (MIME) Part Two: Media Types', RFC 2046, Innosoft, 2582 First Virtual, November 1996. 2583 <P> 2584 </P> 2585 </DD> 2586 <DT><A NAME="[10]">[10]</A> 2587 </DT> 2588 <DD>Mockapetris, P., 'Domain Names - Concepts and Facilities', 2589 STD 13, RFC 1034, ISI, November 1987. 2590 <P> 2591 </P> 2592 </DD> 2593 <DT><A NAME="[11]">[11]</A> 2594 </DT> 2595 <DD>St. Johns, M., 'Identification Protocol', RFC 1431, US 2596 Department of Defense, February 1993. 2597 <P> 2598 </P> 2599 </DD> 2600 <DT><A NAME="[12]">[12]</A> 2601 </DT> 2602 <DD>'Coded Character Set -- 7-bit American Standard Code for 2603 Information Interchange', ANSI X3.4-1986. 2604 <P> 2605 </P> 2606 </DD> 2607 <DT><A NAME="[13]">[13]</A> 2608 </DT> 2609 <DD>Hinden, R. and Deering, S., 2610 'IP Version 6 Addressing Architecture', RFC 2373, 2611 Nokia, Cisco Systems, 2612 July 1998. 2613 <P> 2614 </P> 2615 </DD> 2616 </DL> 2617 2618 <H2> 2619 <A NAME="14.0"> 2620 14. Authors' Addresses 2621 </A> 2622 </H2> 2623 <ADDRESS> 2624 <P> 2625 Ken A L Coar 2626 <BR> 2627 MeepZor Consulting 2628 <BR> 2629 7824 Mayfaire Crest Lane, Suite 202 2630 <BR> 2631 Raleigh, NC 27615-4875 2632 <BR> 2633 U.S.A. 2634 </P> 2635 <P> 2636 Tel: +1 (919) 254.4237 2637 <BR> 2638 Fax: +1 (919) 254.5250 2639 <BR> 2640 Email: 2641 <A 2642 HREF="mailto:Ken.Coar@Golux.Com" 2643 ><SAMP>Ken.Coar@Golux.Com</SAMP></A> 2644 </P> 2645 </ADDRESS> 2646 <ADDRESS> 2647 <P> 2648 David Robinson 2649 <BR> 2650 E*TRADE UK Ltd 2651 <BR> 2652 Mount Pleasant House 2653 <BR> 2654 2 Mount Pleasant 2655 <BR> 2656 Huntingdon Road 2657 <BR> 2658 Cambridge CB3 0RN 2659 <BR> 2660 UK 2661 </P> 2662 <P> 2663 Tel: +44 (1223) 566926 2664 <BR> 2665 Fax: +44 (1223) 506288 2666 <BR> 2667 Email: 2668 <A 2669 HREF="mailto:drtr@etrade.co.uk" 2670 ><SAMP>drtr@etrade.co.uk</SAMP></A> 2671 </ADDRESS> 2672 2673 </BODY> 2674</HTML> 2675