1This document describes the working of the GNU Wget Test Suite. 2 3Install Instructions: 4================================================================================ 5 6This Test Suite exploits the Parallel Test Harness available in GNU Autotools. 7Since it uses features from a relatively recent verion of Autotools, the minimum 8required version as been bumped up to 1.11. 9Run the './configure' command to generate the Makefile and then run 'make check' 10to execute the Test Suite. Use the '-j n' option with 'make check' to execute 11n tests simultaneously. 12 13Structure: 14================================================================================ 15 16 * server: This package contains custom programmatically configurable servers 17 (both HTTP and FTP) for testing Wget. The HTTP server runs an instance of 18 Python's http.server module. The FTP server is to be implemented. 19 20 * test: This package contains the test case classes for HTTP and FTP. The 21 test case classes includes methods for initializing and cleaning up of the 22 test environment. 23 24 * Test-Proto.py: This is a prototype Test Case file. The file defines all 25 the acceptable elements and their uses. Typically, one must copy this file 26 and edit it for writing Test Cases. 27 28 * exc: This package contains custom exception classes used in this test 29 suite. 30 31 * conf: This package contains the configuration classes for servers to be 32 configured with. 33 34 * misc: This package contains several helper modules used in this test 35 suite. 36 - colour_terminal.py: A custom module for printing coloured output to 37 the terminal. Currently it only supports 4 colours in a *nix 38 environment. 39 - wget_file.py: Module which contains WgetFile, which is a file data 40 container object. 41 42Working: 43================================================================================ 44 45The Test Files are valid Python scripts and the default mask for them is 755. 46A singular Test must be invoked in the following manner: 47$ ./python3 <Name of Test File> OR 48$ ./<Name of Test File> 49The script will then initialize the various elements and pass them to an object 50of the respective Test Class. A directory with the name <Test name>-test will be 51created and the PWD will be changed to this directory. The server is then 52spawned with the required configuration elements. A blocking call to Wget is 53made with the command line arguments specified in the Test Case along with the 54list of URLs that it must download. The server is killed once Wget returns and 55the following checks are used to determine the pass/fail status of the test: 56 * Return Code: The Exit code of Wget is matched against the expected Exit 57 Code as mentioned in the Test Case File. 58 * Downloaded Files: Check whether the expected downloaded files exist on 59 disk. 60 * File Content: Test whether the file contents were correctly downloaded by 61 Wget and not corrupted mid-way. 62 * Excess Files: Check to see whether any unexpected files were downloaded 63 by Wget. 64 65Exit Codes: 66=============================================================================== 67 68Following is a list of Exit Status Codes for the tests: 69* 0 Test Successful 70* 66 Errors/Warnings Reported by Thread Sanitizer (If built with -fsanitize) 71* 77 Test Skipped 72* 99 Hard Error 73* 100 Test Failed 74 75Tests are skipped when they are either not supported by the platform, or Wget 76is not compiled with support for that feature. This feature has not yet been 77implemented. 78 79Hard Errors occur when there are problems with the Environment code. Hard 80Error reporting is currently not enabled and all errors are reported as 81failures. 82 83All exceptions should ideally be handled gracefully. If you see any unhandled 84exceptions, please file a bug report at <bug-wget@gnu.org> 85 86Environment Variables: 87================================================================================ 88 89* SERVER_WAIT: Set this environment variable with a value for the number of 90 seconds the test should sleep between invoking the server and calling the Wget 91 executable. This is used when one would like to test a different version of 92 the executable or for running the test through external utilities like gdb and 93 valgrind. 94* NO_CLEANUP: Do not remove the temporary files created by the test. 95 This will prevent the ${testname}-test directory from being deleted 96* VALGRIND_TESTS: If this variable is set, the test suite will execute all the 97 tests through valgrind's memcheck tool. 98 99 100File Structure: 101================================================================================ 102 103The test case files are Python scripts. It is believed that Python is a simple 104yet elegant language and should be easy for everyone to comprehend. This test 105suite is written with the objective of making it easy to write new tests. The 106structure has been kept as intuitive as possible and should not require much 107effort to get accustomed to. 108 109All Test Files MUST begin with the following Three Lines: 110#!/usr/bin/python3 111from sys import exit 112from WgetTest import {HTTPTest|FTPTest} 113from misc.wget_file import WgetFile 114 115It is recommended that a small description of the Test Case is provided next. 116This would be very helpful to future contributors. 117Next, is the const variable, TEST_NAME that defines the name of the Test. 118 119Each File in the Test must be represented as a WgetFile object. The WgetFile 120Class has the following prototype: 121WgetFile (str name, str contents, str timestamp, dict rules) 122None except name is a mandatory paramter, one may pass only those parameters 123that are required by the File object. 124 125The timestamp string should be a valid Unix Timestamp as defined in RFC xxxx. 126The rules object is a dictionary element, with the key as the Rule Name and 127value as the Rule Data. In most cases, the Rule Data is another dictionary. 128 129Various variables used consistently across all tests are: 130 * WGET_OPTIONS: The command line string passed to Wget upon invokation. This 131 string may contain URLs, like in the case where in-URL authentication is 132 used. Variable names passed like {{var_name}} will be replaced by the 133 contents of the variable self.var_name before being passed to Wget 134 * WGET_URLS: This is a list of filenames which will be appended as the URLs 135 to Wget during invokation. This is a list of lists, where WGET_URLS[0] 136 represents the list of Filenames called from Server[0], WGET_URLS[1] is a 137 list of files downloaded from Server[2], etc. 138 * Files: This variable defines the files that exist in the Server's 139 filesystem. The Files variable is a list of lists of WgetFile objects. 140 This means that File[0] is a list of WgetFile objects that lie on Server[0], 141 File[1] a list of files on Server[1] and so on. 142 * Existing_Files: This is a list of files that already exist in the 143 directory from which Wget is invoked. 144 * ExpectedReturnCode: The Exit Code expected to be returned by Wget after 145 the test. 146 * ExpectedDownloadedFiles: A list of files that are expected in the local 147 directory after Wget has finished executing. This does not include the files 148 already existing before Wget was launched and must be mentioned again. 149 * Request_List: An unordered list of Requests that each server must receive. 150 This too is a list of lists and follows the same convention as others above. 151 152Both, the HTTPTest and FTPTest modules have the same prototype: 153{ 154 name, 155 pre_hook, 156 test_options, 157 post_hook, 158 protocols 159} 160name should be a string, and is usually passed to the TEST_NAME variable, 161the three hooks should be Python dict objects and protocols should be a list of 162protocols, like [HTTP, HTTPS]. 163 164Valid File Rules: 165================================================================================ 166 167This section lists the currently supported File Rules and their structure. 168 169 * Authentication: Used when a File must require Authorization for access. 170 The value for this key is the following dictionary: 171 |-->Type : Basic|Digest|Both|Both_inline 172 |-->User : <Username> 173 --->Pass : <Password> 174 175 * ExpectHeader : The following Headers MUST exist in every Request for the 176 File. The value for this key is a dictionary object where each header is 177 represented as: 178 |-->Header Name : <Header Data> 179 180 * RejectHeader : This list of Headers must NEVER occur in a request. It 181 uses the same value format as ExpectHeader. 182 183 * SendHeader : This list of Headers will be sent in EVERY response to a 184 request for the respective file. It follows the same value format as 185 ExpectHeader. 186 187 * Response : The HTTP Response Code to send to a request for this File. 188 The value is an Integer that represents a valid HTTP Response Code. 189 190Pre Test Hooks: 191================================================================================ 192 193The Pre-Test Hooks are executed just after starting the server and just before 194spawning an instance of the server. These are usually used for setting up the 195Test Environment and Server Rules. The currently supported Pre-Test Hooks are: 196 197 * ServerFiles : A list of WgetFile objects that must exist on the Server 198 * LocalFiles : A list of WgetFile objects that exist locally on disk 199 before Wget is executed. 200 201Since pre_test is a dictionary, one may not assume that the hooks will be 202executed in the same order as they are defined. 203 204Test Options: 205================================================================================ 206 207The test_options dictionary defines the commands to be used when the Test is 208executed. The currently supported options are: 209 210 * Urls : A list of the filenames that Wget must attempt to 211 download. The complete URL will be created and passed to Wget 212 automatically. (alias URLs) 213 * WgetCommands : A string consisting of the various commandline switches 214 sent to Wget upon invokation. Any data placed between {{ }} in this string 215 will be replaced with the contents of self.<data> before being passed to 216 Wget. This is particularly useful for getting the hostname and port for a 217 file. While all Download URL's are passed to Urls, a notable exception is 218 when in-url authentication is used. In such a case, the URL is specified in 219 the WgetCommands string. 220 221Post-Test Hooks: 222================================================================================ 223 224These hooks are executed as soon as the call to Wget returns. The post-test 225hooks are usually used to run checks on the data, files downloaded, return code, 226etc. The following hooks are currently supported: 227 228 * ExpectedRetcode : This is an integer value of the ReturnCode with which 229 Wget is expected to exit. (alias ExpectedRetCode) 230 * ExpectedFiles : This is a list of WgetFile objects of the files that 231 must exist locally on disk in the Test directory. 232 * FilesCrawled : This requires a list of the Requests that the server is 233 expected to receive. The order is un-important since it will vary on the 234 parallel-wget branch. This hook is used in tests for Recursive mode to 235 ensure that the website is traversed correctly. 236 237Writing New Tests: 238================================================================================ 239 240See Test-Proto.py for an example of how to write Test Case files. The 241recommended method for writing new Test Case files is to copy Test-Proto.py and 242modify it to ones needs. 243 244In case you require any functionality that is not currently defined in List of 245Rules defined above, you should implement a new class in the conf package. The 246file name doesn't matter (though it's better to give it an appropriate name). 247The new rule or hook class should be like this: 248============================================ 249from conf import rule 250 251 252@rule() 253class MyNewRule: 254 def __init__(self, rule_arg): 255 self.rule_arg = rule_arg 256 # your rule initialization code goes here 257============================================ 258from conf import hook 259 260 261@hook() 262class MyNewHook: 263 def __init__(self, hook_arg): 264 self.hook_arg = hook_arg 265 # your hook initialization code goes here 266 267 def __call__(self, test_obj): 268 # your hook code goes here 269============================================ 270 271Once a new Test File is created, it must be added to the TESTS variable in 272Makefile.am. This way the Test will be executed on running a 'make check'. 273If a Test is expected to fail on the current master branch, then the Test should 274also be added to the XFAIL_TESTS variable. This will allow expected failures to 275pass through. If a test mentioned in the XFAIL_TESTS variable passes, it gets 276red-flagged as a XPASS. Currently, tests expected to fail under valgrind are not 277explicitly marked as XFAIL. Tests failing under valgrind must always be 278considered a blocking error. 279 280Remember to always name the Test correctly using the TEST_NAME variable. This 281is essential since a directory with the Test Name is created and this can 282cause synchronization problems when the Parallel Test Harness is used. 283One can use the following command on Unix systems to check for TEST_NAME 284clashes: 285$ grep -r -h "TEST_NAME =" | cut -c13- | uniq -c -d 286 287Work Remaining: 288================================================================================ 289 290Some amount of work still remains to be done. 291 * Errors in server-side checks need to be handled more explicitly 292 * Support parallel-wget branch 293 * Support to spawn multiple servers is already in place. Need to handle 294 multiple requests to a server simultaneously. Use THreading MixIn. 295 * SSL Tests. Use xyne's HTTPS server implemention 296 * Complete support for FTP Tests 297 * IRI Support. This shouldn't require much effort 298 299Requirements: 300================================================================================ 301 3021. Python >= 3.0 3032. Automake >= 1.11 304