1Samba Architecture 2------------------ 3 4First preliminary version Dan Shearer Nov 97 5Quickly scrabbled together from odd bits of mail and memory. Please update. 6 7This document gives a general overview of how Samba works 8internally. The Samba Team has tried to come up with a model which is 9the best possible compromise between elegance, portability, security 10and the constraints imposed by the very messy SMB and CIFS 11protocol. 12 13It also tries to answer some of the frequently asked questions such as: 14 15 * Is Samba secure when running on Unix? The xyz platform? 16 What about the root priveliges issue? 17 18 * Pros and cons of multithreading in various parts of Samba 19 20 * Why not have a separate process for name resolution, WINS, 21 and browsing? 22 23 24Multithreading and Samba 25------------------------ 26 27People sometimes tout threads as a uniformly good thing. They are very 28nice in their place but are quite inappropriate for smbd. nmbd is 29another matter, and multi-threading it would be very nice. 30 31The short version is that smbd is not multithreaded, and alternative 32servers that take this approach under Unix (such as Syntax, at the 33time of writing) suffer tremendous performance penalties and are less 34robust. nmbd is not threaded either, but this is because it is not 35possible to do it while keeping code consistent and portable across 35 36or more platforms. (This drawback also applies to threading smbd.) 37 38The longer versions is that there are very good reasons for not making 39smbd multi-threaded. Multi-threading would actually make Samba much 40slower, less scalable, less portable and much less robust. The fact 41that we use a separate process for each connection is one of Samba's 42biggest advantages. 43 44Threading smbd 45-------------- 46 47A few problems that would arise from a threaded smbd are: 48 490) It's not only to create threads instead of processes, but you 50 must care about all variables if they have to be thread specific 51 (currently they would be global). 52 531) if one thread dies (eg. a seg fault) then all threads die. We can 54immediately throw robustness out the window. 55 562) many of the system calls we make are blocking. Non-blocking 57equivalents of many calls are either not available or are awkward (and 58slow) to use. So while we block in one thread all clients are 59waiting. Imagine if one share is a slow NFS filesystem and the others 60are fast, we will end up slowing all clients to the speed of NFS. 61 623) you can't run as a different uid in different threads. This means 63we would have to switch uid/gid on _every_ SMB packet. It would be 64horrendously slow. 65 664) the per process file descriptor limit would mean that we could only 67support a limited number of clients. 68 695) we couldn't use the system locking calls as the locking context of 70fcntl() is a process, not a thread. 71 72Threading nmbd 73-------------- 74 75This would be ideal, but gets sunk by portability requirements. 76 77Andrew tried to write a test threads library for nmbd that used only 78ansi-C constructs (using setjmp and longjmp). Unfortunately some OSes 79defeat this by restricting longjmp to calling addresses that are 80shallower than the current address on the stack (apparently AIX does 81this). This makes a truly portable threads library impossible. So to 82support all our current platforms we would have to code nmbd both with 83and without threads, and as the real aim of threads is to make the 84code clearer we would not have gained anything. (it is a myth that 85threads make things faster. threading is like recursion, it can make 86things clear but the same thing can always be done faster by some 87other method) 88 89Chris tried to spec out a general design that would abstract threading 90vs separate processes (vs other methods?) and make them accessible 91through some general API. This doesn't work because of the data 92sharing requirements of the protocol (packets in the future depending 93on packets now, etc.) At least, the code would work but would be very 94clumsy, and besides the fork() type model would never work on Unix. (Is there an OS that it would work on, for nmbd?) 95 96A fork() is cheap, but not nearly cheap enough to do on every UDP 97packet that arrives. Having a pool of processes is possible but is 98nasty to program cleanly due to the enormous amount of shared data (in 99complex structures) between the processes. We can't rely on each 100platform having a shared memory system. 101 102nbmd Design 103----------- 104 105Originally Andrew used recursion to simulate a multi-threaded 106environment, which use the stack enormously and made for really 107confusing debugging sessions. Luke Leighton rewrote it to use a 108queuing system that keeps state information on each packet. The 109first version used a single structure which was used by all the 110pending states. As the initialisation of this structure was 111done by adding arguments, as the functionality developed, it got 112pretty messy. So, it was replaced with a higher-order function 113and a pointer to a user-defined memory block. This suddenly 114made things much simpler: large numbers of functions could be 115made static, and modularised. This is the same principle as used 116in NT's kernel, and achieves the same effect as threads, but in 117a single process. 118 119Then Jeremy rewrote nmbd. The packet data in nmbd isn't what's on the 120wire. It's a nice format that is very amenable to processing but still 121keeps the idea of a distinct packet. See "struct packet_struct" in 122nameserv.h. It has all the detail but none of the on-the-wire 123mess. This makes it ideal for using in disk or memory-based databases 124for browsing and WINS support. 125 126nmbd now consists of a series of modules. It... 127 128 129Samba Design and Security 130------------------------- 131 132Why Isn't nmbd Multiple Daemons? 133-------------------------------- 134 135