2 3For the lack of a better place to put them, this file will contain 4notes on some of the more intricate details of geom. 5 6----------------------------------------------------------------------- 7Locking of bio_children and bio_inbed 8 9bio_children is used by g_std_done() and g_clone_bio() to keep track 10of children cloned off a request. g_clone_bio will increment the 11bio_children counter for each time it is called and g_std_done will 12increment bio_inbed for every call, and if the two counters are 13equal, call g_io_deliver() on the parent bio. 14 15The general assumption is that g_clone_bio() is called only in 16the g_down thread, and g_std_done() only in the g_up thread and 17therefore the two fields do not generally need locking. These 18restrictions are not enforced by the code, but only with great 19care should they be violated. 20 21It is the responsibility of the class implementation to avoid the 22following race condition: A class intend to split a bio in two 23children. It clones the bio, and requests I/O on the child. 24This I/O operation completes before the second child is cloned 25and g_std_done() sees the counters both equal 1 and finishes off 26the bio. 27 28There is no race present in the common case where the bio is split 29in multiple parts in the class start method and the I/O is requested 30on another GEOM class below: There is only one g_down thread and 31the class below will not get its start method run until we return 32from our start method, and consequently the I/O cannot complete 33prematurely. 34 35In all other cases, this race needs to be mitigated, for instance 36by cloning all children before I/O is request on any of them. 37 38Notice that cloning an "extra" child and calling g_std_done() on 39it directly opens another race since the assumption is that 40g_std_done() only is called in the g_up thread. 41 42----------------------------------------------------------------------- 43Statistics collection 44 45Statistics collection can run at three levels controlled by the 46"kern.geom.collectstats" sysctl. 47 48At level zero, only the number of transactions started and completed 49are counted, and this is only because GEOM internally uses the difference 50between these two as sanity checks. 51 52At level one we collect the full statistics. Higher levels are 53reserved for future use. Statistics are collected independently 54on both the provider and the consumer, because multiple consumers 55can be active against the same provider at the same time. 56 57The statistics collection falls in two parts: 58 59The first and simpler part consists of g_io_request() timestamping 60the struct bio when the request is first started and g_io_deliver() 61updating the consumer and providers statistics based on fields in 62the bio when it is completed. There are no concurrency or locking 63concerns in this part. The statistics collected consists of number 64of requests, number of bytes, number of ENOMEM errors, number of 65other errors and duration of the request for each of the three 66major request types: BIO_READ, BIO_WRITE and BIO_DELETE. 67 68The second part is trying to keep track of the "busy%". 69 70If in g_io_request() we find that there are no outstanding requests, 71(based on the counters for scheduled and completed requests being 72equal), we set a timestamp in the "wentbusy" field. Since there 73are no outstanding requests, and as long as there is only one thread 74pushing the g_down queue, we cannot possibly conflict with 75g_io_deliver() until we ship the current request down. 76 77In g_io_deliver() we calculate the delta-T from wentbusy and add this 78to the "bt" field, and set wentbusy to the current timestamp. We 79take care to do this before we increment the "requests completed" 80counter, since that prevents g_io_request() from touching the 81"wentbusy" timestamp concurrently. 82 83The statistics data is made available to userland through the use 84of a special allocator (in geom_stats.c) which through a device 85allows userland to mmap(2) the pages containing the statistics data. 86In order to indicate to userland when the data in a statstics 87structure might be inconsistent, g_io_deliver() atomically sets a 88flag "updating" and resets it when the structure is again consistent.
| 2 3For the lack of a better place to put them, this file will contain 4notes on some of the more intricate details of geom. 5 6----------------------------------------------------------------------- 7Locking of bio_children and bio_inbed 8 9bio_children is used by g_std_done() and g_clone_bio() to keep track 10of children cloned off a request. g_clone_bio will increment the 11bio_children counter for each time it is called and g_std_done will 12increment bio_inbed for every call, and if the two counters are 13equal, call g_io_deliver() on the parent bio. 14 15The general assumption is that g_clone_bio() is called only in 16the g_down thread, and g_std_done() only in the g_up thread and 17therefore the two fields do not generally need locking. These 18restrictions are not enforced by the code, but only with great 19care should they be violated. 20 21It is the responsibility of the class implementation to avoid the 22following race condition: A class intend to split a bio in two 23children. It clones the bio, and requests I/O on the child. 24This I/O operation completes before the second child is cloned 25and g_std_done() sees the counters both equal 1 and finishes off 26the bio. 27 28There is no race present in the common case where the bio is split 29in multiple parts in the class start method and the I/O is requested 30on another GEOM class below: There is only one g_down thread and 31the class below will not get its start method run until we return 32from our start method, and consequently the I/O cannot complete 33prematurely. 34 35In all other cases, this race needs to be mitigated, for instance 36by cloning all children before I/O is request on any of them. 37 38Notice that cloning an "extra" child and calling g_std_done() on 39it directly opens another race since the assumption is that 40g_std_done() only is called in the g_up thread. 41 42----------------------------------------------------------------------- 43Statistics collection 44 45Statistics collection can run at three levels controlled by the 46"kern.geom.collectstats" sysctl. 47 48At level zero, only the number of transactions started and completed 49are counted, and this is only because GEOM internally uses the difference 50between these two as sanity checks. 51 52At level one we collect the full statistics. Higher levels are 53reserved for future use. Statistics are collected independently 54on both the provider and the consumer, because multiple consumers 55can be active against the same provider at the same time. 56 57The statistics collection falls in two parts: 58 59The first and simpler part consists of g_io_request() timestamping 60the struct bio when the request is first started and g_io_deliver() 61updating the consumer and providers statistics based on fields in 62the bio when it is completed. There are no concurrency or locking 63concerns in this part. The statistics collected consists of number 64of requests, number of bytes, number of ENOMEM errors, number of 65other errors and duration of the request for each of the three 66major request types: BIO_READ, BIO_WRITE and BIO_DELETE. 67 68The second part is trying to keep track of the "busy%". 69 70If in g_io_request() we find that there are no outstanding requests, 71(based on the counters for scheduled and completed requests being 72equal), we set a timestamp in the "wentbusy" field. Since there 73are no outstanding requests, and as long as there is only one thread 74pushing the g_down queue, we cannot possibly conflict with 75g_io_deliver() until we ship the current request down. 76 77In g_io_deliver() we calculate the delta-T from wentbusy and add this 78to the "bt" field, and set wentbusy to the current timestamp. We 79take care to do this before we increment the "requests completed" 80counter, since that prevents g_io_request() from touching the 81"wentbusy" timestamp concurrently. 82 83The statistics data is made available to userland through the use 84of a special allocator (in geom_stats.c) which through a device 85allows userland to mmap(2) the pages containing the statistics data. 86In order to indicate to userland when the data in a statstics 87structure might be inconsistent, g_io_deliver() atomically sets a 88flag "updating" and resets it when the structure is again consistent.
|