Cross Reference: /freebsd-10.1-release/sbin/hastd/

History log of /freebsd-10.1-release/sbin/hastd/
Revision	Date	Author	Comments
272461	03-Oct-2014	gjb	Copy stable/10@r272459 to releng/10.1 as part of the 10.1-RELEASE process. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
270911	01-Sep-2014	ngie	MFC r270433: Garbage collect libl dependency The application links and runs without libl Approved by: rpaulo (mentor) Phabric: D673 Submitted by: trociny
270910	01-Sep-2014	ngie	MFC r270117: Add -ll to LDADD to fix "make checkdpadd" Phabric: D622 Approved by: rpaulo (mentor)
262192	18-Feb-2014	jhb	MFC 261517,261520: Convert the license on files where I am the sole copyright holder to 2 clause BSD licenses.
260006	28-Dec-2013	trociny	MFC r257155, r257582, r259191, r259192, r259193, r259194, r259195, r259196: r257155: Make hastctl list command output current queue sizes. Reviewed by: pjd r257582 (pjd): Correct alignment. r259191: For memsync replication, hio_countdown is used not only as an indication when a request can be moved to done queue, but also for detecting the current state of memsync request. This approach has problems, e.g. leaking a request if memsynk ack from the secondary failed, or racy usage of write_complete, which should be called only once per write request, but for memsync can be entered by local_send_thread and ggate_send_thread simultaneously. So the following approach is implemented instead: 1) Use hio_countdown only for counting components we waiting to complete, i.e. initially it is always 2 for any replication mode. 2) To distinguish between "memsync ack" and "memsync fin" responses from the secondary, add and use hio_memsyncacked field. 3) write_complete() in component threads is called only before releasing hio_countdown (i.e. before the hio may be returned to the done queue). 4) Add and use hio_writecount refcounter to detect when write_complete() can be called in memsync case. Reported by: Pete French petefrench ingresso.co.uk Tested by: Pete French petefrench ingresso.co.uk r259192: Add some macros to make the code more readable (no functional chages). r259193: Fix compiler warnings. r259194: In remote_send_thread, if sending a request fails don't take the request back from the receive queue -- it might already be processed by remote_recv_thread, which lead to crashes like below: (primary) Unable to receive reply header: Connection reset by peer. (primary) Unable to send request (Connection reset by peer): WRITE(954662912, 131072). (primary) Disconnected from kopusha:7772. (primary) Increasing localcnt to 1. (primary) Assertion failed: (old > 0), function refcnt_release, file refcnt.h, line 62. Taking the request back was not necessary (it would properly be processed by the remote_recv_thread) and only complicated things. r259195: Send wakeup to threads waiting on empty queue before releasing the lock to decrease spurious wakeups. Submitted by: davidxu r259196: Check remote protocol version only for the first connection (when it is actually sent by the remote node). Otherwise it generated confusing "Negotiated protocol version 1" debug messages when processing the second connection.
259073	07-Dec-2013	peter	Hoist all the mergeinfo up to the root in preparation for enforcing merges to the root only. All MFC's were rerecorded to the root. Going forward, if an MFC includes mergeinfo, it will need to be made to the root and committed from the root. Merges with --ignore-ancestry or diff \| patch can go anywhere. The mergeinfo in HEAD is in a bad state from years of neglect and manual tampering and this was branched into 10.x. This confuses the coalescing code and prevents it from doing its job. Approved by: re (gjb, implicit)
257468	31-Oct-2013	trociny	MFC r257154: Merging local and remote bitmaps must be protected by hr_amp lock. This is believed to fix hastd crashes, which might occur during synchronization, triggered by the failed assertion: Assertion failed: (amp->am_memtab[ext] > 0), function activemap_write_complete, file activemap.c, line 351. Approved by: re (glebius)
256281	10-Oct-2013	gjb	Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
255717	19-Sep-2013	trociny	Fix comments. Approved by: re (marius) MFC after: 3 days
255716	19-Sep-2013	trociny	When updating the map of dirty extents, most recently used extents are kept dirty to reduce the number of on-disk metadata updates. The sequence of operations is: 1) acquire the activemap lock; 2) update in-memory map; 3) if the list of keepdirty extents is changed, update on-disk metadata; 4) release the lock. On-disk updates are not frequent in comparison with in-memory updates, while require much more time. So situations are possible when one thread is updating on-disk metadata and another one is waiting for the activemap lock just to update the in-memory map. Improve this by introducing additional, on-disk map lock: when in-memory map is updated and it is detected that the on-disk map needs update too, the on-disk map lock is acquired and the on-memory lock is released before flushing the map. Reported by: Yamagi Burmeister yamagi.org Tested by: Yamagi Burmeister yamagi.org Reviewed by: pjd Approved by: re (marius) MFC after: 2 weeks
255714	19-Sep-2013	trociny	Use cv_broadcast() instead of cv_signal() when waking up threads waiting on an empty queue as the queue may have several consumers. Before the fix the following scenario was possible: 2 threads are waiting on empty queue, 2 threads are inserting simultaneously. The first inserting thread detects that the queue is empty and is going to send the signal, but before it sends the second thread inserts too. When the first sends the signal only one of the waiting threads receive it while the other one may wait forever. The scenario above is is believed to be the cause of the observed cases, when ggate_recv_thread() was getting stuck on taking free request, while the free queue was not empty. Reviewed by: pjd Tested by: Yamagi Burmeister yamagi.org Approved by: re (marius) MFC after: 2 weeks
255219	05-Sep-2013	pjd	Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD \| CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t cap_rights_init(cap_rights_t rights, ...); void cap_rights_set(cap_rights_t rights, ...); void cap_rights_clear(cap_rights_t rights, ...); bool cap_rights_is_set(const cap_rights_t rights, ...); bool cap_rights_is_valid(const cap_rights_t rights); void cap_rights_merge(cap_rights_t dst, const cap_rights_t src); void cap_rights_remove(cap_rights_t dst, const cap_rights_t src); bool cap_rights_contains(const cap_rights_t big, const cap_rights_t little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP \| CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
252472	01-Jul-2013	trociny	Make hastctl(1) ('list' command) output a worker pid. Reviewed by: pjd MFC after: 3 days
252421	30-Jun-2013	schweikh	Correct some grammar.
252386	29-Jun-2013	ed	Don't let hastd use C11 atomics. Due to possible concerns about the stability of C11 atomics, use our existing atomics API instead. Requested by: pjd
251796	15-Jun-2013	ed	Let hastd use C11 atomics. C11 atomics now work on all the architectures. Have at least a single piece of software in our base system that uses C11 atomics. This somewhat makes it less likely that we break it because of LLVM imports, etc.
250914	22-May-2013	jkim	Improve compatibility with old flex and fix build with GCC.
250503	11-May-2013	trociny	Get rid of libl dependency. We needed it only to provide yywrap. But yywrap is not necessary when parsing a single hast.conf file. Suggested by: kib Reviewed by: pjd
249970	27-Apr-2013	ed	Partially revert my last change. I forgot that I still had a locally applied patch to my copy of Clang that needs to be pushed in before we should use C11 atomics.
249969	27-Apr-2013	ed	Use C11 <stdatomic.h> instead of our non-standard <machine/atomic.h>. Reviewed by: pjd
249657	19-Apr-2013	ed	Add the Clang specific -Wmissing-variable-declarations to WARNS=6. This compiler flag enforces that that people either mark variables static or use an external declarations for the variable, similar to how -Wmissing-prototypes works for functions. Due to the fact that Yacc/Lex generate code that cannot trivially be changed to not warn because of this (lots of yy* variables), add a NO_WMISSING_VARIABLE_DECLARATIONS that can be used to turn off this specific compiler warning. Announced on: toolchain@
248297	14-Mar-2013	pjd	Now that ioctl(2) is allowed in capability mode and we can limit ioctls for the given descriptors, use Capsicum sandboxing for hastd in primary and secondary modes. Allow for DIOCGDELETE and DIOCGFLUSH ioctls on provider descriptor and for G_GATE_CMD_MODIFY, G_GATE_CMD_START, G_GATE_CMD_DONE and G_GATE_CMD_DESTROY on GEOM Gate descriptor. Sponsored by: The FreeBSD Foundation
248296	14-Mar-2013	pjd	Minor corrections.
248294	14-Mar-2013	pjd	Delete requests can be larger than MAXPHYS.
247281	25-Feb-2013	trociny	Add i/o error counters to hastd(8) and make hastctl(8) display them. This may be useful for detecting problems with HAST disks. Discussed with and reviewed by: pjd MFC after: 1 week
246922	17-Feb-2013	pjd	- Add support for 'memsync' mode. This is the fastest replication mode that's why it will now be the default. - Bump protocol version to 2 and add backward compatibility for version 1. - Allow to specify hosts by kern.hostid as well (in addition to hostname and kern.hostuuid) in configuration file. Sponsored by: Panzura Tested by: trociny
244538	21-Dec-2012	kevlo	Fix socket calls on error post-r243965. Submitted by: Garrett Cooper
242593	05-Nov-2012	pjd	Revert r228695. We use __func__ here as a format to distinguish between abort and assert. It would be cleaner to use NULL or "" here, but gcc complains in both cases.
238538	16-Jul-2012	trociny	Metaflush on/off values don't need quotes. Reviewed by: pjd MFC after: 3 days
238120	04-Jul-2012	pjd	Make use of GEOM Gate direct reads feature. This allows HAST to serve reads with native speed of the underlying provider. There are three situations when direct reads are not used: 1. Data is being synchronized and synchronization source is the secondary node, which means secondary node has more recent data and we should read from it. 2. Local read failed and we have to try to read from the secondary node. 3. Local component is unavailable and all I/O requests are served from the secondary node. Sponsored by: Panzura, http://www.panzura.com MFC after: 1 month
237931	01-Jul-2012	pjd	Check if there is cmsg at all. MFC after: 3 days
236919	11-Jun-2012	hselasky	Revert: r236909 Pointyhat: me
236909	11-Jun-2012	hselasky	Use the correct clock source when computing timeouts. MFC after: 1 week
236507	03-Jun-2012	pjd	Simplify the code by using snprlcat(). MFC after: 3 days
235873	24-May-2012	wblock	Fixes to man8 groff mandoc style, usage mistakes, or typos. PR: 168016 Submitted by: Nobuyuki Koganemaru Approved by: gjb MFC after: 3 days
235789	22-May-2012	bapt	Fix world after byacc import: - old yacc(1) use to magicially append stdlib.h, while new one don't - new yacc(1) do declare yyparse by itself, fix redundant declaration of 'yyparse' Approved by: des (mentor)
235337	12-May-2012	gjb	General mdoc(7) and typo fixes. PR: 167804 Submitted by: Nobuyuki Koganemaru (kogane!jp.freebsd.org) MFC after: 3 days
233679	29-Mar-2012	trociny	If hastd is invoked with "-P pidfile" option always create pidfile regardless of whether -F (foreground) option is set or not. Also, if -P option is specified, ignore pidfile setting from configuration not only on start but on reload too. This fixes the issue when for hastd run with -P option reload caused the pidfile change. Reviewed by: pjd MFC after: 1 week
233392	23-Mar-2012	trociny	Fix typo. MFC after: 3 days
231525	11-Feb-2012	pjd	Nice range comparison. MFC after: 3 days
231016	05-Feb-2012	trociny	If a local write request is from the synchronization thread, when it is synchronizing data that is out of date on the local component, we should not send G_GATE_CMD_DONE acknowledge to the kernel. This fixes the issue, observed in async mode, when on synchronization from the remote component the worker terminated with "G_GATE_CMD_DONE failed" error. Reported by: Artem Kajalainen <artem kayalaynen ru> Reviewed by: pjd MFC after: 1 week
231015	05-Feb-2012	trociny	Fix the regression introduced in r226859: if the local component is out of date BIO_READ requests got lost instead of being sent to the remote component. Reviewed by: pjd MFC after: 1 week
230976	04-Feb-2012	pjd	Fix typo in comment. MFC after: 3 days
230515	24-Jan-2012	pjd	- Fix documentation to note that /etc/hast.conf is the default configuration file for hastd(8) and hastctl(8) and not hast.conf. - In copyright statement correct that this file is documentation, not software. - Bump date. MFC after: 3 days
230457	22-Jan-2012	pjd	Free memory that won't be used in child. MFC after: 1 week
230436	21-Jan-2012	pjd	Fix minor memory leak. MFC after: 3 days
230396	20-Jan-2012	pjd	Remove another unused token. MFC after: 3 days
230395	20-Jan-2012	pjd	Remove unused token 'port'. MFC after: 3 days
230092	13-Jan-2012	pjd	Style cleanups. MFC after: 3 days
229946	10-Jan-2012	pjd	- Fix a bug where pidfile was removed in SIGHUP when it hasn't changed in configuration file. - Log the fact that pidfile has changed. MFC after: 3 days
229945	10-Jan-2012	pjd	For functions that return -1 on failure check exactly for -1 and not for any negative number. MFC after: 3 days
229944	10-Jan-2012	pjd	Don't touch pidfiles when running in foreground. Before that change we would create an empty pidfile on start and check if it changed on SIGHUP. MFC after: 3 days
229778	07-Jan-2012	uqs	Spelling fixes for sbin/
229744	06-Jan-2012	pjd	fork(2) returns -1 on failure, not some random negative number. MFC after: 3 days
229699	06-Jan-2012	pjd	Constify argument. MFC after: 3 days
228712	19-Dec-2011	dim	Use NO_WCAST_ALIGN for usr.bin/hastctl and usr.bin/hastd; the alignment warnings in sbin/hastd/lzf.c are only emitted for i386 and amd64, and there they can be safely ignored. MFC after: 1 week
228696	18-Dec-2011	pjd	Use lex's standard way of not generating unused function. Inspired by: r228555 MFC after: 1 week
228695	18-Dec-2011	pjd	Don't use function name as format string. Detected by: clang MFC after: 1 week
228544	15-Dec-2011	pjd	Remove redundant assignment. Found by: Clang Static Analyzer MFC after: 1 week
228543	15-Dec-2011	pjd	Simplify code by changing functions types from int to avoid, as the functions always return 0. Found by: Clang Static Analyzer MFC after: 1 week
228542	15-Dec-2011	pjd	Remove redundant setting of the error variable. Found by: Clang Static Analyzer MFC after: 1 week
226861	27-Oct-2011	pjd	Remove redundant space. MFC after: 3 days
226859	27-Oct-2011	pjd	Implement 'async' mode for HAST. MFC after: 3 days
226857	27-Oct-2011	pjd	Minor cleanups. MFC after: 3 days
226856	27-Oct-2011	pjd	Reduce indentation. MFC after: 3 days
226855	27-Oct-2011	pjd	Improve comment so it doesn't suggest race is possible, but that we handle the race. MFC after: 3 days
226854	27-Oct-2011	pjd	- Eliminate the need for hio_nv. - Introduce hio_clear() function for clearing hio before returning it onto free queue. MFC after: 3 days
226852	27-Oct-2011	pjd	Monor cleanups. MFC after: 3 days
226851	27-Oct-2011	pjd	Delay resuid generation until first connection to secondary, not until first write. This way on first connection we will synchronize only the extents that were modified during the lifetime of primary node, not entire GEOM provider. MFC after: 3 days
226842	27-Oct-2011	pjd	Correct comments. MFC after: 3 days
226463	17-Oct-2011	pjd	Allow to specify pidfile in HAST configuration file. MFC after: 1 week
226462	17-Oct-2011	pjd	Remove redundant space. MFC after: 1 week
226461	17-Oct-2011	pjd	When path to the configuration file is relative, obtain full path, so we can always find the file, even after daemonizing and changing working directory to /. MFC after: 1 week
225835	28-Sep-2011	pjd	Correct typo. MFC after: 3 days
225832	28-Sep-2011	pjd	If the underlying provider doesn't support BIO_FLUSH, log it only once and don't bother trying in the future. MFC after: 3 days
225831	28-Sep-2011	pjd	Break a bit earlier. MFC after: 3 days
225830	28-Sep-2011	pjd	After every activemap change flush disk's write cache, so that write reordering won't make the actual write to be committed before marking the coresponding extent as dirty. It can be disabled in configuration file. If BIO_FLUSH is not supported by the underlying file system we log a warning and never send BIO_FLUSH again to that GEOM provider. MFC after: 3 days
225787	27-Sep-2011	pjd	Use PJDLOG_ASSERT() and PJDLOG_ABORT() everywhere instead of assert(). MFC after: 3 days
225786	27-Sep-2011	pjd	No need to wrap pjdlog functions around with KEEP_ERRNO() macro. MFC after: 3 days
225784	27-Sep-2011	pjd	- Convert some impossible conditions into assertions. - Add missing 'if' in comment. MFC after: 3 days
225783	27-Sep-2011	pjd	Correct two mistakes when converting asserts to PJDLOG_ASSERT()/PJDLOG_ABORT(). MFC after: 3 days
225782	27-Sep-2011	pjd	Prefer PJDLOG_ASSERT() and PJDLOG_ABORT() over assert() and abort(). pjdlog versions will log problem to syslog when application is running in background. MFC after: 3 days
225781	27-Sep-2011	pjd	No need to use KEEP_ERRNO() macro around pjdlog functions, as they don't modify errno. MFC after: 3 days
225773	27-Sep-2011	pjd	Ensure that pjdlog functions don't modify errno. MFC after: 3 days
223974	13-Jul-2011	trociny	Fix indentation. Approved by: pjd (mentor)
223780	05-Jul-2011	trociny	Remove useless initialization. Approved by: pjd (mentor) MFC after: 3 days
223655	28-Jun-2011	trociny	Check the returned value of activemap_write_complete() and update matadata on disk if needed. This should fix a potential case when extents are cleared in activemap but metadata is not updated on disk. Suggested by: pjd Approved by: pjd (mentor)
223654	28-Jun-2011	trociny	Make activemap_write_start/complete check the keepdirty list, when stating if we need to update activemap on disk. This makes keepdirty serve its purpose -- to reduce number of metadata updates. Discussed with: pjd Approved by: pjd (mentor)
223586	27-Jun-2011	pjd	Compile hastd and hastctl with capsicum support. X-MFC after: capsicum merge
223585	27-Jun-2011	pjd	Compile capsicum support only if HAVE_CAPSICUM is defined. MFC after: 3 days
223584	27-Jun-2011	pjd	Log a warning if we cannot sandbox using capsicum, but only under debug level 1. It would be too noisy to log it as a proper warning as CAPABILITIES are not compiled into GENERIC by default. MFC after: 3 days
223181	17-Jun-2011	trociny	In HAST we use two sockets - one for only sending the data and one for only receiving the data. In r220271 the unused directions were disabled using shutdown(2). Unfortunately, this broke automatic receive buffer sizing, which currently works only for connections in ETASBLISHED state. It was a root cause of the issue reported by users, when connection between primary and secondary could get stuck. Disable the code introduced in r220271 until the issue with automatic buffer sizing is not resolved. Reported by: Daniel Kalchev <daniel@digsys.bg>, danger, sobomax Tested by: Daniel Kalchev <daniel@digsys.bg>, danger Approved by: pjd (mentor) MFC after: 1 week
223143	16-Jun-2011	sobomax	Revert r222688. Requested by: Mikolaj Golub
222688	04-Jun-2011	sobomax	Read from the socket using the same max buffer size as we use while sending. What happens otherwise is that the sender splits all the traffic into 32k chunks, while the receiver is waiting for the whole packet. Then for a certain packet sizes, particularly 66607 bytes in my case, the communication stucks to secondary is expecting to read one chunk of 66607 bytes, while primary is sending two chunks of 32768 bytes and third chunk of 1071. Probably due to TCP windowing and buffering the final chunk gets stuck somewhere, so neither server not client can make any progress. This patch also protect from short reads, as according to the manual page there are some cases when MSG_WAITALL can give less data than expected. MFC after: 3 days
222467	29-May-2011	trociny	If READ from the local node failed we send the request to the remote node. There is no use in doing this for synchronization requests. Approved by: pjd (mentor) MFC after: 1 week
222228	23-May-2011	pjd	Keep statistics on number of BIO_READ, BIO_WRITE, BIO_DELETE and BIO_FLUSH requests as well as number of activemap updates. Number of BIO_WRITEs and activemap updates are especially interesting, because if those two are too close to each other, it means that your workload needs bigger number of dirty extents. Activemap should be updated as rarely as possible. MFC after: 1 week
222224	23-May-2011	pjd	To handle BIO_FLUSH and BIO_DELETE requests in secondary worker we need to use ioctl(2). This is why we can't use capsicum for now to sandbox secondary. Capsicum is still used to sandbox hastctl. MFC after: 1 week
222164	21-May-2011	pjd	Recognize HIO_FLUSH requests. MFC after: 1 week
222121	20-May-2011	pjd	Document IPv6 support. MFC after: 3 weeks
222120	20-May-2011	pjd	If no listen address is specified, bind by default to: tcp4://0.0.0.0:8457 tcp6://[::]:8457 MFC after: 3 weeks
222119	20-May-2011	pjd	Rename ipv4/ipv6 to tcp4/tcp6. MFC after: 3 weeks
222118	20-May-2011	pjd	Now that hell is fully frozen it is good time to add IPv6 support to HAST. MFC after: 3 weeks
222117	20-May-2011	pjd	Allow [ ] characters in strings. They might be used in IPv6 addresses. MFC after: 3 weeks
222116	20-May-2011	pjd	Rename tcp4 to tcp in preparation for IPv6 support. MFC after: 3 weeks
222115	20-May-2011	pjd	Rename proto_tcp4.c to proto_tcp.c in preparation for IPv6 support. MFC after: 2 weeks
222108	19-May-2011	pjd	In preparation for IPv6 support allow to specify multiple addresses to listen on. MFC after: 3 weeks
222087	18-May-2011	pjd	- Add support for AF_INET6 sockets for %S format character. - Use inet_ntop(3) instead of reimplementing it. - Use %hhu for unsigned char instead of casting it to unsigned int and using %u. MFC after: 1 week
221899	14-May-2011	pjd	Currently we are unable to use capsicum for the primary worker process, because we need to do ioctl(2)s, which are not permitted in the capability mode. What we do now is to chroot(2) to /var/empty, which restricts access to file system name space and we drop privileges to hast user and hast group. This still allows to access to other name spaces, like list of processes, network and sysvipc. To address that, use jail(2) instead of chroot(2). Using jail(2) will restrict access to process table, network (we use ip-less jails) and sysvipc (if security.jail.sysvipc_allowed is turned off). This provides much better separation. MFC after: 1 week
221898	14-May-2011	pjd	When using capsicum to sanbox, still use other methods first, just in case one of them have some problems.
221643	08-May-2011	pjd	Allow to specify remote as 'none' again which was broken by r219351, where 'none' was defined as a value for checksum. Reported by: trasz MFC after: 1 week
221632	08-May-2011	trociny	Fix isitme(), which is used to check if node-specific configuration belongs to our node, and was returning false positive if the first part of a node name matches short hostname. Approved by: pjd (mentor)
221078	26-Apr-2011	trociny	Add missing ifdef. This fixes build with NO_OPENSSL. Reported by: Pawel Tyll <ptyll@nitronet.pl> Approved by: pjd (mentor) MFC after: 1 week
221076	26-Apr-2011	trociny	Rename HASTCTL_ defines, which are used for conversion between main hastd process and workers, remove unused one and set different range of numbers. This is done in order not to confuse them with HASTCTL_CMD defines, used for conversation between hastctl and hastd, and to avoid bugs like the one fixed in in r221075. Approved by: pjd (mentor) MFC after: 1 week
221075	26-Apr-2011	trociny	For conversation between hastctl and hastd we should use HASTCTL_CMD defines. Approved by: pjd (mentor) MFC after: 1 week
220899	20-Apr-2011	pjd	Correct comment. MFC after: 1 week
220898	20-Apr-2011	pjd	When we become primary, we connect to the remote and expect it to be in secondary role. It is possible that the remote node is primary, but only because there was a role change and it didn't finish cleaning up (unmounting file systems, etc.). If we detect such situation, wait for the remote node to switch the role to secondary before accepting I/Os. If we don't wait for it in that case, we will most likely cause split-brain. MFC after: 1 week
220890	20-Apr-2011	pjd	If we act in different role than requested by the remote node, log it as a warning and not an error. MFC after: 1 week
220889	20-Apr-2011	pjd	Timeout must be positive. MFC after: 1 week
220865	19-Apr-2011	pjd	Scenario: - We have two nodes connected and synchronized (local counters on both sides are 0). - We take secondary down and recreate it. - Primary connects to it and starts synchronization (but local counters are still 0). - We switch the roles. - Synchronization restarts but data is synchronized now from new primary (because local counters are 0) that doesn't have new data yet. This fix this issue we bump local counter on primary when we discover that connected secondary was recreated and has no data yet. Reported by: trociny Discussed with: trociny Tested by: trociny MFC after: 1 week
220744	17-Apr-2011	trociny	Remove hast_proto_recv(). It was used only in one place, where hast_proto_recv_hdr() may be used. This also fixes the issue (introduced by r220523) with hastctl, which crashed on assert in hast_proto_recv_data(). Suggested and approved by: pjd (mentor)
220573	12-Apr-2011	pjd	The replication mode that is currently support is fullsync, not memsync. Correct this and print a warning if different replication mode is configured. MFC after: 1 week
220523	10-Apr-2011	trociny	In hast_proto_recv() remove unnecessary check. The size is checked later in hast_proto_recv_data(). Approved by: pjd (mentor) MFC after: 1 week
220522	10-Apr-2011	trociny	In hast_proto_recv_data() check that the size of the data to be received does not exceed the buffer size. Approved by: pjd (mentor) MFC after: 1 week
220521	10-Apr-2011	trociny	Fix a typo in comments. Approved by: pjd (mentor) MFC after: 3 days
220274	02-Apr-2011	pjd	Increase default timeout from 5 seconds to 20 seconds. 5 seconds is definitely to short under heavy load and I was experiencing those timeouts in my recent tests. MFC after: 1 week
220273	02-Apr-2011	pjd	Handle ENOBUFS on send(2) by retrying for a while and logging the problem. MFC after: 1 week
220272	02-Apr-2011	pjd	When we are operating on blocking socket and get EAGAIN on send(2) or recv(2) this means that request timed out. Translate the meaningless EAGAIN to ETIMEDOUT to give administrator a hint that he might need to increase timeout in configuration file. MFC after: 1 month
220271	02-Apr-2011	pjd	Declare directions for sockets between primary and secondary. In HAST we use two sockets - one for only sending the data and one for only receiving the data. MFC after: 1 month
220270	02-Apr-2011	pjd	Allow to disable sends or receives on a socket using shutdown(2) by interpreting NULL 'data' argument passed to proto_common_send() or proto_common_recv() as a will to do so. MFC after: 1 month
220266	02-Apr-2011	pjd	Handle the problem described in r220264 by using GEOM GATE queue of unlimited length. This should fix deadlocks reported by HAST users. MFC after: 1 week
220007	25-Mar-2011	pjd	Add mapsize to the header just before sending the packet. Before it could change later and we were sending invalid mapsize. Some time ago I added optimization where when nodes are connected for the first time and there were no writes to them yet, there is no initial full synchronization. This bug prevented it from working. MFC after: 1 week
220006	25-Mar-2011	pjd	Use timeout from configuration file not only when sending and receiving, but also when establishing connection. MFC after: 1 week
220005	25-Mar-2011	pjd	Use role2str() when setting process title. MFC after: 1 week
219900	23-Mar-2011	pjd	Don't create socketpair for connection forwarding between parent and secondary. Secondary doesn't need to connect anywhere. MFC after: 1 week
219887	22-Mar-2011	pjd	Add my copyright. MFC after: 1 week
219882	22-Mar-2011	trociny	After synchronization is complete we should make primary counters be equal to secondary counters: primary_localcnt = secondary_remotecnt primary_remotecnt = secondary_localcnt Previously it was done wrong and split-brain was observed after primary had synchronized up-to-date data from secondary. Approved by: pjd (mentor) MFC after: 1 week
219879	22-Mar-2011	trociny	For requests that are sent only to remote component use the error from remote. Approved by: pjd (mentor) MFC after: 1 week
219873	22-Mar-2011	pjd	The proto API is a general purpose API, so don't use 'hast' in structures or function names. It can now be used outside of HAST. MFC after: 1 week
219864	22-Mar-2011	pjd	White space cleanups. MFC after: 1 week
219847	21-Mar-2011	pjd	When dropping privileges prefer capsicum over chroot+setgid+setuid. We can use capsicum for secondary worker processes and hastctl. When working as primary we drop privileges using chroot+setgid+setuid still as we need to send ioctl(2)s to ggate device, for which capsicum doesn't allow (yet). X-MFC after: capsicum is merged to stable/8
219844	21-Mar-2011	pjd	Initialize localcnt on first write. This fixes assertion when we create resource, set role to primary, do no writes, then sent it to secondary and accept connection from primary. MFC after: 1 week
219843	21-Mar-2011	pjd	Fix typo. MFC after: 1 week
219837	21-Mar-2011	pjd	Before handling any events on descriptors check signals so we can update our info about worker processes if any of them was terminated in the meantime. This fixes the problem with 'hastctl status' running from a hook called on split-brain: 1. Secondary calls a hooks and terminates. 2. Hook asks for resource status via 'hastctl status'. 3. The main hastd handles the status request by sending it to the secondary worker who is already dead, but because signals weren't checked yet he doesn't know that and we get EPIPE. MFC after: 1 week
219833	21-Mar-2011	pjd	Remove stale comment. Yes, it is valid to set role back to init. MFC after: 1 week
219832	21-Mar-2011	pjd	Increase debug level of "Checking hooks." message. MFC after: 1 week
219831	21-Mar-2011	pjd	Be pedantic and free nvout before exiting. MFC after: 1 week
219830	21-Mar-2011	pjd	Detect situation where resource internal identifier differs. This means that both nodes have separately managed resources that don't have the same data. MFC after: 1 week
219818	21-Mar-2011	pjd	In hast.conf we define the other node's address in 'remote' variable. This way we know how to connect to secondary node when we are primary. The same variable is used by the secondary node - it only accepts connections from the address stored in 'remote' variable. In cluster configurations it is common that each node has its individual IP address and there is one addtional shared IP address which is assigned to primary node. It seems it is possible that if the shared IP address is from the same network as the individual IP address it might be choosen by the kernel as a source address for connection with the secondary node. Such connection will be rejected by secondary, as it doesn't come from primary node individual IP. Add 'source' variable that allows to specify source IP address we want to bind to before connecting to the secondary node. MFC after: 1 week
219817	21-Mar-2011	pjd	Log when we start hooks checking and when we execute a hook. MFC after: 1 week
219816	21-Mar-2011	pjd	Use snprlcat() instead of two strlcat(3)s. MFC after: 1 week
219815	21-Mar-2011	pjd	Add snprlcat() and vsnprlcat() - the functions I'm always missing. They work as a combination of snprintf(3) and strlcat(3) - the caller can append a string build based on the given format. MFC after: 1 week
219814	21-Mar-2011	pjd	When creating connection on behalf of primary worker, set pjdlog prefix to resource name and role, so that any logs related to that can be identified properly. MFC after: 1 week
219813	21-Mar-2011	pjd	If there is any traffic on one of out descriptors, we were not checking for long running hooks. Fix it by not using select(2) timeout to decide if we want to check hooks or not. MFC after: 1 week
219721	17-Mar-2011	trociny	For secondary, set 2 * HAST_KEEPALIVE seconds timeout for incoming connection so the worker will exit if it does not receive packets from the primary during this interval. Reported by: Christian Vogt <Christian.Vogt@haw-hamburg.de> Tested by: Christian Vogt <Christian.Vogt@haw-hamburg.de> Approved by: pjd (mentor) MFC after: 1 week
219669	15-Mar-2011	pjd	Remove #include needed for debugging. MFC after: 1 week
219482	11-Mar-2011	trociny	Make workers inherit debug level from the main process. Approved by: pjd (mentor) MFC after: 1 week
219385	07-Mar-2011	pjd	Unbreak the build. MFC after: 2 weeks
219372	07-Mar-2011	pjd	- Log size of data to synchronize in human readable form (using %N). - Log synchronization time (using %T). - Log synchronization speed in human readable form (using %N). MFC after: 2 weeks
219371	07-Mar-2011	pjd	Use %S to print IP address and port number. MFC after: 2 weeks
219370	07-Mar-2011	pjd	- Turn on printf extentions. - Load support for %T for pritning time. - Add support for %N for printing number in human readable form. - Add support for %S for printing sockaddr structure (currently only AF_INET family is supported, as this is all we need in HAST). - Disable gcc compile-time format checking as this will no longer work. MFC after: 2 weeks
219369	07-Mar-2011	pjd	Provides three states for pjdlog_initialized, so we can also tell that this is fist initialization ever. MFC after: 2 weeks
219354	06-Mar-2011	pjd	Allow to compress on-the-wire data using two algorithms: - HOLE - it simply turns all-zero blocks into few bytes header; it is extremely fast, so it is turned on by default; it is mostly intended to speed up initial synchronization where we expect many zeros; - LZF - very fast algorithm by Marc Alexander Lehmann, which shows very decent compression ratio and has BSD license. MFC after: 2 weeks
219351	06-Mar-2011	pjd	Allow to checksum on-the-wire data using either CRC32 or SHA256. MFC after: 2 weeks
218474	09-Feb-2011	pjd	When we decide to unlink socket file, sun_path must be set. If it is set, but there is problem unlinking the file, log a warning. MFC after: 1 week
218465	08-Feb-2011	pjd	Explicitly include <sys/types.h> as suggested by getpid(2) and don't rely on <sys/un.h> including what's needed. MFC after: 1 week
218464	08-Feb-2011	pjd	Unlink UNIX domain socket file only if: 1. The descriptor is the one we are listening on (not the one when we connect as a client and not the one which is created on accept(2)). 2. Descriptor was created by us (PID matches with the PID stored on bind(2)). Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
218376	06-Feb-2011	pjd	Now that we break the loop on fstat(2) failure we no longer need to satisfy gcc's imperfections. MFC after: 1 week
218375	06-Feb-2011	pjd	Add (void) cast before snprintf(3)s for which we are not interested in return values. MFC after: 1 week
218374	06-Feb-2011	pjd	Treat fstat(2) failure (different than EBADF) as fatal error. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
218373	06-Feb-2011	pjd	Open syslog when logging sysconf(3) failure. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
218370	06-Feb-2011	pjd	Close more descriptors that can be open if the worker process for the given resource is already running. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
218218	03-Feb-2011	pjd	Setup another socketpair between parent and child, so that primary sandboxed worker can ask the main privileged process to connect in worker's behalf and then we can migrate descriptor using this socketpair to worker. This is not really needed now, but will be needed once we start to use capsicum for sandboxing. MFC after: 1 week
218217	03-Feb-2011	pjd	Add missing locking after moving keepalive_send() to remote send thread in r214692. MFC after: 1 week
218214	03-Feb-2011	pjd	Let the caller log info about successful privilege drop. We don't want to log this in hastctl. MFC after: 1 week
218194	02-Feb-2011	pjd	- Rename proto_descriptor_{send,recv}() functions to proto_connection_{send,recv} and change them to return proto_conn structure. We don't operate directly on descriptors, but on proto_conns. - Add wrap method to wrap descriptor with proto_conn. - Remove methods to send and receive descriptors and implement this functionality as additional argument to send and receive methods. MFC after: 1 week
218193	02-Feb-2011	pjd	Add proto_connect_wait() to wait for connection to finish. If timeout argument to proto_connect() is -1, then the caller needs to use this new function to wait for connection. This change is in preparation for capsicum, where sandboxed worker wants to ask main process to connect in worker's behalf and pass descriptor to the worker. Because we don't want the main process to wait for the connection, it will start async connection and pass descriptor to the worker who will be responsible for waiting for the connection to finish. MFC after: 1 week
218192	02-Feb-2011	pjd	Allow to specify connection timeout by the caller. MFC after: 1 week
218191	02-Feb-2011	pjd	Move protocol allocation and deallocation to separate functions. MFC after: 1 week
218185	02-Feb-2011	pjd	Be prepared that hp_client or hp_server might be NULL now. MFC after: 1 week
218158	01-Feb-2011	pjd	Do not set socket send and receive buffer. It will be auto-tuned. Confirmed by: rwatson MFC after: 1 week
218148	31-Jan-2011	pjd	Fix build on ia64. I found no way how to use CMSG_NXTHDR() macro on ia64 without alignment warnings. MFC after: 1 week
218147	31-Jan-2011	pjd	Until I fix the build on ia64 comment out problematic lines. Those lines are part of the (for now) unused functions.
218139	31-Jan-2011	pjd	Implement two new functions for sending descriptor and receving descriptor over UNIX domain sockets and socket pairs. This is in preparation for capsicum. MFC after: 1 week
218138	31-Jan-2011	pjd	- Use pjdlog for assertions and aborts as this will log assert/abort message to syslog if we run in background. - Asserts in proto.c that method we want to call is implemented and remove dummy methods from protocols implementation that are only there to abort the program with nice message. MFC after: 1 week
218132	31-Jan-2011	pjd	Rename pjdlog_verify() to pjdlog_abort() as it better describes what the the function does and mark it with __dead2. MFC after: 1 week
218049	28-Jan-2011	pjd	Drop privileges in worker processes. Accepting connections and handshaking in secondary is still done before dropping privileges. It should be implemented by only accepting connections in privileged main process and passing connection descriptors to the worker, but is not implemented yet. MFC after: 1 week
218048	28-Jan-2011	pjd	Implement function that drops privileges by: - chrooting to /var/empty (user hast home directory), - setting groups to 'hast' (user hast primary group), - setting real group id, effective group id and saved group id to 'hast', - setting real user id, effective user id and saved user id to 'hast'. At the end verify that those operations where successfull. MFC after: 1 week
218045	28-Jan-2011	pjd	Use newly added descriptors_assert() function to ensure only expected descriptors are open. MFC after: 1 week
218044	28-Jan-2011	pjd	Add function to assert that the only descriptors we have open are the ones we expect to be open. Also assert that they point at expected type. Because openlog(3) API is unable to tell us descriptor number it is using, we have to close syslog socket, remember assert message in local buffer and if we fail on assertion, reopen syslog socket and log the message. MFC after: 1 week
218043	28-Jan-2011	pjd	Close all unneeded descriptors after fork(2). MFC after: 1 week
218042	28-Jan-2011	pjd	Add comments to places where we treat errors as ciritical, but it is possible to handle them more gracefully. MFC after: 1 week
218041	28-Jan-2011	pjd	Add function to close all unneeded descriptors after fork(2). MFC after: 1 week
218040	28-Jan-2011	pjd	Initialize all global variables on pjdlog_init(). MFC after: 1 week
217969	27-Jan-2011	pjd	Remember created control connection so on fork(2) we can close it in child. Found with: procstat(1) MFC after: 1 week
217967	27-Jan-2011	pjd	Close the control socket before exiting, so it will be unlinked. MFC after: 1 week
217966	27-Jan-2011	pjd	Extend pjdlog_verify() to support the following additional macros: PJDLOG_RVERIFY() - always check expression and on false log the given message and exit. PJDLOG_RASSERT() - check expression when NDEBUG is not defined and on false log given message and exit. PJDLOG_ABORT() - log the given message and exit. MFC after: 1 week
217965	27-Jan-2011	pjd	Add functions to initialize/finalize pjdlog. This allows to open/close log file at will. MFC after: 1 week
217964	27-Jan-2011	pjd	Use my copyright for 2011 work. MFC after: 1 week
217962	27-Jan-2011	pjd	Add LOG_NDELAY flag to openlog(3) - we want descriptor to be immediately open so there are no surprises once we start chrooting or using capsicum. MFC after: 1 week
217961	27-Jan-2011	pjd	- Remove obvious NOTREACHED comment after abort() call. - Remove redundant newline at the end of the file. MFC after: 1 week
217958	27-Jan-2011	pjd	Remove __dead2 from pjdlog_verify() prototype, it does return sometimes. MFC after: 1 week
217784	24-Jan-2011	pjd	Don't open configuration file from worker process. Handle SIGHUP in the master process only and pass changes to the worker processes over control socket. This removes access to global namespace in preparation for capsicum sandboxing. MFC after: 2 weeks
217737	22-Jan-2011	pjd	Add missing logs. MFC after: 1 week
217732	22-Jan-2011	pjd	Add nv_assert() which allows to assert that the given name exists. MFC after: 1 week
217731	22-Jan-2011	pjd	Use more consistent function name with the others (pjdlogv_prefix_set() instead of pjdlog_prefix_setv()). MFC after: 1 week
217730	22-Jan-2011	pjd	Use int16 for error. MFC after: 1 week
217729	22-Jan-2011	pjd	- On primary worker reload, update hr_exec field. - Update comment. MFC after: 1 week
217312	12-Jan-2011	pjd	execve(2), not fork(2) resets signal handler to the default value (if it isn't ignored). Correct comment talking about that. Pointed out by: kib MFC after: 3 days
217308	12-Jan-2011	pjd	Add a note that when custom signal handler is installed for a signal, signal action is restored to default in child after fork(2). In this case there is no need to do anything with dummy SIGCHLD handler, because after fork(2) it will be automatically reverted to SIG_IGN. Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 3 days
217307	12-Jan-2011	pjd	Install default signal handlers before masking signals we want to handle. It is possible that the parent process ignores some of them and sigtimedwait() will never see them, eventhough they are masked. The most common situation for this to happen is boot process where init(8) ignores SIGHUP before starting to execute /etc/rc. This in turn caused hastd(8) to ignore SIGHUP. Reported by: trasz Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 3 days
216722	26-Dec-2010	pjd	Detect when resource is configured more than once. MFC after: 3 days
216721	26-Dec-2010	pjd	When node-specific configuration is missing in resource section, provide more useful information. Instead of: hastd: remote address not configured for resource foo Print the following: No resource foo configuration for this node (acceptable node names: freefall, freefall.freebsd.org, 44333332-4c44-4e31-4a30-313920202020). MFC after: 3 days
216494	16-Dec-2010	pjd	The 'ret' variable is of type ssize_t and we use proper format for it (%zd), so no (bogus) cast is needed. MFC after: 3 days
216479	16-Dec-2010	pjd	Improve problems logging. MFC after: 3 days
216478	16-Dec-2010	pjd	Don't ignore errors from remote requests. MFC after: 3 days
216477	16-Dec-2010	pjd	Log the fact of launching and include protocol version number. MFC after: 3 days
215676	22-Nov-2010	brucec	Don't generate input() since it's not used.
215332	15-Nov-2010	pjd	Move timeout.tv_sec initialization outside the loop - sigtimedwait(2) won't modify it. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
215331	15-Nov-2010	pjd	1. Exit when we cannot create incoming connection. 2. Improve logging to inform which connection can't be created. Submitted by: [1] Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
214692	02-Nov-2010	pjd	Send packets to remote node only via the send thread to avoid possible races - in this case a keepalive packet was send from wrong thread which lead to connection dropping, because of corrupted packet. Fix it by sending keepalive packets directly from the send thread. As a bonus we now send keepalive packets only when connection is idle. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
214284	24-Oct-2010	pjd	Before this change on first connect between primary and secondary we initialize all the data. This is huge waste of time and resources if there were no writes yet, as there is no real data to synchronize. Optimize this by sending "virgin" argument to secondary, which gives it a hint that synchronization is not needed. In the common case (where noth nodes are configured at the same time) instead of synchronizing everything, we don't synchronize at all. MFC after: 1 week
214283	24-Oct-2010	pjd	Implement nv_exists() function that returns true if argument of the given name exists. MFC after: 3 days
214282	24-Oct-2010	pjd	Move all NV defines into nv.c, they are not used externally thus there is no need to make then visible from outside. MFC after: 3 days
214276	24-Oct-2010	pjd	Simplify code a bit. MFC after: 3 days
214275	24-Oct-2010	pjd	Plug memory leak. MFC after: 3 days
214274	24-Oct-2010	pjd	Plug memory leaks. Found with: valgrind MFC after: 3 days
214273	24-Oct-2010	pjd	Load geom_gate.ko module after parsing arguments. MFC after: 3 days
214119	20-Oct-2010	pjd	Use closefrom(2) instead of close(2) in a loop. MFC after: 1 week
213981	17-Oct-2010	pjd	Log correct connection when canceling half-open connection. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
213939	16-Oct-2010	pjd	Use one fprintf() instead of two. MFC after: 3 days
213938	16-Oct-2010	pjd	Clear signal mask before executing a hook. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
213580	08-Oct-2010	pjd	We can't zero out ggio request, as we have some fields in there we initialize once during start-up. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
213579	08-Oct-2010	pjd	We close the event socketpair early in the mainloop to prevent spaming with error messages, so when we clean up after child process, we have to check if the event socketpair is still there. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
213533	07-Oct-2010	pjd	Clear ggate structures before using them. We don't initialize all the field and there can be some garbage from the stack. MFC after: 1 week
213531	07-Oct-2010	pjd	Log error message when we fail to destroy ggate provider. MFC after: 3 days
213530	07-Oct-2010	pjd	Start the guard thread first, so we can handle signals from the very begining. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
213529	07-Oct-2010	pjd	Don't close local component on exit as we can hang waiting on g_waitidle. I'm unable to reproduce the race described in comment anymore and also the comment is incorrect - localfd represents local component from configuration file, eg. /dev/da0 and not HAST provider. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week
213430	04-Oct-2010	pjd	Decrease report interval to 5 seconds, as this also means we will check for signals every 5 seconds and not every 10 seconds as before. MFC after: 3 days
213429	04-Oct-2010	pjd	hook_check() is now only used to report about long-running hooks, so the argument is redundant, remove it. MFC after: 3 days
213428	04-Oct-2010	pjd	We can't mask ignored signal, so install dummy signal hander for SIGCHLD before masking it. This fixes bogus reports about hooks running for too long and other problems related to garbage-collecting child processes. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
213183	26-Sep-2010	pjd	Plug memory leak on fork(2) failure. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
213009	22-Sep-2010	pjd	Switch to sigprocmask(2) API also in the main process and secondary process. This way the primary process inherits signal mask from the main process, which fixes a race where signal is delivered to the primary process before configuring signal mask. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
213008	22-Sep-2010	pjd	Assert that descriptor numbers are sane. MFC after: 3 days
213007	22-Sep-2010	pjd	Fix possible deadlock where worker process sends an event to the main process while the main process sends control message to the worker process, but worker process hasn't started control thread yet, because it waits for reply from the main process. The fix is to start the control thread before sending any events. Reported and fix suggested by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
213006	22-Sep-2010	pjd	Fix descriptor leaks: when child exits, we have to close control and event socket pairs. We did that only in one case out of three. MFC after: 3 days
213004	22-Sep-2010	pjd	If we are unable to receive control message is most likely because the main process died. Instead of entering infinite loop, terminate. MFC after: 3 days
213003	22-Sep-2010	pjd	Sort includes. MFC after: 3 days
212899	20-Sep-2010	pjd	Add __dead2 to functions that we know they are going to exit. MFC after: 3 days
212052	31-Aug-2010	pjd	Include process PID in log messages. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 2 weeks
212051	31-Aug-2010	pjd	Correct error message. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 2 weeks
212049	31-Aug-2010	pjd	Forgot to add event.c and event.h in r212038. Pointed out by: pluknet <pluknet@gmail.com> MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
212046	31-Aug-2010	pjd	Mask only those signals that we want to handle. Suggested by: jilles MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
212038	30-Aug-2010	pjd	Because it is very hard to make fork(2) from threaded process safe (we are limited to async-signal safe functions in the child process), move all hooks execution to the main (non-threaded) process. Do it by maintaining connection (socketpair) between child and parent and sending events from the child to parent, so it can execute the hook. This is step in right direction for others reasons too. For example there is one less problem to drop privs in worker processes. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
212037	30-Aug-2010	pjd	We only want to know if descriptors are ready for reading. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
212036	30-Aug-2010	pjd	When someone gives NULL as data, assume this is because he want to declare connection side only. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
212034	30-Aug-2010	pjd	Use pjdlog_exit() before fork(). MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
212033	30-Aug-2010	pjd	Constify arguments we can constify. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211984	30-Aug-2010	pjd	Execute hook when connection between the nodes is established or lost. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211983	30-Aug-2010	pjd	Execute hook when split-brain is detected. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211982	30-Aug-2010	pjd	Use sigtimedwait(2) for signals handling in primary process. This fixes various races and eliminates use of pthread* API in signal handler. Pointed out by: kib With help from: jilles MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211981	29-Aug-2010	pjd	- Move functionality responsible for checking one connection to separate function to make code more readable. - Be sure not to reconnect too often in case of signal delivery, etc. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211979	29-Aug-2010	pjd	Disconnect after logging errors. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211978	29-Aug-2010	pjd	- Call hook on role change. - Document new event. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211977	29-Aug-2010	pjd	Allow to run hooks from the main hastd process. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211976	29-Aug-2010	pjd	- Add hook_fini() which should be called after fork() from the main hastd process, once it start to use hooks. - Add hook_check_one() in case the caller expects different child processes and once it can recognize it, it will pass pid and status to hook_check_one(). MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211975	29-Aug-2010	pjd	Implement mtx_destroy() and rw_destroy(). MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211899	27-Aug-2010	pjd	When SIGTERM or SIGINT is received, terminate worker processes. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211898	27-Aug-2010	pjd	When logging to stdout/stderr, flush after each log. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211897	27-Aug-2010	pjd	Correct when we log interrupted synchronization. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211896	27-Aug-2010	pjd	Check if no signals were delivered just before going to sleep. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211895	27-Aug-2010	pjd	Add hooks execution. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211887	27-Aug-2010	pjd	Document new 'exec' parameter. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211886	27-Aug-2010	pjd	Allow to execute specified program on various HAST events. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211885	27-Aug-2010	pjd	- Run hooks in background - don't block waiting for them to finish. - Keep all hooks we're running in a global list, so we can report when they finish and also report when they are running for too long. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211884	27-Aug-2010	pjd	When logging to stdout/stderr don't close those descriptors after fork(). MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211883	27-Aug-2010	pjd	Reduce indent where possible. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211882	27-Aug-2010	pjd	Implement keepalive mechanism inside HAST protocol so we can detect secondary node failures quickly for HAST resources that are rarely modified. Remove XXX from a comment now that the guard thread never sleeps infinitely. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211881	27-Aug-2010	pjd	- Remove redundant and incorrect 'old' word from debug message. - Log disconnects as warnings. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211880	27-Aug-2010	pjd	Don't increase number synchronized bytes in case of an error. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211879	27-Aug-2010	pjd	Log that synchronization was interrupted in a proper place. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211878	27-Aug-2010	pjd	We have sync_start() function to start synchronization, introduce sync_stop() function to stop it. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211877	27-Aug-2010	pjd	Add QUEUE_INSERT() and QUEUE_TAKE() macros that simplify the code a bit. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211876	27-Aug-2010	pjd	Add mtx_owned() implementation. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211875	27-Aug-2010	pjd	Make comment more readable. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
211452	18-Aug-2010	pjd	For some setups sending data in 128kB chunks makes communication very slow. No idea why. 32kB on the other hand seems to work properly everywhere. Reported by: Thomas Steen Rasmussen <thomas@gibfest.dk> MFC after: 3 weeks
211407	16-Aug-2010	pjd	The 'size' variable is there to limit how many bytes we want to copy from 'addr'. It is very likely that size of 'addr' is larger than 'size', so checking strlcpy() return value is bogus. MFC after: 3 weeks
211397	16-Aug-2010	joel	Fix typos, spelling, formatting and mdoc mistakes found by Nobuyuki while translating these manual pages. Minor corrections by me. Submitted by: Nobuyuki Koganemaru <n-kogane@syd.odn.ne.jp>
210892	05-Aug-2010	pjd	Document 'none' value for remote. Reviewed by: dougb MFC after: 1 month
210886	05-Aug-2010	pjd	Implement configuration reload on SIGHUP. This includes: - Load added resources. - Stop and forget removed resources. - Update modified resources in least intrusive way, ie. don't touch /dev/hast/<name> unless path to local component or provider name were modified. Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 1 month
210883	05-Aug-2010	pjd	Prepare configuration parsing code to be called multiple times: - Don't exit on errors if not requested. - Don't keep configuration in global variable, but allocate memory for configuration. - Call yyrestart() before yyparse() so that on error in configuration file we will start from the begining next time and not from the place we left of. MFC after: 1 month
210882	05-Aug-2010	pjd	Make control_set_role() more public. We will need it soon. MFC after: 1 month
210881	05-Aug-2010	pjd	Allow to use 'none' keywork as remote address in case second cluster node is not setup yet. MFC after: 1 month
210880	05-Aug-2010	pjd	Reset signal handlers after fork(). MFC after: 1 month
210879	05-Aug-2010	pjd	- Use pjdlog_exitx() to log errors and exit instead of errx(). - Use 'unable to' (instead of 'cannot') consistently. MFC after: 1 month
210876	05-Aug-2010	pjd	Assert that various buffers we are large enough. MFC after: 1 month
210875	05-Aug-2010	pjd	Problem with assertion is that it logs on stderr. Add two macros: PJDLOG_ASSERT() and PJDLOG_VERIFY() that will check the given condition and log the problem where appropriate. The difference between those two is that PJDLOG_VERIFY() always work and PJDLOG_ASSERT() can be turned off by defining NDEBUG. MFC after: 1 month
210873	05-Aug-2010	pjd	Keep $FreeBSD$ in __FBSDID() only for C files. MFC after: 1 month
210872	05-Aug-2010	pjd	Mark two more places that we won't reach. MFC after: 1 month
210870	05-Aug-2010	pjd	Now that TCP will be checked last we don't need any knowledge about other protocols. MFC after: 1 month
210869	05-Aug-2010	pjd	Add an argument to the proto_register() function which allows protocol to declare it is the default and be placed at the end of the queue so it is checked last. MFC after: 1 month
210702	31-Jul-2010	joel	Spelling fixes.
210368	22-Jul-2010	pjd	Actually, only the fullsync mode is implemented, not memsync mode. Correct manual page. MFC after: 3 days
209185	14-Jun-2010	pjd	Correct various log messages. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
209184	14-Jun-2010	pjd	Fix typos. MFC after: 3 days
209183	14-Jun-2010	pjd	Initialize gctl_seq for synchronization requests. Reported by: hiroshi@soupacific.com Analysed by: Mikolaj Golub <to.my.trociny@gmail.com> Tested by: hiroshi@soupacific.com, Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
209182	14-Jun-2010	pjd	Plug memory leak. Found by: Coverity Prevent CID: 7057 MFC after: 3 days
209181	14-Jun-2010	pjd	Plug memory leak. Found by: Coverity Prevent CID: 7056 MFC after: 3 days
209180	14-Jun-2010	pjd	Plug memory leak. Found by: Coverity Prevent CID: 7051 MFC after: 3 days
209179	14-Jun-2010	pjd	Plug memory leaks. Found by: Coverity Prevent CID: 7052, 7053, 7054, 7055 MFC after: 3 days
209177	14-Jun-2010	pjd	Remove macros that are not really needed. The idea was to have them in case we grow more descriptors, but I'll reconsider readding them once we get there. Passing (a = b) expression to FD_ISSET() is bad idea, as FD_ISSET() evaluates its argument twice. Found by: Coverity Prevent CID: 5243 MFC after: 3 days
209175	14-Jun-2010	pjd	Eliminate dead code. Found by: Coverity Prevent CID: 5158 MFC after: 3 days
208028	13-May-2010	uqs	mdoc: move remaining sections into consistent order This pertains mostly to FILES, HISTORY, EXIT STATUS and AUTHORS sections. Found by: mdocml lint run Reviewed by: ru
207390	29-Apr-2010	pjd	Default connection timeout is way too long. To make it shorter we have to make socket non-blocking, connect() and if we get EINPROGRESS, we have to wait using select(). Very complex, but I know no other way to define connection timeout for a given socket. Reported by: hiroshi@soupacific.com MFC after: 3 days
207372	29-Apr-2010	pjd	- Check if the worker process was killed by signal and restart it. - Improve logging. Pointed out by: Garrett Cooper <yanefbsd@gmail.com> MFC after: 3 days
207371	29-Apr-2010	pjd	Fix a problem where hastd will stuck in recv(2) after sending request to secondary, which died between send(2) and recv(2). Do it by adding timeout to recv(2) for primary incoming and outgoing sockets and secondary outgoing socket. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> Tested by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
207348	28-Apr-2010	pjd	Restart worker thread only if the problem was temporary. In case of persistent problem we don't want to loop forever. MFC after: 3 days
207347	28-Apr-2010	pjd	Mark temporary issues as such. MFC after: 3 days
207345	28-Apr-2010	pjd	Use WEXITSTATUS() to obtain real exit code. MFC after: 3 days
207343	28-Apr-2010	pjd	Don't assume that "resource" property is in metadata. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
207070	22-Apr-2010	pjd	Fix compilation with WITHOUT_CRYPT or WITHOUT_OPENSSL options. Reported by: Andrei V. Lavreniyuk <andy.lavr@reactor-xg.kiev.ua> MFC after: 3 days
206697	16-Apr-2010	pjd	Fix log size calculation which caused message truncation. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
206696	16-Apr-2010	pjd	Fix control socket leak when worker process exits. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days
206669	15-Apr-2010	pjd	Increase ggate queue size to maximum value. HAST was not able to stand heavy random load. Reported by: Hiroyuki Yamagami MFC after: 3 days
205738	27-Mar-2010	pjd	Don't hold connection lock when doing reconnects as it makes I/Os wait for connection timeouts. Reported by: Kevin Day <toasty@dragondata.com>
204596	02-Mar-2010	uqs	Remove redundant WARNS?=6 overrides and inherit the WARNS setting from the toplevel directory. This does not change any WARNS level and survives a make universe. Approved by: ed (co-mentor)
204352	26-Feb-2010	ru	Fixed static linkage.
204177	21-Feb-2010	pjd	Changing proto_socketpair.c compilation and linking order revealed a problem - we should simply ignore proto_server() if address doesn't start with socketpair://, and not abort.
204076	18-Feb-2010	pjd	Please welcome HAST - Highly Avalable Storage. HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total. HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD. For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as http://wiki.FreeBSD.org/HAST. Sponsored by: FreeBSD Foundation Sponsored by: OMCnet Internet Service GmbH Sponsored by: TransIP BV