#
295454 |
|
10-Feb-2016 |
jhb |
MFC 287442,287537,288944: Fix corruption of coredumps due to procstat notes changing size during coredump generation. The changes in r287442 required some reworking since the 'fo_fill_kinfo' file op does not exist in stable/10.
287442: Detect badly behaved coredump note helpers
Coredump notes depend on being able to invoke dump routines twice; once in a dry-run mode to get the size of the note, and another to actually emit the note to the corefile.
When a note helper emits a different length section the second time around than the length it requested the first time, the kernel produces a corrupt coredump.
NT_PROCSTAT_FILES output length, when packing kinfo structs, is tied to the length of filenames corresponding to vnodes in the process' fd table via vn_fullpath. As vnodes may move around during dump, this is racy.
So:
- Detect badly behaved notes in putnote() and pad underfilled notes.
- Add a fail point, debug.fail_point.fill_kinfo_vnode__random_path to exercise the NT_PROCSTAT_FILES corruption. It simply picks random lengths to expand or truncate paths to in fo_fill_kinfo_vnode().
- Add a sysctl, kern.coredump_pack_fileinfo, to allow users to disable kinfo packing for PROCSTAT_FILES notes. This should avoid both FILES note corruption and truncation, even if filenames change, at the cost of about 1 kiB in padding bloat per open fd. Document the new sysctl in core.5.
- Fix note_procstat_files to self-limit in the 2nd pass. Since sometimes this will result in a short write, pad up to our advertised size. This addresses note corruption, at the risk of sometimes truncating the last several fd info entries.
- Fix NT_PROCSTAT_FILES consumers libutil and libprocstat to grok the zero padding.
287537: Follow-up to r287442: Move sysctl to compiled-once file
Avoid duplicate sysctl nodes.
288944: Fix core corruption caused by race in note_procstat_vmmap
This fix is spiritually similar to r287442 and was discovered thanks to the KASSERT added in that revision.
NT_PROCSTAT_VMMAP output length, when packing kinfo structs, is tied to the length of filenames corresponding to vnodes in the process' vm map via vn_fullpath. As vnodes may move during coredump, this is racy.
We do not remove the race, only prevent it from causing coredump corruption.
- Add a sysctl, kern.coredump_pack_vmmapinfo, to allow users to disable kinfo packing for PROCSTAT_VMMAP notes. This avoids VMMAP corruption and truncation, even if names change, at the cost of up to PATH_MAX bytes per mapped object. The new sysctl is documented in core.5.
- Fix note_procstat_vmmap to self-limit in the second pass. This addresses corruption, at the cost of sometimes producing a truncated result.
- Fix PROCSTAT_VMMAP consumers libutil (and libprocstat, via copy-paste) to grok the new zero padding.
Approved by: re (gjb)
|
#
185548 |
|
02-Dec-2008 |
peter |
Merge user/peter/kinfo branch as of r185547 into head.
This changes struct kinfo_filedesc and kinfo_vmentry such that they are same on both 32 and 64 bit platforms like i386/amd64 and won't require sysctl wrapping.
Two new OIDs are assigned. The old ones are available under COMPAT_FREEBSD7 - but it isn't that simple. The superceded interface was never actually released on 7.x.
The other main change is to pack the data passed to userland via the sysctl. kf_structsize and kve_structsize are reduced for the copyout. If you have a process with 100,000+ sockets open, the unpacked records require a 132MB+ copyout. With packing, it is "only" ~35MB. (Still seriously unpleasant, but not quite as devastating). A similar problem exists for the vmentry structure - have lots and lots of shared libraries and small mmaps and its copyout gets expensive too.
My immediate problem is valgrind. It traditionally achieves this functionality by parsing procfs output, in a packed format. Secondly, when tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled 32 bit binary which ran directly into the differing data structures in 32 vs 64 bit mode. (valgrind uses this to track file descriptor operations and this therefore affected every single 32 bit binary)
I've added two utility functions to libutil to unpack the structures into a fixed record length and to make it a little more convenient to use.
|