#
f75764fe |
|
10-May-2024 |
John Baldwin <jhb@FreeBSD.org> |
md: Merge two switch statements in mdstart_vnode While here, use bp->bio_cmd instead of auio.uio_rw to drive read vs write behavior. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D45155
|
#
13a5a46c |
|
29-Apr-2024 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Fix new users of MAXPHYS and hide it from the kernel namespace In cd8537910406, kib made maxphys a load-time tunable. This made the #define MAXPHYS in sys/param.h almost entirely obsolete, as it could now be overridden by kern.maxphys at boot time, or by opt_maxphys.h. However, decades of tradition have led to several new, incorrect, uses of MAXPHYS in other parts of the kernel, mostly by seasoned developers. I've corrected those uses here in a mechanical fashion, and verified that it fixes a bug in the md driver that I was experiencing. Since using MAXPHYS is such an easy mistake to make, it is best to hide it from the kernel namespace. So I've moved its definition to _maxphys.h, which is now included in param.h only for userspace. That brings up the fact that lots of userspace programs use MAXPHYS for different reasons, most of them probably wrong. Userspace consumers that really need to know the value of maxphys should probably be changed to use the kern.maxphys sysctl. But that's outside the scope of this change. Reviewed by: imp, jkim, kib, markj Fixes: 30038a8b4efc ("md: Get rid of the pbuf zone") Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D44986
|
#
29363fb4 |
|
23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
|
#
95ee2897 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
58a46cfd |
|
08-Aug-2023 |
Mike Karels <karels@FreeBSD.org> |
md driver compat32: fix structure padding for arm, powerpc Because the 32-bit md_ioctl structure contains 64-bit members, arm and powerpc add padding to a multiple of 8. i386 doesn't do this. The md_ioctl32 definition was correct for amd64/i386 without padding, but wrong for arm64 and powerpc64. Make __packed__ conditional on __amd64__, and test for the expected size on non-amd64. Note that mdconfig is used in the ATF test suite. Note, I verified the structure size for powerpc, but was unable to test. MFC after: 1 week Reviewed by: jrtc27 Differential Revision: https://reviews.freebsd.org/D41339 Discussed with: jhibbits
|
#
30038a8b |
|
23-May-2023 |
Mark Johnston <markj@FreeBSD.org> |
md: Get rid of the pbuf zone The zone is used solely to provide KVA for mapping BIOs so that we can pass mapped buffers to VOP_READ and VOP_WRITE. Currently we preallocate nswbuf/10 bufs for this purpose during boot. The intent was to limit KVA usage on 32-bit systems, but the preallocation means that we in fact consumed more KVA than needed unless one has more than nswbuf/10 (typically 25) vnode-backed MD devices in existence, which I would argue is the uncommon case. Meanwhile, all I/O to an MD is handled by a dedicated thread, so we can instead simply preallocate the KVA region at MD device creation time. Event: BSDCan 2023 Reviewed by: kib MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D40215
|
#
ad8feb1e |
|
19-Jan-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
md.c: another style fix Noted by: jkim Sponsored by: The FreeBSD Foundation MFC after: 3 days
|
#
6189672e |
|
18-Jan-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
Handle ERELOOKUP from VOP_FSYNC() in several other places We need to repeat the operation if the vnode was relocked. Reported and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D38114
|
#
bb92cd7b |
|
24-Mar-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd)
|
#
cb28dfb2 |
|
17-Feb-2022 |
Aleksandr Fedorov <afedorov@FreeBSD.org> |
md(4): Add dummy support of the BIO_FLUSH command for malloc and swap backend. PR: 260200 Reported by: editor@callfortesting.org Reviewed by: vmaffione (mentor), markj Approved by: vmaffione (mentor), markj Differential Revision: https://reviews.freebsd.org/D34260
|
#
b9c92d63 |
|
09-Feb-2022 |
Kyle Evans <kevans@FreeBSD.org> |
Annotate geom_md with MODULE_VERSION This was missed in 74d6c131cbe2 where other geom modules were annotated with MODULE_VERSION. Again, the problem is the same: we can't detect that geom_md is loaded into the kernel without it. This was noticed in release builds on the cluster; mdconfig attempts to load geom_md because it can't detect it in the kernel, but the cluster config includes md(4) and does not build the kmod. This problem would have been masked on hosts with the kmod built, as the kmod attempts to register the g_md module and fails. With this commit, mdconfig would not even try to load it again. Reported by: re (cperciva) MFC after: 3 days
|
#
7e1d3eef |
|
25-Nov-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove the unused thread argument from NDINIT* See b4a58fbf640409a1 ("vfs: remove cn_thread") Bump __FreeBSD_version to 1400043.
|
#
3703c188 |
|
11-Sep-2021 |
Ka Ho Ng <khng@FreeBSD.org> |
md: Add MD_MUSTDEALLOC support This adds an option to detect if hole-punching is implemented by the underlying file system. If this flag is set, and if the underlying file system does not support hole-punching, md(4) fails BIO_DELETE requests with EOPNOTSUPP. Sponsored by: The FreeBSD Foundation Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D31883
|
#
47619b60 |
|
31-Aug-2021 |
Mark Johnston <markj@FreeBSD.org> |
md: Clamp to a multiple of the sector size when resizing We do this when creating md(4) devices, in kern_mdattach_locked(), but not when resizing the provider. Apply the same policy when resizing, as many GEOM classes do not expect to deal with providers for which pp->mediasize % pp->sectorsize != 0. Reported by: syzkaller MFC after: 1 week Sponsored by: The FreeBSD Foundation
|
#
78267c2e |
|
19-Aug-2021 |
Ka Ho Ng <khng@FreeBSD.org> |
md: Replace BIO_DELETE emulation with vn_deallocate(9) Both zero-filling and/or deallocation can be done with vn_deallocate(9). Sponsored by: The FreeBSD Foundation Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D28899
|
#
69e18c9b |
|
30-Mar-2021 |
Alex Richardson <arichardson@FreeBSD.org> |
sys/dev/md: Drop unncessary __GLOBL(mfs_root) LLVM12 complains if you change the symbol binding: error: mfs_root_end changed binding to STB_WEAK [-Werror,-Winline-asm] error: mfs_root changed binding to STB_WEAK [-Werror,-Winline-asm]
|
#
c4cceb1d |
|
04-Jan-2021 |
Mark Johnston <markj@FreeBSD.org> |
md: Fix a race in mdstart_swap() Release a grabbed page's busy state only after marking it as referenced. Otherwise there exists a narrow window where the page could be freed before the update. Before r356902 this was not a problem since the object lock was held. Discussed with: kib Sponsored by: The FreeBSD Foundation
|
#
795a009b |
|
27-Dec-2020 |
Mark Johnston <markj@FreeBSD.org> |
md: Set bio_completed properly in the face of errors Account for any residual bytes. This is only relevant for vnode-backed md(4) devices. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27738
|
#
a7a7c306 |
|
23-Dec-2020 |
Mark Johnston <markj@FreeBSD.org> |
md: Fix a read-after-free in BIO_GETATTR handling g_handleattr_int() consumes the bio if the attribute matches, so when we check bp->bio_cmd bp may have been freed. Move GETATTR handling to a separate function to avoid the problem. We do not need to set bio_completed for such bios, g_handleattr_int() will handle it. Also remove the setting of bio_resid before the devstat_end_transaction_bio() call. All of the md(4) bio handlers set bio_resid already. Reported by: KASAN Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27724
|
#
cd853791 |
|
27-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Make MAXPHYS tunable. Bump MAXPHYS to 1M. Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225
|
#
e2a03adb |
|
12-Nov-2020 |
Mateusz Piotrowski <0mp@FreeBSD.org> |
Fix a typo in a license comment Approved by: kaktus (src)
|
#
1224a253 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
md: clean up empty lines in .c and .h files
|
#
3507b8d4 |
|
28-Jun-2020 |
Mark Johnston <markj@FreeBSD.org> |
Remove some redundant assignments and computations. Reported by: alc Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25400
|
#
84242cf6 |
|
25-Jun-2020 |
Mark Johnston <markj@FreeBSD.org> |
Call swap_pager_freespace() from vm_object_page_remove(). All vm_object_page_remove() callers, except linux_invalidate_mapping_pages() in the LinuxKPI, free swap space when removing a range of pages from an object. The LinuxKPI case appears to be an unintentional omission that could result in leaked swap blocks, so unconditionally free swap space in vm_object_page_remove() to protect against similar bugs in the future. Reviewed by: alc, kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25329
|
#
7aaf252c |
|
28-Feb-2020 |
Jeff Roberson <jeff@FreeBSD.org> |
Convert a few triviail consumers to the new unlocked grab API. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23847
|
#
d6e13f3b |
|
19-Jan-2020 |
Jeff Roberson <jeff@FreeBSD.org> |
Don't hold the object lock while calling getpages. The vnode pager does not want the object lock held. Moving this out allows further object lock scope reduction in callers. While here add some missing paging in progress calls and an assert. The object handle is now protected explicitly with pip. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23033
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
2c14385a |
|
03-Jan-2020 |
Mark Johnston <markj@FreeBSD.org> |
Fix a page leak in the md(4) swap I/O path. r356147 removed a vm_page_activate() call, but this is required to ensure that pages end up in the page queues in the first place. Restore the pre-r356157 logic. Now, without the page lock, the vm_page_active() check is racy, but this race is harmless. Reviewed by: alc, kib Reported and tested by: pho Differential Revision: https://reviews.freebsd.org/D23024
|
#
5a93d93e |
|
02-Jan-2020 |
Alexander Motin <mav@FreeBSD.org> |
Avoid duplicate I/O statistics accounting. Alike to geom_disk free the provider statistics structure and point GEOM toward local statistics. It allows to save some CPU time. MFC after: 2 weeks
|
#
024932aa |
|
29-Dec-2019 |
Alexander Motin <mav@FreeBSD.org> |
Use atomic for start_count in devstat_start_transaction(). Combined with earlier nstart/nend removal it allows to remove several locks from request path of GEOM and few other places. It would be cool if we had more SMP-friendly statistics, but this helps too. Sponsored by: iXsystems, Inc.
|
#
9f5632e6 |
|
28-Dec-2019 |
Mark Johnston <markj@FreeBSD.org> |
Remove page locking for queue operations. With the previous reviews, the page lock is no longer required in order to perform queue operations on a page. It is also no longer needed in the page queue scans. This change effectively eliminates remaining uses of the page lock and also the false sharing caused by multiple pages sharing a page lock. Reviewed by: jeff Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D22885
|
#
a8081778 |
|
14-Dec-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Add a deferred free mechanism for freeing swap space that does not require an exclusive object lock. Previously swap space was freed on a best effort basis when a page that had valid swap was dirtied, thus invalidating the swap copy. This may be done inconsistently and requires the object lock which is not always convenient. Instead, track when swap space is present. The first dirty is responsible for deleting space or setting PGA_SWAP_FREE which will trigger background scans to free the swap space. Simplify the locking in vm_fault_dirty() now that we can reliably identify the first dirty. Discussed with: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22654
|
#
abd80ddb |
|
08-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715
|
#
0f9e06e1 |
|
02-Dec-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Fix a few places that free a page from an object without busy held. This is tightening constraints on busy as a precursor to lockless page lookup and should largely be a NOP for these cases. Reviewed by: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22611
|
#
0012f373 |
|
14-Oct-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
(4/6) Protect page valid with the busy lock. Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594
|
#
fee2a2fa |
|
09-Sep-2019 |
Mark Johnston <markj@FreeBSD.org> |
Change synchonization rules for vm_page reference counting. There are several mechanisms by which a vm_page reference is held, preventing the page from being freed back to the page allocator. In particular, holding the page's object lock is sufficient to prevent the page from being freed; holding the busy lock or a wiring is sufficent as well. These references are protected by the page lock, which must therefore be acquired for many per-page operations. This results in false sharing since the page locks are external to the vm_page structures themselves and each lock protects multiple structures. Transition to using an atomically updated per-page reference counter. The object's reference is counted using a flag bit in the counter. A second flag bit is used to atomically block new references via pmap_extract_and_hold() while removing managed mappings of a page. Thus, the reference count of a page is guaranteed not to increase if the page is unbusied, unmapped, and the object's write lock is held. As a consequence of this, the page lock no longer protects a page's identity; operations which move pages between objects are now synchronized solely by the objects' locks. The vm_page_wire() and vm_page_unwire() KPIs are changed. The former requires that either the object lock or the busy lock is held. The latter no longer has a return value and may free the page if it releases the last reference to that page. vm_page_unwire_noq() behaves the same as before; the caller is responsible for checking its return value and freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is introduced for use in pmap_extract_and_hold(). It fails if the page is concurrently being unmapped, typically triggering a fallback to the fault handler. vm_page_wire() no longer requires the page lock and vm_page_unwire() now internally acquires the page lock when releasing the last wiring of a page (since the page lock still protects a page's queue state). In particular, synchronization details are no longer leaked into the caller. The change excises the page lock from several frequently executed code paths. In particular, vm_object_terminate() no longer bounces between page locks as it releases an object's pages, and direct I/O and sendfile(SF_NOCACHE) completions no longer require the page lock. In these latter cases we now get linear scalability in the common scenario where different threads are operating on different files. __FreeBSD_version is bumped. The DRM ports have been updated to accomodate the KPI changes. Reviewed by: jeff (earlier version) Tested by: gallatin (earlier version), pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20486
|
#
dcb235ab |
|
16-Aug-2019 |
Brooks Davis <brooks@FreeBSD.org> |
md(4): remove the unused and unusable MDIOCLIST ioctl. It is unused, the ABI was broken in r322969, and it is broken by design (more than MDNPAD md devices can exist and there is no way to retreive them with this interface). mdconfig(8) was converted to use libgeom to obtain this information in r157160 and any other consumers of MDIOCLIST should likewise be converted. Reviewed by: emaste Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D18936
|
#
7e1a6d47 |
|
31-Mar-2019 |
Kirk McKusick <mckusick@FreeBSD.org> |
When using the force option to shut down a memory-disk device, I/O operations already in its queue were not being properly drained. The GEOM framework does the queue draining, but the device driver needs to wait for the draining to happen. The waiting is done by adding a g_md_providergone() function to wait for the I/O operations to finish up. It is likely that every GEOM provider that implements orphaning attached GEOM consumers needs to use the "providergone" mechanism for this same reason, but some of them do not do so. Apparently Kenneth Merry (ken@) added the drain for just such races, but he missed adding it to some of the device drivers that needed it. Submitted by: Chuck Silvers Reviewed by: imp Tested by: Chuck Silvers MFC after: 1 week Sponsored by: Netflix
|
#
756a5412 |
|
14-Jan-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Allocate pager bufs from UMA instead of 80-ish mutex protected linked list. o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many pbufs are we going to have set. In various subsystems that are going to utilize pbufs create private zones via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(), and sets a limit on created zone. After startup preallocate pbufs according to requirements of all pbuf zones. Subsystems that used to have a private limit with old allocator now have private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS, swap, vnode pager. The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9), aio(4). They should have their private limits, but changing that is out of scope of this commit. o Fetch tunable value of kern.nswbuf from init_param2() and while here move NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only this option. Default values aren't touched by this commit, but they probably should be reviewed wrt to modern hardware. This change removes a tight bottleneck from sendfile(2) operation, that uses pbufs in vnode pager. Other pagers also would benefit from faster allocation. Together with: gallatin Tested by: pho
|
#
c907940b |
|
22-Dec-2018 |
Bruce Evans <bde@FreeBSD.org> |
Fix devstat on md devices, second attempt. r341765 depends on g_io_deliver() finishing initialization of the bio, but g_io_deliver() actually destroys the bio. INVARIANTS makes the bug obvious by overwriting the bio with garbage. Restore the old order for calling devstat (except don't restore not calling it for the error case), and translate to the devstat KPI so that this order works. Reviewed by: kib
|
#
9e5ed859 |
|
21-Dec-2018 |
Bruce Evans <bde@FreeBSD.org> |
Use VOP_ADVISE() with POSIX_FADV_DONTNEED instead of IO_DIRECT to implement not double-caching for reads from vnode-backed md devices. Use VOP_ADVISE() similarly instead of !IO_DIRECT unsimilarly for writes. Add a "cache" option to mdconfig to allow changing the default of not caching. This depends on a recent commit to fix VOP_ADVISE(). A previous version had optimizations for sequential i/o's (merge the i/o's and only uncache for discontiguous i/o's and for full blocks), but optimizations and knowledge of block boundaries belong in VOP_ADVISE(). Read-ahead should also be handled better, by supporting it in md and discarding it in VOP_ADVISE(). POSIX_FADV_DONTNEED is ignored by zfs, but so is IO_DIRECT. POSIX_FADV_DONTNEED works better than IO_DIRECT if it is not ignored, since it only discards from the buffer cache immediately, while IO_DIRECT also discards from the page cache immediately. IO_DIRECT was not used for writes since it was claimed to be too slow, but most of the slowness for writes is from doing them synchronously by default. Non-synchronous writes still deadlock in many cases. IO_DIRECT only has a special implementation for ffs reads with DIRECTIO configured. Otherwise, if it is not ignored than it uses the buffer and page caches normally except for discarding everything after each i/o, and then it has much the same overheads as POSIX_FADV_DONTNEED. The overheads for reading with ffs and DIRECTIO were similar in tests of md. Reviewed by: kib
|
#
dac6a0d5 |
|
09-Dec-2018 |
Bruce Evans <bde@FreeBSD.org> |
Fix devstat on md devices. devstat_end_transaction() was called before the i/o was actually ended (by delivering it to GEOM), so at least the i/o length was messed up. It was always recorded as 0, so the average transaction size and the average transfer rate was always displayed as 0. devstat_end_transaction() was not called at all for the error case, so there were sometimes multiple starts per end. I didn't observe this in practice and don't know if it did much damage. I think it extended the length of the i/o to the next transaction. Reviewed by: kib
|
#
7b2c7b92 |
|
07-Jun-2018 |
Breno Leitao <leitao@FreeBSD.org> |
md: use prestaged mfs_root On PowerNV systems, the rootfs is passed through kexec, which loads the rootfs into memory and set two fdt entries to describe where the file is located in the memory; I need to pass this memory region to the md device as a mfs_root, but, current md driver does not support two things: * Just getting a pointer from an external (bootloader) memory. If I need to workaround it, I would need to declare a static array and memcopy from this external memory to this static variable. * The size of the image. The usage of mfs_root_end, which is not a pointer, seems to be not possible for this prestaged scenario. This patch simply adds a new way to load mfs_root from memory. Differential Revision: https://reviews.freebsd.org/D15625 Approved by: kib, jhibbits (mentor)
|
#
6469bdcd |
|
06-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move most of the contents of opt_compat.h to opt_global.h. opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941
|
#
026cac82 |
|
27-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move 32-bit compat for md(4) ioctls into the md code. This is more correct in that ioctl commands have no meaning until they hit the handler associated with the file descriptor. Add support for MDIOCRESIZE_32 which was missed when it was added. Reviewed by: cem, kib, markj (various versions) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14714
|
#
34a77b97 |
|
27-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move uio enums to sys/_uio.h. Include _uio.h instead of uio.h in several headers to reduce header polution. Fix a few places that relied on header polution to get the uio.h header. I have not moved struct uio as many more things that use it rely on header polution to get other definitions from uio.h. Reviewed by: cem, kib, markj Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14811
|
#
064c9c3d |
|
15-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Add a request structure and make the implementation use it. This allows compatibility translation to take place on the stack (md_ioctl is too big) and is more suitable as a public interface within the kernel than the kern_ioctl interface. Except for the initialization of the md_req from the md_ioctl (including detection of kernel md_file pointers) and the updating of the md_ioctl prior to return, this is a mechanical replacment of md_ioctl and mdio with md_req and mdr. Reviewed by: markj, cem, kib (assorted versions) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14704
|
#
b65794ad |
|
15-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move implementation of ioctls into kern_*() functions. Move locks from outside ioctl to the individual implementations. This is the first step of changing the implementations to act on a kernel-internal request struct rather than on struct md_ioctl and to removing the use of kern_ioctl in mountroot. Reviewed by: cem, kib, markj (prior version) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14700
|
#
94598ac9 |
|
15-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Restore the behavior of returning the total number of units by unconditionally incrementing i in the loop; Reported by: cem MFC with: r330880 Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14685
|
#
8b9f77a1 |
|
13-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Don't overflow the kernel struct mdio in the MDIOCLIST ioctl. Always terminate the list with -1 and document the ioctl behavior. This preserves existing behavior as seen from userspace with the addition of the unconditional termination which will not be seen by working consumers of MDIOCLIST. Because this ioctl can only be performed by root (in default configurations) and is not used in the base system this bug is not deemed to warrant either a security advisory or an eratta notice. Reviewed by: kib Obtained from: CheriBSD Discussed with: security-officer (gordon) MFC after: 3 days Security: kernel heap buffer overflow Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14685
|
#
f05c4956 |
|
09-Jan-2018 |
Jonathan T. Looney <jtl@FreeBSD.org> |
Fix backwards MD_VERIFY logic for md devices. If the MD_VERIFY flag is set, we should use O_VERIFY. If the MD_VERIFY flag is not set, we should not. Reviewed by: stevek Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D13814
|
#
5cf10fb9 |
|
20-Dec-2017 |
Ian Lepore <ian@FreeBSD.org> |
Add a new kernel config option, MD_ROOT_READONLY, which forces on the MD_READONLY flag for the md device automatically instantiated during kernel init for an mdroot filesystem. Note that there is specifically and by design no tunable or sysctl control over this feature. Without this option, you already have control over whether the mdroot fs is writeable using vfs.root.mountfrom.options from loader(8), the root_rw_mount rcvar, and by using "mount -u[rw] /" or equivelent on the fly. This option is being added to provide a way to make the mdroot fs truly immutable before userland code begins running. Differential Revision: https://reviews.freebsd.org/D13411
|
#
64de3fdd |
|
30-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
SPDX: use the Beerware identifier.
|
#
7282444b |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/dev: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
253f5a2e |
|
03-Oct-2017 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Make md(4) support GEOM::ident for vnode-backed disks. It's based on backing file device and inode numbers. This is useful for gmountver(8) regression tests. MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D12230
|
#
d9ccb9a8 |
|
02-Oct-2017 |
Alan Cox <alc@FreeBSD.org> |
When mdstart_swap() accesses a page that is already in the active queue, mark the page as referenced rather than calling vm_page_activate(). This allows the page's act_count to grow beyond ACT_INIT and better reflect its usage. (See also r324146, which modified a function used by tmpfs, uiomove_object_page(), to behave in the same way.) Reviewed by: kib, markj MFC after: 2 weeks
|
#
f7ca2bbe |
|
28-Aug-2017 |
Maxim Sobolev <sobomax@FreeBSD.org> |
Add ability to label md(4) devices. This feature comes from the fact that we rely memory-backed md(4) in our build process heavily. However, if the build goes haywire the allocated resources (i.e. swap and memory-backed md(4)'s) need to be purged. It is extremely useful to have ability to attach arbitrary labels to each of the virtual disks so that they can be identified and GC'ed if neecessary. MFC after: 4 weeks Differential Revision: https://reviews.freebsd.org/D10457
|
#
0d48e7e8 |
|
13-Jun-2017 |
Mark Johnston <markj@FreeBSD.org> |
Don't call vm_pager_page_unswapped() when writing or deleting a dirty page. The swap space backing a clean page is released when it is first dirtied, so there's no need to attempt to release swap space when the page is already dirty. Reviewed by: alc MFC after: 1 week
|
#
01bc16bb |
|
13-Jun-2017 |
Mark Johnston <markj@FreeBSD.org> |
Free the request page if an I/O error occurs while reading from swap. After such a failure, the page is invalid, so there's point in keeping it around. Moreover, such pages were not being inserted into the active queue, making them unreclaimable until a subsequent write or delete made them valid. Reported by: alc Reviewed by: alc (previous revision) MFC after: 1 week
|
#
cc2fe2b0 |
|
13-Jun-2017 |
Mark Johnston <markj@FreeBSD.org> |
Fix handling of subpage BIO_WRITE and BIO_DELETE requests on swap MDs. Such requests would previously mark the entire page as valid, which was incorrect since nothing guaranteed that the page's contents had been initialized. This change also modifies subpage BIO_DELETEs so that the entire page is marked dirty, rather than only a subrange. There is no benefit to creating partially dirty swap pages. Reviewed by: alc, kib (previous version) MFC after: 3 days
|
#
9a81ba0f |
|
31-May-2017 |
Stephen J. Kiernan <stevek@FreeBSD.org> |
Add MD_VERIFY option to enable O_VERIFY in open for vnode type. Add -o [no]verify option to mdconfig (and document in man page.) Implement GEOM attribute MNT::verified to ask md if the backing vnode is verified. Check for MNT::verified in cd9660 mount to flag the mount as MNT_VERIFIED if the underlying device has been verified. Reviewed by: rwatson Approved by: sjg (mentor) Obtained from: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D2902
|
#
fbbd9655 |
|
28-Feb-2017 |
Warner Losh <imp@FreeBSD.org> |
Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96
|
#
4d24901a |
|
19-Feb-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/dev: Replace zero with NULL for pointers. Makes things easier to read, plus architectures may set NULL to something different than zero. Found with: devel/coccinelle MFC after: 3 weeks
|
#
e895e7fc |
|
13-Feb-2017 |
Stephen J. Kiernan <stevek@FreeBSD.org> |
Fix typo where opening brace was needed. Reported by: Michael Butler Reviewed by: sjg Approved by: sjg (mentor)
|
#
d2e63913 |
|
13-Feb-2017 |
Stephen J. Kiernan <stevek@FreeBSD.org> |
For MD_PRELOAD type md(4) devices, if there is a file name in the preloaded meta-data, copy it into the softc structure. When returning md(4) device details to the caller, include the file name in any MD_PRELOAD type devices if it is set (first character is not NUL.) In mdconfig, for "preload" type md(4) devices, if there is file config available, print it in the file column of the output. Reviewed by: brooks Approved by: sjg (mentor) MFC after: 1 month Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D9529
|
#
c3d1c73f |
|
09-Mar-2016 |
Maxim Sobolev <sobomax@FreeBSD.org> |
For the MD_ROOT option don't inject /dev/md0 as root dev when ROOTDEVNAME is defined explicitly. It's kinda pointless and results in extra step in boot sequence which is not really needed, i.e.: md0: Embedded image 1331200 bytes at 0x8038b7b4 Trying to mount root from ufs:/dev/md0 []... Mounting from ufs:/dev/md0 failed with error 22. Trying to mount root from ufs:md0.uzip []... warning: no time-of-day clock registered, system time will not be set accurately start_init: trying /sbin/init
|
#
f4c1f0b9 |
|
02-Feb-2016 |
Adrian Chadd <adrian@FreeBSD.org> |
Fix MFS builds when both MD_ROOT_SIZE and MFS_IMAGE are specified MD_ROOT_SIZE and embed_mfs.sh were basically retired as part of https://reviews.freebsd.org/D2903 . However, when building a kernel with 'options MD_ROOT_SIZE' specified, this results in a non-working MFS, as within sys/dev/md/md.c we fall within the wrong # ifdef. This patch implements the following: * Allow kernels to be built without the MD_ROOT_SIZE option, which results in a kernel built as per D2903. * Allow kernels to be built with the MD_ROOT_SIZE option, which results in a kernel built similarly to the pre-D2903 way, with the following differences: * The MFS is now put in a separate section within the kernel (oldmfs, so it differs from the mfs section introduced by D2903). * embed_mfs.sh is changed, so it looks up the oldmfs section within the kernel, gets its size and offset, sees if the MFS will fit within the allocated oldmfs section and only if all is well does a dd of the MFS image into the kernel. Submitted by: Stanislav Galabov <sgalabov@gmail.com> Reviewed by: brooks, imp Differential Revision: https://reviews.freebsd.org/D5093
|
#
b0cd2017 |
|
16-Dec-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
#
d5f998ba |
|
12-Dec-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
In md(4) over vnode, correct handling of the unaligned unmapped io requests which page alignment + size is greater than MAXPHYS. Right now md(4) over vnode would use the physical buffer of the size MAXPHYS to map a data of size MAXPHYS + page offset of the user buffer. This typically corrupts next pbuf, or, if the pbuf used was the last pbuf in the map, the next page after the pbuf's map. Split request up to the size of io which fits into pbuf KVA with alignment, and retry if a part of the bio is left unprocessed. Reported by: Fabian Keil <fk@fabiankeil.de> Tested by: Fabian Keil, pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
a9934668 |
|
03-Dec-2015 |
Kenneth D. Merry <ken@FreeBSD.org> |
Add asynchronous command support to the pass(4) driver, and the new camdd(8) utility. CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and completed CCBs may be retrieved via the CAMIOGET ioctl. User processes can use poll(2) or kevent(2) to get notification when I/O has completed. While the existing CAMIOCOMMAND blocking ioctl interface only supports user virtual data pointers in a CCB (generally only one per CCB), the new CAMIOQUEUE ioctl supports user virtual and physical address pointers, as well as user virtual and physical scatter/gather lists. This allows user applications to have more flexibility in their data handling operations. Kernel memory for data transferred via the queued interface is allocated from the zone allocator in MAXPHYS sized chunks, and user data is copied in and out. This is likely faster than the vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in configurations with many processors (there are more TLB shootdowns caused by the mapping/unmapping operation) but may not be as fast as running with unmapped I/O. The new memory handling model for user requests also allows applications to send CCBs with request sizes that are larger than MAXPHYS. The pass(4) driver now limits queued requests to the I/O size listed by the SIM driver in the maxio field in the Path Inquiry (XPT_PATH_INQ) CCB. There are some things things would be good to add: 1. Come up with a way to do unmapped I/O on multiple buffers. Currently the unmapped I/O interface operates on a struct bio, which includes only one address and length. It would be nice to be able to send an unmapped scatter/gather list down to busdma. This would allow eliminating the copy we currently do for data. 2. Add an ioctl to list currently outstanding CCBs in the various queues. 3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do that. 4. Test physical address support. Virtual pointers and scatter gather lists have been tested, but I have not yet tested physical addresses or scatter/gather lists. 5. Investigate multiple queue support. At the moment there is one queue of commands per pass(4) device. If multiple processes open the device, they will submit I/O into the same queue and get events for the same completions. This is probably the right model for most applications, but it is something that could be changed later on. Also, add a new utility, camdd(8) that uses the asynchronous pass(4) driver interface. This utility is intended to be a basic data transfer/copy utility, a simple benchmark utility, and an example of how to use the asynchronous pass(4) interface. It can copy data to and from pass(4) devices using any target queue depth, starting offset and blocksize for the input and ouptut devices. It currently only supports SCSI devices, but could be easily extended to support ATA devices. It can also copy data to and from regular files, block devices, tape devices, pipes, stdin, and stdout. It does not support queueing multiple commands to any of those targets, since it uses the standard read(2)/write(2)/writev(2)/readv(2) system calls. The I/O is done by two threads, one for the reader and one for the writer. The reader thread sends completed read requests to the writer thread in strictly sequential order, even if they complete out of order. That could be modified later on for random I/O patterns or slightly out of order I/O. camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from the pass(4) driver and also to send request notifications internally. For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR) per CAM CCB on the reading side, and a scatter/gather list (CAM_DATA_SG) on the writing side. In addition to testing both interfaces, this makes any potential reblocking of I/O easier. No data is copied between the reader and the writer, but rather the reader's buffers are split into multiple I/O requests or combined into a single I/O request depending on the input and output blocksize. For the file I/O path, camdd(8) also uses a single buffer (read(2), write(2), pread(2) or pwrite(2)) on reads, and a scatter/gather list (readv(2), writev(2), preadv(2), pwritev(2)) on writes. Things that would be nice to do for camdd(8) eventually: 1. Add support for I/O pattern generation. Patterns like all zeros, all ones, LBA-based patterns, random patterns, etc. Right Now you can always use /dev/zero, /dev/random, etc. 2. Add support for a "sink" mode, so we do only reads with no writes. Right now, you can use /dev/null. 3. Add support for automatic queue depth probing, so that we can figure out the right queue depth on the input and output side for maximum throughput. At the moment it defaults to 6. 4. Add support for SATA device passthrough I/O. 5. Add support for random LBAs and/or lengths on the input and output sides. 6. Track average per-I/O latency and busy time. The busy time and latency could also feed in to the automatic queue depth determination. sys/cam/scsi/scsi_pass.h: Define two new ioctls, CAMIOQUEUE and CAMIOGET, that queue and fetch asynchronous CAM CCBs respectively. Although these ioctls do not have a declared argument, they both take a union ccb pointer. If we declare a size here, the ioctl code in sys/kern/sys_generic.c will malloc and free a buffer for either the CCB or the CCB pointer (depending on how it is declared). Since we have to keep a copy of the CCB (which is fairly large) anyway, having the ioctl malloc and free a CCB for each call is wasteful. sys/cam/scsi/scsi_pass.c: Add asynchronous CCB support. Add two new ioctls, CAMIOQUEUE and CAMIOGET. CAMIOQUEUE adds a CCB to the incoming queue. The CCB is executed immediately (and moved to the active queue) if it is an immediate CCB, but otherwise it will be executed in passstart() when a CCB is available from the transport layer. When CCBs are completed (because they are immediate or passdone() if they are queued), they are put on the done queue. If we get the final close on the device before all pending I/O is complete, all active I/O is moved to the abandoned queue and we increment the peripheral reference count so that the peripheral driver instance doesn't go away before all pending I/O is done. The new passcreatezone() function is called on the first call to the CAMIOQUEUE ioctl on a given device to allocate the UMA zones for I/O requests and S/G list buffers. This may be good to move off to a taskqueue at some point. The new passmemsetup() function allocates memory and scatter/gather lists to hold the user's data, and copies in any data that needs to be written. For virtual pointers (CAM_DATA_VADDR), the kernel buffer is malloced from the new pass(4) driver malloc bucket. For virtual scatter/gather lists (CAM_DATA_SG), buffers are allocated from a new per-pass(9) UMA zone in MAXPHYS-sized chunks. Physical pointers are passed in unchanged. We have support for up to 16 scatter/gather segments (for the user and kernel S/G lists) in the default struct pass_io_req, so requests with longer S/G lists require an extra kernel malloc. The new passcopysglist() function copies a user scatter/gather list to a kernel scatter/gather list. The number of elements in each list may be different, but (obviously) the amount of data stored has to be identical. The new passmemdone() function copies data out for the CAM_DATA_VADDR and CAM_DATA_SG cases. The new passiocleanup() function restores data pointers in user CCBs and frees memory. Add new functions to support kqueue(2)/kevent(2): passreadfilt() tells kevent whether or not the done queue is empty. passkqfilter() adds a knote to our list. passreadfiltdetach() removes a knote from our list. Add a new function, passpoll(), for poll(2)/select(2) to use. Add devstat(9) support for the queued CCB path. sys/cam/ata/ata_da.c: Add support for the BIO_VLIST bio type. sys/cam/cam_ccb.h: Add a new enumeration for the xflags field in the CCB header. (This doesn't change the CCB header, just adds an enumeration to use.) sys/cam/cam_xpt.c: Add a new function, xpt_setup_ccb_flags(), that allows specifying CCB flags. sys/cam/cam_xpt.h: Add a prototype for xpt_setup_ccb_flags(). sys/cam/scsi/scsi_da.c: Add support for BIO_VLIST. sys/dev/md/md.c: Add BIO_VLIST support to md(4). sys/geom/geom_disk.c: Add BIO_VLIST support to the GEOM disk class. Re-factor the I/O size limiting code in g_disk_start() a bit. sys/kern/subr_bus_dma.c: Change _bus_dmamap_load_vlist() to take a starting offset and length. Add a new function, _bus_dmamap_load_pages(), that will load a list of physical pages starting at an offset. Update _bus_dmamap_load_bio() to allow loading BIO_VLIST bios. Allow unmapped I/O to start at an offset. sys/kern/subr_uio.c: Add two new functions, physcopyin_vlist() and physcopyout_vlist(). sys/pc98/include/bus.h: Guard kernel-only parts of the pc98 machine/bus.h header with #ifdef _KERNEL. This allows userland programs to include <machine/bus.h> to get the definition of bus_addr_t and bus_size_t. sys/sys/bio.h: Add a new bio flag, BIO_VLIST. sys/sys/uio.h: Add prototypes for physcopyin_vlist() and physcopyout_vlist(). share/man/man4/pass.4: Document the CAMIOQUEUE and CAMIOGET ioctls. usr.sbin/Makefile: Add camdd. usr.sbin/camdd/Makefile: Add a makefile for camdd(8). usr.sbin/camdd/camdd.8: Man page for camdd(8). usr.sbin/camdd/camdd.c: The new camdd(8) utility. Sponsored by: Spectra Logic MFC after: 1 week
|
#
c68ea8a6 |
|
13-Aug-2015 |
Marcel Moolenaar <marcel@FreeBSD.org> |
s/as/at/ in previous commit. Pointed out by: jmallett@
|
#
cc787e3d |
|
13-Aug-2015 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Change md(4) to use weak symbols as start, end and size for the embedded root disk. The embedded image is linked into the kernel in the .mfs section. Add rules and variables to kern.pre.mk and kern.post.mk that handle the linking of the image. First objcopy is used to generate an object file. Then, the object file is linked into the kernel. Submitted by: Steve Kiernan <stevek@juniper.net> Reviewed by: brooks@ Obtained from: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D2903
|
#
ec170744 |
|
13-Aug-2015 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Use g_conf_printf_escaped() to escape illegal symbols in file name. PR: 202289 MFC after: 1 week
|
#
5d9b4508 |
|
28-Jul-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
For md(4), posix shm(3) and tmpfs(5), free swap space used by paged in dirty page, which is written by the process. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
0d8243cc |
|
18-Mar-2014 |
Attilio Rao <attilio@FreeBSD.org> |
vm_page_grab() and vm_pager_get_pages() can drop the vm_object lock, then threads can sleep on the pip condition. Avoid to deadlock such threads by correctly awakening the sleeping ones after the pip is finished. swapoff side of the bug can likely result in shutdown deadlocks. Sponsored by: EMC / Isilon Storage Division Reported by: pho, pluknet Tested by: pho
|
#
60b6e197 |
|
10-Dec-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Only assert the length of the passed bio in the mdstart_vnode() when the bio is unmapped, so we must map the bio pages into pbuf. This works around the geom classes which do not follow the MAXPHYS limit on the i/o size, since such classes do not know about unmapped bios either. Reported by: Paolo Pinto <paolo.pinto@netasq.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
bc2308d4 |
|
04-Dec-2013 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Change comment to match code. Discussed with: thompsa Sponsored by: The FreeBSD Foundation
|
#
0efd9bfd |
|
04-Dec-2013 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add "null" backend to mdconfig(8). This does exactly what the name suggests, and is somewhat useful for benchmarking. MFC after: 1 month No objections from: kib Sponsored by: The FreeBSD Foundation
|
#
40ea77a0 |
|
22-Oct-2013 |
Alexander Motin <mav@FreeBSD.org> |
Merge GEOM direct dispatch changes from the projects/camlock branch. When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. The defined now safety requirements are: - caller should not hold any locks and should be reenterable; - callee should not depend on GEOM dual-threaded concurency semantics; - on the way down, if request is unmapped while callee doesn't support it, the context should be sleepable; - kernel thread stack usage should be below 50%. To keep compatibility with GEOM classes not meeting above requirements new provider and consumer flags added: - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request); - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done); - G_PF_DIRECT_SEND -- provider code meets caller requirements (done); - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request). Capable GEOM class can set them, allowing direct dispatch in cases where it is safe. If any of requirements are not met, request is queued to g_up or g_down thread same as before. Such GEOM classes were reviewed and updated to support direct dispatch: CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE, VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL, MAP, FLASHMAP, etc). To declare direct completion capability disk(9) KPI got new flag equivalent to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk drivers got it set now thanks to earlier CAM locking work. This change more then twice increases peak block storage performance on systems with manu CPUs, together with earlier CAM locking changes reaching more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to 256 user-level threads). Sponsored by: iXsystems, Inc. MFC after: 2 months
|
#
1a42d14a |
|
30-Aug-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Give the page allocations initiated by the swap-backed md(4) a higher priority. If the write is requested by a system daemon, sleeping there would starve resources and cause deadlock. Reported and tested by: pho Sponsored by: The FreeBSD Foundation
|
#
5944de8e |
|
22-Aug-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9). The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation
|
#
c7aebda8 |
|
09-Aug-2013 |
Attilio Rao <attilio@FreeBSD.org> |
The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
|
#
537cc627 |
|
24-May-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix the data corruption on the swap-backed md. Assign the rv variable a success code if the pager was not asked for the page. Using an error code from the previous processed page caused zeroing of the valid page, when e.g. the previous page was not available in the pager. Reported by: lstewart Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
1ef76554 |
|
02-Apr-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not declare that preloaded md(4) supports unmapped bio requests, it does not. Reported by: <mh@kernel32.de> Sponsored by: The FreeBSD Foundation
|
#
59ec9023 |
|
19-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Support unmapped i/o for the md(4). The vnode-backed md(4) has to map the unmapped bio because VOP_READ() and VOP_WRITE() interfaces do not allow to pass unmapped requests to the filesystem. Vnode-backed md(4) uses pbufs instead of relying on the bio_transient_map, to avoid usual md deadlock. Sponsored by: The FreeBSD Foundation Tested by: pho, scottl
|
#
89f6b863 |
|
08-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
341b240d |
|
21-Nov-2012 |
Jaakko Heinonen <jh@FreeBSD.org> |
Print correct unit number when attaching preloaded memory disks. Retire now unused mdunits variable.
|
#
734e78df |
|
21-Nov-2012 |
Jaakko Heinonen <jh@FreeBSD.org> |
Disallow attaching preloaded memory disks via ioctl. - The feature is dangerous because the kernel code didn't check validity of the memory address provided from user space. - It seems that mdconfig(8) never really supported attaching preloaded memory disks. - Preloaded memory disks are automatically attached during md(4) initialization. Thus there shouldn't be much use for the feature. PR: kern/169683 Discussed on: freebsd-hackers
|
#
e9f581ba |
|
07-Nov-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Zero the newly allocated md(4) swap-backed page to prevent random kernel memory leakage to userspace. For the typical use, when a filesystem put on the md disk, the change only results in CPU and memory bandwidth spent to zero the page, since filsystems make sure that user never see unwritten content. But if md disk is used as raw device by userspace, the garbage is exposed. Reported by: Paul Schenkeveld <freebsd@psconsult.nl> MFC after: 2 weeks
|
#
22ff74b2 |
|
03-Nov-2012 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Add a MD_ROOT_FSTYPE kernel option. The option specifies the file system part for the MD_ROOT mount string. Hardcoding the the file system type as "ufs" is too restrictive.
|
#
5050aa86 |
|
22-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
|
#
1c771f92 |
|
05-Aug-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
|
#
2ddfc13d |
|
04-Aug-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove verbose unused commented out debugging printf. MFC after: 1 week Reviewed by: alc
|
#
8cb51643 |
|
02-Aug-2012 |
Jaakko Heinonen <jh@FreeBSD.org> |
Disallow sectorsize larger than MAXPHYS and mediasize smaller than sectorsize. PR: 169947 Submitted by: Filip Palian (original version) Reviewed by: kib
|
#
dc604f0c |
|
07-Jul-2012 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Make it possible to resize md(4) devices. Reviewed by: kib Sponsored by: FreeBSD Foundation
|
#
3eb9ab52 |
|
12-Dec-2011 |
Eitan Adler <eadler@FreeBSD.org> |
Document a large number of currently undocumented sysctls. While here fix some style(9) issues and reduce redundancy. PR: kern/155491 PR: kern/155490 PR: kern/155489 Submitted by: Galimov Albert <wtfcrap@mail.ru> Approved by: bde Reviewed by: jhb MFC after: 1 week
|
#
1f192809 |
|
31-Oct-2011 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Add information about MD_READONLY and MD_COMPRESS flags to the configuration dump. MFC after: 1 week
|
#
657bd8b1 |
|
10-Jul-2011 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Include sys/sbuf.h directly.
|
#
cfb00e5a |
|
13-May-2011 |
Matthew D Fleming <mdf@FreeBSD.org> |
Move the ZERO_REGION_SIZE to a machine-dependent file, as on many architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by: alc MFC after: 1 week MFC with: r221853
|
#
89cb2a19 |
|
13-May-2011 |
Matthew D Fleming <mdf@FreeBSD.org> |
Usa a globally visible region of zeros for both /dev/zero and the md device. There are likely other kernel uses of "blob of zeros" than can be converted. Reviewed by: alc MFC after: 1 week
|
#
0abd21bd |
|
29-Apr-2011 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Implement BIO_DELETE for vnode devices by simply overwriting the deleted sectors with all-zeroes. The zeroes come from a static buffer; null(4) uses a dynamic buffer for the same purpose (for /dev/zero). It might be a good idea to have a static, shared, read-only all-zeroes page somewhere in the kernel that md(4), null(4) and any other code that needs zeroes could use. Reviewed by: kib MFC after: 3 weeks
|
#
8d5ac6c3 |
|
09-Feb-2011 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Use the preload_fetch_addr() and preload_fetch_size() convenience functions and only create the MD device when we have a non-zero pointer and size. Sponsored by: Juniper Networks
|
#
4a13a769 |
|
27-Jan-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Add support for BIO_DELETE on swap-backed md(4). In the case of BIO_DELETE covering the whole page, free the page. Otherwise, clear the region and mark it clean. Not marking the page dirty could reinstantiate cleared data, but it is allowed by BIO_DELETE specification and saves unneeded write to swap. Reviewed by: alc Tested by: pho MFC after: 2 weeks
|
#
96410b95 |
|
25-Jan-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Bio shall not be accessed after g_io_deliver(9). Reported and tested by: pho Reviewed by: ae, phk MFC after: 1 week
|
#
007777f1 |
|
19-Jan-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Add missed (). Noted by: alc MFC after: 3 days
|
#
18a22f96 |
|
19-Jan-2011 |
Alan Cox <alc@FreeBSD.org> |
There is no point in calling vm_object_set_writeable_dirty() on an object that is definitively known to be swap backed since its only effects are on vnode-backed objects. Reviewed by: kib
|
#
d91e813c |
|
28-Dec-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Add reporting of GEOM::candelete BIO_GETATTR for md(4) and geom_disk(4). Non-zero value of attribute means that device supports BIO_DELETE. Suggested and reviewed by: pjd Tested by: pho MFC after: 1 week
|
#
c44d423e |
|
29-Dec-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Add sysctl vm.md_malloc_wait, non-zero value of which switches malloc-backed md(4) to using M_WAITOK malloc calls. M_NOWAITOK allocations may fail when enough memory could be freed, but not immediately. E.g. SU UFS becomes quite unhappy when metadata write return error, that would happen for failed malloc() call. Reported and tested by: pho MFC after: 1 week
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
3d5c947d |
|
17-Oct-2010 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Allow the MDIOCATTACH ioctl operation to originate from within the kernel. To protect against malicious software, we demand that the file name is at a particular location (i.e. appended to the mdio structure) for it to be treated as in-kernel.
|
#
b42f40b8 |
|
26-Jul-2010 |
Jaakko Heinonen <jh@FreeBSD.org> |
- Remove some extra white space. - Wrap g_md_dumpconf() prototype to 80 columns.
|
#
f4e7c5a8 |
|
22-Jul-2010 |
Jaakko Heinonen <jh@FreeBSD.org> |
Convert md(4) to use alloc_unr(9) and alloc_unr_specific(9) for unit number allocation. The old approach had some problems such as it allowed an overflow to occur in the unit number calculation. PR: kern/122288
|
#
d12fc952 |
|
06-Jul-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Calculate nshift only once. Also noted by: avg MFC after: 1 week
|
#
ecd5dd95 |
|
15-Jun-2010 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary page queues locking.
|
#
fc0c3802 |
|
03-May-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Lock the page around vm_page_activate() and vm_page_deactivate() calls where it was missed. The wrapped fragments now protect wire_count with page lock. Reviewed by: alc
|
#
ae3c92a1 |
|
27-Mar-2010 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
MFC r204408: Fix panic on invalid 'mdconfig -at preload' usage. PR: kern/80136
|
#
5ed1eb2b |
|
27-Feb-2010 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Fix panic on invalid 'mdconfig -at preload' usage. PR: kern/80136
|
#
e2b36efd |
|
29-Jan-2010 |
Antoine Brodin <antoine@FreeBSD.org> |
MFC r201145 to stable/8: (S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument. Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used. PR: 137213 Submitted by: Eygene Ryabinkin (initial version)
|
#
13e403fd |
|
28-Dec-2009 |
Antoine Brodin <antoine@FreeBSD.org> |
(S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument. Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used. PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month
|
#
3364c323 |
|
23-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
|
#
dbb95048 |
|
18-May-2009 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Add cpu_flush_dcache() for use after non-DMA based I/O so that a possible future I-cache coherency operation can succeed. On ARM for example the L1 cache can be (is) virtually mapped, which means that any I/O that uses temporary mappings will not see the I-cache made coherent. On ia64 a similar behaviour has been observed. By flushing the D-cache, execution of binaries backed by md(4) and/or NFS work reliably. For Book-E (powerpc), execution over NFS exhibits SIGILL once in a while as well, though cpu_flush_dcache() hasn't been implemented yet. Doing an explicit D-cache flush as part of the non-DMA based I/O read operation eliminates the need to do it as part of the I-cache coherency operation itself and as such avoids pessimizing the DMA-based I/O read operations for which D-cache are already flushed/invalidated. It also allows future optimizations whereby the bcopy() followed by the D-cache flush can be integrated in a single operation, which could be implemented using on-chips DMA engines, by-passing the D-cache altogether.
|
#
33fc3625 |
|
11-Mar-2009 |
John Baldwin <jhb@FreeBSD.org> |
Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that a filesystem supports additional operations using shared vnode locks. Currently this is used to enable shared locks for open() and close() of read-only file descriptors. - When an ISOPEN namei() request is performed with LOCKSHARED, use a shared vnode lock for the leaf vnode only if the mount point has the extended shared flag set. - Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but not O_CREAT. - Use a shared vnode lock around VOP_CLOSE() if the file was opened with O_RDONLY and the mountpoint has the extended shared flag set. - Adjust md(4) to upgrade the vnode lock on the vnode it gets back from vn_open() since it now may only have a shared vnode lock. - Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since FIFO's require exclusive vnode locks for their open() and close() routines. (My recent MPSAFE patches for UDF and cd9660 already included this change.) - Enable extended shared operations on UFS, cd9660, and UDF. Submitted by: ups Reviewed by: pjd (ZFS bits) MFC after: 1 month
|
#
b72cca38 |
|
21-Feb-2009 |
Alan Cox <alc@FreeBSD.org> |
Remove unnecessary page queues locking around vm_page_wakeup(). (This change is applicable to RELENG_7 but not RELENG_6.) MFC after: 1 week
|
#
a9ebb311 |
|
10-Jan-2009 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add the possibility to specify "-o force" with "mdconfig -du". Reviewed by: scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
#
41c8b468 |
|
16-Dec-2008 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Fix forced mdconfig -du. E.g. the following would previously result in panic: mdconfig -af blah.img -o force mount /dev/md0 /mnt mdconfig -du 0 Reviewed by: scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
0359a12e |
|
28-Aug-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
#
06d425f9 |
|
28-May-2008 |
Ed Schouten <ed@FreeBSD.org> |
Remove the distinction between device minor and unit numbers. Even though we got rid of device major numbers some time ago, device drivers still need to provide unique device minor numbers to make_dev(). These numbers are only used inside the kernel. They are not related to device major and minor numbers which are visible in devfs. These are actually based on the inode number of the device. It would eventually be nice to remove minor numbers entirely, but we don't want to be too agressive here. Because the 8-15 bits of the device number field (si_drv0) are still reserved for the major number, there is no 1:1 mapping of the device minor and unit numbers. Because this is now unused, remove the restrictions on these numbers. The MAXMAJOR definition was actually used for two purposes. It was used to convert both the userspace and kernelspace device numbers to their major/minor pair, which is why it is now named UMINORMASK. minor2unit() and unit2minor() have now become useless. Both minor() and dev2unit() now serve the same purpose. We should eventually remove some of them, at least turning them into macro's. If devfs would become completely minor number unaware, we could consider using si_drv0 directly, just like si_drv1 and si_drv2. Approved by: philip (mentor)
|
#
3cf74e53 |
|
28-Feb-2008 |
Philip Paeps <philip@FreeBSD.org> |
Zero sc->vnode if mdsetcred() fails. This fixes the panic which happens when mdcreate_vnode() calls vn_close() and mddestroy() calls it again further down the error handling path. Reviewed by: kris, kib MFC after: 3 days
|
#
22db15c0 |
|
13-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
cb05b60a |
|
09-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
#
a03be42d |
|
07-Nov-2007 |
Maxim Sobolev <sobomax@FreeBSD.org> |
Put back devstat support that was lost during GEOM transition. Initially, I've tried to move md(4) to use geom_disk class, like real disks do, but this requires major rework of some of the existing features such as configuration dumping for example. Therefore just putting devstat support directly into md(4) seems to be optimal solution. Now you can see md(4) stats in `systat -vm' again. MFC after: 2 weeks
|
#
3745c395 |
|
20-Oct-2007 |
Julian Elischer <julian@FreeBSD.org> |
Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
|
#
982d11f8 |
|
04-Jun-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
#
9e223287 |
|
31-May-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)
|
#
3b7b5496 |
|
14-Dec-2006 |
Konstantin Belousov <kib@FreeBSD.org> |
Resolve two deadlocks that could be caused by busy md device backed by vnode. Allow for md thread and the thread that owns lock on vnode backing the md device to do the write even when runningbufspace is exhausted. Tested by: Peter Holm Reviewed by: tegge MFC after: 2 weeks
|
#
a7773239 |
|
01-Nov-2006 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Style nits.
|
#
5541f25e |
|
01-Nov-2006 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Fix md(4) panic which occurs when I/O request different than BIO_READ/BIO_WRITE is sent to vnode-backed provider (BIO_DELETE or BIO_FLUSH). Reported by: ceri Add support for BIO_FLUSH to vnode-backed md(4) devices based on VOP_FSYNC().
|
#
a08d2e7f |
|
28-Mar-2006 |
John Baldwin <jhb@FreeBSD.org> |
- Conditionally acquire Giant in mdstart_vnode(), mdcreate_vnode(), and mddestroy() only if the file is from a non-MPSAFE VFS. - No longer unconditionally hold Giant in the md kthread for vnode-backed kthreads. - Improve the handling of the thread exit race when destroying an md device.
|
#
c27a8954 |
|
26-Mar-2006 |
Wojciech A. Koszek <wkoszek@FreeBSD.org> |
Teach md(4) and mdconfig(8) how to understand XML. Right now there won't be a problem with listing large number of md(4) devices. Either 'list' or 'query' mode uses XML. Additionally, new functionality was introduced. It's possible to pass multiple devices to -u: # ./mdconfig -l -u md0,md1 Approved by: cognet (mentor)
|
#
de64f22a |
|
31-Jan-2006 |
Luigi Rizzo <luigi@FreeBSD.org> |
make sure that the start and end preloaded MFS markers are in contiguous strings, and that the compiler does not optimize them away because it thinks they are unused.
|
#
b322d85d |
|
27-Jan-2006 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Call NDFREE() only when vn_open() succeeded. MFC after: 3 days
|
#
6c3cd0e2 |
|
28-Dec-2005 |
Maxim Konovalov <maxim@FreeBSD.org> |
o Fix typos in the comments. Submitted by: Wojciech A. Koszek
|
#
5bb84bc8 |
|
31-Oct-2005 |
Robert Watson <rwatson@FreeBSD.org> |
Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
|
#
947fc8de |
|
06-Oct-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make sure that the worker thread knows the type early enough to grab Giant for vnode backing. Found by: pho & tegge
|
#
9b00ca19 |
|
19-Sep-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix configuration locking in MD. Remove md_mtx. Remove GIANT from the mdctl device driver and avoid DROP_GIANT, PICKUP_GIANT and geom events since we can call into GEOM directly now. Pick up Giant around vn_close(). Apply an exclusive sx around mdctls ioctl and preloading to protect lists etc.. Don't initialize our lock (md_mtx or md_sx) from a SYSINIT when there is a perfectly good pair of _fini/_init functions to do it from. Prune any final fractional sector from the mediasize to keep GEOM happy. Cleanups: Unify MDIOVERSION check in (x)mdctlioctl() Add pointer to start() routine to softc to eliminate a switch{} Inline guts of mddetach(). Always pass error pointer to mdnew(), simplify implementation.
|
#
9fbea3e3 |
|
10-Sep-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Do not destroy the queue mutex until the thread is done with it.
|
#
7ee3c044 |
|
31-Aug-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Add md_mtx lock to protect ID number and list of devices. - Always check mdnew() return value, as even in !autounit case kthread_create() can fail. Those two changes fix serval panics provked by simple stress test. Tested by: Kris The BugMagnet MFC after: 3 days
|
#
86776891 |
|
16-Aug-2005 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Ensure that file flags such as schg, sappnd (and others) are honored by md(4). Before this change, it was possible to by-pass these flags by creating memory disks which used a file as a backing store and writing to the device. This was discussed by the security team, and although this is problematic, it was decided that it was not critical as we never guarantee that root will be restricted. This change implements the following behavior changes: -If the user specifies the readonly flag, unset write operations before opening the file. If the FWRITE mask is unset, the device will be created with the MD_READONLY mask set. (readonly) -Add a check in g_md_access which checks to see if the MD_READONLY mask is set, if so return EROFS -Do not gracefully downgrade access modes without telling the user. Instead make the user specify their intentions for the device (assuming the file is read only). This seems like the more correct way to handle things. This is a RELENG_6 candidate. PR: kern/84635 Reviewed by: phk
|
#
e340fc60 |
|
13-Feb-2005 |
Alan Cox <alc@FreeBSD.org> |
Request a CPU private mapping from sf_buf_alloc(). If the swap-backed memory disk is larger than the number of available sf_bufs, this improves performance on SMPs by eliminating interprocessor TLB shootdowns. For example, with 6656 sf_bufs, the default on my test machine, and a 256MB swap-backed memory disk, I see the command "dd if=/dev/md0 of=/dev/null bs=64k" achieve ~489MB/sec with the default, shared mappings, and ~587MB/sec with CPU private mappings.
|
#
d9aaa28f |
|
29-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use MAXMINOR
|
#
1db17c6d |
|
22-Jan-2005 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Don't destroy UMA zone on error in mdcreate_malloc(), because we need it in mddestroy() to properly free already allocated memory. This fixes a panic when we want to create too big memory backed device with preallocate memory (-o reserve). - Remove redundant { }. MFC after: 1 week
|
#
9d3a77c4 |
|
22-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a couple of mtx_asserts() to try to narrow down the window on a bug repeatedly reported.
|
#
098ca2bd |
|
05-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
Start each of the license/copyright comments with /*-, minor shuffle of lines
|
#
c935314f |
|
04-Jan-2005 |
Alan Cox <alc@FreeBSD.org> |
Add needed synchronization to the error handling code that was introduced in revision 1.141. Lock assertion failures reported by: Kris Kennaway
|
#
63710c4d |
|
30-Dec-2004 |
John Baldwin <jhb@FreeBSD.org> |
Stop explicitly touching td_base_pri outside of the scheduler and simply set a thread's priority via sched_prio() when that is the desired action. The schedulers will start managing td_base_pri internally shortly.
|
#
88b5b78d |
|
27-Dec-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Rewrite piece of code which I committed some time ago that allows to show file name for 'mdconfig -l -u <x>' command. This allows to preserve API/ABI compatibility with version 0 (that's why I changed version number back to 0) and will allow to merge this change to RELENG_5. MFC after: 5 days
|
#
8b6fc67a |
|
12-Nov-2004 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Fix the MDIOCDETACH ioctl() for md(4). Now that the md_file field in the mdio structure is an array and not a pointer, we cannot test for it to be NULL. It never is. Instead, test for md_file[0] to be '\0'.
|
#
e3ed29a7 |
|
06-Nov-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Be consistent and use 'if (error != 0)' instead of 'if (error)' everywhere.
|
#
61a6eb62 |
|
06-Nov-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
For file backed md(4) devices output their source file via 'mdconfig -l -u <unit>'. Bump version number, as this change breaks ABI/API.
|
#
3b66ad07 |
|
23-Oct-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't explicitly call g_waitidle(), it happens automagically now.
|
#
812851b6 |
|
11-Oct-2004 |
Brian Feldman <green@FreeBSD.org> |
Account for failure in vm_pager_allocate() or vm_pager_get_pages() in md(8). The former is generally not going to fail, but the latter can fail when the underlying swap device returns an error. There are still plenty of other places where vm_pager_get_pages() failing will lead directly to crashes, so it's a good idea to put your swap on RAID if you care enough to put any of your disks on RAID....
|
#
e4cdd0d4 |
|
18-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Actually this order (unlock, wakeup) in this case is race-safe and can save us 2 context switches. Explained by: njl
|
#
b830359b |
|
16-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Make md(4) 64-bit clean. After this change it should be possible to use very big md(4) devices. - Clean up and simplify the code a bit. - Use humanize_number(3) to print size of md(4) devices. - Add 't' suffix which stands for terabyte. - Make '-S' to really work with all types of devices. - Other minor changes.
|
#
fcd57fbe |
|
16-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
There is no need to keep 'npage' value inside our softc structure, it is only used in one function. While doing so, change its type to vm_ooffset_t. We are still limited for swap-backed devices to 16TB on 32-bit architectures where PAGE_SIZE is 4096 bytes.
|
#
a8a58d03 |
|
16-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Do not use bio_pblkno as it is going away anyway. - Prefer bio_length than bio_bcount.
|
#
4b07ede4 |
|
16-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
First wakeup, then unlock.
|
#
6ab0a0ae |
|
16-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Type 'int' is too small for 'i' and 'lastp' variables. Use proper type, which is vm_pindex_t (unsigned 64bit on i386).
|
#
2eafd8b1 |
|
14-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Deallocate VM object on failure.
|
#
7a097011 |
|
14-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
One more missing NDFREE(9).
|
#
52c6716f |
|
14-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
- Don't forget about NDFREE() in case of vn_open() failure. - Don't forget about vn_close() in case of failure.
|
#
f9963bbc |
|
14-Sep-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Fix UMA zone leak.
|
#
affa4706 |
|
07-Sep-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use bioq_takefirst()
|
#
972be79a |
|
30-Aug-2004 |
Colin Percival <cperciva@FreeBSD.org> |
Don't g_waitidle() when initializing a preloaded md. This fixes a deadlock which otherwise occurs during the boot process. Reported by: kensmith MFC after: 3 days (assuming that re@ approves)
|
#
2b004a22 |
|
22-Aug-2004 |
Colin Percival <cperciva@FreeBSD.org> |
When creating a new md, wait for geom's event queue to become empty before returning. Device nodes are created via the "taste" mechanism, so this is necessary in order to make sure that devfs entries are created before mdconfig(8) returns. This may be a MFC candidate for 5.3. Suggested by: phk
|
#
5721c9c7 |
|
08-Aug-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Tag all geom classes in the tree with a version number.
|
#
19945697 |
|
08-Aug-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use a ->fini() from the geom class to destroy the control device. Use default initialization of geom methods.
|
#
3e019dea |
|
15-Jul-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events. A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".
|
#
89c9c53d |
|
16-Jun-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.
|
#
6a408929 |
|
18-May-2004 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Fix panic which occurs when given sector size for memory-backed device is less than DEV_BSIZE (512) bytes. Reported by: Mike Bristow <mike@urgle.com> Approved by: phk
|
#
ed010cdf |
|
08-Apr-2004 |
Warner Losh <imp@FreeBSD.org> |
Ooops, removed this acknowledgement bogusly. Eagle Eyes: bde
|
#
f36cfd49 |
|
07-Apr-2004 |
Warner Losh <imp@FreeBSD.org> |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
|
#
121230a4 |
|
03-Apr-2004 |
Alan Cox <alc@FreeBSD.org> |
In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it should not. Add a new parameter so that the caller can specify which is the case. Reported by: dillon
|
#
5d4ca75e |
|
31-Mar-2004 |
Luigi Rizzo <luigi@FreeBSD.org> |
Fix a bug with preloaded image -- for some reason [that i don't completely understand], md_takeroot() runs before md_preloaded(), rendering both useless. As a fix, move the body (effectively one line!) of md_takeroot() into md_preloaded(), and get rid of the stuff that has become useless. Bug and fix reported 10 days ago on -current, no reply.
|
#
07be617f |
|
19-Mar-2004 |
Alan Cox <alc@FreeBSD.org> |
- Remove some unused #includes. - Apply some style fixes to mdstart_swap().
|
#
7cd53fdd |
|
18-Mar-2004 |
Alan Cox <alc@FreeBSD.org> |
Utilize sf_buf_alloc() and sf_buf_free() to implement the ephemeral mappings required by mdstart_swap(). On i386, if the ephemeral mapping is already in the sf_buf mapping cache, a swap-backed md performs similarly to a malloc-backed md. Even if the ephemeral mapping is not cached, this implementation is still faster. On 64-bit platforms, this change has the effect of using the direct virtual-to-physical mapping, avoiding ephemeral mapping overheads, such as TLB shootdowns on SMPs. On a 2.4GHz, 400MHz FSB P4 Xeon configured with 64K sf_bufs and "mdmfs -S -o async -s 128m md /mnt" before: dd if=/dev/md0 of=/dev/null bs=64k 134217728 bytes transferred in 0.430923 secs (311465697 bytes/sec) after with cold sf_buf cache: dd if=/dev/md0 of=/dev/null bs=64k 134217728 bytes transferred in 0.367948 secs (364773576 bytes/sec) after with warm sf_buf cache: dd if=/dev/md0 of=/dev/null bs=64k 134217728 bytes transferred in 0.252826 secs (530870010 bytes/sec) malloc-backed md: dd if=/dev/md0 of=/dev/null bs=64k 134217728 bytes transferred in 0.253126 secs (530240978 bytes/sec)
|
#
33651381 |
|
13-Mar-2004 |
Alan Cox <alc@FreeBSD.org> |
Allow swap-backed devices to run without Giant.
|
#
7a6b2b64 |
|
10-Mar-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a long-standing deadlock issue with vnode backed md(4) devices: On vnode backed md(4) devices over a certain, currently undetermined size relative to the buffer cache our "lemming-syncer" can provoke a buffer starvation which puts the md thread to sleep on wdrain. This generally tends to grind the entire system to a stop because the event that is supposed to wake up the thread will not happen until a fair bit of the piled up I/O requests in the system finish, and since a lot of those are on a md(4) vnode backed device which is currently waiting on wdrain until a fair amount of the piled up ... you get the picture. The cure is to issue all VOP_WRITES on the vnode backing the device with IO_SYNC. In addition to more closely emulating a real disk device with a non-lying write-cache, this makes the writes exempt from rate-limited (there to avoid starving the buffer cache) and consequently prevents the deadlock. Unfortunately performance takes a hit. Add "async" option to give people who know what they are doing the old behaviour.
|
#
60744399 |
|
05-Mar-2004 |
John Baldwin <jhb@FreeBSD.org> |
kthread_exit() no longer requires Giant, so don't force callers to acquire Giant just to call kthread_exit(). Requested by: many
|
#
9ed40643 |
|
02-Mar-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make swapbacked md(4) devices respect the -x and -y emulation arguments.
|
#
e07113d6 |
|
29-Feb-2004 |
Colin Percival <cperciva@FreeBSD.org> |
Use DEV_BSIZE byte sectors instead of PAGE_SIZE byte sectors for swap-backed memory disks. This reduces filesystem allocation overhead and makes swap-backed memory disks compatible with broken code (dd, for example) which expects to see 512 byte sectors. The size of a swap-backed memory disk must still be a multiple of the page size. When performing page-aligned operations, this change has zero performance impact. Reviewed by: phk Approved by: rwatson (mentor)
|
#
dc08ffec |
|
21-Feb-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Device megapatch 4/6: Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION. Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
|
#
d5a929dc |
|
12-Jan-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Allow specification of a geometry for vnode backed devices as well as for malloc backed devices.
|
#
0a937206 |
|
13-Dec-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a locking problem with MD_ROOT_SIZE. Retire md(4)'s static major number.
|
#
0eb14309 |
|
18-Nov-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use the class->init() to hitch up preload devices, rather than rely on the "old" SYSINIT. This makes sure things happen in the right order. XXX: md(4) needs to be fully geom-ified and in particluar /dev/md.ctl should be abandonded for the GEOM OaM api. Approved by: re@
|
#
6e9a011a |
|
18-Oct-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't initialize unused bio_blkno field.
|
#
70cd7713 |
|
26-Sep-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
The present defaults for the open and close for device drivers which provide no methods does not make any sense, and is not used by any driver. It is a pretty hard to come up with even a theoretical concept of a device driver which would always fail open and close with ENODEV. Change the defaults to be nullopen() and nullclose() which simply does nothing. Remove explicit initializations to these from the drivers which already used them.
|
#
8b149b51 |
|
07-Aug-2003 |
John Baldwin <jhb@FreeBSD.org> |
Consistently use the BSD u_int and u_short instead of the SYSV uint and ushort. In most of these files, there was a mixture of both styles and this change just makes them self-consistent. Requested by: bde (kern_ktrace.c)
|
#
8e28326a |
|
05-Aug-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Change the implementation of swap backing to use the VM system in normal ways, and drop the need for vm_pager_strategy().
|
#
7c89f162 |
|
27-Jul-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.
|
#
8198a1a4 |
|
22-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove 256 unit limit, there is no evil minor number encoding to deal with any more. Spotted by: "Darren Freestone" <df@cops.org>
|
#
f075585f |
|
31-May-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove the G_CLASS_INITIALIZER, we do not need it anymore.
|
#
17a13919 |
|
31-May-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent deadlocks with vnode backed md(4) devices because md now uses a kthread to run the bio requests instead of doing it directly from the bio down path.
|
#
f820bc50 |
|
16-May-2003 |
Alan Cox <alc@FreeBSD.org> |
Use vm_object_deallocate(), not vm_pager_deallocate(), to destroy a vm object. (vm_pager_deallocate() does not, in fact, destroy a vm object.) Approved by: re (scottl) Reviewed by: phk
|
#
6b60a2cd |
|
02-May-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Call g_wither_geom(), instead of just setting the flag.
|
#
4e8bfe14 |
|
09-Apr-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a couple of undocumented test options to MD(4) to aid in regression testting of GEOM.
|
#
4eba52a2 |
|
03-Apr-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove all references to BIO_SETATTR. We will not be using it.
|
#
891619a6 |
|
01-Apr-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use bioq_flush() to drain a bio queue with a specific error code. Retain the mistake of not updating the devstat API for now. Spell bioq_disksort() consistently with the remaining bioq_*(). #include <geom/geom_disk.h> where this is more appropriate.
|
#
51a5b7f1 |
|
01-Apr-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't include <sys/disk.h>.
|
#
67ffbc63 |
|
29-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
remove a blank line.
|
#
83e13864 |
|
27-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Allocate the toplevel indir with M_WAITOK to avoid complicating things needlessly. Detected by: rwatsons EvilMalloc(9)
|
#
5d445dcb |
|
24-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Change g_class initialization to sparse format.
|
#
b4b138c2 |
|
18-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
|
#
60794e04 |
|
08-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Centralize the devstat handling for all GEOM disk device drivers in geom_disk.c. As a side effect this makes a lot of #include <sys/devicestat.h> lines not needed and some biofinish() calls can be reduced to biodone() again.
|
#
ebe789d6 |
|
03-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a "-S sectorsize" option to enable Kirk to find a bug :-)
|
#
7ac40f5f |
|
02-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Gigacommit to improve device-driver source compatibility between branches: Initialize struct cdevsw using C99 sparse initializtion and remove all initializations to default values. This patch is automatically generated and has been tested by compiling LINT with all the fields in struct cdevsw in reverse order on alpha, sparc64 and i386. Approved by: re(scottl)
|
#
a163d034 |
|
18-Feb-2003 |
Warner Losh <imp@FreeBSD.org> |
Back out M_* changes, per decision of the TRB. Approved by: trb
|
#
b3b3d1b7 |
|
10-Feb-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Mark our provider with G_PF_CANDELETE in the cases where this is actually the case.
|
#
5777c5b9 |
|
30-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
NO_GEOM cleanup: unifdef
|
#
16bcbe8c |
|
27-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Implement MDIOCLIST which returns the unit numbers of configured md(4) devices. We use the md_pad[] array and if there are more units than its size the last returned unit number will be -1, but the number of units returned is correct.
|
#
44956c98 |
|
21-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
26d48b40 |
|
13-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
OK Ok, so I didn't check the NO_GEOM case for the final version... Stumbled on by: bde
|
#
a4f86158 |
|
13-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Enable the new h0h0magic code which on GEOM kernels make the md(4) driver a _real_ GEOM driver.
|
#
0f8500a5 |
|
13-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a mutex around the per unit bioqueue. Only grab giant in the per unit kthread for SWAP and VNODE backed devices. Initialize the bioq before the kthread gets a chance to study it. Don't lock Giant in mddone_swap, we shouldn't need it.
|
#
64bfc43b |
|
13-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove the printf which announces the creation of malloc disks: it is inconsistent when we do not do it for swap or vnode. We still printf for preloaded disks because of the weak debugging options people have in embedded/tiny environments where this is usually used.
|
#
6f4f00f1 |
|
12-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add code to make md(4) a GEOM device driver instead of relying in the disk mini-layer. This is currently not enabled.
|
#
a522a159 |
|
12-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Shift things around a bit in preparation for future evilness.
|
#
e176446d |
|
30-Nov-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Move the check for the MD_SHUTDOWN flag to before the tsleep() call in the per-device kthread. This ensures that synchronisation with mddestroy() succeeds even if the kthread was not waiting in tsleep() at the time of the wakeup(). Among other things, this fixes the problem of mdconfig getting stuck when an attempt is made to use a zero-length file as a vnode-type backing store. Approved by: re
|
#
8689acc4 |
|
21-Oct-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
We want /dev/md0 for ramdisk roots, not /dev/md0c. Sponsored by: DARPA & NAI Labs
|
#
975b628f |
|
20-Oct-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use ENOSPC error return, not ENOMEM. Use %jd rather than %lld.
|
#
81bb0b95 |
|
13-Oct-2002 |
Jake Burkholder <jake@FreeBSD.org> |
MODINFO_SIZE metadata has type size_t, not unsigned. This makes preloaded md root work on sparc64.
|
#
316ec49a |
|
02-Oct-2002 |
Scott Long <scottl@FreeBSD.org> |
Some kernel threads try to do significant work, and the default KSTACK_PAGES doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates. Reviewed by: jake, peter, jhb
|
#
1cff889a |
|
28-Sep-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Put the casts on the right hand side of =.
|
#
a0c67264 |
|
22-Sep-2002 |
Peter Grehan <grehan@FreeBSD.org> |
Initialize fwsectors/fwheads to allow the DIOCGFWSECTORS and DIOCGFWHEADS ioctls to return meaningful values to disklabel/newfs Approved by: phk
|
#
7812d86f |
|
20-Sep-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
(This commit touches about 15 disk device drivers in a very consistent and predictable way, and I apologize if I have gotten it wrong anywhere, getting prior review on a patch like this is not feasible, considering the number of people involved and hardware availability etc.) If struct disklabel is the messenger: kill the messenger. Inside struct disk we had a struct disklabel which disk drivers used to communicate certain metrics to the disklayer above (GEOM or the disk mini-layer). This commit changes this communication to use four explicit fields instead. Amongst the benefits is that the fields do not get overwritten by wrong or bogus on-disk disklabels. Once that is clear, <sys/disk.h> which is included in the drivers no longer need to pull <sys/disklabel.h> and <sys/diskslice.h> in, the few places that needs them, have gotten explicit #includes for them. The disklabel inside struct disk is now only for internal use in the disk mini-layer, so instead of embedding it, we malloc it as we need it. This concludes (modulus any mistakes) the series of disklabel related commits. I belive it all amounts to a NOP for all the rest of you :-) Sponsored by: DARPA & NAI Labs.
|
#
4a6a94d8 |
|
22-Aug-2002 |
Archie Cobbs <archie@FreeBSD.org> |
Replace (ab)uses of "NULL" where "0" is really meant.
|
#
6569d6f3 |
|
23-Jun-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Yet another warning fix for 64 bits platforms. Reviewed by: phk
|
#
58f3c42e |
|
15-Jun-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
mdcreate_vnode() isn't correctly clearing things out of the linked list if the file is of 0 size or mdsetcred() fails. Submitted by: Martin Faxer <gmh003532@brfmasthugget.se>
|
#
d8a186eb |
|
10-Jun-2002 |
Maxim Sobolev <sobomax@FreeBSD.org> |
- Whitespace only: use return statement consistentlt (return (foo), not return(foo)), kill extra blank names between function names; - fix format string in printf(): devtoname() returns string, not pointer.
|
#
5c97ca54 |
|
03-Jun-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Use a per-device worker thread to avoid blocking in mdstrategy() until the I/O completes. This fixes some easily reproducable deadlocks that occur when using md(4) with GEOM. Reviewed by: phk
|
#
fcf867e9 |
|
26-May-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Mis-edit in last commit.
|
#
fde2a2e4 |
|
26-May-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Be a bit smarter about rewriting data so we don't loose too much performance. Sponsored by: DARPA & NAI Labs.
|
#
f43b2bac |
|
26-May-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use an umazone per unit for allocating the sectors for malloc backing. Clean up things properly when we unconfigure malloc backed units. Sponsored by: DARPA & NAI Labs.
|
#
c6517568 |
|
25-May-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Give the "malloc" backing of md(4) an adaptive multilevel index tree to remove the need for a contiguous array with pointers to all the sectors. Try to make failure to malloc(9) memory a non-hang situation. Eventually this will allow us to test the 64bit cleanness of the disk I/O patch, but more work is outstanding here and elsewhere. Sponsored by: DARPA & NAI Labs.
|
#
9589c256 |
|
03-May-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a memory-leak when configuring a vnode backed md(4) device fails. Submitted by: Martin Faxér <gmh003532@brfmasthugget.se> MFC after: 4 weeks
|
#
421e6a65 |
|
20-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove unused include.
|
#
1ab0b5f9 |
|
18-Mar-2002 |
Bruce Evans <bde@FreeBSD.org> |
The previous commit missed fixing 2 old printf format errors and introduced a format printf error.
|
#
54ed0c32 |
|
18-Mar-2002 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Fix printf warning caused by recent changes in bio_pblkno's type.
|
#
0d2af521 |
|
15-Mar-2002 |
Kirk McKusick <mckusick@FreeBSD.org> |
Introduce the new 64-bit size disk block, daddr64_t. Change the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.
|
#
a854ed98 |
|
27-Feb-2002 |
John Baldwin <jhb@FreeBSD.org> |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
#
e087ce2d |
|
10-Feb-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Staticize the malloc definitions. Obtained from: ~bde/sys.dif.gz
|
#
3ca627fe |
|
21-Jan-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Gah! last commit botched indentation, fix indentation and some other white-space nits while at it.
|
#
b4a4f93c |
|
21-Jan-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Restructure slightly, eliminating some repetitive source lines and making GEOM patches simpler and more readable at the same time.
|
#
53d745bc |
|
19-Dec-2001 |
Dima Dorfman <dd@FreeBSD.org> |
Actually make use of the md_version field of 'struct mdio'. In order not to needlessly break compatibility, decrement MDIOVERSION to 0. Approved by: phk
|
#
7e76bb56 |
|
05-Nov-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Implement IO_NOWDRAIN and B_NOWDRAIN - prevents the buffer cache from blocking in wdrain during a write. This flag needs to be used in devices whos strategy routines turn-around and issue another high level I/O, such as when MD turns around and issues a VOP_WRITE to vnode backing store, in order to avoid deadlocking the dirty buffer draining code. Remove a vprintf() warning from MD when the backing vnode is found to be in-use. The syncer of buf_daemon could be flushing the backing vnode at the time of an MD operation so the warning is not correct. MFC after: 1 week
|
#
bd78cece |
|
11-Oct-2001 |
John Baldwin <jhb@FreeBSD.org> |
Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
|
#
9935282d |
|
09-Oct-2001 |
John Baldwin <jhb@FreeBSD.org> |
Use crhold() instead of crdup(). The md(4) driver doesn't modify the ucred that it uses, so it merely needs to bump its refcount to make it immutable rather than obtain its own copy.
|
#
b40ce416 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
|
#
166cd0b4 |
|
27-Aug-2001 |
Maxim Sobolev <sobomax@FreeBSD.org> |
OOPS, remove local change that somehow slipped into a commit (I swear that I already deleted it some time ago). This should fix problem people have with unsefined reference to `MD_PRELOAD_COMPRESSED'. Submitted by: Manfred Antar <null@pozo.com>
|
#
9d4b5945 |
|
27-Aug-2001 |
Maxim Sobolev <sobomax@FreeBSD.org> |
- On module unload try to detach all configured disks and let unload proceed if all disks were detached sucessfully; - use consistent style for return statements and fix several others style inconsistencies. Reviewed by: ru Approved by: phk
|
#
e0cebb40 |
|
15-Aug-2001 |
Dima Dorfman <dd@FreeBSD.org> |
There is no MD_OBJET disk type, it's actually MD_SWAP. I guess the former was either a previous or proposed name that kind of snuck in.
|
#
26a0ee75 |
|
07-Aug-2001 |
Dima Dorfman <dd@FreeBSD.org> |
Introduce a force option, MD_FORCE, that instructs the driver to bypass some extra anti-foot-shooting measures. Currently, its only effect is to allow detaching a device while it's still open (e.g., mounted). This is useful for testing how the system reacts to a disk suddenly going away, which can happen with some removeable media. At this point, the force option is only checked on detach, so it would've been possible to allow the option to be passed with the MDIOCDETACH operation. This was not done to allow the possibility of having the force flag influence other tests in the future, which may not necessarily deal with detaching the device. Reviewed by: sobomax Approved by: phk
|
#
fe603109 |
|
02-Aug-2001 |
Maxim Sobolev <sobomax@FreeBSD.org> |
- Deny detaching requests until device is still open, otherwise it is possible to hang or panic kernel by detaching disk from which fs is mounted; - replace "md" with MD_NAME in yet another place. Reviewed by: phk Approved by: phk
|
#
3d3c27fe |
|
26-Jul-2001 |
Thomas Moestl <tmm@FreeBSD.org> |
Make sure the total number of sectors is not 0 for a vnode-type md to avoid a division by zero which would occur on open() in this case. Reviewed by: phk
|
#
10b0e058 |
|
18-Jul-2001 |
Dima Dorfman <dd@FreeBSD.org> |
Use MD_NAME and MDCTL_NAME constants where appropriate.
|
#
0cddd8f0 |
|
04-Jul-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
277be4c2 |
|
25-Jun-2001 |
John Baldwin <jhb@FreeBSD.org> |
We don't need the vm lock to perform a few simple calculations on the md device's softc.
|
#
8adeb35a |
|
29-May-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove MFS compat bits.
|
#
5a025167 |
|
26-May-2001 |
Dima Dorfman <dd@FreeBSD.org> |
Acquire vm_mtx before calling vm_pager_deallocate. Reviewed by: phk
|
#
9dceb26b |
|
21-May-2001 |
John Baldwin <jhb@FreeBSD.org> |
Sort includes.
|
#
23955314 |
|
18-May-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
|
#
d4e6d409 |
|
08-May-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Polish error handling code using biofinish()
|
#
a468031c |
|
06-May-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Actually biofinish(struct bio *, struct devstat *, int error) is more general than the bioerror(). Most of this patch is generated by scripts.
|
#
1f4ee1aa |
|
06-May-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a panic if MD devices were left half-created. XXX: the real bug is that devstat isn't part of the disk minilayer. PR: 27158 Submitted by: Anders Nordby <anders@fix.no>
|
#
fb919e4d |
|
01-May-2001 |
Mark Murray <markm@FreeBSD.org> |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
|
#
53233f94 |
|
19-Mar-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix a reference to the "vn" driver in a warning message.
|
#
0ac60323 |
|
09-Mar-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use a more BIOS friendly geometry. Submitted by: joe
|
#
174b5e9a |
|
25-Feb-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make "md" and "mdctl" macroized parameters. Implement "-l" option to mdconfig which can list one or all md devices. Submitted by: Dima Dorfman <dima@unixfreak.org>
|
#
57e9624e |
|
24-Feb-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make md/mdconfig do kld. Submitted by: dcs
|
#
c9384920 |
|
28-Jan-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove devstat entries in mddelete() Spotted: tegge
|
#
96b6a55f |
|
21-Jan-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
General cleanup.
|
#
dc57d7c6 |
|
19-Jan-2001 |
Peter Wemm <peter@FreeBSD.org> |
Fix a maybe-not-so-harmless warning.
|
#
637f671a |
|
02-Jan-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Either cvs(1) or I forgot this file in my last commit. Please see commit log for rev 1.4 of src/sbin/mdconfig/mdconfig.c
|
#
8f8def9e |
|
31-Dec-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
This is the first snapshot of the new all-singing-and-dancing md(4). Using the mdconfig(8) program you can now configure memory disks on malloc(9), swap or a file/vnode. preloaded md disks also work as usual.
|
#
e0913a2c |
|
15-Dec-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Enforce disk unit numbers upper limit in cloning.
|
#
7cc0979f |
|
08-Dec-2000 |
David Malone <dwmalone@FreeBSD.org> |
Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
|
#
db901281 |
|
02-Sep-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Avoid the modules madness I inadvertently introduced by making the cloning infrastructure standard in kern_conf. Modules are now the same with or without devfs support. If you need to detect if devfs is present, in modules or elsewhere, check the integer variable "devfs_present". This happily removes an ugly hack from kern/vfs_conf.c. This forces a rename of the eventhandler and the standard clone helper function. Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include like <sys/queue.h> Remove all #includes of opt_devfs.h they no longer matter.
|
#
21c3015a |
|
28-Aug-2000 |
Doug Rabson <dfr@FreeBSD.org> |
* Completely rewrite the alpha busspace to hide the implementation from the drivers. * Remove legacy inx/outx support from chipset and replace with macros which call busspace. * Rework pci config accesses to route through the pcib device instead of calling a MD function directly. With these changes it is possible to cleanly support machines which have more than one independantly numbered PCI busses. As a bonus, the new busspace implementation should be measurably faster than the old one.
|
#
3f54a085 |
|
20-Aug-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c) Remove old DEVFS support fields from dev_t. Make uid, gid & mode members of dev_t and set them in make_dev(). Use correct uid, gid & mode in make_dev in disk minilayer. Add support for registering alias names for a dev_t using the new function make_dev_alias(). These will show up as symlinks in DEVFS. Use makedev() rather than make_dev() for MFSs magic devices to prevent DEVFS from noticing this abuse. Add a field for DEVFS inode number in dev_t. Add new DEVFS in fs/devfs. Add devfs cloning to: disk minilayer (ie: ad(4), sd(4), cd(4) etc etc) md(4), tun(4), bpf(4), fd(4) If DEVFS add -d flag to /sbin/inits args to make it mount devfs. Add commented out DEVFS to GENERIC
|
#
f2744793 |
|
17-Jul-2000 |
Sheldon Hearn <sheldonh@FreeBSD.org> |
Rename MDNSECT to MD_NSECT and declare it as something that isn't default in NOTES. Requested by: bde Approved by: phk
|
#
0cfaeeee |
|
04-Jul-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix the "almost clone" semantics.
|
#
9626b608 |
|
05-May-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
|
#
8177437d |
|
14-Apr-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Complete the bio/buf divorce for all code below devfs::strategy Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case. CCD not converted yet, casts to struct buf (still safe) atapi-cd casts to struct buf to examine B_PHYS
|
#
21144e3b |
|
20-Mar-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
|
#
66c16191 |
|
01-Dec-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Initialize type correctly.
|
#
71e4fff8 |
|
26-Nov-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Retire MFS_ROOT and MFS_ROOT_SIZE options from the MFS implementation. Add MD_ROOT and MD_ROOT_SIZE options to the md driver. Make the md driver handle MFS_ROOT and MFS_ROOT_SIZE options for compatibility. Add md driver to GENERIC, PCCARD and LINT. This is a cleanup which removes the need for some of the worse hacks in MFS: We really want to have a rootvnode but MFS on a preloaded image doesn't really have one. md is a true device, so it is less trouble. This has been tested with make release, and if people remember to add the "md" pseudo-device to their kernels, PicoBSD should be just fine as well. If people have no other use for MFS, it can be removed from the kernel.
|
#
95f1a897 |
|
20-Nov-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Teach the md driver to use preloaded files of type "md_image".
|
#
c23f2166 |
|
11-Oct-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
remove unused #include
|
#
d6a0e38a |
|
25-Sep-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove five now unused fields from struct cdevsw. They should never have been there in the first place. A GENERIC kernel shrinks almost 1k. Add a slightly different safetybelt under nostop for tty drivers. Add some missing FreeBSD tags
|
#
27068b01 |
|
22-Sep-1999 |
Brian Feldman <green@FreeBSD.org> |
Fix includes (remove unnecessary ones, reorder necessary ones.) Also, correct an %x to be %lx. Reviewed by: phk
|
#
33edfabe |
|
20-Sep-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
fix a buglet which jordan made me provoke :-)
|
#
00a6a3c6 |
|
21-Sep-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add an experimental Memory-Disk driver. This driver will allocate memory with malloc(9) using a few tricks to save space on the way.
|