History log of /freebsd-11-stable/sys/kern/tty.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 361357 22-May-2020 hselasky

MFC r361075:
Assign process group of the TTY under the "proctree_lock".

This fixes a race where concurrent calls to doenterpgrp() and
leavepgrp() while TIOCSCTTY is executing may result in tp->t_pgrp
changing value so that tty_rel_pgrp() misses clearing it to NULL. For
more details refer to the use of pgdelete() in the kernel.

No functional change intended.

Panic backtrace:
__mtx_lock_sleep() # page fault due to using destroyed mutex
tty_signal_pgrp()
tty_ioctl()
ptsdev_ioctl()
kern_ioctl()
sys_ioctl()
amd64_syscall()

Sponsored by: Mellanox Technologies


# 357660 07-Feb-2020 kevans

MFC r355248: tty: implement TIOCNOTTY

Generally, it's preferred that an application fork/setsid if it doesn't want
to keep its controlling TTY, but it could be that a debugger is trying to
steal it instead -- so it would hook in, drop the controlling TTY, then do
some magic to set things up again. In this case, TIOCNOTTY is quite handy
and still respected by at least OpenBSD, NetBSD, and Linux as far as I can
tell.

I've dropped the note about obsoletion, as I intend to support TIOCNOTTY as
long as it doesn't impose a major burden.


# 356020 22-Dec-2019 kevans

MFC r355206, r355212, r355257, r355271: tty nits

r355206: tty_pts: don't rely on tty header pollution for sys/mutex.h
r355212: tty_rel_gone: add locking assertion
r355257: usb: remove some extraneous tty.h includes
r355271: Remove more needless <sys/tty.h> includes


# 349806 07-Jul-2019 markj

MFC r349733:
Defer funsetown() calls for a TTY to tty_rel_free().


# 314538 02-Mar-2017 ian

MFC r311954, r311996, r312077, r312080:

Rework tty_drain() to poll the hardware for completion, and restore
drain timeout handling to historical freebsd behavior.

The primary reason for these changes is the need to have tty_drain() call
ttydevsw_busy() at some reasonable sub-second rate, to poll hardware that
doesn't signal an interrupt when the transmit shift register becomes empty
(which includes virtually all USB serial hardware). Such hardware hangs
in a ttyout wait, because it never gets an opportunity to trigger a wakeup
from the sleep in tty_drain() by calling ttydisc_getc() again, after
handing the last of the buffered data to the hardware.

Restructure the tty_drain loop so that device-busy is checked one more time
after tty_timedwait() returns an error only if the error is EWOULDBLOCK;
other errors cause an immediate return. This fixes the case of the tty
disappearing while in tty_drain().

Check tty_gone() after allocating IO buffers. The tty lock has to be
dropped then reacquired due to using M_WAITOK, which opens a window in
which the tty device can disappear. Check for this and return ENXIO
back up the call chain so that callers can cope.

Correct the comments about how much buffer is allocated.


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 298819 29-Apr-2016 pfg

sys/kern: spelling fixes in comments.

No functional change.


# 294836 26-Jan-2016 kib

Don't clear the software flow control flag before draining for last
close or assert the bug that it is clear when leaving.

Remove an unrelated rotted comment that was attached to the buggy
clearing.

Since draining is not done in more cases, flushing is needed in more
cases, so start fixing flushing:
- do a full flush in ttydisc_close(). State what POSIX requires more
clearly. This was missing ttydevsw_pktnotify() calls to tell the
devsw layer to flush. Hardware tty drivers don't actually flush
since they don't understand this API.
- fix 2 missing wakeups in tty_flush(). Most of the wakeups here are
unnecessary for last close. But ttydisc_close() did one of the
missing ones.

This flow control bug ameliorated the design bug of requiring
potentially unbounded waits in draining. Software flow control is the
easiest way to get an unbounded wait, and a long wait is sometimes
actually useful. Users can type the xoff character on the receiver
and (if ixon is set on the sender) expect the output to be held until
the user is ready for more.

Hardware flow control can also give the unbounded wait, and this bug
didn't affect hardware flow control. Unbounded waits from hardware
flow control take a more unusual configuration. E.g., a terminal
program that controls the modem status lines, or unplugging the cable
in a configuration where this doesn't break the connection.

The design bug is still ameliorated by a newer bug in draining for
last close -- the 1 second timeout. E.g., if the user types the
xoff character and the sender reaches last close, then output is
not resumed and the wait times out after just 1 second. This is
broken, but preferable to an unbounded wait. Before this change,
the output was resumed immediately and usually completed.

Submitted by: bde
MFC after: 2 weeks


# 294778 26-Jan-2016 kib

Restore flushing of output for revoke(2) again. Document revoke()'s
intended behaviour in its man page. Simplify tty_drain() to match.
Don't call ttydevsw methods in tty_flush() if the device is gone
since we now sometimes call it then.

The flushing was supposed to be implemented by passing the FNONBLOCK
flag to VOP_CLOSE() for revoke(). The tty driver is one of the few
that can block in close and was one of the fewer that knew about this.

This almost worked in FreeBSD-1 and similarly in Net/2. These
versions only almost worked because there was and is considerable
confusion between IO_NDELAY and FNONBLOCK (aka O_NONBLOCK). IO_NDELAY
is only valid for VOP_READ() and VOP_WRITE(). For other VOPs it has
the same value as O_SHLOCK. But since vfs_subr.c and tty.c
consistently used the wrong flag and the O_SHLOCK flag is rarely set,
this mostly worked. It also gave the feature than applications could
get the non-blocking close by abusing O_SHLOCK.

This was first broken then fixed in 1995. I changed only the tty
driver to use FNONBLOCK, as a hack to get non-blocking via the normal
flag FNONBLOCK for last closes. I didn't know about revoke()'s use
of IO_NDELAY or change it to be consistent, so revoke() was broken.
Then I changed revoke() to match.

This was next broken in 1997 then fixed in 1998. Importing Lite2 made
the flags inconsistent again by undoing the fix only in vfs_subr.c.

This was next broken in 2008 by replacing everything in tty.c and not
checking any flags in last close. Other bugs in draining limited the
resulting unbounded waits to drain in some cases.

It is now possible to fix this better using the new FREVOKE flag.
Just restore flushing for revoke() for now. Don't restore or undo any
hacks for ordinary last closes yet. But remove dead code in the
1-second relative timeout (r272789). This did extra work to extend
the buggy draining for revoke() for as long as possible. The 1-second
timeout made this not very long by usually flushing after 1 second.

Submitted by: bde
MFC after: 2 weeks


# 294753 25-Jan-2016 marius

- Make the code consistent with itself style-wise and bring it closer
to style(9).
- Mark unused arguments as such.
- Make the ttystates table const.


# 294735 25-Jan-2016 kib

Don't allow opening the callout device when the callin device is already
open (in disguise as the console device). The only allowed combination
was supposed to be the callin device with the console.

Fix the assertion in ttydev_close() that was meant to detect this (it
only detected all 3 devices being open). Assert this in ttydev_open()
too.

Submitted by: bde
MFC after: 2 weeks


# 294732 25-Jan-2016 kib

Fix the %b flags string for ddb. All bits above the 5th
(TF_OPENED_CONS) were broken in r188147 by adding TF_OPENED_CONS
without updating the string. It was especially confusing to display
OPENED_CONS as GONE and BYPASS as ZOMBIE. 2 flags at the end were
not updated in r188487.

Don't print an extra 0x prefix for %p in a ddb command. In the rest
of the kernel there are more than 6000 lines with %p and only about
40 with this bug.

Print a non-extra 0x prefix for %b in a ddb command. In the rest
of the kernel, there are approx. 180 lines with %b and 2/3 of them
have this bug.

Submitted by: bde
MFC after: 2 weeks


# 294598 22-Jan-2016 kib

In tty_dealloc(), clear the queues. See the comment for a scenario
which explains why ttydev_leave() cleanup might not happen.

Submitted by: bde
MFC after: 3 weeks


# 294362 19-Jan-2016 marius

Fix tty_drain() and, thus, TIOCDRAIN of the current tty(4) incarnation
to actually wait until the TX FIFOs of UARTs have be drained before
returning. This is done by bringing the equivalent of the TS_BUSY flag
found in the previous implementation back in an ABI-preserving way.
Reported and tested by: Patrick Powell

Most likely, drivers for USB-serial-adapters likewise incorporating
TX FIFOs as well as other terminal devices that buffer output in some
form should also provide implementations of tsw_busy.

MFC after: 3 days


# 293349 07-Jan-2016 kib

Convert tty common code to use make_dev_s().

Tty.c was untypical in that it handled the si_drv1 issue consistently
and correctly, by always checking for si_drv1 being non-NULL and
sleeping if NULL. The removed code also illustrated unneeded
complications in drivers which are eliminated by the use of new KPI.

Reviewed by: hps, jhb
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D4746


# 278930 17-Feb-2015 mjg

filedesc: simplify fget_unlocked & friends

Introduce fget_fcntl which performs appropriate checks when needed.
This removes a branch from fget_unlocked.

Introduce fget_mmap dealing with cap_rights_to_vmprot conversion.
This removes a branch from _fget.

Modify fget_unlocked to pass sequence counter to interested callers so
that they can perform their own checks and make sure the result was
otained from stable & current state.

Reviewed by: silence on -hackers


# 272789 09-Oct-2014 marcel

Fix draining in ttydev_leave():
1. ERESTART is not only returned when the revoke count changed. It
is also returned when a signal is received. While a change in
the revoke count should be ignored, a signal should not.
2. Waiting until the output queue is entirely drained can cause a
hang when the underlying device is stuck or broken.

Have tty_drain() take care of this by telling it when we're leaving.
When leaving, tty_drain() will use a timed wait to address point 2
above and it will check the revoke count to handle point 1 above.
The timeout is set to 1 second, which is arbitrary and long enough
to expect a change in the output queue.

Discussed with: jilles@
Reported by: Yamagi Burmeister <lists@yamagi.org>


# 272786 09-Oct-2014 marcel

Apply r269126 to tty_timedwait():
Don't return ERESTART when the device is gone.


# 272270 28-Sep-2014 neel

tty_rel_free() can be called more than once for the same tty so make sure
that the tty is dequeued from 'tty_list' only the first time.

The panic below was seen when a revoke(2) was issued on an nmdm device.
In this case there was also a thread that was blocked on a read(2) on the
device. The revoke(2) woke up the blocked thread which would typically
return an error to userspace. In this case the reader also held the last
reference on the file descriptor so fdrop() ended up calling tty_rel_free()
via ttydev_close().

tty_rel_free() then tried to dequeue 'tp' again which led to the panic.

panic: Bad link elm 0xfffff80042602400 prev->next != elm
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00f9c90460
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe00f9c90510
vpanic() at vpanic+0x189/frame 0xfffffe00f9c90590
panic() at panic+0x43/frame 0xfffffe00f9c905f0
tty_rel_free() at tty_rel_free+0x29b/frame 0xfffffe00f9c90640
ttydev_close() at ttydev_close+0x1f9/frame 0xfffffe00f9c90690
devfs_close() at devfs_close+0x298/frame 0xfffffe00f9c90720
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x13c/frame 0xfffffe00f9c90770
vn_close() at vn_close+0x194/frame 0xfffffe00f9c90810
vn_closefile() at vn_closefile+0x48/frame 0xfffffe00f9c90890
devfs_close_f() at devfs_close_f+0x2c/frame 0xfffffe00f9c908c0
_fdrop() at _fdrop+0x29/frame 0xfffffe00f9c908e0
sys_read() at sys_read+0x63/frame 0xfffffe00f9c90980
amd64_syscall() at amd64_syscall+0x2b3/frame 0xfffffe00f9c90ab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe00f9c90ab0
--- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800b78d8a, rsp = 0x7fffffbfdaf8, rbp = 0x7fffffbfdb30 ---

CR: https://reviews.freebsd.org/D851
Reviewed by: glebius, ed
Reported by: Leon Dang
Sponsored by: Nahanni Systems
MFC after: 1 week


# 269126 26-Jul-2014 marcel

Don't return ERESTART when the device is gone. In ttydev_leave() ERESTART
is the indication that draining got interrupted due to a revoke(2) and
that tty_drain() is to be called again for draining to complete. If the
device is flagged as gone, then waiting/draining is not possible. Only
return ERESTART when waiting is still possible.

Obtained from: Juniper Networks, Inc.


# 263233 16-Mar-2014 rwatson

Update kernel inclusions of capability.h to use capsicum.h instead; some
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.

MFC after: 3 weeks


# 259663 20-Dec-2013 glebius

Move list of ttys handling from the allocating procedures, to the
device creation stage. A device creation can fail, and in that case
an entry already on the list will be freed.

Sponsored by: Nginx, Inc.


# 259549 18-Dec-2013 glebius

- Rename tty_makedev() into tty_makedevf() and make it capable
to fail and return error.
- Use make_dev_p() in tty_makedevf() instead of make_dev_cred().
- Always pass MAKEDEV_CHECKNAME flag.
- Optionally pass MAKEDEV_REF flag.
- Provide macro for compatibility with old API.

This fixes races with simultaneous creation and desctruction of
ttys, and makes it possible to call tty_makedevf() from device
cloners.

A race in tty_watermarks() still exist, since the latter drops
lock for M_WAITOK allocation. This will be addressed in separate
commit.

Reviewed by: kib
Sponsored by: Nginx, Inc.


# 259441 15-Dec-2013 marcel

Properly drain the TTY when both revoke(2) and close(2) end up closing
the TTY. In such a case, ttydev_close() is called multiple times and
each time, t_revokecnt is incremented and cv_broadcast() is called for
both the t_outwait and t_inwait condition variables.
Let's say revoke(2) comes in first and gets to call tty_drain() from
ttydev_leave(). Let's say that the revoke comes from init(8) as the
result of running "shutdown -r now". Since shutdown prints various
messages to the console before announing that the machine will reboot
immediately, let's also say that the output queue is not empty and
that tty_drain() has something to do. Let's assume this all happens
on a 9600 baud serial console, so it takes a time to drain.
The shutdown command will exit(2) and as such will end up closing
stdout. Let's say this close will come in second, bump t_revokecnt
and call tty_wakeup(). This has tty_wait() return prematurely and
the next thing that will happen is that the thread doing revoke(2)
will flush the TTY. Since the drain wasn't complete, the flush will
effectively drop whatever is left in t_outq.

This change takes into account that tty_drain() will return ERESTART
due to the fact that t_revokecnt was bumped and in that case simply
call tty_drain() again. The thread in question is already performing
the close so it can safely finish draining the TTY before destroying
the TTY structure.

Now all messages from shutdown will be printed on the serial console.

Obtained from: Juniper Networks, Inc.


# 255219 04-Sep-2013 pjd

Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

struct cap_rights {
uint64_t cr_rights[CAP_RIGHTS_VERSION + 2];
};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL)
#define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)

#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
void cap_rights_set(cap_rights_t *rights, ...);
void cap_rights_clear(cap_rights_t *rights, ...);
bool cap_rights_is_set(const cap_rights_t *rights, ...);

bool cap_rights_is_valid(const cap_rights_t *rights);
void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

cap_rights_t rights;

cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

#define cap_rights_set(rights, ...) \
__cap_rights_set((rights), __VA_ARGS__, 0ULL)
void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by: The FreeBSD Foundation


# 247602 01-Mar-2013 pjd

Merge Capsicum overhaul:

- Capability is no longer separate descriptor type. Now every descriptor
has set of its own capability rights.

- The cap_new(2) system call is left, but it is no longer documented and
should not be used in new code.

- The new syscall cap_rights_limit(2) should be used instead of
cap_new(2), which limits capability rights of the given descriptor
without creating a new one.

- The cap_getrights(2) syscall is renamed to cap_rights_get(2).

- If CAP_IOCTL capability right is present we can further reduce allowed
ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed
ioctls can be retrived with cap_ioctls_get(2) syscall.

- If CAP_FCNTL capability right is present we can further reduce fcntls
that can be used with the new cap_fcntls_limit(2) syscall and retrive
them with cap_fcntls_get(2).

- To support ioctl and fcntl white-listing the filedesc structure was
heavly modified.

- The audit subsystem, kdump and procstat tools were updated to
recognize new syscalls.

- Capability rights were revised and eventhough I tried hard to provide
backward API and ABI compatibility there are some incompatible changes
that are described in detail below:

CAP_CREATE old behaviour:
- Allow for openat(2)+O_CREAT.
- Allow for linkat(2).
- Allow for symlinkat(2).
CAP_CREATE new behaviour:
- Allow for openat(2)+O_CREAT.

Added CAP_LINKAT:
- Allow for linkat(2). ABI: Reuses CAP_RMDIR bit.
- Allow to be target for renameat(2).

Added CAP_SYMLINKAT:
- Allow for symlinkat(2).

Removed CAP_DELETE. Old behaviour:
- Allow for unlinkat(2) when removing non-directory object.
- Allow to be source for renameat(2).

Removed CAP_RMDIR. Old behaviour:
- Allow for unlinkat(2) when removing directory.

Added CAP_RENAMEAT:
- Required for source directory for the renameat(2) syscall.

Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR):
- Allow for unlinkat(2) on any object.
- Required if target of renameat(2) exists and will be removed by this
call.

Removed CAP_MAPEXEC.

CAP_MMAP old behaviour:
- Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and
PROT_WRITE.
CAP_MMAP new behaviour:
- Allow for mmap(2)+PROT_NONE.

Added CAP_MMAP_R:
- Allow for mmap(PROT_READ).
Added CAP_MMAP_W:
- Allow for mmap(PROT_WRITE).
Added CAP_MMAP_X:
- Allow for mmap(PROT_EXEC).
Added CAP_MMAP_RW:
- Allow for mmap(PROT_READ | PROT_WRITE).
Added CAP_MMAP_RX:
- Allow for mmap(PROT_READ | PROT_EXEC).
Added CAP_MMAP_WX:
- Allow for mmap(PROT_WRITE | PROT_EXEC).
Added CAP_MMAP_RWX:
- Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC).

Renamed CAP_MKDIR to CAP_MKDIRAT.
Renamed CAP_MKFIFO to CAP_MKFIFOAT.
Renamed CAP_MKNODE to CAP_MKNODEAT.

CAP_READ old behaviour:
- Allow pread(2).
- Disallow read(2), readv(2) (if there is no CAP_SEEK).
CAP_READ new behaviour:
- Allow read(2), readv(2).
- Disallow pread(2) (CAP_SEEK was also required).

CAP_WRITE old behaviour:
- Allow pwrite(2).
- Disallow write(2), writev(2) (if there is no CAP_SEEK).
CAP_WRITE new behaviour:
- Allow write(2), writev(2).
- Disallow pwrite(2) (CAP_SEEK was also required).

Added convinient defines:

#define CAP_PREAD (CAP_SEEK | CAP_READ)
#define CAP_PWRITE (CAP_SEEK | CAP_WRITE)
#define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ)
#define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE)
#define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL)
#define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W)
#define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X)
#define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X)
#define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X)
#define CAP_RECV CAP_READ
#define CAP_SEND CAP_WRITE

#define CAP_SOCK_CLIENT \
(CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \
CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN)
#define CAP_SOCK_SERVER \
(CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \
CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \
CAP_SETSOCKOPT | CAP_SHUTDOWN)

Added defines for backward API compatibility:

#define CAP_MAPEXEC CAP_MMAP_X
#define CAP_DELETE CAP_UNLINKAT
#define CAP_MKDIR CAP_MKDIRAT
#define CAP_RMDIR CAP_UNLINKAT
#define CAP_MKFIFO CAP_MKFIFOAT
#define CAP_MKNOD CAP_MKNODAT
#define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER)

Sponsored by: The FreeBSD Foundation
Reviewed by: Christoph Mallon <christoph.mallon@gmx.de>
Many aspects discussed with: rwatson, benl, jonathan
ABI compatibility discussed with: kib


# 242692 07-Nov-2012 kevlo

Fix typo; s/ouput/output


# 242529 03-Nov-2012 ed

Add tty_set_winsize().

This removes some of the signalling magic from the Syscons driver and
puts it in the TTY layer, where it belongs.


# 242078 25-Oct-2012 ed

Correct SIGTTIN handling.

In the old TTY layer, SIGTTIN was correctly handled like this:

while (data should be read) {
send SIGTTIN if not foreground process group
read data
}

In the new TTY layer, however, this behaviour was changed, based on a
false interpretation of the standard:

send SIGTTIN if not foreground process group
while (data should be read) {
read data
}

Correct this by pushing tty_wait_background() into the ttydisc_read_*()
functions.

Reported by: koitsu
PR: kern/173010
MFC after: 2 weeks


# 237219 18-Jun-2012 pho

In tty_makedev() the following construction:

dev = make_dev_cred();
dev->si_drv1 = tp;

leaves a small window where the newly created device may be opened
and si_drv1 is NULL.

As this is a vary rare situation, using a lock to close the window
seems overkill. Instead just wait for the assignment of si_drv1.

Suggested by: kib
MFC after: 1 week


# 236730 07-Jun-2012 pjd

Eliminate redundant variable.

Sponsored by: FreeBSD Foundation
MFC after: 1 week


# 236727 07-Jun-2012 pjd

Plug file reference leak in capability failure case.

Sponsored by: FreeBSD Foundation
MFC after: 3 days


# 232197 26-Feb-2012 phk

Also call the low-level driver if ->c_iflag & (IXON|IXOFF|IXANY) changes.

Uftdi(4) examines (c_iflag & (IXON|IXOFF)) to control hw XON-XOFF support.
This is obviously no good, if changes to those bits are not communicated
down the stack.


# 231095 06-Feb-2012 ed

Fix whitespace inconsistencies in TTY code.


# 225617 16-Sep-2011 kmacy

In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)


# 225506 12-Sep-2011 ed

Fix error return codes for ioctls on init/lock state devices.

In revision 223722 we introduced support for driver ioctls on init/lock
state devices. Unfortunately the call to ttydevsw_cioctl() clobbers the
value of the error variable, meaning that in many cases ioctl() will now
return ENOTTY, even though the ioctl() was processed properly.

Reported by: Boris Samorodov <bsam ipt ru>
Patch by: jilles@
Approved by: re@ (kib@)


# 225177 25-Aug-2011 attilio

Fix a deficiency in the selinfo interface:
If a selinfo object is recorded (via selrecord()) and then it is
quickly destroyed, with the waiters missing the opportunity to awake,
at the next iteration they will find the selinfo object destroyed,
causing a PF#.

That happens because the selinfo interface has no way to drain the
waiters before to destroy the registered selinfo object. Also this
race is quite rare to get in practice, because it would require a
selrecord(), a poll request by another thread and a quick destruction
of the selrecord()'ed selinfo object.

Fix this by adding the seldrain() routine which should be called
before to destroy the selinfo objects (in order to avoid such case),
and fix the present cases where it might have already been called.
Sometimes, the context is safe enough to prevent this type of race,
like it happens in device drivers which installs selinfo objects on
poll callbacks. There, the destruction of the selinfo object happens
at driver detach time, when all the filedescriptors should be already
closed, thus there cannot be a race.
For this case, mfi(4) device driver can be set as an example, as it
implements a full correct logic for preventing this from happening.

Sponsored by: Sandvine Incorporated
Reported by: rstone
Tested by: pluknet
Reviewed by: jhb, kib
Approved by: re (bz)
MFC after: 3 weeks


# 224778 11-Aug-2011 rwatson

Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:

Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *. With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.

Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.

In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.

Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.

Approved by: re (bz)
Submitted by: jonathan
Sponsored by: Google Inc


# 223722 02-Jul-2011 ed

Reintroduce the cioctl() hook in the TTY layer for digi(4).

The cioctl() hook can be used by drivers to add ioctls to the *.init and
*.lock devices. This commit breaks the ttydevsw ABI, since this
structure didn't provide any padding. To prevent ABI breakage in the
future, add a tsw_spare.

Submitted by: Peter Jeremy <peter jeremy alcatel lucent com>
Obtained from: kern/152254 (slightly modified)


# 223575 26-Jun-2011 ed

Fix whitespace inconsistencies in the TTY layer and its drivers owned by me.


# 216952 04-Jan-2011 kib

Finish r210923, 210926. Mark some devices as eternal.

MFC after: 2 weeks


# 212867 19-Sep-2010 ed

Just make callout devices and /dev/console force CLOCAL on open().

Instead of adding custom checks to wait for DCD on open(), just modify
the termios structure to set CLOCAL. This means SIGHUP is no longer
generated when losing DCD as well.

Reviewed by: kib@
MFC after: 1 week


# 212860 19-Sep-2010 ed

Ignore DCD handling on /dev/console entirely.

This makes /dev/console more fail-safe and prevents a potential console
lock-up during boot.

Discussed on: stable@
Tested by: koitsu@
MFC after: 1 week


# 210923 06-Aug-2010 kib

Add new make_dev_p(9) flag MAKEDEV_ETERNAL to inform devfs that created
cdev will never be destroyed. Propagate the flag to devfs vnodes as
VV_ETERNVALDEV. Use the flags to avoid acquiring devmtx and taking a
thread reference on such nodes.

In collaboration with: pho
MFC after: 1 month


# 209718 06-Jul-2010 ed

Fix a race condition, where a TTY could be destroyed twice.

There are special cases where tty_rel_free() can be called twice in a
row, namely when closing and revoking the TTY at the same moment. Only
call destroy_dev_sched_cb() once.

Reported by: Jeremie Le Hen
MFC after: 1 week


# 201532 04-Jan-2010 ed

Make TIOCSTI work again.

It looks like I didn't implement this when I imported MPSAFE TTY.
Applications like mail(1) still use this. I think it's conceptually bad.

Tested by: Pete French <petefrench ticketswitch com>
MFC after: 2 weeks


# 201223 29-Dec-2009 rnoland

Update d_mmap() to accept vm_ooffset_t and vm_memattr_t.

This replaces d_mmap() with the d_mmap2() implementation and also
changes the type of offset to vm_ooffset_t.

Purge d_mmap2().

All driver modules will need to be rebuilt since D_VERSION is also
bumped.

Reviewed by: jhb@
MFC after: Not in this lifetime...


# 199998 01-Dec-2009 ed

Don't allocate an input buffer for a TTY when the receiver is turned off.

When the termios CREAD flag is not set, it makes little sense to
allocate an input buffer. Just set the size to 0 in this case to reduce
memory footprint.

Disallow CREAD to be disabled for pseudo-devices to prevent
foot-shooting.


# 199355 17-Nov-2009 kib

Among signal generation syscalls, only sigqueue(2) is allowed by POSIX
to fail due to lack of resources to queue siginfo. Add KSI_SIGQ flag
that allows sigqueue_add() to fail while trying to allocate memory for
new siginfo. When the flag is not set, behaviour is the same as for
KSI_TRAP: if memory cannot be allocated, set bit in sq_kill. KSI_TRAP is
kept to preserve KBI.

Add SI_KERNEL si_code, to be used in siginfo.si_code when signal is
generated by kernel. Deliver siginfo when signal is generated by kill(2)
family of syscalls (SI_USER with properly filled si_uid and si_pid), or
by kernel (SI_KERNEL, mostly job control or SIGIO). Since KSI_SIGQ flag
is not set for the ksi, low memory condition cause old behaviour.

Keep psignal(9) KBI intact, but modify it to generate SI_KERNEL
si_code. Pgsignal(9) and gsignal(9) now take ksi explicitely. Add
pksignal(9) that behaves like psignal but takes ksi, and ddb kill
command implemented as pksignal(..., ksi = NULL) to not do allocation
while in debugger.

While there, remove some register specifiers and use ANSI C prototypes.

Reviewed by: davidxu
MFC after: 1 month


# 198223 19-Oct-2009 ed

Properly set the low watermarks when reducing the baud rate.

Now that buffers are deallocated lazily, we should not use
tty*q_getsize() to obtain the buffer size to calculate the low
watermarks. Doing this may cause the watermark to be placed outside the
typical buffer size.

This caused some regressions after my previous commit to the TTY code,
which allows pseudo-devices to resize the buffers as well.

Reported by: yongari, dougb
MFC after: 1 week


# 198214 18-Oct-2009 ed

Allow the buffer size to be configured for pseudo-like TTY devices.

Devices that don't implement param() (which means they don't support
hardware parameters such as flow control, baud rate) hardcode the baud
rate to TTYDEF_SPEED. This means the buffer size cannot be configured,
which is a little inconvenient when using canonical mode with big lines
of input, etc.

Make it adjustable, but do clamp it between B50 and B115200 to prevent
awkward buffer sizes. Remove the baud rate assignment from
/etc/gettytab. Trust the kernel to fill in a proper value.

Reported by: Mikolaj Golub <to my trociny gmail com>
MFC after: 1 month


# 198213 18-Oct-2009 ed

Make lock devices work properly.

It turned out I did add the code to use the init state devices to set
the termios structure when opening the device, but it seems I totally
forgot to add the bits required to force the actual locking of flags
through the lock state devices.

Reported by: ru
MFC after: 1 week (to be discussed)


# 197134 12-Sep-2009 rwatson

Use C99 initialization for struct filterops.

Obtained from: Mac OS X
Sponsored by: Apple Inc.
MFC after: 3 weeks


# 195444 08-Jul-2009 ed

Fix regressions in return events of poll() on TTYs.

As pointed out, POLLHUP should be generated, even if it hasn't been
specified on input. It is also not allowed to return both POLLOUT and
POLLHUP at the same time.

Reported by: jilles
Approved by: re (kib)


# 195136 28-Jun-2009 ed

Add FIONWRITE support to TTYs.

TTYs already supported TIOCOUTQ, but FIONWRITE seems to be a more
generic name for this.

Approved by: re (kib)


# 194771 23-Jun-2009 ed

Improve my last commit: use a separate condvar to serialize.

The advantage of using a separate condvar is that we can just use
cv_signal(9) instead of cv_broadcast(9). It makes no sense to wake up
multiple threads. It also makes the TTY code easier to understand.
t_dcdwait sounds totally unrelated.


# 194769 23-Jun-2009 ed

Use dcdwait to block threads to serialize writes.

I suspect the usage of bgwait causes a lot of spurious wakeups when
threads are blocked in the background, because they will be woken up
each time a write() call is performed.

Also wakeup dcdwait when the TTY is abandoned.


# 194532 20-Jun-2009 ed

Improve nested jail awareness of devfs by handling credentials.

Now that we start to use credentials on character devices more often
(because of MPSAFE TTY), move the prison-checks that are in place in the
TTY code into devfs.

Instead of strictly comparing the prisons, use the more common
prison_check() function to compare credentials. This means that
pseudo-terminals are only visible in devfs by processes within the same
jail and parent jails.

Even though regular users in parent jails can now interact with
pseudo-terminals from child jails, this seems to be the right approach.
These processes are also capable of interacting with the jailed
processes anyway, through signals for example.

Reviewed by: kib, rwatson (older version)


# 194264 15-Jun-2009 ed

Perform some more cleanups to in-kernel session handling.

The code that was in place in exit1() was mainly based on code from the
old TTY layer. The main reason behind this, was because at one moment I
ran a system that had two TTY layers in place at the same time. It is
now sufficient to do the following:

- Remove references from the session structure to the TTY vnode and the
session leader.

- If we have a controlling TTY and the session used by the TTY is equal
to our session, send the SIGHUP.

- If we have a vnode to the controlling TTY which has not been revoked,
revoke it.

While there, change sys/kern/tty.c to use s_ttyp in the comparison
instead of s_ttyvp. It should not make any difference, because s_ttyvp
can only become null when the session leader already left, but it's
nicer to compare against the proper value.


# 194256 15-Jun-2009 ed

Make tcsetsid(3) work on revoked TTYs.

Right now the only way to make tcsetsid(3)/TIOCSCTTY work, is by
ensuring the session leader is dead. This means that an application that
catches SIGHUPs and performs a sleep prevents us from assigning a new
session leader.

Change the code to make it work on revoked TTYs as well. This allows us
to change init(8) to make the shutdown script run in a more clean
environment.


# 194079 12-Jun-2009 ed

Revert my previous change, because it reintroduces an old regression.

Because our rc scripts also open the /etc/ttyv* nodes, it revokes the
console, preventing startup messages from being displayed.

I really have to think about this. Maybe we should just give the console
its own TTY and let it build on top of other TTYs. I'm still not sure
what to do with input handling there.


# 194074 12-Jun-2009 ed

Prevent yet another staircase effect bug in the console device.

Even though I thought I fixed the staircase issue (and I was no longer
able to reproduce it), I got some reports of the issue still being
there. It turns out the staircase effect still occurred when
/dev/console was kept open while killing the getty on the same TTY
(ttyv0).

For some reason I can't figure out how the old TTY code dealt with that,
so I assume the issue has always been there. I only exposed it more by
merging consolectl with ttyv0, which means that the issue was present,
even on systems without a serial console.

I'm now marking the console device as being closed when closing the
regular TTY device node. This means that when the getty shuts down,
init(8) will open /dev/console, which means the termios attributes will
always be reset in this case.


# 193951 10-Jun-2009 kib

Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use
vnode interlock to protect the knote fields [1]. The locking assumes
that shared vnode lock is held, thus we get exclusive access to knote
either by exclusive vnode lock protection, or by shared vnode lock +
vnode interlock.

Do not use kl_locked() method to assert either lock ownership or the
fact that curthread does not own the lock. For shared locks, ownership
is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared
lock not owned by curthread, causing false positives in kqueue subsystem
assertions about knlist lock.

Remove kl_locked method from knlist lock vector, and add two separate
assertion methods kl_assert_locked and kl_assert_unlocked, that are
supposed to use proper asserts. Change knlist_init accordingly.

Add convenience function knlist_init_mtx to reduce number of arguments
for typical knlist initialization.

Submitted by: jhb [1]
Noted by: jhb [2]
Reviewed by: jhb
Tested by: rnoland


# 193714 08-Jun-2009 kib

Do not dereference vp->v_rdev without holding any of dev_mtx or vnode
lock. Use code similar to devfs_fp_check(), but inlined to feet other
checks performed by ttyhook_register().

Reviewed by: ed


# 193018 29-May-2009 ed

Last minute TTY API change: remove mutex argument from tty_alloc().

I don't want people to override the mutex when allocating a TTY. It has
to be there, to keep drivers like syscons happy. So I'm creating a
tty_alloc_mutex() which can be used in those cases. tty_alloc_mutex()
should eventually be removed.

The advantage of this approach, is that we can just remove a function,
without breaking the regular API in the future.


# 192827 26-May-2009 ed

Get rid of M_TEMP.


# 192682 24-May-2009 ed

Block when initially opening a TTY multiple times.

In the original MPSAFE TTY code, I changed the behaviour by returning
EBUSY. I thought this made more sense, because it's basically a race to
see who gets the TTY first.

It turns out this is not a good change, because it also causes EBUSY to
be returned when another process is closing the TTY. This can happen
during startup, when /etc/rc (or one of its children) is still busy
draining its data and /sbin/init is attempting to open the TTY to spawn
a getty.

Reported by: bz
Tested by: bz


# 192080 14-May-2009 jeff

- Implement a lockless file descriptor lookup algorithm in
fget_unlocked().
- Save old file descriptor tables created on expansion until
the entire descriptor table is freed so that pointers may be
followed without regard for expanders.
- Mark the file zone as NOFREE so we may attempt to reference
potentially freed files.
- Convert several fget_locked() users to fget_unlocked(). This
requires us to manage reference counts explicitly but reduces
locking overhead in the common case.


# 191782 04-May-2009 ed

Remove unneeded check for SESS_LEADER().

We perform the same check ~10 lines above.


# 190847 08-Apr-2009 ed

Fix tty_wait_background() to comply with standards.

It turns out my handling of SIGTTOU and SIGTTIN didn't entirely comply
to the standards. It is true that in the SIGTTOU case we should not
return EIO when the signal is ignored/blocked, but in the SIGTTIN case
we must.

See also: POSIX issue 7 section 11.1.4


# 189222 01-Mar-2009 ed

Improve my previous changes to the TTY code: also remove memcpy().

It's better to just use internal language constructs, because it is
likely the compiler has a better opinion on whether to perform inlining,
which is very likely to happen to struct winsize.

Submitted by: Christoph Mallon <christoph mallon gmx de>


# 189167 28-Feb-2009 ed

Replace bcopy() calls inside the TTY layer with memcpy()/strlcpy().

In all these cases the buffers never overlap. Program names are also
likely to be shorter, so use a regular strlcpy() to copy p_comm.


# 188487 11-Feb-2009 ed

Serialize write() calls on TTYs.

Just like the old TTY layer, the current MPSAFE TTY layer does not make
any attempt to serialize calls of write(). Data is copied into the
kernel in 256 (TTY_STACKBUF) byte chunks. If a write() call occurs at
the same time, the data may interleave. This is especially likely when
the TTY starts blocking, because the output queue reaches the high
watermark.

I've implemented this by adding a new flag, TTY_BUSY_OUT, which is used
to mark a TTY as having a thread stuck in write(). Because I don't want
non-blocking processes to be possibly blocked by a sleeping thread, I'm
still allowing it to bypass the protection. According to this message,
the Linux kernel returns EAGAIN in such cases, but I think that's a
little too restrictive:

http://kerneltrap.org/index.php?q=mailarchive/linux-kernel/2007/5/2/85418/thread

PR: kern/118287


# 188147 05-Feb-2009 ed

Don't leave the console TTY constantly open.

When we leave the console TTY constantly open, we never reset the
termios attributes. This causes output processing, echoing, etc. not to
be reset to the proper values when going into single user mode after the
system has booted. It also causes nl-to-crnl-conversion not to take
place during shutdown, which causes a `staircase effect'.

This patch adds a new TTY flag, TF_OPENED_CONS, which is set when the
TTY is opened through /dev/console. Because the flags are only used by
the kernel and the pstat(8) utility, I've decided to renumber the TTY
flags. This shouldn't be an issue, because the TTY layer is not yet part
of a stable release.

Reported by: Mark Atkinson <atkin901 yahoo com>
Tested by: sepotvin


# 188096 03-Feb-2009 ed

Slightly improve the design of the TTY buffer.

The TTY buffers used the standard <sys/queue.h> lists. Unfortunately
they have a big shortcoming. If you want to have a double linked list,
but no tail pointer, it's still not possible to obtain the previous
element in the list. Inside the buffers we don't need them. This is why
I switched to custom linked list macros. The macros will also keep track
of the amount of items in the list. Because it doesn't use a sentinel,
we can just initialize the queues with zero.

In its simplest form (the output queue), we will only keep two
references to blocks in the queue, namely the head of the list and the
last block in use. All free blocks are stored behind the last block in
use.

I noticed there was a very subtle bug in the previous code: in a very
uncommon corner case, it would uma_zfree() a block in the queue before
calling memcpy() to extract the data from the block.


# 187722 26-Jan-2009 ed

Use the proper flag to let kern.ttys be executed without Giant.

Pointed out by: jhb


# 187671 24-Jan-2009 ed

Mark kern.ttys as MPSAFE.

sysctl now allows Giantless calls, so make kern.ttys use this. If it
needs Giant, it locks the proper TTY anyway.


# 186707 02-Jan-2009 ed

Fix a corner case in my previous commit.

Even though there are not many setups that have absolutely no console
device, make sure a close() on a TTY doesn't dereference a null pointer.


# 186706 02-Jan-2009 ed

Don't let /dev/console be revoked if the TTY below is being closed.

During startup some of the syscons TTY's are used to set attributes like
the screensaver and mouse options. These actions cause /dev/console to
be rendered unusable.

Fix the issue by leaving the TTY opened when it is used as the console
device.

Reported by: imp


# 186354 20-Dec-2008 ed

Let wchan names more closely match pre-MPSAFE TTY behaviour.

Right now the wchan strings "ttyinp" and "ttybgw" only differ one
character from the strings we used prior to MPSAFE TTY. Just rename them
back to their pre-MPSAFE TTY counterparts.

Also rename "ttylck" to "ttymtx", which should make it more clear that a
process is blocked on the TTY mutex, not some other form of locking.


# 186322 19-Dec-2008 ivoras

Further beautify the lock strings to be more pleasing to the eye and
self documenting within 6 characters.

Reviewed by: ed (older version)
Approved by: gnn (older version)


# 186285 18-Dec-2008 ivoras

Remove spaces in wait object names to make top (1) output prettier and
unbreak scripts that examine ps (1) output.

Reviewed by: ed
Approved by: gnn (mentor)


# 186056 13-Dec-2008 mav

Change ttyhook_register() second argument from thread to process pointer.
Thread was not really needed there, while previous ng_tty implementation
that used thread pointer had locking issues (using sx while holding mutex).


# 184771 08-Nov-2008 ed

Reduce the default baud rate of PTY's to 9600.

On RELENG_6 (and probably RELENG_7) we see our syscons windows and
pseudo-terminals have the following buffer sizes:

| LINE RAW CAN OUT IHIWT ILOWT OHWT LWT COL STATE SESS PGID DISC
| ttyv0 0 0 0 7680 6720 2052 256 7 OCcl 1146 1146 term
| ttyp0 0 0 0 7680 6720 1296 256 0 OCc 82033 82033 term

These buffer sizes make no sense, because we often have much more output
than input, but I guess having higher input buffer sizes improves
guarantees of the system.

On MPSAFE TTY I just sent both the input and output buffer sizes to 7
KB, which is pretty big on a standard FreeBSD install with 8 syscons
windows and some PTY's. Reduce the baud rate to 9600 baud, which means
we now have the following buffer sizes:

| LINE INQ CAN LIN LOW OUTQ USE LOW COL SESS PGID STATE
| ttyv0 1920 0 0 192 1984 0 199 7 2401 2401 Oil
| pts/0 1920 0 0 192 1984 0 199 5631 1305 2526 Oi

This is a lot smaller, but for pseudo-devices this should be good
enough. You need to do a lot of punching to fill up a 7.5 KB input
buffer. If it turns out things don't work out this way, we'll just
switch to 19200 baud.


# 184522 01-Nov-2008 ed

Clamp the values of t_column to 5 digits in `pstat -t' and `show all ttys'.

We often run into these very high column numbers when we run curses
applications, because they don't print any newlines. This messes up the
table output of `pstat -t'. If these numbers get really high, they
aren't of any use to the reader anyway. Convert them to `99999' when
they run out of bounds.


# 184521 01-Nov-2008 ed

Reimplement the /dev/console device node.

One of the pieces of code that I had left alone during the development
of the MPSAFE TTY layer, was tty_cons.c. This file actually has two
different functions:

- It contains low-level console input/output routines (cnputc(), etc).

- It creates /dev/console and wraps all its cdevsw calls to the
appropriate TTY.

This commit reimplements the second set of functions by moving it
directly into the TTY layer. /dev/console is now a character device node
that's basically a regular TTY, but does a lookup of `si_drv1' each time
you open it. d_write has also been changed to call log_console().
d_close() is not present, because we must make sure we don't revoke the
TTY after writing a log message to it.

Even though I'm not convinced this is in line with the future directions
of our console code, it is a good move for now. It removes recursive
locking from the top half of the TTY layer. The previous implementation
called into the TTY layer with Giant held.

I'm renaming tty_cons.c to kern_cons.c now. The code hardly contains any
TTY related bits, so we'd better give it a less misleading name.

Tested by: Andrzej Tobola <ato iem pw edu pl>,
Carlos A.M. dos Santos <unixmania gmail com>,
Eygene Ryabinkin <rea-fbsd codelabs ru>


# 184128 21-Oct-2008 thompsa

Fix spelling mistake in the last rev.


# 184127 21-Oct-2008 thompsa

If we have getc_inject hooked then the outq buffer is inaccessible to the
driver so skip the drain rather than waiting indefinitely.

Reviewed by: ed


# 183922 15-Oct-2008 ed

Import some improvements to the TTY code from the MPSAFE TTY branch.

- Change the ddb(4) commands to be more useful (by thompsa@):
- `show ttys' is now called `show all ttys'. This command will now
also display the address where the TTY data structure resides.
- Add `show tty <addr>', which dumps the TTY in a readable form.

- Place an upper bound on the TTY buffer sizes. Some drivers do not want
to care about baud rates. Protect these drivers by preventing the TTY
buffers from getting enormous. Right now we'll just clamp it to 64K,
which is pretty high, taking into account that these buffers are only
used by the built-in discipline.

- Only call ttydev_leave() when needed. Back in April/May the TTY
reference counting mechanism was a little different, which required us
to call ttydev_leave() each time we finished a cdev operation.
Nowadays we only need to call ttydev_leave() when we really mark it as
being closed.

- Improve return codes of read() and write() on TTY device nodes.

- Make sure we really wake up all blocked threads when the driver calls
tty_rel_gone(). There were some possible code paths where we didn't
properly wake up any readers/writers.

- Add extra assertions to prevent sleeping on a TTY that has been
abandoned by the driver.

- Use ttydev_cdevsw as a more reliable method to figure out whether a
device node is a real TTY device node.

Obtained from: //depot/projects/mpsafetty/...
Reviewed by: thompsa


# 183386 26-Sep-2008 ed

Don't forget to initialize `int error' in ttydev_open().

I've had some reports in the past that opening an already opened TTY
through, for example, /dev/tty can fail with random error codes. Looking
at ttydev_open(), I can see there is a way `error' is returned without
initialising it. Even though I haven't had any confirmation this fixes
the bug, I'll fix it anyway.

Reported by: Andrzej Tobola <ato iem pw edu pl>


# 183324 24-Sep-2008 ed

Fix a crash when calling tty_rel_free() while draining during closure.

Yesterday I got two reports of potential crashes, related to TTY
deallocation during device closure. When a thread is in TF_OPENCLOSE,
draining its output upon closure, we should not allow calls to
tty_rel_free() to happen at the same time. This could cause the TTY to
be torn down twice.

PR: kern/127561
Reported by: KOIE Hidetaka <koie suri co jp>
Discussed with: thompsa


# 183276 22-Sep-2008 ed

Introduce a hooks layer for the MPSAFE TTY layer.

One of the features that prevented us from fixing some of the TTY
consumers to work once again, was an interface that allowed consumers to
do the following:

- `Sniff' incoming data, which is used by the snp(4) driver.

- Take direct control of the input and output paths of a TTY, which is
used by ng_tty(4), ppp(4), sl(4), etc.

There's no practical advantage in committing a hooks layer without
having any consumers. In P4 there is a preliminary port of snp(4) and
thompsa@ is busy porting ng_tty(4) to this interface. I already want to
have it in the tree, because this may stimulate others to work on the
remaining modules.

Discussed with: thompsa
Obtained from: //depot/projects/mpsafetty/...


# 183076 16-Sep-2008 ed

Fix minor TTY API inconsistency.

Unlike tty_rel_gone() and tty_rel_sess(), the tty_rel_pgrp() routine
does not unlock the TTY. I once had the idea to make the code call
tty_rel_pgrp() and tty_rel_sess(), picking up the TTY lock once. This
turned out a little harder than I expected, so this is how it works now.

It's a lot easier if we just let tty_rel_pgrp() unlock the TTY, because
the other routines do this anyway.


# 182815 06-Sep-2008 ed

Make TIOCCONS use priv_check() instead of checking /dev/console permissions.

As discussed with Robert on IRC, checking the permissions on
/dev/console to see if we can call TIOCCONS could be unreliable. When we
run a chroot() without a devfs instance mounted inside, it won't
actually check the permissions on the device node inside the devfs
instance.

Using the already existing PRIV_TTY_CONSOLE for this seems like a better
idea.

Approved by: rwatson


# 182764 04-Sep-2008 ed

Implement pts(4) packet mode.

As reported by several users on the mailing lists, applications like
screen(1) fail to properly handle ^S and ^Q characters. This was because
MPSAFE TTY didn't implement packet mode (TIOCPKT) yet. Add basic packet
mode support to make these applications work again.

Obtained from: //depot/projects/mpsafetty/...


# 182471 30-Aug-2008 ed

Fix some edge cases in the TTY queues:

- In the current design, when a TTY decreases its baud rate, it tries to
shrink the queues. This may not always be possible, because it will
not free any blocks that are still filled with data.

Change the TTY queues to store a `quota' value as well, which means it
will not free any blocks when changing the baud rate, but when placing
blocks back into the queue. When the amount of blocks exceeds the
quota, they get freed.

It also fixes some edge cases, where TIOCSETA during read()/
write()-calls could actually make the queue a tiny bit bigger than in
normal cases.

- Don't leak blocks of memory when calling TIOCSETA when the device
driver abandons the TTY while allocating memory.

- Create ttyoutq_init() and ttyinq_init() to initialize the queues,
instead of initializing them by hand. The new TTY snoop driver also
creates an outq, so it's good to have a proper interface to do this.

Obtained from: //depot/projects/mpsafetty/...


# 182296 27-Aug-2008 ed

Properly unlock the init/lock-state devices when invoking TIOCSETA.

For some reason a return-statement crept into this code, where it
shouldn't belong. This means we didn't properly unlock the TTY before
returning to userspace.

Submitted by: Tor Egge <tor egge cvsup no freebsd org>


# 182023 22-Aug-2008 ed

Fix two small bugs in tcsetattr().

- According to POSIX, tcsetattr() must not fail when any of the bits in
the structure are unsupported, but it must leave the unsupported flags
alone.

- The CIGNORE flag (set by TCSASOFT, extension) was not cleared from
c_cflag, which means using it would cause it to be applied during its
entire lifespan. Eventually make sure we clear the flag.

I don't really like CIGNORE, but I think we must keep it alive right
now. With our new TTY layer, we don't actually need this mechanism,
because if you leave c_cflag, c_ispeed and c_ospeed alone, we won't make
a call into the device driver anyway.

Reported by: naddy
Tested by: naddy


# 181993 22-Aug-2008 ed

Prevent VSTART flooding when turning on software flow control.

It turned out we transmitted VSTART after each successful read on a TTY
when software flow control was turned on. This was because of a very
evil bug where we tested the TF_HIWAT_IN flag the other way around.

Reported by: Christian Weisgerber <naddy mips inka de>


# 181905 20-Aug-2008 ed

Integrate the new MPSAFE TTY layer to the FreeBSD operating system.

The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

The old TTY layer has a driver model that is not abstract enough to
make it friendly to use. A good example is the output path, where the
device drivers directly access the output buffers. This means that an
in-kernel PPP implementation must always convert network buffers into
TTY buffers.

If a PPP implementation would be built on top of the new TTY layer
(still needs a hooks layer, though), it would allow the PPP
implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

With the old TTY layer, it isn't entirely safe to destroy TTY's from
the system. This implementation has a two-step destructing design,
where the driver first abandons the TTY. After all threads have left
the TTY, the TTY layer calls a routine in the driver, which can be
used to free resources (unit numbers, etc).

The pts(4) driver also implements this feature, which means
posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

One of the major improvements is the per-TTY mutex, which is expected
to improve scalability when compared to the old Giant locking.
Another change is the unbuffered copying to userspace, which is both
used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from: //depot/projects/mpsafetty/...
Approved by: philip (ex-mentor)
Discussed: on the lists, at BSDCan, at the DevSummit
Sponsored by: Snow B.V., the Netherlands
dcons(4) fixed by: kan


# 180801 25-Jul-2008 ed

Move ttyinfo() into its own C file.

The ttyinfo() routine generates the fancy output when pressing ^T. Right
now it is stored in tty.c. In the MPSAFE TTY code it is already stored
in tty_info.c. To make integration of the MPSAFE TTY code a little
easier, take the same approach.

This makes the TTY code a little bit more readable, because having the
proc_*/thread_* routines in tty.c is very distractful.

Approved by: philip (mentor)


# 179251 23-May-2008 kib

Rev. 1.274 put the ttyrel() call before the destroy_dev() in the
ttyfree(), freeing the tty. Since destroy_dev() may call d_purge()
cdevsw method, that is the ttypurge() for the tty, the code ends up
accessing freed tty structure.

Put the ttyrel() after destroy_dev() in the ttyfree. To prevent the
panic the rev. 1.274 provided fix for, check the TS_GONE in sysctl
handler and refuse to provide information on such tty.

Reported, debugging help and tested by: pho
DIscussed with and reviewed by: jhb
MFC after: 1 week


# 179250 23-May-2008 kib

The dev_refthread() in the tty_gettp() may fail, because Giant is taken
in the giant_trick routines after the dev_refthread increments the
si_threadcount. Remove assert, do not perform dev_relthread() for failed
dev_refthread(), and handle failure in the tty_gettp() callers (cdevsw
tty methods).

Before kern_conf.c 1.210 and 1.211, the kernel usually paniced in the
giant_trick routines dereferencing NULL cdevsw, not taking this fault.

Reported by: Vince Hoffman <jhary unsane co uk>
Debugging help and tested by: pho
Reviewed by: jhb
MFC after: 1 week


# 179249 23-May-2008 kib

Use the t_state for the TS_GONE test.

Submitted by: jhb
MFC after: 3 days


# 179246 23-May-2008 ed

Move TTY unrelated bits out of <sys/tty.h>.

For some reason, the <sys/tty.h> header file also contains routines of the
clists and console that are used inside the TTY layer. Because the clists
are not only used by the TTY layer (example: various input drivers), we'd
better move the entire clist programming interface into <sys/clist.h>. Also
remove a declaration of nonexistent variable.

The <sys/tty.h> header also contains various definitions for the console
code (tty_cons.c). Also move these to <sys/cons.h>, because they are
not implemented inside the TTY layer.

While there, create separate malloc pools for the clist and console code.

Approved by: philip (mentor)


# 178219 15-Apr-2008 davidxu

Implement POSIX function tcgetsid() which returns session id.

PR: stand/107561


# 177368 19-Mar-2008 jeff

- Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from
requiring the per-process spinlock to only requiring the process lock.
- Reflect these changes in the proc.h documentation and consumers throughout
the kernel. This is a substantial reduction in locking cost for these
fields and was made possible by recent changes to threading support.


# 175152 08-Jan-2008 jhb

Close a race in the kern.ttys sysctl handler that resulted in panics in
dev2udev() when a tty was being detached concurrently with the sysctl
handler:
- Hold the 'tty_list_mutex' lock while we read all the fields out of the
struct tty for copying out later. Previously the pty(4) and pts(4)
destroy routines could set t_dev to NULL, drop their reference on the
tty and destroy the cdev while the sysctl handler was attempting to
invoke dev2udev() on the cdev being destroyed. This happened when the
sysctl handler read the value of t_dev prior to it being set to NULL
either due to it being stale or due to timing races. By holding the
list lock we guarantee that the destroy routines will block in ttyrel()
in that case and not destroy the cdev until after we've copied all of our
data. We may see a NULL cdev pointer or we may see the previous value,
but the previous value will no longer point to a destroyed cdev if we
see it.
- Fix the ttyfree() routine used by tty device drivers in their detach
methods to use ttyrel() on the tty so we don't leak them. Also, fix it
to use the same order of operations as pty/pts destruction (set t_dev
NULL, ttyrel(), destroy_dev()) so it cooperates with the sysctl handler.

MFC after: 3 days
Tested by: avatar


# 171517 20-Jul-2007 kib

ttyfree() frees the cdev(). But if there are pending kevents,
filt_ttyrdetach() etc would later attempt to dereference cdev->si_tty,
causing a 0xdeadc0de dereference. Change kn_hook value from cdev to
struct tty to avoid dereferencing freed cdev.

In ttygone(), wake up select(), sigio and kevent() users in addition
to the queue sleepers.

Return EV_EOF from kevent filters if TS_GONE is set.

Submitted by: peter
Tested by: Peter Holm
Approved by: re (kensmith)
MFC after: 2 weeks


# 171126 30-Jun-2007 jeff

- Use rufetchcalc() rather than calcru() in ttyinfo so that we get
correct system and user time stats.

Approved by: re
Reported by: kris
Discussed with: Attilio


# 170581 11-Jun-2007 cognet

Re-acquire the PROC_SLOCK before calling calcru(), and release it after,
since calcru() expects it to be locked.

Reviewed by: attilio


# 170482 09-Jun-2007 mjacob

The new compiler can't quite follow the logic of has_stime and
complains about using uninitialized tags in stime.


# 170301 04-Jun-2007 jeff

Commit 9/14 of sched_lock decomposition.
- Attempt to return the ttyinfo() selection algorithm to something sane
as it has been broken and disabled for some time. Adapt this algorithm
in such a way that it does not conflict with per-cpu scheduler locking.

Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)


# 165367 20-Dec-2006 mbr

Back out rev. 1.266. The real cause for the recent panics has been fixed
in rev. 1.267 and there is no need to keep this test.


# 165358 19-Dec-2006 mbr

Giant might have been temporarily dropped while waiting for proctree_lock, allowing for an
intervening tty_close() that cleared tp->t_session.

Submitted by: tegge
MFC: 1 day


# 165349 19-Dec-2006 mbr

Add the tp->t_refcnt validity check back. There are still some race
conditions where tp->t_refcnt can go to zero.


# 164936 06-Dec-2006 julian

Threading cleanup.. part 2 of several.

Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs. Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.


# 164054 06-Nov-2006 tegge

Don't drop reference to tty in tty_close() if TS_ISOPEN is already cleared.

Reviewed by: bde


# 164033 06-Nov-2006 rwatson

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


# 163709 26-Oct-2006 jb

Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by: davidxu@


# 162833 30-Sep-2006 mbr

Any call of tty_close() with a tty refcount of <= 1 is wrong and we will
free the tty in this case. This is a workaround until the underlaying
devfs/tty problems are fixed.

MFC after: 1 day


# 162575 23-Sep-2006 mbr

Check for tp->t_refcnt == 0 before doing anything in tty_open().

PR: 103520
MFC after: 1 week


# 162520 21-Sep-2006 mbr

Back out rev. 1.258. The real race cause has been fixed
in rev. 1.241 of kern_proc.c.

Requested by: jhb


# 162203 10-Sep-2006 mbr

Fix locking race in ttymodem(). The locking of the proctree happens too late
and opens a small race window before tp->t_session->s_leader is accessed. In case
tp->t_session has just been set to NULL elsewhere, we get a panic().

This fix is a bandaid until someone else fixes the whole locking in the tty subsystem.
Definitly more work needs to be done.

MFC after: 1 week
Reviewed by: mlaier
PR: kern/103101


# 154170 10-Jan-2006 phk

Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43)
to COMPAT_43TTY.

Add COMPAT_43TTY to NOTES and */conf/GENERIC

Compile tty_compat.c only under the new option.

Spit out
#warning "Old BSD tty API used, please upgrade."
if ioctl_compat.h gets #included from userland.


# 154018 04-Jan-2006 phk

Deorbit ttymalloc() in preference for ttyalloc()


# 154013 04-Jan-2006 phk

Use MTX_SYSINIT to set up the tty list mutex.


# 151740 27-Oct-2005 jhb

Revert most of revision 1.235 and fix the problem a different way. We
can't acquire an sx lock in ttyinfo() because ttyinfo() can be called
from interrupt handlers (such as atkbd_intr()). Instead, go back to
locking the process group while we pick a thread to display information for
and hold that lock until after we drop sched_lock to make sure the
process doesn't exit out from under us. sched_lock ensures that the
specific thread from that process doesn't go away. To protect against
the process exiting after we drop the proc lock but before we dereference
it to lookup the pid and p_comm in the call to ttyprintf(), we now copy
the pid and p_comm to local variables while holding the proc lock.

This problem was found by the recently added TD_NO_SLEEPING assertions for
interrupt handlers.

Tested by: emaste
MFC after: 1 week


# 151389 16-Oct-2005 phk

Use new functions to call into drivers methods.


# 151388 16-Oct-2005 phk

Make ttyconsolemode() call ttsetwater() so that drivers don't have to.


# 151383 16-Oct-2005 phk

Eliminate two unused arguments to ttycreate().


# 147730 01-Jul-2005 ssouhlal

Fix the recent panics/LORs/hangs created by my kqueue commit by:

- Introducing the possibility of using locks different than mutexes
for the knlist locking. In order to do this, we add three arguments to
knlist_init() to specify the functions to use to lock, unlock and
check if the lock is owned. If these arguments are NULL, we assume
mtx_lock, mtx_unlock and mtx_owned, respectively.

- Using the vnode lock for the knlist locking, when doing kqueue operations
on a vnode. This way, we don't have to lock the vnode while holding a
mutex, in filt_vfsread.

Reviewed by: jmg
Approved by: re (scottl), scottl (mentor override)
Pointyhat to: ssouhlal
Will be happy: everyone


# 145014 13-Apr-2005 avatar

According to the comment in struct tty, t_modem is optional; hence we should
guard against NULL t_modem entry. Otherwise, driver doesn't have t_modem
callback implemented(such like sys/dev/usb/ucycom.c) would panic when
someone opens the driver's associated tty device.

Reviewed by: phk, sam (mentor)


# 144154 26-Mar-2005 phk

Make (some) serial ports implement the PPS-API again. This change
appearantly fell out during the tty code cleanup.


# 143438 11-Mar-2005 peter

Replace my previous change for 32 bit systems with hz > 169 with Bruce's
simpler one.


# 143437 11-Mar-2005 peter

Make the tty vmin/vtime timeouts work for hz > 169 on 32 bit machines.


# 143238 07-Mar-2005 phk

Add placeholder mutex argument to new_unrhdr().


# 139449 30-Dec-2004 jhb

Call tty_close() at the very end of ttyclose() since otherwise NULL
deferences can occur since tty_close() may end up freeing the tty structure
if it drops the last reference to it.

Glanced at by: phk


# 139086 20-Dec-2004 phk

fix a misleading sleep identifier.


# 137779 16-Nov-2004 dds

Improvements and fixes in the 1.241 commit:

- Have TS_ZOMBIE ttys return POLLHUP instead of POLLERR
- Remove unneeded POLLWRNORM (old bug)
- TS_ZOMBIE ttys will set POLLIN and POLLRDNORM
- Do not call selrecord in TS_ZOMBIE ttys

PR: kern/73821
Reviewed by: bde
MFC after: 4 weeks


# 137577 11-Nov-2004 dds

Return POLLERR rather than POLLIN/POLLOUT on TS_ZOMBIE ttys.

PR: kern/73821
MFC after: 4 weeks


# 137167 03-Nov-2004 phk

Restore TTYDEF_LFLAG to set echo bits.


# 136680 18-Oct-2004 phk

Add new function ttyinitmode() which sets our systemwide default
modes on a tty structure.

Both the ".init" and the current settings are initialized allowing
the function to be used both at attach and open time.

The function takes an argument to decide if echoing should be enabled.
Echoing should not be enabled for regular physical serial ports
unless they are consoles, in which case they should be configured
by ttyconsolemode() instead.

Use the new function throughout.


# 136553 15-Oct-2004 phk

Make pty's always come up in echo mode.


# 136514 14-Oct-2004 phk

Add missing chunk of code to enforce the lock-bits of termios.

This solves the problem where serial consoles suddenly required
DCD to be asserted.

Reported by: Randy Bush <randy@psg.com>


# 136456 12-Oct-2004 phk

Don't call driver close unless we have one.


# 136152 05-Oct-2004 jhb

Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
times it needs rather than calling getrusage() twice with associated
stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
for user, system, and interrupt time as well as a bintime of the total
runtime. A new p_rux field in struct proc replaces the same inline fields
from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux
field in struct proc contains the "raw" child time usage statistics.
ruadd() has been changed to handle adding the associated rusage_ext
structures as well as the values in rusage. Effectively, the values in
rusage_ext replace the ru_utime and ru_stime values in struct rusage. These
two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
calculates appropriate timevals for user and system time as well as updating
the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a
copy of the process' p_rux structure to compute the timevals after updating
the runtime appropriately if any of the threads in that process are
currently executing. It also now only locks sched_lock internally while
doing the rux_runtime fixup. calcru() now only requires the caller to
hold the proc lock and calcru1() only requires the proc lock internally.
calcru() also no longer allows callers to ask for an interrupt timeval
since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
calling calcru1() on p_crux. Note that this means that any code that wants
child times must now call this function rather than reading from p_cru
directly. This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
proctree lock is used to protect the process group rather than the process
group lock. By holding this lock until the end of the function we now
ensure that the process/thread that we pick to dump info about will no
longer vanish while we are trying to output its info to the console.

Submitted by: bde (mostly)
MFC after: 1 month


# 135964 30-Sep-2004 phk

Assign a global unit number for the tty slave devices (init/lock) using
the new subr_unit.c code.

For now assert Giant in ttycreate() and ttyfree(). It is not obvious that
it will ever pay off to lock these with anything else.


# 135905 28-Sep-2004 phk

Add functions to create and free the "tty-ness" of a serial port in a
generic way. This code will allow a similar amount of code to be
removed from most if not all serial port drivers.

Add generic cdevsw for tty devices.

Add generic slave cdevsw for init/lock devices.

Add ttypurge function which wakes up all know generic sleep
points in the tty code, and calls into the hw-driver if it
provides a method.

Add ttycreate function which creates tty device and optionally
cua device. In both cases .init/.lock devices are created
as well.

Change ttygone() slightly to also call the hw driver provided
purge routine.

Add ttyfree() which will purge and destroy the cdevs.

Add ttyconsole mode for setting console friendly termios
on a port.


# 135725 24-Sep-2004 phk

Hold threadcount while throbbing cdevsw in our underlying driver.

This is a bit heavyhanded, and will be simplified once the tty code
learns to properly deal with disappearing hw and drivers.


# 135432 18-Sep-2004 phk

Initialize new ttys a bit more.

Check TS_GONE flag for gone-ness.


# 135378 17-Sep-2004 phk

Add ttyopen and ttyclose functions which will do the right stuff for
most if not all of our tty drivers in the future.

Centralizing this stuff enables us to remove about 100 lines of
almost but not quite perfectly copy&paste code from each tty driver.


# 135363 17-Sep-2004 phk

Add ttyalloc() which in due time will be the successor to ttymalloc(),
but without the "struct tty *" argument.


# 133741 15-Aug-2004 jmg

Add locking to the kqueue subsystem. This also makes the kqueue subsystem
a more complete subsystem, and removes the knowlege of how things are
implemented from the drivers. Include locking around filter ops, so a
module like aio will know when not to be unloaded if there are outstanding
knotes using it's filter ops.

Currently, it uses the MTX_DUPOK even though it is not always safe to
aquire duplicate locks. Witness currently doesn't support the ability
to discover if a dup lock is ok (in some cases).

Reviewed by: green, rwatson (both earlier versions)


# 132226 15-Jul-2004 phk

Preparation commit for the tty cleanups that will follow in the near
future:

rename ttyopen() -> tty_open() and ttyclose() -> tty_close().

We need the ttyopen() and ttyclose() for the new generic cdevsw
functions for tty devices in order to have consistent naming.


# 131981 11-Jul-2004 phk

Introduce ttygone() which indicates that the hardware is detached.

Move dtrwait logic to the generic TTY level.


# 131134 26-Jun-2004 phk

Pick the hotchar out of the tty structure instead of caching private
copies.

No current line disciplines have a dynamically changing hotchar, and
expecting to receive anything sensible during a change in ldisc is
insane so no locking of the hotchar field is necessary.


# 131130 26-Jun-2004 phk

Fix line discipline switching issues: If opening a new ldisc fails,
we have to revert to TTYDISC which we know will successfully open
rather than try the previous ldisc which might also fail to open.

Do not let ldisc implementations muck about with ->t_line, and remove
code which checks for reopens, it should never happen.

Move ldisc->l_hotchar to tty->t_hotchar and have ldisc implementation
initialize it in their open routines. Reset to zero when we enter
TTYDISC. ("no" should really be -1 since zero could be a valid
hotchar for certain old european mainframe protocols.)


# 131092 25-Jun-2004 phk

Add two new methods to struct tty: One for manipulating BREAK condition
and one for fiddling modem-control signals.

Add generic code to deal with the relevant ioctls if these methods are
present.


# 131045 24-Jun-2004 phk

#include <sys/serial.h>


# 131042 24-Jun-2004 phk

Use CTASSERT to enforce the relationship between the new serial port
modem definitions and the old definitions from ioctls.


# 130892 21-Jun-2004 phk

Put the pre FreeBSD-2.x tty compat code under BURN_BRIDGES.


# 130585 16-Jun-2004 phk

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


# 130344 11-Jun-2004 phk

Deorbit COMPAT_SUNOS.

We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.


# 130261 09-Jun-2004 phk

Reference count struct tty.

Add two new functions: ttyref() and ttyrel(). ttymalloc() creates a struct
tty with a reference count of one. when ttyrel sees the count go to zero,
struct tty is freed.

Hold references for open ttys and for ttys which are controlling terminal
for sessions.

Until drivers start using ttyrel(), this commit will make no difference.


# 130203 07-Jun-2004 phk

Make linesw[] an array of pointers to linedesc instead of an array of
linedisc.


# 130096 04-Jun-2004 phk

Centralize the line discipline optimization determination in a function
called ttyldoptim().

Use this function from all the relevant drivers.

I belive no drivers finger linesw[] directly anymore, paving the way for
locking and refcounting.


# 130095 04-Jun-2004 phk

Manual edits to change linesw[]-frobbing to ttyld_*() calls.


# 130077 04-Jun-2004 phk

Machine generated patch which changes linedisc calls from accessing
linesw[] directly to using the ttyld...() functions

The ttyld...() functions ar inline so there is no performance hit.


# 130056 04-Jun-2004 phk

Get rid of ttyregister(). All drivers now use ttymalloc() for struct
tty, so now we stand a chance of implementing refcounting and getting
rid of the damn things again.


# 129943 01-Jun-2004 phk

Introduce a ttyioctl() cdevsw default function.


# 128019 07-Apr-2004 imp

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson


# 126078 21-Feb-2004 phk

Device megapatch 3/6:

Add missing D_TTY flags to various drivers.

Complete asserts that dev_t's passed to ttyread(), ttywrite(),
ttypoll() and ttykqwrite() have (d_flags & D_TTY) and a struct tty
pointer.

Make ttyread(), ttywrite(), ttypoll() and ttykqwrite() the default
cdevsw methods for D_TTY drivers and remove the explicit initializations
in various drivers cdevsw structures.


# 125794 13-Feb-2004 green

T -CURRENT DO NOT CRASH UPON ^T K PLZ THX.

Also, use sched_pctcpu() instead of assuming td->td_kse is non-NULL.


# 125416 04-Feb-2004 rwatson

A variety of further cleanups to ttyinfo():

- Rename temporary variable names ("tmp", "tmp2") to more informative
names ("load", "pctcpu", "rss", ...)

- Unclutter indentation and return paths: rather than lots of nested
ifs, simply return earlier if it's not going to work out. Simplify
general structure and avoid "deep" code.

- Comment on the thread/process selection and locking.

- Correct handling of "running"/"runnable" states, avoid "unknown"
that people were seeing for running processes. This was due to
a misunderstanding of the more complex state machine / inhibitors
behavior of KSE.

- Do perform ttyinfo() printing on KSE (P_SA) processes, it seems
generally to work.

While I initially attempted to formulate this as two commits (one
layout, the other content), I concluded that the layout changes were
really structural changes.

Many elements submitted by: bde


# 124268 08-Jan-2004 rwatson

Improve the expressiveness of ttyinfo (^T) when dealing with threads
in slightly less usual states:

If the thread is on a run queue, display "running" if the thread is
actually running, otherwise, "runnable".

If the thread is sleeping, and it's on a sleep queue, display the
name of the queue, otherwise "unknown" -- previously, in this situation
we would display "iowait".

If the thread is waiting on a lock, display *lockname.

If the thread is suspended, display "suspended" -- previously, in
this situation we would display "iowait".

If the thread is waiting for an interrupt, display "intrwait" --
previously, in this situation we would display "iowait".

If the thread is in a state not handled by the above, display
"unknown" -- previously, we would print "iowait".

Among other things, this avoids displaying "iowait" when the foreground
process turns out to be suspended waiting for a debugger to properly
attach.


# 122352 09-Nov-2003 tanimura

- Implement selwakeuppri() which allows raising the priority of a
thread being waken up. The thread waken up can run at a priority as
high as after tsleep().

- Replace selwakeup()s with selwakeuppri()s and pass appropriate
priorities.

- Add cv_broadcastpri() which raises the priority of the broadcast
threads. Used by selwakeuppri() if collision occurs.

Not objected in: -arch, -current


# 116663 22-Jun-2003 iedowse

Use a new message buffer `consmsgbuf' to forward messages to a
TIOCCONS console (e.g. xconsole) via a timeout routine instead of
calling into the tty code directly from printf(). This fixes a
number of cases where calling printf() at the wrong time (such as
with locks held) would cause a panic if xconsole is running.

The TIOCCONS message buffer is 8k in size by default, but this can
be changed with the kern.consmsgbuf_size sysctl. By default, messages
are checked for 5 times per second. The timer runs and the buffer
memory remains allocated only at times when a TIOCCONS console is
active.

Discussed on: freebsd-arch


# 116361 14-Jun-2003 davidxu

Rename P_THREADED to P_SA. P_SA means a process is using scheduler
activations.


# 116182 10-Jun-2003 obrien

Use __FBSDID().


# 114985 13-May-2003 ps

p_sigignore moved into struct sigacts. move one which was missed.

Approved by: re (scottl)


# 114983 13-May-2003 jhb

- Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
and thread_stopped() are now MP safe.

Reviewed by: arch@
Approved by: re (rwatson)


# 113637 17-Apr-2003 jhb

- Use a local struct proc variable to improve readability.
- Use a local variable to close a minor race when determining if the wmesg
printed out needs a prefix such as when a thread is blocked on a lock.


# 112888 31-Mar-2003 jeff

- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with
a follow on commit to kern_sig.c
- signotify() now operates on a thread since unmasked pending signals are
stored in the thread.
- PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.


# 111899 05-Mar-2003 das

Make TTYHOG tunable.

Reviewed by: mike (mentor)


# 111585 27-Feb-2003 julian

Change the process flags P_KSES to be P_THREADED.
This is just a cosmetic change but I've been meaning to do it for about a year.


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 111002 16-Feb-2003 phk

Remove #include <sys/dkstat.h>


# 110996 16-Feb-2003 phk

Move the tty related statistics counters to live with the tty code.


# 110630 10-Feb-2003 tjr

Lock the proc around accessing p_siglist in ttycheckoutq() in the
unused wait != 0 case.


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 104393 03-Oct-2002 truckman

In an SMP environment post-Giant it is no longer safe to blindly
dereference the struct sigio pointer without any locking. Change
fgetown() to take a reference to the pointer instead of a copy of the
pointer and call SIGIO_LOCK() before copying the pointer and
dereferencing it.

Reviewed by: rwatson


# 104387 02-Oct-2002 jhb

Rename the mutex thread and process states to use a more generic 'LOCK'
name instead. (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of
TD_ON_MUTEX()) Eventually a turnstile abstraction will be added that
will be shared with mutexes and other types of locks. SLOCK/TDI_LOCK will
be used internally by the turnstile code and will not be specific to
mutexes. Making the change now ensures that turnstiles can be dropped
in at a later date without affecting the ABI of userland applications.


# 104306 01-Oct-2002 jmallett

Back our kernel support for reliable signal queues.

Requested by: rwatson, phk, and many others


# 104245 30-Sep-2002 jmallett

(Forced commit, to clarify previous commit of ksiginfo/signal queue code.)

I've added a structure, kernel-private, to represent a pending or in-delivery
signal, called `ksiginfo'. It is roughly analogous to the basic information
that is exported by the POSIX interface 'siginfo_t', but more basic. I've
added functions to allocate these structures, and further to wrap all signal
operations using them.

Once the operations are wrapped, I've added a TailQ (see queue(3)) of these
structures to 'struct proc', and all pending signals are in that TailQ. When
a signal is being delivered, it is dequeued from the list. Once I finish
the spreading of ksiginfo throughout the tree, the dequeued structure will be
delivered to the process in question, whereas currently and normally, the
signal number is what is used.


# 104233 30-Sep-2002 jmallett

First half of implementation of ksiginfo, signal queues, and such. This
gets signals operating based on a TailQ, and is good enough to run X11,
GNOME, and do job control. There are some intricate parts which could be
more refined to match the sigset_t versions, but those require further
evaluation of directions in which our signal system can expand and contract
to fit our needs.

After this has been in the tree for a while, I will make in kernel API
changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo
more robustly, such that we can actually pass information with our
(queued) signals to the userland. That will also result in using a
struct ksiginfo pointer, rather than a signal number, in a lot of
kern_sig.c, to refer to an individual pending signal queue member, but
right now there is no defined behaviour for such.

CODAFS is unfinished in this regard because the logic is unclear in
some places.

Sponsored by: New Gold Technology
Reviewed by: bde, tjr, jake [an older version, logic similar]


# 103216 11-Sep-2002 julian

Completely redo thread states.

Reviewed by: davidxu@freebsd.org


# 100337 18-Jul-2002 julian

Clear up confusion in ugly code. ^T gave wrong results for RSS.
I misinterpretted this code when changing it to handle threads.
(there are still issues here)
Submitted by: Ian Dowse <iedowse@maths.tcd.ie>


# 99072 29-Jun-2002 julian

Part 1 of KSE-III

The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)

Reviewed by: Almost everyone who counts
(at various times, peter, jhb, matt, alfred, mini, bernd,
and a cast of thousands)

NOTE: this is still Beta code, and contains lots of debugging stuff.
expect slight instability in signals..


# 98493 20-Jun-2002 iedowse

Display the mutex name in the ^T status line if the selected thread
is blocked on a mutex. Prepend a '*' to distinguish this case as
is done in top(1).


# 97673 31-May-2002 des

Nit: kern.ttys is of type S,xtty, not S,tty.


# 97401 28-May-2002 des

Add some checks to prevent NULL dereferences.

Submitted by: jhay


# 97379 28-May-2002 des

Add NAI copyright.


# 97366 28-May-2002 des

Introduce struct xtty, used when exporting tty information to userland.
Make kern.ttys export a struct xtty rather than struct tty. Since struct
tty is no longer exposed to userland, remove the dev_t / udev_t hack.

Sponsored by: DARPA, NAI Labs


# 97284 25-May-2002 des

ANSIfy (significant portions were already partly ANSIfied)


# 97283 25-May-2002 des

Remove register.


# 97282 25-May-2002 des

Automated whitespace cleanup.


# 96122 06-May-2002 alfred

Make funsetown() take a 'struct sigio **' so that the locking can
be done internally.

Ensure that no one can fsetown() to a dying process/pgrp. We need
to check the process for P_WEXIT to see if it's exiting. Process
groups are already safe because there is no such thing as a pgrp
zombie, therefore the proctree lock completely protects the pgrp
from having sigio structures associated with it after it runs
funsetownlst.

Add sigio lock to witness list under proctree and allproc, but over
proc and pgrp.

Seigo Tanimura helped with this.


# 95883 01-May-2002 alfred

Redo the sigio locking.

Turn the sigio sx into a mutex.

Sigio lock is really only needed to protect interrupts from dereferencing
the sigio pointer in an object when the sigio itself is being destroyed.

In order to do this in the most unintrusive manner change pgsigio's
sigio * argument into a **, that way we can lock internally to the
function.


# 94860 16-Apr-2002 jhb

- Lock proctree_lock instead of pgrpsess_lock.
- Use temporary variables to hold a pointer to a pgrp while we dink with it
while not holding either the associated proc lock or proctree_lock. It
is in theory possible that p->p_pgrp could change out from under us.


# 93719 03-Apr-2002 ru

Dike out a highly insecure UCONSOLE option.
TIOCCONS must be able to VOP_ACCESS() /dev/console to succeed.

Obtained from: OpenBSD


# 93679 02-Apr-2002 tanimura

Fix leakage of p_pgrp lock.


# 93593 01-Apr-2002 jhb

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# 92723 19-Mar-2002 alfred

Remove __P.


# 92252 13-Mar-2002 alfred

Fixes to make select/poll mpsafe.

Problem:
selwakeup required calling pfind which would cause lock order
reversals with the allproc_lock and the per-process filedesc lock.
Solution:
Instead of recording the pid of the select()'ing process into the
selinfo structure, actually record a pointer to the thread. To
avoid dereferencing a bad address all the selinfo structures that
are in use by a thread are kept in a list hung off the thread
(protected by sellock). When a selwakeup occurs the selinfo is
removed from that threads list, it is also removed on the way out
of select or poll where the thread will traverse its list removing
all the selinfos from its own list.

Problem:
Previously the PROC_LOCK was used to provide the mutual exclusion
needed to ensure proper locking, this couldn't work because there
was a single condvar used for select and poll and condvars can
only be used with a single mutex.
Solution:
Introduce a global mutex 'sellock' which is used to provide mutual
exclusion when recording events to wait on as well as performing
notification when an event occurs.

Interesting note:
schedlock is required to manipulate the per-thread TDF_SELECT
flag, however if given its own field it would not need schedlock,
also because TDF_SELECT is only manipulated under sellock one
doesn't actually use schedlock for syncronization, only to protect
against corruption.

Proc locks are no longer used in select/poll.

Portions contributed by: davidc


# 92069 11-Mar-2002 tanimura

Stop abusing the pgrpsess_lock.


# 91565 02-Mar-2002 tanimura

Fix lock leakage and late unlock.

Submitted by: bde


# 91199 24-Feb-2002 phk

Fix a typo (?) in previous commit told ttyprintf() to print the integer
part of the user-time as a 64bit quantity. This resulted in weird
output from SIGINFO.


# 91140 23-Feb-2002 tanimura

Lock struct pgrp, session and sigio.

New locks are:

- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.

Please refer to sys/proc.h for the coverage of these locks.

Changes on the pgrp/session interface:

- pgfind() needs the pgrpsess_lock held.

- The caller of enterpgrp() is responsible to allocate a new pgrp and
session.

- Call enterthispgrp() in order to enter an existing pgrp.

- pgsignal() requires a pgrp lock held.

Reviewed by: jhb, alfred
Tested on: cvsup.jp.FreeBSD.org
(which is a quad-CPU machine running -current)


# 90375 07-Feb-2002 peter

Fix a couple of style bugs introduced (or touched by) previous commit.


# 90361 07-Feb-2002 julian

Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,


# 86474 16-Nov-2001 peter

utime/stime.tv_sec are elapsed times, not relative to 1970. We can
safely print them as longs. Even if ^T overflows after a process
has accumulated 68 years of user or system time, it is no big deal.


# 86472 16-Nov-2001 peter

You cannot cast a time_t to quad_t and printf it with %lld. quad_t is
64 bits, not long long.


# 85844 01-Nov-2001 rwatson

o Move suser() calls in kern/ to using suser_xxx() with an explicit
credential selection, rather than reference via a thread or process
pointer. This is part of a gradual migration to suser() accepting
a struct ucred instead of a struct proc, simplifying the reference
and locking semantics of suser().

Obtained from: TrustedBSD Project


# 85654 28-Oct-2001 dillon

Make ttyprintf() of tv_sec value type agnostic.


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 83300 10-Sep-2001 peter

Fix a warning on alpha (real problem) and make pstat -t work as a bonus.
'struct tty' was out of sync in user and kernel due to dev_t/udev_t
mixups. This takes advantage of the fact that dev_t changes type in
userland, so it isn't too pretty.


# 81130 04-Aug-2001 tmm

Export the tk_nin and tk_nout variables (number of tty input/output
characters) as sysctls (kern.tty_nin and kern.tty_nout).


# 77017 22-May-2001 dd

Unifdef DEV_SNP; snp(4) no longer requires these ugly hacks.

Silence by: -hackers, -audit


# 76166 01-May-2001 markm

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


# 73929 07-Mar-2001 jhb

Grab the process lock while calling psignal and before calling psignal.


# 73421 04-Mar-2001 assar

implement OCRNL, ONOCR, and ONLRET

Obtained from: NetBSD


# 72576 17-Feb-2001 jlemon

Fix tab breakage from last commit.

Spotted by: bde


# 72521 15-Feb-2001 jlemon

Extend kqueue down to the device layer.

Backwards compatible approach suggested by: peter


# 72200 09-Feb-2001 bmilekic

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 71786 29-Jan-2001 peter

Turn '#if NSNP > 0' into an option.


# 71568 24-Jan-2001 jhb

- Catch up to proc flag changes.
- Assert sched_lock is held in proc_compare.


# 71307 20-Jan-2001 jhb

- All of proc_compare needs sched_lock, so hold it for the for loop that
calls it rather than obtaining and releasing it a lot in proc_compare.
- Collect all of the data gathering and stick it just after the
proc_compare loop. This way, we only have to grab sched_lock once now
when handling SIGINFO. All the printf's are done after the values are
calculated.

Submitted mostly by: bde


# 71285 20-Jan-2001 jhb

Be more careful with sched_lock in the SIGINFO handler. Specifically, do
not hold sched_lock while calling ttyprintf(). If we are on a serial
console, then ttyprintf() will end up getting the sio lock, resulting in
a lock order violation.

Noticed by: des


# 69781 08-Dec-2000 dwmalone

Convert more malloc+bzero to malloc+M_ZERO.

Submitted by: josh@zipperup.org
Submitted by: Robert Drehmel <robd@gmx.net>


# 69506 01-Dec-2000 jhb

Protect p_stat with sched_lock.


# 69322 28-Nov-2000 jkh

Kernel support for erase2 character.

Submitted by: Rui Pedro Mendes Salgueiro <rps@mat.uc.pt>


# 65557 06-Sep-2000 jasone

Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).

Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh


# 62573 04-Jul-2000 phk

Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.

Pointed out by: bde


# 62454 03-Jul-2000 phk

Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:

Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

-sysctl_vm_zone SYSCTL_HANDLER_ARGS
+sysctl_vm_zone (SYSCTL_HANDLER_ARGS)


# 60938 26-May-2000 jake

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


# 60833 23-May-2000 jake

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


# 59822 01-May-2000 ache

Move t_timeout initializing to ttyregister

Pointed-by: bde


# 59815 01-May-2000 ache

Set t_timeout to its default sysctl value only once in ttyopen
Initialize t_timeout to -1 for this reason

Pointed-by: bde


# 59786 30-Apr-2000 ache

Add sysctl variable to set initial drainwait timeout on ttyopen, default to
5 minutes


# 59707 27-Apr-2000 ache

Add default 5min timeout for output drain to stop hanging on exit or in other
places when connection dropped


# 59052 05-Apr-2000 archie

Fix a bug where SIGIO was not being delivered to a process requesting
async I/O when a tty device became writable.

PR: kern/8324
Submitted by: Don Lewis <Don.Lewis@tsc.tdk.com>


# 56755 28-Jan-2000 archie

Back out previous commit; it was premature.


# 56715 28-Jan-2000 archie

When an attempt to install a line discipline fails, check for
known KLD's that might support it, and load the KLD if found.
Currently the list includes SLIPDISC, PPPDISC, and NETGRAPHDISC.


# 53212 16-Nov-1999 phk

This is a partial commit of the patch from PR 14914:

Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY
structures for list operations. This patch makes all list operations
in sys/kern use the queue(3) macros, rather than directly accessing the
*Q_{HEAD,ENTRY} structures.

This batch of changes compile to the same object files.

Reviewed by: phk
Submitted by: Jake Burkholder <jake@checker.org>
PR: 14914


# 51791 29-Sep-1999 marcel

sigset_t change (part 2 of 5)
-----------------------------

The core of the signalling code has been rewritten to operate
on the new sigset_t. No methodological changes have been made.
Most references to a sigset_t object are through macros (see
signalvar.h) to create a level of abstraction and to provide
a basis for further improvements.

The NSIG constant has not been changed to reflect the maximum
number of signals possible. The reason is that it breaks
programs (especially shells) which assume that all signals
have a non-null name in sys_signame. See src/bin/sh/trap.c
for an example. Instead _SIG_MAXSIG has been introduced to
hold the maximum signal possible with the new sigset_t.

struct sigprop has been moved from signalvar.h to kern_sig.c
because a) it is only used there, and b) access must be done
though function sigprop(). The latter because the table doesn't
holds properties for all signals, but only for the first NSIG
signals.

signal.h has been reorganized to make reading easier and to
add the new and/or modified structures. The "old" structures
are moved to signalvar.h to prevent namespace polution.

Especially the coda filesystem suffers from the change, because
it contained lines like (p->p_sigmask == SIGIO), which is easy
to do for integral types, but not for compound types.

NOTE: kdump (and port linux_kdump) must be recompiled.

Thanks to Garrett Wollman and Daniel Eischen for pressing the
importance of changing sigreturn as well.


# 51756 28-Sep-1999 phk

Introduce ttyread() and ttywrite() which do the canonical thing.

Use them in many tty drivers.

Reviewed by: julian, bde


# 51658 25-Sep-1999 phk

Remove five now unused fields from struct cdevsw. They should never
have been there in the first place. A GENERIC kernel shrinks almost 1k.

Add a slightly different safetybelt under nostop for tty drivers.

Add some missing FreeBSD tags


# 51654 25-Sep-1999 phk

This patch clears the way for removing a number of tty related
fields in struct cdevsw:

d_stop moved to struct tty.
d_reset already unused.
d_devtotty linkage now provided by dev_t->si_tty.

These fields will be removed from struct cdevsw together with
d_params and d_maxio Real Soon Now.

The changes in this patch consist of:

initialize dev->si_tty in *_open()
initialize tty->t_stop
remove devtotty functions
rename ttpoll to ttypoll
a few adjustments to these changes in the generic code
a bump of __FreeBSD_version
add a couple of FreeBSD tags


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 49542 08-Aug-1999 phk

Enable ttymalloc().


# 49540 08-Aug-1999 phk

Add new sysctl "kern.ttys" which return all the struct tty's which have
been registered with ttyregister().

register ptys with ttyregister().


# 47407 22-May-1999 dt

Don't call calcru() on a swapped-out process. calcru() access p_stats, which
is in U-area.


# 46676 08-May-1999 phk

I got tired of seeing all the cdevsw[major(foo)] all over the place.

Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.


# 46568 06-May-1999 peter

Add sufficient braces to keep egcs happy about potentially ambiguous
if/else nesting.


# 46112 27-Apr-1999 phk

Suser() simplification:

1:
s/suser/suser_xxx/

2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.


# 44157 19-Feb-1999 luoqi

Introduce machine-dependent macro pgtok() to convert page count to number
of kilobytes. Its definition for each architecture could be optimized to
avoid potential numerical overflows.


# 44146 19-Feb-1999 luoqi

Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This
is the preparation step for moving pmap storage out of vmspace proper.

Reviewed by: Alan Cox <alc@cs.rice.edu>
Matthew Dillion <dillon@apollo.backplane.com>


# 43425 30-Jan-1999 phk

Use suser() to check for super user rather than examining cr_uid directly.
Use TTYDEF_SPEED rather than 9600 a couple of places.

Reviewed by: bde, with a few grumbles.


# 43351 28-Jan-1999 dillon

Fix warnings related to -Wall -Wcast-qual


# 42408 08-Jan-1999 eivind

Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as
discussed on -hackers.

Introduce 'KASSERT(assertion, ("panic message", args))' for simple
check + panic.

Reviewed by: msmith


# 41598 08-Dec-1998 bde

Backed out the FIOASYNC fix in rev.1.108. fcntl(fd, F_SETFL, flags)
depends on the bug. It does an FIOASYNC ioctl to sync the setting
of the O_ASYNC "file" flag with drivers even if the setting hasn't
changed.

PR: 9003


# 41578 07-Dec-1998 ache

Move stime declaration to main block, otherwise can left uninitialized
in rare cases.
Found by: Eivind Eklund <eivind@yes.no>


# 41286 22-Nov-1998 bde

Fixed some missing cases in the check for ioctls that involve modification.
Many (mostly machine-dependent ones) are still missing. NIST-PCTS found
this bug for all the ioctls used to implement the POSIX tc* functions
(TIOCCBRK, TIOCDRAIN, TIOCSPGRP, TIOCSBRK, TIOCSTART and TIOCSTOP), and
I found FIOASYNC, TIOCCONS, TIOCEXCL, TIOCHPCL, TIOCNXCL, TIOCSCTTY and
TIOCSDRAINWAIT by inspection. TIOCSPGRP was ifdefed out for some reason.

Handle tcsetattr()'s historical speed conversions correctly and more
centrally:
- don't store speeds of 0 in the final termios struct. Drivers can now
depend on tp->t_ispeed and tp->t_ospeed giving the actual speed.
Applications can now depend on tcgetattr() being POSIX.1 conformant.
- convert from a proposed input speed of 0 to the proposed output speed
(except if that is 0, convert to the current output speed). Drivers
can now depend on the proposed input speed being nonzero.
- don't reject negative speeds. Negative speeds can't happen now that
speed_t is unsigned, and rejecting invalid speeds is a bug - tcsetattr()
is supposed to succeed if it can "perform any of the requested actions",
so it shouldn't fail in practice.


# 41086 11-Nov-1998 truckman

Installed the second patch attached to kern/7899 with some changes suggested
by bde, a few other tweaks to get the patch to apply cleanly again and
some improvements to the comments.

This change closes some fairly minor security holes associated with
F_SETOWN, fixes a few bugs, and removes some limitations that F_SETOWN
had on tty devices. For more details, see the description on the PR.

Because this patch increases the size of the proc and pgrp structures,
it is necessary to re-install the includes and recompile libkvm,
the vinum lkm, fstat, gcore, gdb, ipfilter, ps, top, and w.

PR: kern/7899
Reviewed by: bde, elvind


# 38434 19-Aug-1998 bde

A limit of 200000 for the output buffer high watermark was excessive,
since (hardware) ttys have too low a bandwidth to benefit significantly
from large buffers. Use twice the old limit for the new-default case
and 8 times the old limit for the driver-specifies-watermark case.
Nothing uses these cases yet.

Removed related debugging code.


# 37559 11-Jul-1998 bde

Fixed printf format errors.


# 36735 07-Jun-1998 dfr

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


# 36127 17-May-1998 bde

Fixed stale references to hzto() in comments.


# 34961 30-Mar-1998 phk

Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time. Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by: bde


# 34185 07-Mar-1998 bde

Set the input and output buffer sizes and the input buffer watermarks
dynamically depending on the line speed(s). This should give the old
sizes and watermarks until drivers are changed.

Display the input watermarks in pstat and sicontrol.


# 31778 16-Dec-1997 eivind

Make COMPAT_43 and COMPAT_SUNOS new-style options.


# 31577 06-Dec-1997 bde

Use ENOIOCTL instead of -1 (= ERESTART) for tty ioctls that are
not handled at a particular level. This fixes mainly restarting
of interrupted TIOCDRAINs and TIOCSETA{W,F}s.


# 31016 07-Nov-1997 phk

Remove a bunch of variables which were unused both in GENERIC and LINT.

Found by: -Wunused


# 30354 12-Oct-1997 phk

Last major round (Unless Bruce thinks of somthing :-) of malloc changes.

Distribute all but the most fundamental malloc types. This time I also
remembered the trick to making things static: Put "static" in front of
them.

A couple of finer points by: bde


# 29354 14-Sep-1997 peter

Extend to use poll backend. If memory serves correctly, most of this was
adapted from NetBSD.. However, there are some differences in the tty
system that are big enough to cause their code to not fit comfortably.

Obtained from: NetBSD (I think)


# 29041 02-Sep-1997 bde

Removed unused #includes.


# 24207 24-Mar-1997 bde

Don't include <sys/ioctl.h> in the kernel. Stage 5: include
<sys/ioctl_compat.h> and sometimes <sys/filio.h> instead of
<sys/ioctl.h> in tty-related files. <sys/ttycom.h> is still
usually imported bogusly via <sys/termios.h>.


# 24131 23-Mar-1997 bde

Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined.
Fixed everything that depended on getting fcntl.h stuff from the wrong
place. Most things don't depend on file.h stuff at all.


# 24101 22-Mar-1997 bde

Fixed some invalid (non-atomic) accesses to `time', mostly ones of the
form `tv = time'. Use a new function gettime(). The current version
just forces atomicicity without fixing precision or efficiency bugs.
Simplified some related valid accesses by using the central function.


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 20025 29-Nov-1996 bde

Help broken d_stop() routines by flushing the output queue before
calling them (as well as after).

Found by: NIST PCTS


# 20024 29-Nov-1996 bde

Fixed bugs handling (background) orphaned process groups. tty
writes and tty ioctls by processes in such groups must return
-1/EIO, but they were allowed. tty reads were handled correctly.

Found by: NIST PCTS


# 20021 29-Nov-1996 bde

Fixed some bugs in BREAK handling. If BRKINT is set, then always flush
the queues and generate a SIGINT. Previously, this wasn't done if ISIG
was clear or the VINTR character was disabled, and it was done by
converting the BREAK to a VINTR character and sometimes bogusly echoing
this character.

Found by: NIST-PCTS


# 20020 29-Nov-1996 bde

Fixed handling of non-POSIX control characters. They must not do
anything special unless IEXTEN is set.

Found by: NIST-PCTS


# 17974 31-Aug-1996 bde

Fixed the easy cases of const poisoning in the kernel. Cosmetic.


# 17876 28-Aug-1996 bde

Fixed a wrong comment. Did tsleep() ever return the networking errno
ETIMEDOUT?


# 16322 12-Jun-1996 gpalmer

Clean up -Wunused warnings.

Reviewed by: bde


# 15543 02-May-1996 phk

removed:
CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei()
ptei() kvtopte() ptetov() ispt() ptetoav() &c &c
new:
NPDEPG

Major macro cleanup.


# 14529 11-Mar-1996 hsu

From Lite2: proc LIST changes.
Reviewed by: david & bde


# 14328 02-Mar-1996 peter

Add more options into the conf/options and i386/conf/options.i386 files
and the #include hooks so that 'make depend' is more useful. This
covers most of the options I regularly use (but not all) and some other
easy ones.


# 12900 16-Dec-1995 bde

Oops, the last commit missed one change from 200 to OBUFSIZ + 100.


# 12856 15-Dec-1995 bde

Changed the maximum output queue count from (TTMAXHIWAT + 200) to
(TTMAXHIWAT + OBUFSIZ + 100) in case someone changes OBUFSIZ. 200
was to allow 100 above high water for ordinary writes and another
100 for kernel printfs.

Increased the reserved output queue count from 512 to the maximum
output queue count. This prevents exhaustion of clists and increases
the output throughput for 8 cy lines by almost a factor of 2 (on
a system where there aren't many other open ttys so clists become
exhausted after about 4 active lines (or earlier if TTMAXHIWAT is
increased :-]).

ttwrite() behaves very badly when clists are exhausted:
(1) it sleeps on lbolt instead of on TSA_OLOWAT(tp).
This could be fixed adequately by sleeping on TSA_OLOWAT(tp).
The nonzero reserved count guaratees that space will become
available independent of other ttys, and a reserved count
of 512 is barely enough for efficiency.
(2) it drops output if space runs out in the middle of special
output processing. This is too hard to fix without hardening
the reserved count. The watermark processing guarantees that
space doesn't run out only if the advertised space is guaranteed.

Increasing the reserved output queue count defeats the point of
dynamic allocation of clists. Previously, about 2K of memory per
tty was reserved (the raw queue was already reserved). Now, about
3.5K is reserved. Reserving everything would take a whole 0.5K
more.


# 12855 14-Dec-1995 bde

Disabled the sleep in ttyflush(). It can't work in general because
ttyflush() might be called from an interrupt handler. This fixes
panics in IXOFF mode at the cost of more failures to send the START
character to exit from IXOFF mode.


# 12841 14-Dec-1995 bde

Restored unused function ttrstrt(). It would be used if the low level
drivers supported inter-character delays.


# 12819 14-Dec-1995 phk

A Major staticize sweep. Generates a couple of warnings that I'll deal
with later.
A number of unused vars removed.
A number of unused procs removed or #ifdefed.


# 12813 13-Dec-1995 julian

devsw tables are now arrays of POINTERS to struct [cb]devsw
seems to work hre just fine though I can't check every file
that changed due to limmited h/w, however I've checked enught to be petty
happy withe hte code..

WARNING... struct lkm[mumble] has changed
so it might be an idea to recompile any lkm related programs


# 12662 07-Dec-1995 dg

Untangled the vm.h include file spaghetti.


# 11965 31-Oct-1995 bde

Fixed initialization of TS_CONNECTED bit in t_state. It wasn't
set in open() when CLOCAL is set unless carrier is present.

Fixed initialization of line discipline. It lived across opens.
Lines that started with the wrong discipline probably didn't work
at all, because TS_ISOPEN is only set by TTYDISC.


# 10658 10-Sep-1995 bde

Fix wakeups for TIOCDRAINWAIT. The conditional wakeups introduced in rev
1.59 defeated the point of doing the wakeups (having reduced timeouts
take effect immediately).


# 9851 02-Aug-1995 ache

Check for valid speed values in pty drive
Check for negative speed values in tty drive
Back out valid speed values checking from tty drive
Suggested by: bde


# 9848 01-Aug-1995 ache

Optimize a bit valid speed search using fact that speed table sorted
Submitted by:
Obtained from:


# 9847 01-Aug-1995 ache

Check for valid speeds in TIOCSET* and return EINVAL for incorrect
values instead of setting garbadge.


# 9833 31-Jul-1995 bde

Obtained from: partly from ancient patches of mine via 1.1.5

Change all short variables in `struct tty' to int. Shorts were only
right on ancient systems with ints optimized for vaxness over
efficiency.


# 9832 31-Jul-1995 bde

Obtained from: partly from ancient patches of mine via 1.1.5

Handle MDMBUF a little better. Prepare to handle 4 different kinds of
output flow control.


# 9829 31-Jul-1995 bde

Obtained from: an ancient patch of mine via 1.1.5

Clear PENDIN when input is flushed so that the handling of future input
doesn't get pessimized.


# 9824 31-Jul-1995 bde

Obtained from: partly from ancient patches of mine via 1.1.5

Introduce TS_CONNECTED and TS_ZOMBIE states. TS_CONNECTED is set
while a connection is established. It is set while (TS_CARR_ON or
CLOCAL is set) and TS_ZOMBIE is clear. TS_ZOMBIE is set for on to
off transitions of TS_CARR_ON that occur when CLOCAL is clear and
is cleared for off to on transitions of CLOCAL. I/o can only occur
while TS_CONNECTED is set. TS_ZOMBIE prevents further i/o.

Split the input-event sleep address TSA_CARR_ON(tp) into TSA_CARR_ON(tp)
and TSA_HUP_OR_INPUT(tp). The former address is now used only for
off to on carrier transitions and equivalent CLOCAL transitions.
The latter is used for all input events, all carrier transitions
and certain CLOCAL transitions. There are some harmless extra
wakeups for rare connection- related events. Previously there were
too many extra wakeups for non-rare input events.

Drivers now call l_modem() instead of setting TS_CARR_ON directly
to handle even the initial off to on transition of carrier. They
should always have done this. l_modem() now handles TS_CONNECTED
and TS_ZOMBIE as well as TS_CARR_ON.

gnu/isdn/iitty.c:
Set TS_CONNECTED for first open ourself to go with bogusly setting
CLOCAL.

i386/isa/syscons.c, i386/isa/pcvt/pcvt_drv.c:
We fake carrier, so don't also fake CLOCAL.

kern/tty.c:
Testing TS_CONNECTED instead of TS_CARR_ON fixes TIOCCONS forgetting to
test CLOCAL. TS_ISOPEN was tested instead, but that broke when we disabled
the clearing of TS_ISOPEN for certain transitions of CLOCAL.

Testing TS_CONNECTED fixes ttyselect() returning false success for output
to devices in state !TS_CARR_ON && !CLOCAL.

Optimize the other selwakeup() call (this is not related to the other
changes).

kern/tty_pty.c:
ptcopen() can be declared in traditional C now that dev_t isn't short.


# 9823 31-Jul-1995 bde

Assorted cosmetic changes:

Make more functions static.

tty.c:
Use tcflag_t (u_long) and cc_t instead of u_char and int/long.

Don't record values that are only evaluated once.

Compare ints using imin(), not min(). min() is for comparing u_ints.
Old versions of tty.c used the type-safe but multiple-evaluation-unsafe
macro MIN(). The args are apparently never negative; otherwise this
change would be non-cosmetic.

Don't repeat the loop test in ttywait().

tty.h:
Improve English in and formatting of comments.


# 9822 31-Jul-1995 bde

Improve input flow control.

Use input buffer watermarks of TTYHOG-512 (high) and (high)*7/8
(low) instead of TTYHOG/2 (high) and TTYHOG/5 (low) to agree with
some drivers. 512 is magic and some things depended on TTYHOG/2
>= TTYHOG-512 to work; now they depend on the 512 magic not changing
and TTYHOG-512 being significantly larger than 0. This should be
handled in ttsetwater().

Separate the decision about whether to do input flow control from
doing it. ttyblock() now just starts input flow control (hardware
and/or software) and there is a new function ttyunblock() to stop
it. The decisions are the same except for the watermark changes
and allowing for input expansion for PARMRK.

When flushing input, try harder at first to send a start character
if required, but give up if the first attempt fails.

cy.c, rc.c, sio.c:
Simplify: let ttyinput() handle input flow control if it is not
being bypassed. Use ttyblock() to start flow control otherwise.

rc.c:
Use same input flow control test as elsewhere: test in a more
efficient order and start flow control at >= highwater instead of
at > highwater.


# 9796 30-Jul-1995 bde

Don't swap the queue headers to implement concatenation of the
queues for TIOCSETA[W]. Swapping an even number of times broke
the queue resource limits. This would have broken CRTSCTS flow
control if the clist slush list was used up.

Don'concatenate the queues for TIOCSETA[W] if one of the queues
has a resource limit of 0. Concatenation would cause a panic if
one of the queues is nonempty and the other is limited to length
0. This may have caused panics in PPPDISC.

Wake up readers after all transitions of ICANON. When ICANON is
turned off it is quite likely that characters will become available
to be read.

Reduce indentation near these changes.


# 9790 30-Jul-1995 bde

Split TS_ASLEEP (sleep on output [below low water])into TS_SO_OLOWAT (sleep
on output below low water) and TS_SO_OCOMPLETE (sleep on output complete).
Most of the support for this has already been committed. Drivers should
call ttwwakeup() to handle wakeups whenever output is below low water
(and some output event causes this condition to be checked) or TS_BUSY is
cleared.

tty.c:
Fix the livelock in ttywait() properly by sleeping on output complete, not
on output below low water.

Use ttwwakeup() instead of separate select and output wakeups for all
wakeups of writers.

Add wakeups of writers for output flushes and carrier/clocal transitions.

Don't go to sleep in ttycheckoutq() if ttstart() reduces the queue to below
low water.

Use the timeout built into tsleep() in ttycheckoutq().

Optimize the select wakeup in ttwwakeup(). It seems reasonable to know
too much about the internals of tp->t_wsel now that the knowledge is
localised in tty.c.


# 9763 29-Jul-1995 bde

Obtained from: partly from ancient patches by ache and me via 1.1.5

Remove nullmodem().

It may be useful to have a null modem routine, but nullmodem()
wasn't one. nullmodem() was identical to ttymodem() except it
didn't implement MDMBUF (carrier) flow control, didn't do any
wakeups for off to on carrier transitions, and didn't flush the
i/o queues for on to off carrier transitions (flushing has the side
effect of waking up readers and writers) although it did generate
SIGHUPs. The wakeups must normally be done even if nullmodem() is
null in case something is sleeping waiting for a carrier transition.
In any case, the wakeups should be harmless. They may cause bogus
results for select(), but select() is already bogus for nonstandard
line disciplines.


# 9639 22-Jul-1995 bde

Obtained from: partly from ancient patches of mine via 1.1.5

Give names to the magic tty i/o sleep addresses and use them. This makes
it easier to remember what the addresses are for and to keep them unique.


# 9626 21-Jul-1995 bde

Move the inline code for waking up writers to a new function
ttwwakeup(). The conditions for doing the wakeup will soon become
more complicated and I don't want them duplicated in all drivers.

It's probably not worth making ttwwakeup() a macro or an inline
function. The cost of the function call is relatively small when
there is a process to wake up. There is usually a process to wake
up for large writes and the system call overhead dwarfs the function
call overhead for small writes.


# 9625 21-Jul-1995 bde

Obtained from: partly from ancient patches of mine via 1.1.5

Move static termioschars() from a couple of drivers to tty.c. Now there
is only one copy of ttydefchars[].


# 9624 21-Jul-1995 bde

Obtained from: partly from ancient patches by ache and me via 1.1.5

Nuke `symbolic sleep message strings'. Use unique literal messages so that
`ps l' shows unambiguously where processes are sleeping.


# 9622 21-Jul-1995 bde

Obtained from: partly from anancient patch of mine via 1.1.5

Fix races for FIONREAD, TIOCSTI and TIOCSTAT.


# 9619 21-Jul-1995 bde

Obtained from: partly from an ancient patch of mine via 1.1.5

Temporarily nuke TS_WOPEN. It was only used for the obscure MDMBUF
flow control option in the kernel and for informational purposes
in `pstat -t'. The latter worked properly only for ptys. In
general there may be multiple processes sleeping in open() and
multiple processes that successfully opened the tty by opening it
in O_NONBLOCK mode or during a window when CLOCAL was set. tty.c
doesn't have enough information to maintain the flag but always
cleared it in ttyopen().

TS_WOPEN should be restored someday just so that `pstat -t' can
display it (MDMBUF is already fixed). Fixing it requires counting
of processes sleeping in open() in too many serial drivers.


# 9617 21-Jul-1995 bde

Obtained from: an ancient patch of mine via 1.1.5

Don't put partial PARMRK escape sequences in the input queue. Use
MAX_INPUT = TTYHOG instead of TTYHOG directly for the maximum input
queue size. Don't use the bogus MAX_INPUT advertised in
<sys/syslimits.h>.


# 9616 21-Jul-1995 bde

Add to TODO list and move it to near the top of the file.


# 9615 21-Jul-1995 bde

Obtained from: ancient usenet posting as applied to 1.1.5
First of many changes required to restore lost stability to the tty
driver.

ECHONL is supposed to enable echoing of NL when ECHO is off, but it
enabled echoing of everything except NL.


# 9293 24-Jun-1995 ache

ttywait: convert EWOULDBLOCK to EIO, when t_timeout expired


# 9289 23-Jun-1995 ache

Replace EWOULDBLOCK to EIO in ttwrite, when t_timeout expired


# 9202 11-Jun-1995 rgrimes

Merge RELENG_2_0_5 into HEAD


# 8876 30-May-1995 rgrimes

Remove trailing whitespace.


# 8337 07-May-1995 ache

Make two "ttyout" ttysleep wmesg unique
Add t_timeout to ttysleep call into ttywrite


# 8318 07-May-1995 bde

Test the correct nonblocking flag in ttylclose(). IO_NDELAY is only valid
in read() and write(). FNONBLOCK is valid in ioctl() and close().

The bug caused hung ptys when a process talked to itself using nonblocking
i/o and exited while the slave pty had output to flush. ttywait() was
called and hung. Signals didn't work because the process was exiting.
`comcontrol /dev/ttyp0 drainwait 1' worked to terminate the wait. This
shows that comcontrol is not limited to hardware control. It has no i386
or driver dependencies and doesn't belong in src/sbin/i386.

Bruce


# 7851 15-Apr-1995 bde

Speed up ttnread() in the !(ICANON | ISIG) case by copying to user space
through a temporary buffer instead of one character at a time. The old
method takes about 6 usec/char on a 486DX2/66. This is larger than than
the combined interrupt and PIO overhead for a 16550!

This change was first implemented in 1.1.5. It was rewritten for 2.1.
The clist access functions allow a simpler implementation at some cost
in correctness and speed. There needs to be an ungetc() function to
recover from EFAULT, and it wastes time to copy through a temporary
buffer.

Don't snoop on single characters that weren't read due to EFAULT.
Rewrite a snoop comment in my approximation to English.

Undo bogus exportation of ttnread().


# 7470 29-Mar-1995 ache

Oops, fix typing error in prev. commit


# 7469 29-Mar-1995 ache

Handle TTY_BI now instead of TTY_FE && c == 0


# 7466 29-Mar-1995 ache

Move parmark 0377 double code after control chars processing


# 7443 28-Mar-1995 ache

ttyinput() fixes:

1) Preserve old buffer contents when input buffer overflows.

Old code clear buffer and rewrite it again, if !MAXBEL
(for MAXBEL it does right thing :-).
F.e. if you type too long string, last chars passed,
not first ones as expected.
Moreover, it flush output queue too in this case without any needs.

2) Don't do IXOFF, if IGNCR and c==\r, ignore completely.

3) If PARMRK is active and !ISTRIP and char == 0377
put yet one 0377 to distinguish it from parity mark sequence.
POSIX standard (thanx Bruce).

Reviewed by:
Submitted by:
Obtained from:
CVS:


# 7439 28-Mar-1995 ache

Bug fixed:
parity/framing/break not completely ignored when IGN* is set
but cause output restarted.
CVS:


# 7090 16-Mar-1995 bde

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 6790 28-Feb-1995 ache

Workaround IXOFF bug when output queue is full && RTS control is on


# 6782 27-Feb-1995 pst

Incorporate bde's code-review comments.

(a) bring back ttselect, now that we have xxxdevtotty() it isn't dangerous.
(b) remove all of the wrappers that have been replaced by ttselect
(c) fix formatting in syscons.c and definition in syscons.h
(d) add cxdevtotty

NOT DONE:
(e) make pcvt work... it was already broken...when someone fixes pcvt to
link properly, just rename get_pccons to xxxdevtotty and we're done


# 6774 27-Feb-1995 ugen

same


# 6712 25-Feb-1995 pst

(a) remove the pointer to each driver's tty structure array from cdevsw
(b) add a function callback vector to tty drivers that will return a pointer
to a valid tty structure based upon a dev_t
(c) make syscons structures the same size whether or not APM is enabled so
utilities don't crash if NAPM changes (and make the damn kernel compile!)
(d) rewrite /dev/snp ioctl interface so that it is device driver and i386
independant


# 6678 24-Feb-1995 ache

Add releasing of input flow control into
ttyflush(FREAD)


# 6657 23-Feb-1995 ache

Add two IXOFF checks to not confuse with CRTS_IFLOW.
Now TS_TBLOCK used as general input flow flag
for both IXOFF and CRTS_IFLOW cases.


# 6642 22-Feb-1995 ache

Revive hadrware input flow control
Submitted by: iverson@lionheart.com


# 6469 15-Feb-1995 ache

Restore deleted in second time my & bde fixes.
UGEN STOP IT!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


# 6455 15-Feb-1995 ugen

More changes to support user calls.
It's 22:00 here,utility still to come(hopefully tomorrow
morning..)


# 6445 15-Feb-1995 bde

Remove duplicated code from ttymalloc(). Disable ttyfree().

Restore fixes to flushing that were lost in the previous commit.

Clean up snoop changes.

Add my TODO list from 1.1.5. The improvements in 1.1.5 should be "obtained"
first.


# 6390 14-Feb-1995 ugen

Here it came-the all-brand-new snoop device..
Users-beware..
It is tested and working for me but probably have some bugs i
didn't noticed so test it and reply...
It can:
look at what's sent to the user from tty device
snoop on pty's,vty's and serial tty's
It (still) can't:
write to tty
see what user types in local echo mode
It is probably bad styled and
very dependant on tty_pty.c,sio.c and syscons.c
I would be really happy if another ppl would make their
changes because i am not sure this is the best snoop
we can have..but it is good..:)))))


# 6336 13-Feb-1995 ache

Purge queues in ttylclose(), if ttywflush() failed


# 6334 13-Feb-1995 ache

Replace previous fix with less agressive, just return EIO
if ttywait can't drain queue.


# 6331 12-Feb-1995 ache

1. If some output still present at the end of ttywait, kill it.
2. Even if ttywait() fails, call ttyflush(FREAD) in ttywflush.
This two fixes guarantee that queues are empty after calling ttywflush()
in any case


# 6267 09-Feb-1995 jkh

>32 PTY support
Submitted by: Heikki Suonsivu <hsu@cs.hut.fi>


# 6254 08-Feb-1995 bde

Disable bogus attempt to switch from the TS_ISOPEN state to the
TS_WOPEN state when CLOCAL is toggled from on to off while there
is no carrier. There is no way back, and with sio there is no way
forward either (TS_ISOPEN will never be set again for the current
open). This bug was observed in 1.1 and was fixed in 1.1.5.


# 6027 30-Jan-1995 bde

Increase the reserved clist space for the raw queue from 512 to TTYHOG.
This might help avoid tty buffer overflows on loaded systems.


# 5415 06-Jan-1995 bde

Fix error handling for new TIOCSDRAINWAIT ioctl.


# 5396 04-Jan-1995 ache

Fight against hanging modems: add timeout to ttywait.
Reviewed by: Bruce


# 4937 03-Dec-1994 ache

Call d_stop in ttyflush not only for WRITE but for READ too
Obtained from: 1.1.5.1


# 4825 26-Nov-1994 bde

Fix cblock starvation bugs by reserving enough cblocks for minimal
operation of each clist. Limit the growth of each clist. Clists
can only grow larger than the reserved minimum if there are free
cblocks in a shared pool. The size of this pool is now fixed
(this could be improved). The reserved and maximum sizes are more
carefully allocated for slip and ppp, depending on the mtu. A maximum
MTU of 16384 is now enforced for ppp.


# 4824 26-Nov-1994 bde

Don't block for output in non-blocking mode if clists run out.

Remove an unnecessary test (if the output queue is above high water
then it is nonempty).


# 4061 01-Nov-1994 bde

Return immediately from ttwrite() if the ttysleep()s that wait for
a clist return with an error. There are some clist starvation/deadlock
bugs elsewhere and killing clist hogs didn't help because the breaks
only exited from the inner loops.


# 3808 23-Oct-1994 dg

Round down instead of up in 'kerninfo'/ctrl-T stats code. Incorrect output
can result otherwise.

Submitted by: John Dyson


# 3619 15-Oct-1994 ache

ttywait: check conditions again right after oproc
Obtained from: (I know, but can't say :-)


# 3441 08-Oct-1994 phk

Cosmetics: added ()'s and fixed prinf-formats to make gcc silent.


# 3396 06-Oct-1994 dg

Use tsleep() rather than sleep so that 'ps' is more informative about
the wait.


# 3323 02-Oct-1994 ache

Add VMIN/VTIME support
Obtained from: scratch :-)


# 3308 02-Oct-1994 phk

All of this is cosmetic. prototypes, #includes, printfs and so on. Makes
GCC a lot more silent.


# 2106 18-Aug-1994 dg

Added support for TIOCSTAT ioctl. This allows shells that use raw/cbreak
tty modes to process a control-T and do the right thing.


# 1817 02-Aug-1994 dg

Added $Id$


# 1809 01-Aug-1994 dg

Allow for output processing routine to handle entire character buffer
and thus not require a sleep/wakeup.
Reviewed by:
Submitted by:


# 1792 06-Jul-1994 dg

Added code to allocate and deallocate a number of cblocks on open/close of
a tty.
Note that this might conflict with the collateral use of TS_WOPEN, but
for the moment I can find no problems associated with this. (TS_WOPEN
will likely go away in the future anyway). This should be looked at
again in the future (the potential problem is that the cblock pool
may either run out or accumulate too many cblocks).


# 1549 25-May-1994 rgrimes

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


# 1542 24-May-1994 rgrimes

This commit was generated by cvs2svn to compensate for changes in r1541,
which included commits to RCS files with non-trunk default branches.


# 1541 24-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources