History log of /freebsd-9.3-release/sys/compat/linux/
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
301049 31-May-2016 glebius

Fix kernel stack disclosure in Linux compatibility layer. [SA-16:20]
Fix kernel stack disclosure in 4.3BSD compatibility layer. [SA-16:21]

Security: SA-16:20
Security: SA-16:21
Approved by: so

293896 14-Jan-2016 glebius

o Fix invalid TCP checksums with pf(4). [EN-16:02.pf]
o Fix YP/NIS client library critical bug. [EN-16:03.yplib]
o Fix SCTP ICMPv6 error message vulnerability. [SA-16:01.sctp]
o Fix ntp panic threshold bypass vulnerability. [SA-16:02.ntp]
o Fix Linux compatibility layer incorrect futex handling. [SA-16:03.linux]
o Fix Linux compatibility layer setgroups(2) system call. [SA-16:04.linux]
o Fix TCP MD5 signature denial of service. [SA-16:05.tcp]
o Fix insecure default bsnmpd.conf permissions. [SA-16:06.bsnmpd]

Errata: FreeBSD-EN-16:02.pf
Errata: FreeBSD-EN-16:03.yplib
Security: FreeBSD-SA-16:01.sctp, CVE-2016-1879
Security: FreeBSD-SA-16:02.ntp, CVE-2015-5300
Security: FreeBSD-SA-16:03.linux, CVE-2016-1880
Security: FreeBSD-SA-16:04.linux, CVE-2016-1881
Security: FreeBSD-SA-16:05.tcp, CVE-2016-1882
Security: FreeBSD-SA-16:06.bsnmpd, CVE-2015-5677
Approved by: so


/freebsd-9.3-release/UPDATING
/freebsd-9.3-release/contrib/ntp
/freebsd-9.3-release/contrib/ntp/ChangeLog
/freebsd-9.3-release/contrib/ntp/CommitLog
/freebsd-9.3-release/contrib/ntp/NEWS
/freebsd-9.3-release/contrib/ntp/configure
/freebsd-9.3-release/contrib/ntp/html/miscopt.html
/freebsd-9.3-release/contrib/ntp/include/Makefile.am
/freebsd-9.3-release/contrib/ntp/include/Makefile.in
/freebsd-9.3-release/contrib/ntp/include/ntp_refclock.h
/freebsd-9.3-release/contrib/ntp/include/ntp_stdlib.h
/freebsd-9.3-release/contrib/ntp/include/ntp_worker.h
/freebsd-9.3-release/contrib/ntp/include/ntpd.h
/freebsd-9.3-release/contrib/ntp/include/safecast.h
/freebsd-9.3-release/contrib/ntp/lib/isc/backtrace.c
/freebsd-9.3-release/contrib/ntp/lib/isc/buffer.c
/freebsd-9.3-release/contrib/ntp/lib/isc/inet_aton.c
/freebsd-9.3-release/contrib/ntp/lib/isc/inet_pton.c
/freebsd-9.3-release/contrib/ntp/lib/isc/log.c
/freebsd-9.3-release/contrib/ntp/lib/isc/netaddr.c
/freebsd-9.3-release/contrib/ntp/lib/isc/sockaddr.c
/freebsd-9.3-release/contrib/ntp/lib/isc/task.c
/freebsd-9.3-release/contrib/ntp/lib/isc/win32/interfaceiter.c
/freebsd-9.3-release/contrib/ntp/lib/isc/win32/net.c
/freebsd-9.3-release/contrib/ntp/libntp/a_md5encrypt.c
/freebsd-9.3-release/contrib/ntp/libntp/atolfp.c
/freebsd-9.3-release/contrib/ntp/libntp/authkeys.c
/freebsd-9.3-release/contrib/ntp/libntp/authreadkeys.c
/freebsd-9.3-release/contrib/ntp/libntp/authusekey.c
/freebsd-9.3-release/contrib/ntp/libntp/dolfptoa.c
/freebsd-9.3-release/contrib/ntp/libntp/hextolfp.c
/freebsd-9.3-release/contrib/ntp/libntp/mstolfp.c
/freebsd-9.3-release/contrib/ntp/libntp/msyslog.c
/freebsd-9.3-release/contrib/ntp/libntp/ntp_crypto_rnd.c
/freebsd-9.3-release/contrib/ntp/libntp/ntp_lineedit.c
/freebsd-9.3-release/contrib/ntp/libntp/ntp_rfc2553.c
/freebsd-9.3-release/contrib/ntp/libntp/ntp_worker.c
/freebsd-9.3-release/contrib/ntp/libntp/snprintf.c
/freebsd-9.3-release/contrib/ntp/libntp/socktohost.c
/freebsd-9.3-release/contrib/ntp/libntp/systime.c
/freebsd-9.3-release/contrib/ntp/libntp/work_thread.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_computime.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_dcf7000.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_hopf6021.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_meinberg.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_rawdcf.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_rcc8000.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_schmid.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_trimtaip.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_varitext.c
/freebsd-9.3-release/contrib/ntp/libparse/clk_wharton.c
/freebsd-9.3-release/contrib/ntp/libparse/parse.c
/freebsd-9.3-release/contrib/ntp/ntpd/invoke-ntp.conf.texi
/freebsd-9.3-release/contrib/ntp/ntpd/invoke-ntp.keys.texi
/freebsd-9.3-release/contrib/ntp/ntpd/invoke-ntpd.texi
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.conf.5man
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.conf.5mdoc
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.conf.html
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.conf.man.in
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.conf.mdoc.in
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.keys.5man
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.keys.5mdoc
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.keys.html
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.keys.man.in
/freebsd-9.3-release/contrib/ntp/ntpd/ntp.keys.mdoc.in
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_control.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_crypto.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_io.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_loopfilter.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_parser.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_proto.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_refclock.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_request.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_restrict.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_signd.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_timer.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntp_util.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntpd-opts.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntpd-opts.h
/freebsd-9.3-release/contrib/ntp/ntpd/ntpd.1ntpdman
/freebsd-9.3-release/contrib/ntp/ntpd/ntpd.1ntpdmdoc
/freebsd-9.3-release/contrib/ntp/ntpd/ntpd.c
/freebsd-9.3-release/contrib/ntp/ntpd/ntpd.html
/freebsd-9.3-release/contrib/ntp/ntpd/ntpd.man.in
/freebsd-9.3-release/contrib/ntp/ntpd/ntpd.mdoc.in
/freebsd-9.3-release/contrib/ntp/ntpd/refclock_local.c
/freebsd-9.3-release/contrib/ntp/ntpd/refclock_parse.c
/freebsd-9.3-release/contrib/ntp/ntpd/refclock_shm.c
/freebsd-9.3-release/contrib/ntp/ntpd/refclock_true.c
/freebsd-9.3-release/contrib/ntp/ntpd/refclock_tsyncpci.c
/freebsd-9.3-release/contrib/ntp/ntpdate/ntpdate.c
/freebsd-9.3-release/contrib/ntp/ntpdc/invoke-ntpdc.texi
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc-opts.c
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc-opts.h
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc.1ntpdcman
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc.1ntpdcmdoc
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc.c
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc.h
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc.html
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc.man.in
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc.mdoc.in
/freebsd-9.3-release/contrib/ntp/ntpdc/ntpdc_ops.c
/freebsd-9.3-release/contrib/ntp/ntpq/invoke-ntpq.texi
/freebsd-9.3-release/contrib/ntp/ntpq/libntpq.c
/freebsd-9.3-release/contrib/ntp/ntpq/libntpq.h
/freebsd-9.3-release/contrib/ntp/ntpq/libntpq_subs.c
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq-opts.c
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq-opts.h
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq-subs.c
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq.1ntpqman
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq.1ntpqmdoc
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq.c
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq.h
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq.html
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq.man.in
/freebsd-9.3-release/contrib/ntp/ntpq/ntpq.mdoc.in
/freebsd-9.3-release/contrib/ntp/ntpsnmpd/invoke-ntpsnmpd.texi
/freebsd-9.3-release/contrib/ntp/ntpsnmpd/ntpsnmpd-opts.c
/freebsd-9.3-release/contrib/ntp/ntpsnmpd/ntpsnmpd-opts.h
/freebsd-9.3-release/contrib/ntp/ntpsnmpd/ntpsnmpd.1ntpsnmpdman
/freebsd-9.3-release/contrib/ntp/ntpsnmpd/ntpsnmpd.1ntpsnmpdmdoc
/freebsd-9.3-release/contrib/ntp/ntpsnmpd/ntpsnmpd.html
/freebsd-9.3-release/contrib/ntp/ntpsnmpd/ntpsnmpd.man.in
/freebsd-9.3-release/contrib/ntp/ntpsnmpd/ntpsnmpd.mdoc.in
/freebsd-9.3-release/contrib/ntp/packageinfo.sh
/freebsd-9.3-release/contrib/ntp/scripts/calc_tickadj/Makefile.am
/freebsd-9.3-release/contrib/ntp/scripts/calc_tickadj/Makefile.in
/freebsd-9.3-release/contrib/ntp/scripts/calc_tickadj/calc_tickadj.1calc_tickadjman
/freebsd-9.3-release/contrib/ntp/scripts/calc_tickadj/calc_tickadj.1calc_tickadjmdoc
/freebsd-9.3-release/contrib/ntp/scripts/calc_tickadj/calc_tickadj.html
/freebsd-9.3-release/contrib/ntp/scripts/calc_tickadj/calc_tickadj.man.in
/freebsd-9.3-release/contrib/ntp/scripts/calc_tickadj/calc_tickadj.mdoc.in
/freebsd-9.3-release/contrib/ntp/scripts/calc_tickadj/invoke-calc_tickadj.texi
/freebsd-9.3-release/contrib/ntp/scripts/invoke-plot_summary.texi
/freebsd-9.3-release/contrib/ntp/scripts/invoke-summary.texi
/freebsd-9.3-release/contrib/ntp/scripts/ntp-wait/invoke-ntp-wait.texi
/freebsd-9.3-release/contrib/ntp/scripts/ntp-wait/ntp-wait-opts
/freebsd-9.3-release/contrib/ntp/scripts/ntp-wait/ntp-wait.1ntp-waitman
/freebsd-9.3-release/contrib/ntp/scripts/ntp-wait/ntp-wait.1ntp-waitmdoc
/freebsd-9.3-release/contrib/ntp/scripts/ntp-wait/ntp-wait.html
/freebsd-9.3-release/contrib/ntp/scripts/ntp-wait/ntp-wait.man.in
/freebsd-9.3-release/contrib/ntp/scripts/ntp-wait/ntp-wait.mdoc.in
/freebsd-9.3-release/contrib/ntp/scripts/ntpsweep/invoke-ntpsweep.texi
/freebsd-9.3-release/contrib/ntp/scripts/ntpsweep/ntpsweep-opts
/freebsd-9.3-release/contrib/ntp/scripts/ntpsweep/ntpsweep.1ntpsweepman
/freebsd-9.3-release/contrib/ntp/scripts/ntpsweep/ntpsweep.1ntpsweepmdoc
/freebsd-9.3-release/contrib/ntp/scripts/ntpsweep/ntpsweep.html
/freebsd-9.3-release/contrib/ntp/scripts/ntpsweep/ntpsweep.man.in
/freebsd-9.3-release/contrib/ntp/scripts/ntpsweep/ntpsweep.mdoc.in
/freebsd-9.3-release/contrib/ntp/scripts/ntptrace/invoke-ntptrace.texi
/freebsd-9.3-release/contrib/ntp/scripts/ntptrace/ntptrace-opts
/freebsd-9.3-release/contrib/ntp/scripts/ntptrace/ntptrace.1ntptraceman
/freebsd-9.3-release/contrib/ntp/scripts/ntptrace/ntptrace.1ntptracemdoc
/freebsd-9.3-release/contrib/ntp/scripts/ntptrace/ntptrace.html
/freebsd-9.3-release/contrib/ntp/scripts/ntptrace/ntptrace.man.in
/freebsd-9.3-release/contrib/ntp/scripts/ntptrace/ntptrace.mdoc.in
/freebsd-9.3-release/contrib/ntp/scripts/plot_summary-opts
/freebsd-9.3-release/contrib/ntp/scripts/plot_summary.1plot_summaryman
/freebsd-9.3-release/contrib/ntp/scripts/plot_summary.1plot_summarymdoc
/freebsd-9.3-release/contrib/ntp/scripts/plot_summary.html
/freebsd-9.3-release/contrib/ntp/scripts/plot_summary.man.in
/freebsd-9.3-release/contrib/ntp/scripts/plot_summary.mdoc.in
/freebsd-9.3-release/contrib/ntp/scripts/summary-opts
/freebsd-9.3-release/contrib/ntp/scripts/summary.1summaryman
/freebsd-9.3-release/contrib/ntp/scripts/summary.1summarymdoc
/freebsd-9.3-release/contrib/ntp/scripts/summary.html
/freebsd-9.3-release/contrib/ntp/scripts/summary.man.in
/freebsd-9.3-release/contrib/ntp/scripts/summary.mdoc.in
/freebsd-9.3-release/contrib/ntp/scripts/update-leap/invoke-update-leap.texi
/freebsd-9.3-release/contrib/ntp/scripts/update-leap/update-leap-opts
/freebsd-9.3-release/contrib/ntp/scripts/update-leap/update-leap.1update-leapman
/freebsd-9.3-release/contrib/ntp/scripts/update-leap/update-leap.1update-leapmdoc
/freebsd-9.3-release/contrib/ntp/scripts/update-leap/update-leap.html
/freebsd-9.3-release/contrib/ntp/scripts/update-leap/update-leap.man.in
/freebsd-9.3-release/contrib/ntp/scripts/update-leap/update-leap.mdoc.in
/freebsd-9.3-release/contrib/ntp/sntp/configure
/freebsd-9.3-release/contrib/ntp/sntp/include/version.def
/freebsd-9.3-release/contrib/ntp/sntp/include/version.texi
/freebsd-9.3-release/contrib/ntp/sntp/invoke-sntp.texi
/freebsd-9.3-release/contrib/ntp/sntp/m4/ntp_libevent.m4
/freebsd-9.3-release/contrib/ntp/sntp/m4/ntp_problemtests.m4
/freebsd-9.3-release/contrib/ntp/sntp/m4/version.m4
/freebsd-9.3-release/contrib/ntp/sntp/networking.c
/freebsd-9.3-release/contrib/ntp/sntp/sntp-opts.c
/freebsd-9.3-release/contrib/ntp/sntp/sntp-opts.h
/freebsd-9.3-release/contrib/ntp/sntp/sntp.1sntpman
/freebsd-9.3-release/contrib/ntp/sntp/sntp.1sntpmdoc
/freebsd-9.3-release/contrib/ntp/sntp/sntp.html
/freebsd-9.3-release/contrib/ntp/sntp/sntp.man.in
/freebsd-9.3-release/contrib/ntp/sntp/sntp.mdoc.in
/freebsd-9.3-release/contrib/ntp/sntp/tests/keyFile.c
/freebsd-9.3-release/contrib/ntp/sntp/tests/kodDatabase.c
/freebsd-9.3-release/contrib/ntp/sntp/tests/kodFile.c
/freebsd-9.3-release/contrib/ntp/sntp/tests/run-kodDatabase.c
/freebsd-9.3-release/contrib/ntp/sntp/tests/run-t-log.c
/freebsd-9.3-release/contrib/ntp/sntp/tests/t-log.c
/freebsd-9.3-release/contrib/ntp/sntp/tests/utilities.c
/freebsd-9.3-release/contrib/ntp/sntp/unity/unity_internals.h
/freebsd-9.3-release/contrib/ntp/sntp/version.c
/freebsd-9.3-release/contrib/ntp/tests/bug-2803/bug-2803.c
/freebsd-9.3-release/contrib/ntp/tests/bug-2803/run-bug-2803.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/a_md5encrypt.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/authkeys.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/buftvtots.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/calendar.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/caljulian.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/clocktime.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/decodenetnum.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/humandate.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/lfpfunc.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/lfptostr.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/modetoa.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/msyslog.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/netof.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/numtoa.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/numtohost.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/octtoint.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/prettydate.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/recvbuff.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/refidsmear.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/refnumtoa.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-a_md5encrypt.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-calendar.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-decodenetnum.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-humandate.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-lfpfunc.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-lfptostr.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-modetoa.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-msyslog.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-netof.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-numtoa.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-numtohost.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-prettydate.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-refnumtoa.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-sfptostr.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-socktoa.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-statestr.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-strtolfp.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-timespecops.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-timevalops.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/run-uglydate.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/sfptostr.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/socktoa.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/statestr.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/strtolfp.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/timespecops.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/timevalops.c
/freebsd-9.3-release/contrib/ntp/tests/libntp/uglydate.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/leapsec.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/ntp_prio_q.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/ntp_restrict.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/rc_cmdlength.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/run-leapsec.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/run-ntp_restrict.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/run-rc_cmdlength.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/run-t-ntp_signd.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/t-ntp_scanner.c
/freebsd-9.3-release/contrib/ntp/tests/ntpd/t-ntp_signd.c
/freebsd-9.3-release/contrib/ntp/tests/sandbox/run-uglydate.c
/freebsd-9.3-release/contrib/ntp/tests/sandbox/smeartest.c
/freebsd-9.3-release/contrib/ntp/tests/sandbox/uglydate.c
/freebsd-9.3-release/contrib/ntp/tests/sec-2853/sec-2853.c
/freebsd-9.3-release/contrib/ntp/util/invoke-ntp-keygen.texi
/freebsd-9.3-release/contrib/ntp/util/ntp-keygen-opts.c
/freebsd-9.3-release/contrib/ntp/util/ntp-keygen-opts.h
/freebsd-9.3-release/contrib/ntp/util/ntp-keygen.1ntp-keygenman
/freebsd-9.3-release/contrib/ntp/util/ntp-keygen.1ntp-keygenmdoc
/freebsd-9.3-release/contrib/ntp/util/ntp-keygen.c
/freebsd-9.3-release/contrib/ntp/util/ntp-keygen.html
/freebsd-9.3-release/contrib/ntp/util/ntp-keygen.man.in
/freebsd-9.3-release/contrib/ntp/util/ntp-keygen.mdoc.in
/freebsd-9.3-release/etc/Makefile
/freebsd-9.3-release/include/Makefile
/freebsd-9.3-release/lib/libc
/freebsd-9.3-release/lib/libc/yp/yplib.c
/freebsd-9.3-release/sys
/freebsd-9.3-release/sys/amd64/linux32/linux32_proto.h
/freebsd-9.3-release/sys/amd64/linux32/linux32_systrace_args.c
/freebsd-9.3-release/sys/amd64/linux32/syscalls.master
linux_futex.c
linux_misc.c
/freebsd-9.3-release/sys/conf/newvers.sh
/freebsd-9.3-release/sys/contrib/pf/net/pf.c
/freebsd-9.3-release/sys/contrib/pf/net/pf_ioctl.c
/freebsd-9.3-release/sys/contrib/pf/net/pf_norm.c
/freebsd-9.3-release/sys/contrib/pf/net/pfvar.h
/freebsd-9.3-release/sys/i386/linux/syscalls.master
/freebsd-9.3-release/sys/kern/kern_prot.c
/freebsd-9.3-release/sys/netinet/tcp_output.c
/freebsd-9.3-release/sys/netinet6/ip6_output.c
/freebsd-9.3-release/sys/netinet6/ip6_var.h
/freebsd-9.3-release/sys/netinet6/sctp6_usrreq.c
/freebsd-9.3-release/sys/sys/ucred.h
/freebsd-9.3-release/usr.sbin/ntp
/freebsd-9.3-release/usr.sbin/ntp/config.h
/freebsd-9.3-release/usr.sbin/ntp/doc/ntp-keygen.8
/freebsd-9.3-release/usr.sbin/ntp/doc/ntp.conf.5
/freebsd-9.3-release/usr.sbin/ntp/doc/ntp.keys.5
/freebsd-9.3-release/usr.sbin/ntp/doc/ntpd.8
/freebsd-9.3-release/usr.sbin/ntp/doc/ntpdc.8
/freebsd-9.3-release/usr.sbin/ntp/doc/ntpq.8
/freebsd-9.3-release/usr.sbin/ntp/doc/sntp.8
/freebsd-9.3-release/usr.sbin/ntp/scripts/mkver
267654 20-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


267006 03-Jun-2014 dchagin

MFC r266924:

Glibc was switched to the FUTEX_WAIT_BITSET op and CLOCK_REALTIME
flag has been added instead of FUTEX_WAIT to replace the FUTEX_WAIT
logic which needs to do gettimeofday() calls before the futex syscall
to convert the absolute timeout to a relative timeout.
Before this the CLOCK_MONOTONIC used by the FUTEX_WAIT_BITSET op.

When the FUTEX_CLOCK_REALTIME is specified the timeout is an absolute
time, not a relative time. Rework futex_wait to handle this.
On the side fix the futex leak in error case and remove useless
parentheses.

Properly calculate the timeout for the CLOCK_MONOTONIC case.

Tested by: Hans Petter Selasky

Approved by: re (glebius)


266980 02-Jun-2014 dchagin

MFC r266782:

In r218101 I have not changed properly the futex syscall definition.
Some Linux futex ops atomically verifies that the futex address uaddr
(uval) contains the value val. Comparing signed uval and unsigned val
may lead to an unexpected result, mostly to a deadlock.

So copyin uaddr to an unsigned int to compare the parameters correctly.

While here change ktr records to print parameters in more readable format.

Approved by: re (glebius)


262057 17-Feb-2014 avg

MFC r258622,258675: dtrace sdt: remove the ugly sname parameter of
SDT_PROBE_DEFINE


262056 17-Feb-2014 avg

MFC r255971: Fix some typos that were causing probe argument types to
show up as unknown

MFC slacker: markj


255743 20-Sep-2013 markj

MFC r254467:
Remove a couple of unused macros.


248532 19-Mar-2013 jkim

MFC: r234352

Implement pipe2 syscall for Linuxulator.


248085 09-Mar-2013 marius

MFC: r227309 (partial)

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


247558 01-Mar-2013 jhb

MFC 245849:
Don't assume that all Linux TCP-level socket options are identical to
FreeBSD TCP-level socket options (only the first two are). Instead,
using a mapping function and fail unsupported options as we do for other
socket option levels.


246292 03-Feb-2013 dchagin

MFC r245908:

Arithmetic on pointers takes into account the size of the type.
Properly cast the pointer to avoid incorrect pointer scaling.


246291 03-Feb-2013 dchagin

MFC r240387 (by kevlo@):
Remove redundant check.


246290 03-Feb-2013 dchagin

MFC r235063 (by netchild@):
- >500 static DTrace probes for the linuxulator
- DTrace scripts to check for errors, performance, ...
they serve mostly as examples of what you can do with the static probes
with moderate load the scripts may be overwhelmed, excessive lock-tracing
may influence program behavior (see the last design decission)

Design decissions:
- use "linuxulator" as the provider for the native bitsize; add the
bitsize for the non-native emulation (e.g. "linuxuator32" on amd64)
- Add probes only for locks which are acquired in one function and released
in another function. Locks which are aquired and released in the same
function should be easy to pair in the code, inter-function
locking is more easy to verify in DTrace.
- Probes for locks should be fired after locking and before releasing to
prevent races (to provide data/function stability in DTrace, see the
man-page of "dtrace -v ..." and the corresponding DTrace docs).

Manual merge futex part of r227293 (by ed@):
Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.

MFC r235078 (by uqs@):
Fix make depend.


244660 24-Dec-2012 kib

MFC r242476:
The r241025 fixed the case when a binary, executed from nullfs mount,
was still possible to open for write from the lower filesystem. There
is a symmetric situation where the binary could already has file
descriptors opened for write, but it can be executed from the nullfs
overlay.

Handle the issue by passing one v_writecount reference to the lower
vnode if nullfs vnode has non-zero v_writecount.


244658 24-Dec-2012 kib

MFC r241025:
Fix the mis-handling of the VV_TEXT on the nullfs vnodes.
Add a set of VOPs for the VV_TEXT query, set and clear operations,
which are correctly bypassed to lower vnode.


243417 22-Nov-2012 simon

Fix multiple Denial of Service vulnerabilities with named(8).

Fix insufficient message length validation for EAP-TLS messages.

Fix Linux compatibility layer input validation error.

Security: FreeBSD-SA-12:06.bind
Security: FreeBSD-SA-12:07.hostapd
Security: FreeBSD-SA-12:08.linux
Security: CVE-2012-4244, CVE-2012-5166, CVE-2012-4445, CVE-2012-4576
Approved by: re
Approved by: security-officer


239843 29-Aug-2012 kib

MFC r238029:
Extend the KPI to lock and unlock f_offset member of struct file. It
now fully encapsulates all accesses to f_offset, and extends f_offset
locking to other consumers that need it, in particular, to lseek() and
variants of getdirentries().


232405 02-Mar-2012 ed

MFC r231378:

Remove direct access to si_name.

Code should just use the devtoname() function to obtain the name of a
character device. Also add const keywords to pieces of code that need it
to build properly.


232387 02-Mar-2012 kib

MFC r231885:
Fix misuse of the kernel map in miscellaneous image activators.
Vnode-backed mappings cannot be put into the kernel map, since it is a
system map.


231145 07-Feb-2012 jhb

MFC 228957:
Implement linux_fadvise64() and linux_fadvise64_64() using
kern_posix_fadvise().


229923 10-Jan-2012 dim

MFC r229402:

In sys/compat/linux/linux_ioctl.c, work around a warning when a pointer
is compared to an integer, by casting the pointer to l_uintptr_t. No
functional difference on both i386 and amd64.

Reviewed by: ed, jhb


226640 22-Oct-2011 brueffer

MFC: r226247, r226253

Properly free linux_gidset in case of an error.

Approved by: re (kib)


226231 10-Oct-2011 jkim

MFC: r226068, r226069, r226071, r226072, r226073, r226074, r226078, r226079

- Unroll inlined strnlen(9) and make it easier to read.
- Inline do_sa_get() function and remove an unused return value.
- Retern more appropriate errno when Linux path name is too long.
- Restore the original socket address length if it was not really AF_INET6.
- Make sure to ignore the leading NULL byte from Linux abstract namespace.
- Use uint32_t instead of u_int32_t. Fix style(9) nits.
- Remove a now-defunct variable.
- Use the caculated length instead of maximum length.

Approved by: re (kib)


226023 04-Oct-2011 cperciva

Fix a bug in UNIX socket handling in the linux emulator which was
exposed by the security fix in FreeBSD-SA-11:05.unix.

Approved by: so (cperciva)
Approved by: re (kib)
Security: Related to FreeBSD-SA-11:05.unix, but not actually
a security fix.


225736 23-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


225617 16-Sep-2011 kmacy

In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)


224987 18-Aug-2011 jonathan

Add experimental support for process descriptors

A "process descriptor" file descriptor is used to manage processes
without using the PID namespace. This is required for Capsicum's
Capability Mode, where the PID namespace is unavailable.

New system calls pdfork(2) and pdkill(2) offer the functional equivalents
of fork(2) and kill(2). pdgetpid(2) allows querying the PID of the remote
process for debugging purposes. The currently-unimplemented pdwait(2) will,
in the future, allow querying rusage/exit status. In the interim, poll(2)
may be used to check (and wait for) process termination.

When a process is referenced by a process descriptor, it does not issue
SIGCHLD to the parent, making it suitable for use in libraries---a common
scenario when using library compartmentalisation from within large
applications (such as web browsers). Some observers may note a similarity
to Mach task ports; process descriptors provide a subset of this behaviour,
but in a UNIX style.

This feature is enabled by "options PROCDESC", but as with several other
Capsicum kernel features, is not enabled by default in GENERIC 9.0.

Reviewed by: jhb, kib
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc


224778 11-Aug-2011 rwatson

Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:

Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *. With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.

Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.

In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.

Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.

Approved by: re (bz)
Submitted by: jonathan
Sponsored by: Google Inc


224123 17-Jul-2011 bz

Remove the 'either' from the comment as it'll be less obvious that we
removed semmap in a bit of time from now. Re-wrap.

Suggested by: jhb


224016 14-Jul-2011 bz

Remove semaphore map entry count "semmap" field and its tuning
option that is highly recommended to be adjusted in too much
documentation while doing nothing in FreeBSD since r2729 (rev 1.1).

ipcs(1) needs to be recompiled as it is accessing _KERNEL private
variables.

Reviewed by: jhb (before comment change on linux code)
Sponsored by: Sandvine Incorporated


221434 04-May-2011 netchild

Commit the missing linux_videdev2_compat.h (lost somewhere between
commit tree patch generation -> successful compile tree build test -> commmit).

Pointy hat to: netchild


221428 04-May-2011 netchild

Add FEATURE macros for v4l and v4l2 to the linuxulator.

Suggested by: ae


221426 04-May-2011 netchild

This is v4l2 support for the linuxulator. This allows to access FreeBSD
native devices which support the v4l2 API from processes running within
the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd
or multimedia/webcamd supplied drivers.

Submitted by: nox
MFC after: 1 month


221425 04-May-2011 netchild

Fix typo in comment, improve comment.


221424 04-May-2011 netchild

Add explanation about the use-permission and FreeBSDify it.


221423 04-May-2011 netchild

Copy the v4l2 header unchanged from the vendor branch.


220373 05-Apr-2011 trasz

Add accounting for most of the memory-related resources.

Sponsored by: The FreeBSD Foundation
Reviewed by: kib (earlier version)


220186 31-Mar-2011 avg

Revert r220032:linux compat: add SO_PASSCRED option with basic handling

I have not properly thought through the commit. After r220031 (linux
compat: improve and fix sendmsg/recvmsg compatibility) the basic
handling for SO_PASSCRED is not sufficient as it breaks recvmsg
functionality for SCM_CREDS messages because now we would need to handle
sockcred data in addition to cmsgcred. And that is not implemented yet.

Pointyhat to: avg


220032 26-Mar-2011 avg

linux compat: add SO_PASSCRED option with basic handling

This seems to have been a part of a bigger patch by dchagin that either
haven't been committed or committed partially.

Submitted by: dchagin, nox
MFC after: 2 weeks


220031 26-Mar-2011 avg

linux compat: improve and fix sendmsg/recvmsg compatibility

- implement baseic stubs for capget, capset, prctl PR_GET_KEEPCAPS
and prctl PR_SET_KEEPCAPS.
- add SCM_CREDS support to sendmsg and recvmsg
- modify sendmsg to ignore control messages if not using UNIX
domain sockets

This should allow linux pulse audio daemon and client work on FreeBSD
and interoperate with native counter-parts modulo the differences in
pulseaudio versions.

PR: kern/149168
Submitted by: John Wehle <john@feith.com>
Reviewed by: netchild
MFC after: 2 weeks


219668 15-Mar-2011 netchild

Staticize functions which are not used somewhere else, move the
corresponding prototypes from the header to the code file.


219558 12-Mar-2011 dchagin

Style(9) fixes. No functional changes.

MFC after: 2 Week


219460 10-Mar-2011 jhb

Remove now-obsolete comment.

Submitted by: netchild
MFC after: 1 week


219421 09-Mar-2011 dchagin

Indeed, remove bogus since r219405 check of the Linux ABI.

Pointed out: jhb

MFC after: 2 Week


219405 08-Mar-2011 dchagin

Extend struct sysvec with new method sv_schedtail, which is used for an
explicit process at fork trampoline path instead of eventhadler(schedtail)
invocation for each child process.

Remove eventhandler(schedtail) code and change linux ABI to use newly added
sysvec method.

While here replace explicit comparing of module sysentvec structure with the
newly created process sysentvec to detect the linux ABI.

Discussed with: kib

MFC after: 2 Week


219242 03-Mar-2011 dchagin

Print out shared flag for debug purpose.

MFC after: 1 Week


219240 03-Mar-2011 dchagin

Switch PROCESS_SHARE to AUTO_SHARE (as umtx do). Even for SHARED,
if page mapped MAP_ANON linux uses private algorithm too.

Disscussed with: jhb

MFC after: 3 Days


218970 23-Feb-2011 jhb

Use umtx_key objects to uniquely identify futexes. Private futexes in
different processes that happen to use the same user address in the
separate processes will now be treated as distinct futexes rather than the
same futex. We can now honor shared futexes properly by mapping them to a
PROCESS_SHARED umtx_key. Private futexes use THREAD_SHARED umtx_key
objects.

In conjunction with: dchagin
Reviewed by: kib
MFC after: 1 week


218879 20-Feb-2011 dchagin

Do not clobber %rdx.
Before calling vfork() syscall the linux user-space stores the current PID
in the %rdx and restore it when the parent process will leave the kernel.


218720 15-Feb-2011 dchagin

For realtime signals fill the sigval value.


218719 15-Feb-2011 dchagin

Make a linux_rt_sigtimedwait() system call is actually working.

1) Translate the native signal number in the appropriate Linux signal.
2) Remove bogus code, which can lead to a panic as it calls
kern_sigtimedwait with same ksiginfo.
3) Return the corresponding signal number.


218718 15-Feb-2011 dchagin

Style(9) fix. Wrap long lines in linux_rt_sigtimedwait().


218717 15-Feb-2011 dchagin

Put the macro declaration in the relevant include file for future use.


218686 14-Feb-2011 dchagin

Style(9) fix. Do not initialize variables in the declarations.


218668 13-Feb-2011 dchagin

Sort include files in the alphabetical order.


218655 13-Feb-2011 dchagin

Remove comment about 'ftlk' LOR.


218654 13-Feb-2011 dchagin

Stop printing the LOR, as this is expected behavior.


218646 13-Feb-2011 dchagin

The bitset field of freshly created futex should be initialized explicity.
Otherwise, REQUEUE operations fails.


218621 12-Feb-2011 dchagin

Rename used_requeue and use it as bitwise field to store more flags.
Reimplement used_requeue logic with LINUX_XDEPR_REQUEUEOP flag.


218618 12-Feb-2011 dchagin

Slightly rewrite linux_fork:

1) Remove bogus error checking.
2) A new process exit from kernel through fork_trampoline(),
so remove bogus check.


218617 12-Feb-2011 dchagin

Remove bogus include <machine/frame.h>


218616 12-Feb-2011 dchagin

Move linux_clone(), linux_fork(), linux_vfork() to a MI path.


218497 09-Feb-2011 netchild

Linux' shm_open() fails because it wants to find some funky shmfs
to construct the full pathname. It starts to search at the default
mountpoint which is /dev/shm. If this fails it runs through fstab
and searches for shmfs and tmpfs. Whatever it finds will be
statfs()'ed to be checked for Linux' fs magic for shmfs (0x01021994).

Ideally our tmpfs should deliver this fs magic to Linux processes, but
as our tmpfs is considered to be an experimental feature we can not
assume that there is always a tmpfs available.

To make shared memory work in the Linuxulator, force the fs type of
/dev/shm (which can be a symlink) to match what Linux expects. The user
is responsible (info has to be added to the linux base ports and the docs)
to setup a suitable link for /dev/shm.

Noticed by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
Submitted by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
MFC after: 1 month


218118 31-Jan-2011 dchagin

Yet another unimplemented futex operation, print out about.

Submitted by: arundel
MFC after: 1 month.


218117 31-Jan-2011 dchagin

Implement a futex BITSET op.

Submitted by: arundel
MFC after: 1 month.


218031 28-Jan-2011 dchagin

Style(9) fixes.

MFC after: 1 Month.


218030 28-Jan-2011 dchagin

Implement a variation of the linux_common_wait() which should
be used by linuxolator itself.

Move linux_wait4() to MD path as it requires native struct
rusage translation to struct l_rusage on linux32/amd64.

MFC after: 1 Month.


218005 28-Jan-2011 dchagin

Style(9) fix.

MFC after: 1 month.


217743 23-Jan-2011 dchagin

Style(9) fix.

Approved by: kib(mentor)
MFC after: 1 month


217578 19-Jan-2011 kib

In linuxolator getdents_common(), it seems there is no reason to loop
if no records where returned by VOP_READDIR(). Readdir implementations
allowed to return 0 records when first record is larger then supplied
buffer. In this case trying to execute VOP_READDIR() again causes the
syscall looping forewer.

The goto was there from the day 1, which goes back to 1995 year.

Reported and tested by: Beat G?tzi <beat chruetertee ch>
MFC after: 2 weeks


216813 30-Dec-2010 scf

Fix the LINUX_SOUND_MIXER_INFO ioctl to return success after the
information is set to FreeBSD. It had been falling through to the end
of linux_ioctl_sound() and returning ENOIOCTL. Noticed when running the
Linux ALSA amixer tool.

Add a LINUX_SOUND_MIXER_READ_CAPS ioctl which is used by the Skype
v2.1.0.81 binary.

Reviewed by: gavin
MFC after: 2 weeks


215706 22-Nov-2010 dim

Fix linux kernel module breakage introduced in r215675, by including
<sys/sysent.h>.

Noticed by: many
Pointy hat to: netchild


215675 22-Nov-2010 netchild

Do not take the process lock. The assignment to u_short inside the
properly aligned structure is atomic on all supported architectures, and
the thread that should see side-effect of assignment is the same thread
that does assignment.

Use a more appropriate conditional to detect the linux ABI.

Suggested by: kib
X-MFC: together with r215664


215666 22-Nov-2010 netchild

Remove trailing dot from the unimplemented futex messages to make
them consistent with the syscall and ipc messages.

Submitted by: arundel
MFC after: 3 days


215664 22-Nov-2010 netchild

By using the 32-bit Linux version of Sun's Java Development Kit 1.6
on FreeBSD (amd64), invocations of "javac" (or "java") eventually
end with the output of "Killed" and exit code 137.

This is caused by:
1. After calling exec() in multithreaded linux program threads are not
destroyed and continue running. They get killed after program being
executed finishes.

2. linux_exit_group doesn't return correct exit code when called not
from group leader. Which happens regularly using sun jvm.

The submitters fix this in a similar way to how NetBSD handles this.

I took the PRs away from dchagin, who seems to be out of touch of
this since a while (no response from him).

The patches committed here are from [2], with some little modifications
from me to the style.

PR: 141439 [1], 144194 [2]
Submitted by: Stefan Schmidt <stefan.schmidt@stadtbuch.de>, gk
Reviewed by: rdivacky (in april 2010)
MFC after: 5 days


215339 15-Nov-2010 netchild

Some style(9) fixes.

Submitted by: arundel
MFC after: 1 week


215338 15-Nov-2010 netchild

- print out the PID and program name of the program trying to use an
unsupported futex operation
- for those futex operations which are known to be not supported,
print out which futex operation it is
- shortcut the error return of the unsupported FUTEX_CLOCK_REALTIME in
some cases:
FUTEX_CLOCK_REALTIME can be used to tell linux to use
CLOCK_REALTIME instead of CLOCK_MONOTONIC. FUTEX_CLOCK_REALTIME
however must only be set, if either FUTEX_WAIT_BITSET or
FUTEX_WAIT_REQUEUE_PI are set too. If that's not the case
we can die with ENOSYS right at the beginning.

Submitted by: arundel
Reviewed by: rdivacky (earlier iteration of the patch)
MFC after: 1 week


213846 14-Oct-2010 kib

Remove stale comment.

Submitted by: arundel
MFC after: 3 days


213490 06-Oct-2010 jkim

Simplify timeout check in futex_wait() using itimerfix() and return error
if the given timeout is invalid. Consistently use int type for timeout and
correct a format string in futex_sleep().


213471 06-Oct-2010 netchild

Fix a comparision of an uninitialised pointer.

Submitted by: arundel
Found by: clang analysis (automatic service by uqs@)
Reviewed by: rdivacky


212425 10-Sep-2010 mdf

Replace sbuf_overflowed() with sbuf_error(), which returns any error
code associated with overflow or with the drain function. While this
function is not expected to be used often, it produces more information
in the form of an errno that sbuf_overflowed() did.


209592 29-Jun-2010 jhb

Tweak the in-kernel API for sending signals to threads:
- Rename tdsignal() to tdsendsignal() and make it private to kern_sig.c.
- Add tdsignal() and tdksignal() routines that mirror psignal() and
pksignal() except that they accept a thread as an argument instead of
a process. They send a signal to a specific thread rather than to an
individual process.

Reviewed by: kib


208486 24-May-2010 wkoszek

Bring USB fixes for linux(4).

Intention of this commit is to let us take a full advantage
of libusb(8) ported to Linux. This decreases a possibility of getting
any collisions within ioctl() "command" space, especially with
relation to LINUX_SNDCTL_SEQ... stuff.

Basically, we provide commands, that will be mapped in the kernel
to correct ones and forward those to the USB layer. Port enabling
functionality brought with this patch is here:

http://www.freebsd.org/cgi/query-pr.cgi?pr=146895

Bump __FreeBSD_version to catch, since which version installing a
port makes sense.

This patch should bring no regressions. So far, only i386 is tested.

Tested by: thompsa@
Reviewed by: thompsa@
OKed by: netchild@


207569 03-May-2010 netchild

- #ifdef out the cliplist part, skype seems like using an uninitialized
variable and can cause problems, without the cliplist handling it works
without problems
- improve the cliplist error handling
- fix VIDIOCGTUNER and VIDIOCSMICROCODE (still no hardware available to test)

Submitted by: J.R. Oldroyd <jr@opal.com>
X-MFC after: soon (together with all the v4l stuff)


205792 28-Mar-2010 ed

Rename st_*timespec fields to st_*tim for POSIX 2008 compliance.

A nice thing about POSIX 2008 is that it finally standardizes a way to
obtain file access/modification/change times in sub-second precision,
namely using struct timespec, which we already have for a very long
time. Unfortunately POSIX uses different names.

This commit adds compatibility macros, so existing code should still
build properly. Also change all source code in the kernel to work
without any of the compatibility macros. This makes it all a less
ambiguous.

I am also renaming st_birthtime to st_birthtim, even though it was a
local extension anyway. It seems Cygwin also has a st_birthtim.


205678 26-Mar-2010 netchild

Fix some problems which may lead to a panic:
- right order of src and dst in memcpy
- NULL out the clips after freeing to prevent an accident

Noticed by: hselasky


205423 21-Mar-2010 ed

Actually make O_DIRECTORY work.

According to POSIX open() must return ENOTDIR when the path name does
not refer to a path name. Change vn_open() to respect this flag. This
also simplifies the Linuxolator a bit.


204523 01-Mar-2010 joel

The NetBSD Foundation has granted permission to remove clause 3 and 4 from
their software.

Obtained from: NetBSD


204068 18-Feb-2010 pjd

No need to include security/mac/mac_framework.h here.


203728 09-Feb-2010 delphij

- Return EAFNOSUPPORT instead of EINVAL for unsupported address family,
this matches the Linux behavior.
- Check if we have sufficient space allocated for socket structure, which
fixes a buffer overflow when wrong length is being passed into the
emulation layer. [1]

PR: kern/138860
Submitted by: Mateusz Guzik <mjguzik gmail com>
Reported by: Alexander Best [1]
MFC after: 2 weeks


202598 18-Jan-2010 wkoszek

Let us to use our libusb(3) in Linuxolator.

With this change, Linux binaries can work with our libusb(3) when
it's compiled against our header files on GNU/Linux system -- this
solves the problem with differences between /dev layouts.

With ported libusb(3), I am able to use my USB JTAG cable with Linux
binaries that support it.

Reviewed by: thompsa


202376 15-Jan-2010 netchild

Whitespace change to be able to provide the correct commit log for r202364:
---snip---
Add video clipping support but with the caveats below.

Background info:

Video clipping allows the user to provide either a series of clip rectangles
or a clip bitmap to the driver and have the driver mask the video according
to the clipping specs provided.

Adding support for clipping to the FreeBSD Linux emulator is problematic
because it seems that this feature is not supported by many drivers and
therefore it is ignored by many applications. Unfortunately, when not
using it, rather than passing in a null clipping list, some apps leave the
clipping fields uninitialized, casuing random values to be passed in. In
the case where the driver does not use the clipping info, this is not a
problem (although it is bad form). But the Linux emulator does not know
which drivers will use this and which won't, so the Linux emulator must
try to handle this clip list, and deal gracefully with cases where the
values seem to be uninitialized.

Video clipping info is passed in using the VIDIOCSWIN ioctl in two fields
in the video_window structure: the integer clipcount and the pointer clips.

How the linuxulator handles this from this commit on:

* if (clipcount == VIDEO_CLIP_BITMAP)
The clips variable is a void * pointer to a 128*625 byte
(1024*625 bit) memory area containing a bitmap of the clipping area.
The pointer in the video_window structure is copied, but no
video_clip structures are copied.
* if (clipcount > 0 && clipcount <= 16384)
The clips variable is pointer to a list of video_clip structures. Up
to clipcount structures are copied and passed to the driver.
The upper limit of 16384 was imposed here so that user code that does
not properly initialize clipcount falls through below and no attempt
is made to copy an uninitialized list. This value was found by
examining Linux drivers that support the clip list.
* else
The clipcount is either negative (but not VIDEO_CLIP_BITMAP), zero or
positive (> 16384).
All these cases are treated as invalid data. Both the clipcount field
and clips pointer are forced to zero/NULL and passed to the driver.

It should be noted that, at the time of developing this V4L emulator code,
the pwc(4) V4L driver does not support clipping.

Submitted by: J.R. Oldroyd <fbsd@opal.com>
MFC after: 1 month
---snip---


202364 15-Jan-2010 netchild

This is v4l support for the linuxulator. This allows to access FreeBSD
native devices which support the v4l API from processes running within
the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd driver.

Not tested is firmware upload, framebuffer stuff and video tuner stuff
due to lack of hardware.
The clipping part (VIDIOCSWIN) needs a little bit of further work (partly
in progress, but can not be tested due to lack of a suitable device).

The submitter tested this sucessfully with Skype and flash apps on amd64 and
i386 with the multimedia/pwcbsd driver.

Submitted by: J.R. Oldroyd <fbsd@opal.com>


202341 15-Jan-2010 brooks

Since all other comparisons involving ngroups_max use
"ngroups_max + 1", use ">= ngroups_max+1" instead of the equivalent
"> ngroups_max" to reduce confusion.


202143 12-Jan-2010 brooks

Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamic
kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to
INT_MAX-1. Given that the Windows group limit is 1024, this range
should be sufficient for most applications.

MFC after: 1 month


202113 11-Jan-2010 mckusick

Background:

When renaming a directory it passes through several intermediate
states. First its new name will be created causing it to have two
names (from possibly different parents). Next, if it has different
parents, its value of ".." will be changed from pointing to the old
parent to pointing to the new parent. Concurrently, its old name
will be removed bringing it back into a consistent state. When fsck
encounters an extra name for a directory, it offers to remove the
"extraneous hard link"; when it finds that the names have been
changed but the update to ".." has not happened, it offers to rewrite
".." to point at the correct parent. Both of these changes were
considered unexpected so would cause fsck in preen mode or fsck in
background mode to fail with the need to run fsck manually to fix
these problems. Fsck running in preen mode or background mode now
corrects these expected inconsistencies that arise during directory
rename. The functionality added with this update is used by fsck
running in background mode to make these fixes.

Solution:

This update adds three new fsck sysctl commands to support background
fsck in correcting expected inconsistencies that arise from incomplete
directory rename operations. They are:

setcwd(dirinode) - set the current directory to dirinode in the
filesystem associated with the snapshot.
setdotdot(oldvalue, newvalue) - Verify that the inode number for ".."
in the current directory is oldvalue then change it to newvalue.
unlink(nameptr, oldvalue) - Verify that the inode number associated
with nameptr in the current directory is oldvalue then unlink it.

As with all other fsck sysctls, these new ones may only be used by
processes with appropriate priviledge.

Reported by: jeff
Security issues: rwatson


201758 07-Jan-2010 mbr

Remove extraneous semicolons, no functional changes.

Submitted by: Marc Balmer <marc@msys.ch>
MFC after: 1 week


200667 18-Dec-2009 kib

Signal 0 is used to check the permission for current process to signal
target one. Since r184058, linux_do_tkill() calls tdsignal() instead of
kill(), without checking for validity of supplied signal number. Prevent
panic when supplied signal is 0 by finishing work after checks.

Found and tested by: scf
MFC after: 3 days


200110 04-Dec-2009 netchild

This is v4l support for the linuxulator. This allows to access FreeBSD
native devices which support the v4l API from processes running within
the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd driver.

Not tested is firmware upload, framebuffer stuff and video tuner stuff
due to lack of hardware.
The clipping part (VIDIOCSWIN) needs a little bit of further work (partly
in progress, but can not be tested due to lack of a suitable device).

The submitter tested this sucessfully with Skype and flash apps on amd64 and
i386 with the multimedia/pwcbsd driver.

Submitted by: J.R. Oldroyd <fbsd@opal.com>


200109 04-Dec-2009 netchild

Import the unchanged v4l videodev.h from the vendor branch.


198945 05-Nov-2009 netchild

Fix typo in kernel message. The fix is based upon the patch in the PR.

PR: kern/140279
Submitted by: Alexander Best <alexbestms@math.uni-muenster.de>
MFC after: 1 week


198467 25-Oct-2009 bz

Unconditionally call the setsockopt for IPV6_V6ONLY for v6 linux sockets
no matter whether we are compiled as module or if our default of the
net.inet6.ip6.v6only sysctl already matches what we would set.

This avoids unnecessary complications with modules, VIMAGES, INET6 and
the sysctl value, especially considering that most users will use
linux compat as a module.

Discussed with: kib, rwatson (weeks ago)
Reviewed by: rwatson
MFC after: 6 weeks


197176 13-Sep-2009 zec

Lock the ifnet list while iterating over it.

Submitted by: julian
MFC after: 3 days


197049 09-Sep-2009 kib

kern_select(9) copies fd_set in and out of userspace in quantities of
longs. Since 32bit processes longs are 4 bytes, 64bit kernel may copy in
or out 4 bytes more then the process expected.

Calculate the amount of bytes to copy taking into account size of fd_set
for the current process ABI.

Diagnosed and tested by: Peter Jeremy <peterjeremy acm org>
Reviewed by: jhb
MFC after: 1 week


196635 28-Aug-2009 zec

Fix a few panics in linuxulator + VIMAGE due to curvnet not being set.

This change affects only options VIMAGE builds.

Reviewed by: julian
MFC after: 3 days


196481 23-Aug-2009 rwatson

Rework global locks for interface list and index management, correcting
several critical bugs, including race conditions and lock order issues:

Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an
sxlock. Either can be held to stablize the lists and indexes, but both
are required to write. This allows the list to be held stable in both
network interrupt contexts and sleepable user threads across sleeping
memory allocations or device driver interactions. As before, writes to
the interface list must occur from sleepable contexts.

Reviewed by: bz, julian
MFC after: 3 days


196019 01-Aug-2009 rwatson

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


195870 25-Jul-2009 jamie

Some jail parameters (in particular, "ip4" and "ip6" for IP address
restrictions) were found to be inadequately described by a boolean.
Define a new parameter type with three values (disable, new, inherit)
to handle these and future cases.

Approved by: re (kib), bz (mentor)
Discussed with: rwatson


195699 14-Jul-2009 rwatson

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


195104 27-Jun-2009 rwatson

Replace AUDIT_ARG() with variable argument macros with a set more more
specific macros for each audit argument type. This makes it easier to
follow call-graphs, especially for automated analysis tools (such as
fxr).

In MFC, we should leave the existing AUDIT_ARG() macros as they may be
used by third-party kernel modules.

Suggested by: brooks
Approved by: re (kib)
Obtained from: TrustedBSD Project
MFC after: 1 week


194910 24-Jun-2009 jhb

Change the ABI of some of the structures used by the SYSV IPC API:
- The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned
short.
- The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned
short.
- The mode member of struct ipc_perm is now mode_t instead of unsigned short
(this is merely a style bug).
- The rather dubious padding fields for ABI compat with SV/I386 have been
removed from struct msqid_ds and struct semid_ds.
- The shm_segsz member of struct shmid_ds is now a size_t instead of an
int. This removes the need for the shm_bsegsz member in struct
shmid_kernel and should allow for complete support of SYSV SHM regions
>= 2GB.
- The shm_nattch member of struct shmid_ds is now an int instead of a
short.
- The shm_internal member of struct shmid_ds is now gone. The internal
VM object pointer for SHM regions has been moved into struct
shmid_kernel.
- The existing __semctl(), msgctl(), and shmctl() system call entries are
now marked COMPAT7 and new versions of those system calls which support
the new ABI are now present.
- The new system calls are assigned to the FBSD-1.1 version in libc. The
FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls.
- A simplistic framework for tagging system calls with compatibility
symbol versions has been added to libc. Version tags are added to
system calls by adding an appropriate __sym_compat() entry to
src/lib/libc/incldue/compat.h. [1]

PR: kern/16195 kern/113218 bin/129855
Reviewed by: arch@, rwatson
Discussed with: kan, kib [1]


194739 23-Jun-2009 bz

After cleaning up rt_tables from vnet.h and cleaning up opt_route.h
a lot of files no longer need route.h either. Garbage collect them.
While here remove now unneeded vnet.h #includes as well.


194498 19-Jun-2009 brooks

Rework the credential code to support larger values of NGROUPS and
NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024
and 1023 respectively. (Previously they were equal, but under a close
reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it
is the number of supplemental groups, not total number of groups.)

The bulk of the change consists of converting the struct ucred member
cr_groups from a static array to a pointer. Do the equivalent in
kinfo_proc.

Introduce new interfaces crcopysafe() and crsetgroups() for duplicating
a process credential before modifying it and for setting group lists
respectively. Both interfaces take care for the details of allocating
groups array. crsetgroups() takes care of truncating the group list
to the current maximum (NGROUPS) if necessary. In the future,
crsetgroups() may be responsible for insuring invariants such as sorting
the supplemental groups to allow groupmember() to be implemented as a
binary search.

Because we can not change struct xucred without breaking application
ABIs, we leave it alone and introduce a new XU_NGROUPS value which is
always 16 and is to be used or NGRPS as appropriate for things such as
NFS which need to use no more than 16 groups. When feasible, truncate
the group list rather than generating an error.

Minor changes:
- Reduce the number of hand rolled versions of groupmember().
- Do not assign to both cr_gid and cr_groups[0].
- Modify ipfw to cache ucreds instead of part of their contents since
they are immutable once referenced by more than one entity.

Submitted by: Isilon Systems (initial implementation)
X-MFC after: never
PR: bin/113398 kern/133867


194368 17-Jun-2009 bz

Add explicit includes for jail.h to the files that need them and
remove the "hidden" one from vimage.h.


194252 15-Jun-2009 jamie

Get vnets from creds instead of threads where they're available, and from
passed threads instead of curthread.

Reviewed by: zec, julian
Approved by: bz (mentor)


194203 14-Jun-2009 dchagin

Unlock process lock when return error from getrobustlist call.

Tested by: Alexander Best <alexbestms at math uni-muenster de>
Approved by: kib (mentor)
MFC after: 3 days


194090 13-Jun-2009 jamie

Add counterparts to getcredhostname:
getcreddomainname, getcredhostuuid, getcredhostid

Suggested by: rmacklem
Approved by: bz


193744 08-Jun-2009 bz

After r193232 rt_tables in vnet.h are no longer indirectly dependent on
the ROUTETABLES kernel option thus there is no need to include opt_route.h
anymore in all consumers of vnet.h and no longer depend on it for module
builds.

Remove the hidden include in flowtable.h as well and leave the two
explicit #includes in ip_input.c and ip_output.c.


193511 05-Jun-2009 rwatson

Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with: pjd


193265 01-Jun-2009 dchagin

Add forgotten in previous commit flags argument.

Approved by: kib (mentor)
MFC after: 1 month


193264 01-Jun-2009 dchagin

Implement accept4 syscall.

Approved by: kib (mentor)
MFC after: 1 month


193263 01-Jun-2009 dchagin

Implement a variation of the accept_common() which takes
a flags argument.

Do not preserve td_retval before kern_fcntl(F_SETFL) as it does not
changed.

Approved by: kib (mentor)
MFC after: 1 month


193262 01-Jun-2009 dchagin

Split linux_accept() syscall onto linux_accept_common() which should
be used by linuxulator and linux_accept() itself.

Approved by: kib (mentor)
MFC after: 1 month


193168 31-May-2009 dchagin

Implement a variation of the socketpair() syscall which takes a flags
in addition to the type argument.

Approved by: kib (mentor)
MFC after: 1 month


193165 31-May-2009 dchagin

Move new socket flags handling into a separate function as Linux
introduced more syscalls which uses these flags.

Approved by: kib (mentor)
MFC after: 1 month


193164 31-May-2009 dchagin

Remove empty lines.

Approved by: kib (mentor)
MFC after: 1 month


193066 29-May-2009 jamie

Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex. Jails may
have their own host information, or they may inherit it from the
parent/system. The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL. The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by: bz (mentor)


192899 27-May-2009 avg

linux_ioctl_cdrom: reduce stack usage

... by moving two ~2KB structures from stack to heap allocation.
I experienced stack overflow in linux emulation on i386 (8K stack)
when LINUX_DVD_READ_STRUCT ioctl was performed on atapicam cd
device and there was an error that resulted in additional quite
heavy stack use in cam layer.

Reviewed by: dchagin
Approved by: jhb (mentor)


192895 27-May-2009 jamie

Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails. Child jails may be restricted more than their parents,
but never less. Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system. Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings. The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by: bz (mentor)


192373 19-May-2009 dchagin

Validate user-supplied arguments values.
Args argument is a pointer to the structure located in user space in
which the socketcall arguments are packed. The structure must be
copied to the kernel instead of direct dereferencing.

Approved by: kib (mentor)
MFC after: 1 week


192284 18-May-2009 dchagin

Implement MSG_CMSG_CLOEXEC flag for linux_recvmsg().

Approved by: kib (mentor)
MFC after: 1 month


192206 16-May-2009 dchagin

Somewhere between 2.6.23 and 2.6.27, Linux added SOCK_CLOEXEC and
SOCK_NONBLOCK flags, that allow to save fcntl() calls.

Implement a variation of the socket() syscall which takes a flags
in addition to the type argument.

Approved by: kib (mentor)
MFC after: 1 month


192205 16-May-2009 dchagin

Return EINVAL in case when the incorrect or unsupported
type argument is specified.

Do not map type argument value as its Linux values are
identical to FreeBSD values.

Approved by: kib (mentor)


192204 16-May-2009 dchagin

Use the protocol family constants for the domain argument validation.
Return immediately when the socket() failed.

Approved by: kib (mentor)
MFC after: 1 month


192203 16-May-2009 dchagin

Emulate SO_PEERCRED socket option.
Temporarily use 0 for pid member as the FreeBSD does not cache remote
UNIX domain socket peer pid.

PR: kern/102956
Reviewed by: rwatson
Approved by: kib (mentor)
MFC after: 1 month


191989 11-May-2009 dchagin

Translate l_timeval arg to native struct timeval in
linux_setsockopt()/linux_getsockopt() for SO_RCVTIMEO,
SO_SNDTIMEO opts as l_timeval has MD members.

Remove bogus __packed attribute from l_timeval struct on __amd64__.

PR: kern/134276
Submitted by: Thomas Mueller <tmueller sysgo com>
Approved by: kib (mentor)
MFC after: 2 weeks


191988 11-May-2009 dchagin

Add forgotten linux to bsd flags argument mapping into the linux_recv().

PR: kern/134276
Submitted by: Thomas Mueller <tmueller sysgo com>
Approved by: kib (mentor)
MFC after: 2 weeks


191973 10-May-2009 dchagin

Do not export AT_CLKTCK when emulating Linux kernel prior
to 2.4.0, as it has appeared in the 2.4.0-rc7 first time.
Being exported, AT_CLKTCK is returned by sysconf(_SC_CLK_TCK),
glibc falls back to the hard-coded CLK_TCK value when aux entry
is not present.

Glibc versions prior to 2.2.1 always use hard-coded CLK_TCK value.

For older applications/libc's which depends on hard-coded CLK_TCK
value user should set compat.linux.osrelease less than 2.4.0.

Approved by: kib (mentor)


191972 10-May-2009 dchagin

Introduce linux_kernver() interface which is intended for an exact
designation of the emulated kernel version.

linux_kernver() returns integer value formatted as 'VVVMMMIII' where
VVV - version, MMM - major revision, III - minor revision.

Approved by: kib (mentor)


191966 10-May-2009 dchagin

Rework r189362, r191883.
The frequency of the statistics clock is given by stathz.
Use stathz if it is available, otherwise use hz.

Pointed out by: bde

Approved by: kib (mentor)


191898 07-May-2009 jamie

Give vfs_getopt the type it's expecting.
Write 100 times: "32 bits is so twentieth century."

Noticed by: dchagin


191896 07-May-2009 jamie

Move the per-prison Linux MIB from a private one-off pointer to the new
OSD-based jail extensions. This allows the Linux MIB to accessed via
jail_set and jail_get, and serves as a demonstration of adding jail support
to a module.

Reviewed by: dchagin, kib
Approved by: bz (mentor)


191887 07-May-2009 dchagin

Add KTR(9) tracing for futex emulation.

Approved by: kib (mentor)
MFC after: 1 month


191883 07-May-2009 dchagin

Linux exports HZ value to user space via AT_CLKTCK auxiliary vector entry,
which is available for Glibc as sysconf(_SC_CLK_TCK). If AT_CLKTCK entry is
not exported, Glibc uses 100.

linux_times() shall use the value that is exported to user space.

Pointyhat to: dchagin

PR: kern/134251
Approved by: kib (mentor)
MFC after: 2 weeks


191880 07-May-2009 dchagin

Change linux struct tms definition to match actual linux one.

Approved by: kib (mentor)
MFC after: 2 weeks


191877 07-May-2009 dchagin

Add preliminary KTR(9) support to the linux emulation layer.

Approved by: kib (mentor)
MFC after: 1 month


191876 07-May-2009 dchagin

To avoid excessive code duplication move MI definitions to the MI
header file. As it is defined in Linux.

Approved by: kib (mentor)
MFC after: 1 month


191875 07-May-2009 dchagin

Return EAFNOSUPPORT instead of EINVAL in case when the incorrect or
unsupported domain argument is specified.

Approved by: kib (mentor)


191871 07-May-2009 dchagin

Rework r191742.
Use the protocol family constants for the domain argument validation.

Return EAFNOSUPPORT in case when the incorrect domain argument
is specified.

Return EPROTONOSUPPORT instead of passing values that are not 0
to the BSD layer.

Suggested by: rwatson

Approved by: kib (mentor)
MFC after: 1 month


191792 04-May-2009 jamie

Mark Linux MIB sysctls MPSAFE.

Reviewed by: dchagin, kib
Approved by: bz (mentor)


191742 02-May-2009 dchagin

Linux socketpair() call expects explicit specified protocol for
AF_LOCAL domain unlike FreeBSD which expects 0 in this case.

Approved by: kib (mentor)
MFC after: 1 month


191741 02-May-2009 dchagin

Move extern variable definitions to the header file.

Approved by: kib (mentor)
MFC after: 1 month


191719 01-May-2009 dchagin

Reimplement futexes.
Old implemention used Giant to protect the kernel data structures,
but at the same time called malloc(M_WAITOK), that could cause the
calling thread to sleep and lost Giant protection. User-visible
result was the missed wakeup.

New implementation uses one sx lock per futex. The sx protects
the futex structures and allows to sleep while copyin or copyout
are performed.

Unlike linux, we return EINVAL when FUTEX_CMP_REQUEUE operation
is requested and either caller specified futexes are equial or
second futex already exists. This is acceptable since the situation
can only occur from the application error, and glibc falls back to
old FUTEX_WAKE operation when FUTEX_CMP_REQUEUE returns an error.

Approved by: kib (mentor)
MFC after: 1 month


191548 26-Apr-2009 zec

In preparation for turning on options VIMAGE in next commits,
rearrange / replace / adjust several INIT_VNET_* initializer
macros, all of which currently resolve to whitespace.

Reviewed by: bz (an older version of the patch)
Approved by: julian (mentor)


191269 19-Apr-2009 dchagin

Remove support for FUTEX_REQUEUE operation.
Glibc does not use this operation since 2.3.3 version (Jun 2004),
as it is racy and replaced by FUTEX_CMP_REQUEUE operation.
Glibc versions prior to 2.3.3 fall back to FUTEX_WAKE when
FUTEX_REQUEUE returned EINVAL.

Any application directly using FUTEX_REQUEUE without return
value checking are definitely broken.

Limit quantity of messages per process about unsupported
operation.

Approved by: kib (mentor)
MFC after: 1 month


190445 26-Mar-2009 ambrisko

Add stuff to support upcoming BMC/IPMI flashing of newer Dell machine
via the Linux tool.
- Add Linux shim to ipmi(4)
- Create a partitions file to linprocfs to make Linux fdisk see
disks. This file is dynamic so we can see disks come and go.
- Convert msdosfs to vfat in mtab since Linux uses that for
msdosfs.
- In the Linux mount path convert vfat passed in to msdosfs
so Linux mount works on FreeBSD. Note that tasting works
so that if da0 is a msdos file system
/compat/linux/bin/mount /dev/da0 /mnt
works.
- fix a 64it bug for l_off_t.
Grabing sh, mount, fdisk, df from Linux, creating a symlink of mtab to
/compat/linux/etc/mtab and then some careful unpacking of the Linux bmc
update tool and hacking makes it work on newer Dell boxes. Note, probably
if you can't figure out how to do this, then you probably shouldn't be
doing it :-)


189867 16-Mar-2009 dchagin

Sort include files in the alphabetical order.

Approved by: kib (mentor)
MFC after: 2 weeks


189862 15-Mar-2009 dchagin

Ignore FUTEX_FD op, as it is done by linux.

Approved by: kib (mentor)
MFC after: 2 weeks


189861 15-Mar-2009 dchagin

Include linux_futex.h before linux_emul.h

Approved by: kib (mentor)
MFC after: 6 days


189423 05-Mar-2009 jhb

A better fix for handling different FPU initial control words for different
ABIs:
- Store the FPU initial control word in the pcb for each thread.
- When first using the FPU, load the initial control word after restoring
the clean state if it is not the standard control word.
- Provide a correct control word for Linux/i386 binaries under
FreeBSD/amd64.
- Adjust the control word returned for fpugetregs()/npxgetregs() when a
thread hasn't used the FPU yet to reflect the real initial control
word for the current ABI.
- The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control
word instead of trashing whatever the current state of the FPU is.

Reviewed by: bde


189362 04-Mar-2009 dchagin

Add AT_PLATFORM, AT_HWCAP and AT_CLKTCK auxiliary vector entries which
are used by glibc. This silents the message "2.4+ kernel w/o ELF notes?"
from some programs at start, among them are top and pkill.

Do the assignment of the vector entries in elf_linux_fixup()
as it is done in glibc.

Fix some minor style issues.

Submitted by: Marcin Cieslak <saper at SYSTEM PL>
Approved by: kib (mentor)
MFC after: 1 week


189106 27-Feb-2009 bz

For all files including net/vnet.h directly include opt_route.h and
net/route.h.

Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.

We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.

This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.


188849 20-Feb-2009 ed

Don't make Linux stat() open character devices to resolve its name.

The existing code calls kern_open() to resolve the vnode of a pathname
right after a stat(). This is not correct, because it causes random
character devices to be opened in /dev. This means ls'ing a tape
streamer will cause it to rewind, for example. Changes I have made:

- Add kern_statat_vnhook() to allow binary emulators to `post-process'
struct stat, using the proper vnode.

- Remove unneeded printf's from stat() and statfs().

- Make the Linuxolator use kern_statat_vnhook(), replacing
translate_path_major_minor_at().

- Let translate_fd_major_minor() use vp->v_rdev instead of
vp->v_un.vu_cdev.

Result:

crw-rw-rw- 1 root root 0, 14 Feb 20 13:54 /dev/ptmx
crw--w---- 1 root adm 136, 0 Feb 20 14:03 /dev/pts/0
crw--w---- 1 root adm 136, 1 Feb 20 14:02 /dev/pts/1
crw--w---- 1 ed tty 136, 2 Feb 20 14:03 /dev/pts/2

Before this commit, ptmx also had a major number of 136, because it
silently allocated and deallocated a pseudo-terminal. Device nodes that
cannot be opened now have proper major/minor-numbers.

Reviewed by: kib, netchild, rdivacky (thanks!)


188588 13-Feb-2009 jhb

Use shared vnode locks when invoking VOP_READDIR().

MFC after: 1 month


188572 13-Feb-2009 netchild

Fix an edge-case of the linux readdir: We need the size of a linux dirent
structure, not the size of a pointer to it.

PR: 131099
Submitted by: Andreas Kies <andikies@gmail.com>
MFC after: 2 weeks


187830 28-Jan-2009 ed

Last step of splitting up minor and unit numbers: remove minor().

Inside the kernel, the minor() function was responsible for obtaining
the device minor number of a character device. Because we made device
numbers dynamically allocated and independent of the unit number passed
to make_dev() a long time ago, it was actually a misnomer. If you really
want to obtain the device number, you should use dev2udev().

We already converted all the drivers to use dev2unit() to obtain the
device unit number, which is still used by a lot of drivers. I've
noticed not a single driver passes NULL to dev2unit(). Even if they
would, its behaviour would make little sense. This is why I've removed
the NULL check.

Ths commit removes minor(), minor2unit() and unit2minor() from the
kernel. Because there was a naming collision with uminor(), we can
rename umajor() and uminor() back to major() and minor(). This means
that the makedev(3) manual page also applies to kernel space code now.

I suspect umajor() and uminor() isn't used that often in external code,
but to make it easier for other parties to port their code, I've
increased __FreeBSD_version to 800062.


186564 29-Dec-2008 ed

Push down Giant inside sysctl. Also add some more assertions to the code.

In the existing code we didn't really enforce that callers hold Giant
before calling userland_sysctl(), even though there is no guarantee it
is safe. Fix this by just placing Giant locks around the call to the oid
handler. This also means we only pick up Giant for a very short period
of time. Maybe we should add MPSAFE flags to sysctl or phase it out all
together.

I've also added SYSCTL_LOCK_ASSERT(). We have to make sure sysctl_root()
and name2oid() are called with the sysctl lock held.

Reviewed by: Jille Timmermans <jille quis cx>


185571 02-Dec-2008 bz

Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation


185442 29-Nov-2008 kib

Make linux_sendmsg() and linux_recvmsg() work on linux32/amd64.
Change types used in the linux' struct msghdr and struct cmsghdr
definitions to the properly-sized architecture-specific types.
Move ancillary data handler from linux_sendit() to linux_sendmsg().

Submitted by: dchagin


185337 26-Nov-2008 rdivacky

Document that all the other commands are either
identical to the FreeBSD ones or rejected by
kern_msgctl().

Found with: Coverity Prevent(tm)
CID: 3456
Approved by: kib (mentor)


185002 16-Nov-2008 kib

In the robust futexes list head, futex_offset shall be signed,
and glibc actually supplies negative offsets. Change l_ulong to l_long.

Submitted by: dchagin


184789 09-Nov-2008 ed

Mark uname(), getdomainname() and setdomainname() with COMPAT_FREEBSD4.

Looking at our source code history, it seems the uname(),
getdomainname() and setdomainname() system calls got deprecated
somewhere after FreeBSD 1.1, but they have never been phased out
properly. Because we don't have a COMPAT_FREEBSD1, just use
COMPAT_FREEBSD4.

Also fix the Linuxolator to build without the setdomainname() routine by
just making it call userland_sysctl on kern.domainname. Also replace the
setdomainname()'s implementation to use this approach, because we're
duplicating code with sysctl_domainname().

I wasn't able to keep these three routines working in our
COMPAT_FREEBSD32, because that would require yet another keyword for
syscalls.master (COMPAT4+NOPROTO). Because this routine is probably
unused already, this won't be a problem in practice. If it turns out to
be a problem, we'll just restore this functionality.

Reviewed by: rdivacky, kib


184501 31-Oct-2008 kib

The code in linux_proc_exit() contains a race when multiple linux based
processes exits at the same time. The linux_emuldata structure is freed
but p->p_emuldata is left as a dangling pointer to the just freed memory.

The check for W_EXIT in the loop scanning the child processes isn't safe
since the state of the child process can change right afterwards. Lock
the process and check the W_EXIT before delivering signal.

Submitted by: tegge
Reviewed by: davidxu
MFC after: 1 week


184413 28-Oct-2008 trasz

Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by: rwatson (mentor)


184205 23-Oct-2008 des

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


184058 19-Oct-2008 kib

Correctly fill siginfo for the signals delivered by linux tkill/tgkill.
It is required for async cancellation to work.

Fix PROC_LOCK leak in linux_tgkill when signal delivery attempt is made
to not linux process.

Do not call em_find(p, ...) with p unlocked.

Move common code for linux_tkill() and linux_tgkill() into
linux_do_tkill().

Change linux siginfo_t definition to match actual linux one. Extend
uid fields to 4 bytes from 2. The extension does not change structure
layout and is binary compatible with previous definition, because i386
is little endian, and each uid field has 2 byte padding after it.

Reported by: Nicolas Joly <njoly pasteur fr>
Submitted by: dchangin
MFC after: 1 month


183871 14-Oct-2008 kib

Make robust futexes work on linux32/amd64. Use PTRIN to read
user-mode pointers. Change types used in the structures definitions to
properly-sized architecture-specific types.

Submitted by: dchagin
MFC after: 1 week


183612 04-Oct-2008 kib

Current linux_fooaffinity() emulation fails, as the FreeBSD affinity
syscalls expect the bitmap size in the range from 32 to 128. Old glibc
always assumed size 1024, while newer glibc searches for approriate
size, starting from 1024 and going up.

For now, use FreeBSD size of cpuset_t for bitmap size parameter and
return EINVAL if length of user space bitmap less than our size of
cpuset_t.

Submitted by: dchagin
MFC after: 1 week
[This requires MFC of the actual linux affinity syscalls]


183550 02-Oct-2008 zec

Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


183275 22-Sep-2008 trasz

Fix usage of mac_vnode_check_open() in linuxulator - last argument
should be VREAD, not FREAD.

Approved by: rwatson (mentor)


182935 11-Sep-2008 rdivacky

The ERESTART to EINTR conversion is already done in
kern_select so there is no need to repeat it in
linux_select().

Submitted by: Dmitry Chagin <dchagin@>
MFC after: 1 week
Approved by: kib (mentor)


182892 09-Sep-2008 rdivacky

Getdents requires padding with 2 bytes instead of 1 byte
as with getdents64. The last byte is used for storing
the d_type, add this to plain getdents case where it was
missing before. Also change the code to use strlcpy instead
of plain strcpy. This changes fix the getdents crash we
had reports about (hl2 server etc.)

PR: kern/117010
MFC after: 1 week
Submitted by: Dmitry Chagin (dchagin@)
Tested by: MITA Yoshio <mita ee.t.u-tokyo.ac jp>
Approved by: kib (mentor)


182890 09-Sep-2008 kib

Remove superfluous copyin() of args, structures are already in kernel space.

Submitted by: dchagin
MFC after: 1 week


182371 28-Aug-2008 attilio

Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


182141 25-Aug-2008 julian

All opt_x.h includes go at the top of other includes.


181905 20-Aug-2008 ed

Integrate the new MPSAFE TTY layer to the FreeBSD operating system.

The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

The old TTY layer has a driver model that is not abstract enough to
make it friendly to use. A good example is the output path, where the
device drivers directly access the output buffers. This means that an
in-kernel PPP implementation must always convert network buffers into
TTY buffers.

If a PPP implementation would be built on top of the new TTY layer
(still needs a hooks layer, though), it would allow the PPP
implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

With the old TTY layer, it isn't entirely safe to destroy TTY's from
the system. This implementation has a two-step destructing design,
where the driver first abandons the TTY. After all threads have left
the TTY, the TTY layer calls a routine in the driver, which can be
used to free resources (unit numbers, etc).

The pts(4) driver also implements this feature, which means
posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

One of the major improvements is the per-TTY mutex, which is expected
to improve scalability when compared to the old Giant locking.
Another change is the unbuffered copying to userspace, which is both
used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from: //depot/projects/mpsafetty/...
Approved by: philip (ex-mentor)
Discussed: on the lists, at BSDCan, at the DevSummit
Sponsored by: Snow B.V., the Netherlands
dcons(4) fixed by: kan


181803 17-Aug-2008 bz

Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch


180768 23-Jul-2008 ed

Add TIOCPKT and TIOCSPTLCK to the Linuxolator.

We're very lucky, because the flags used by our TIOCPKT implementation
are the same as flags used by Linux. We can safely enable TIOCPKT,
assuming EXTPROC is not used.

TIOCSPTLCK is used by unlockpt(). Because we don't need unlockpt() in
our implementation, make this ioctl a no-op.

Approved by: philip (mentor, implicit), rdivacky
Obtained from: P4 (//depot/projects/mpsafetty/...)


180766 23-Jul-2008 rdivacky

Fix linux_alarm, the linux behaviour is to limit the
secs to INT_MAX when the passed in parameter is bigger
than INT_MAX.

Submitted by: Dmitry Chagin <chagin.dmitry gmail com>
Approved by: kib (mentor)


180291 05-Jul-2008 rwatson

Introduce a new lock, hostname_mtx, and use it to synchronize access
to global hostname and domainname variables. Where necessary, copy
to or from a stack-local buffer before performing copyin() or
copyout(). A few uses, such as in cd9660 and daemon_saver, remain
under-synchronized and will require further updates.

Correct a bug in which a failed copyin() of domainname would leave
domainname potentially corrupted.

MFC after: 3 weeks


179651 08-Jun-2008 rdivacky

d_ino member of linux_dirent structure should be unsigned long.

Submitted by: Chagin Dmitry <chagin.dmitry@gmail.com>
Approved by: kib (mentor)


179523 03-Jun-2008 rdivacky

Switch to emulating Linux 2.6 on default.

Approved by: kib (mentor)


179486 02-Jun-2008 ed

Push down the major/minor conversion for pts/%u to improve consistency.

In the mpsafetty branch, Linux sshd seems to work properly inside a
jail. Some small modifications had to be made to the Linux compatibility
layer.

The Linux PTY routines always expect the device major number to be 136
or higher. Our code always set the major/minor number pair to 136:0.
This makes routines like ttyname() and ptsname() fail, because we'll end
up having ambiguous device numbers.

The conversion was not performed on all *stat() routines, which meant in
some cases the numbers didn't get transformed. By pushing the conversion
into linux_driver_get_major_minor(), the transformation will take place
on all calls.

Approved by: philip (mentor), rdivacky


178976 13-May-2008 rdivacky

Implement robust futexes. Most of the code is modelled after
what Linux does. This is because robust futexes are mostly
userspace thing which we cannot alter. Two syscalls maintain
pointer to userspace list and when process exits a routine
walks this list waking up processes sleeping on futexes
from that list.

Reviewed by: kib (mentor)
MFC after: 1 month


178439 23-Apr-2008 rdivacky

Implement linux_truncate64() syscall.

Tested by: Aline de Freitas <aline@riseup.net>
Approved by: kib (mentor)


178036 09-Apr-2008 rdivacky

Remove using magic value of -1 to distinguish between linux_open()
and linux_openat(). Instead just pass AT_FDCWD into linux_common_open()
for the linux_open() case. This prevents passing -1 as a dirfd to
openat() from succeeding which is wrong.

Suggested by: rwatson, kib
Approved by: kib (mentor)


177997 08-Apr-2008 kib

Implement the linux syscalls
openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat,
renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat.

Submitted by: rdivacky
Sponsored by: Google Summer of Code 2007
Tested by: pho


177785 31-Mar-2008 kib

Add the support for the AT_FDCWD and fd-relative name lookups to the
namei(9).

Based on the submission by rdivacky,
sponsored by Google Summer of Code 2007
Reviewed by: rwatson, rdivacky
Tested by: pho


177633 26-Mar-2008 dfr

Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
client handle safely with replies being de-multiplexed at the socket
upcall (typically driven directly by the NIC interrupt) and handed
off to whichever thread matches the reply. For UDP sockets, many RPC
clients can share the same socket. This allows the use of a single
privileged UDP port number to talk to an arbitrary number of remote
hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
server would be relatively straightforward and would follow
approximately the Solaris KPI. A single thread should be sufficient
for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
callbacks. I've tested the NLM server reasonably extensively - it
passes both my own tests and the NFS Connectathon locking tests
running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
support for the local NFS client's locking needs, it does have to
field async replies and granted callbacks from remote NLMs that the
local client has contacted. We relay these replies to the userland
rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
it will detect deadlocks caused by a lock request that covers more
than one blocking request. As required by the NLM protocol, all
deadlock detection happens synchronously - a user is guaranteed that
if a lock request isn't rejected immediately, the lock will
eventually be granted. The old system allowed for a 'deferred
deadlock' condition where a blocked lock request could wake up and
find that some other deadlock-causing lock owner had beaten them to
the lock.

* Since both local and remote locks are managed by the same kernel
locking code, local and remote processes can safely use file locks
for mutual exclusion. Local processes have no fairness advantage
compared to remote processes when contending to lock a region that
has just been unlocked - the local lock manager enforces a strict
first-come first-served model for both local and remote lockers.

Sponsored by: Isilon Systems
PR: 95247 107555 115524 116679
MFC after: 2 weeks


177604 25-Mar-2008 ru

Fix build.

Reported by: ache, tinderbox


177460 20-Mar-2008 rdivacky

o Add stub support for some new futex operations,
so the annoying message is not printed.

o Don't warn about FUTEX_FD not being implemented
and return ENOSYS instead of 0 (eg. success).

o Clear FUTEX_PRIVATE_FLAG as we actually implement
only private futexes so there is no reason to
return ENOSYS when app asks for a private futex.
We don't reject shared futexes because they worked
just fine with our implementation so far.

Approved by: kib (mentor)
Tested by: bsam
MFC after: 1 week


177257 16-Mar-2008 rdivacky

Implement sched_setaffinity and get_setaffinity using
real cpu affinity setting primitives.

Reviewed by: jeff
Approved by: kib (mentor)


176740 02-Mar-2008 kib

Return ENOSYS instead of 0 for the unknown futex operations.

Submitted by: rdivacky
Reported and tested by: Gary Stanley <gary velocity-servers net>


176460 22-Feb-2008 kib

Sanitize arguments to linux_mremap().
Check that only MREMAP_FIXED and MREMAP_MAYMOVE flags are specified.
Check for the page alignment of the addr argument.

Submitted by: rdivacky
MFC after: 1 week


175294 13-Jan-2008 attilio

VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>


175202 10-Jan-2008 attilio

vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>


175107 05-Jan-2008 kib

After applying LCONVPATH() to the path, do use the converted path
instead of original user-mode string in the linux_stat() and
linux_lstat() syscalls.

Tested by: Peter Holm
MFC after: 3 days


174975 29-Dec-2007 kib

Plug the leaks in the present (hopefully, soon to be replaced)
implementation of the linux_openat() for the quick MFC.

Reported and tested by: Peter Holm
MFC after: 3 days


174974 29-Dec-2007 kib

Apply the LCONVPATH() to the (old) linux_stat() and linux_lstat() syscalls.
Without it, code has two problems:
- behaviour of the old and new [l]stat are different with regard of
the /compat/linux
- directly accessing the userspace data from the kernel asks for
the panics.

Reported and tested by: Peter Holm
Reviewed by: rdivacky
MFC after: 3 days


173422 07-Nov-2007 kib

Implement LINUX_SIOCGIFCOUNT and LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX.

LINUX_SIOCGIFCOUNT just returns 0 since it is not implemented in the
Linux 2.6.16.

LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX are mapped to the FreeBSD native
SIOCGIFINDEX.

Tested by: Peter Kostouros <kpeter@melbpc.org.au>
Reviewed by: brooks, rpaulo (on net@)
Submitted by: rdivacky
MFC after: 1 week


172930 24-Oct-2007 rwatson

Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

mac_<object>_<method/action>
mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer


172220 18-Sep-2007 dwmalone

The kernel version of Linux statfs64 is actually supposed to take
3 arguments, but we had forgotten the second argument. Also make the
Linux statfs64 struct depend on the architecture because it has an
extra 4 bytes padding on amd64 compared to i386.

The three argument fix is from David Taylor, the struct statfs64
stuff is my fault. With this patch I can install i386 Linux matlab
on an amd64 machine.

Submitted by: David Taylor <davidt_at_yadt.co.uk>
Approved by: re (kensmith)


171998 28-Aug-2007 kib

Implement fake linux sched_getaffinity() syscall to enable java to work
with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets
native scheduler affinity syscalls.

Submitted by: rdivacky
Reviewed by: jkim
Sponsored by: Google Summer of Code 2007
Approved by: re (kensmith)


171744 06-Aug-2007 rwatson

Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which
previously conditionally acquired Giant based on debug.mpsafenet. As that
has now been removed, they are no longer required. Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.

While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option. Clean up some related gotos for
consistency.

Reviewed by: bz, csjp
Tested by: kris
Approved by: re (kensmith)


171216 04-Jul-2007 peter

Don't add the 'pad' argument to the mmap/truncate/etc syscalls.

Submitted by: kensmith
Approved by: re (kensmith)


170587 12-Jun-2007 rwatson

Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths. Do, however, move those prototypes to priv.h.

Reviewed by: csjp
Obtained from: TrustedBSD Project


170486 10-Jun-2007 mjacob

Ensure that newpath is always initialized, even for the error case.


170472 09-Jun-2007 attilio

rufetch and calcru sometimes should be called atomically together.
This patch fixes places where they should be called atomically changing
their locking requirements (both assume per-proc spinlock held) and
introducing rufetchcalc which wrappers both calls to be performed in
atomic way.

Reviewed by: jeff
Approved by: jeff (mentor)


170170 31-May-2007 attilio

Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)


170152 31-May-2007 kib

Revert UF_OPENING workaround for CURRENT.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.

Proposed and reviewed by: jhb
Reviewed by: daichi (unionfs)
Approved by: re (kensmith)


169895 23-May-2007 kib

Move futex support code from <arch>/support.s into linux compat directory.
Implement all futex atomic operations in assembler to not depend on the
fuword() that does not allow to distinguish between -1 and failure return.
Correctly return 0 from atomic operations on success.

In collaboration with: rdivacky
Tested by: Scot Hetzel <swhetzel gmail com>, Milos Vyletel <mvyletel mzm cz>
Sponsored by: Google SoC 2007


169667 18-May-2007 jeff

- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts. This can be used to abstract away pcpu details but also changes
to use atomics for all counters now. This means sched lock is no longer
responsible for protecting counts in the switch routines.

Contributed by: Attilio Rao <attilio@FreeBSD.org>


168711 14-Apr-2007 rwatson

Some Linux applications (ping) pass a non-NULL msg_control argument to
sendmsg() while using a 0-length msg_controllen. This isn't allowed in
the FreeBSD system call ABI, so detect this case and set msg_control to
NULL. This allows Linux ping to work.

Submitted by: rdivacky


168602 10-Apr-2007 scottl

Whitespace fixes


168477 07-Apr-2007 scottl

Add the CAM 'SG' peripheral device. This device implements a subset of the
Linux SCSI SG passthrough device API. The intention is to allow for both
running of Linux apps that want to talk to /dev/sg* nodes, and to facilitate
porting of apps from Linux to FreeBSD. As such, both native and linuxolator
entry points and definitions are provided.

Caveats:
- This does not support the procfs and sysfs nodes that the Linux SG
driver provides. Some Linux apps may rely on these for operation,
others may only use them for informational purposes.
- More ioctls need to be implemented.
- Linux uses a naming scheme of "sg[a-z]" for devices, while FreeBSD uses a
scheme of "sg[0-9]". Devfs aliasis (symlinks) are automatically created
to link the two together. However, tools like camcontrol only see the
native names.
- Some operations were originally designed to return byte counts or other
data directly as the syscall return value. The linuxolator doesn't appear
to support this well, so this driver just punts for these cases.

Now that the driver is in place, others are welcome to add missing
functionality. Thanks to Roman Divacky for pushing this work along.


168355 04-Apr-2007 rwatson

Replace custom file descriptor array sleep lock constructed using a mutex
and flags with an sxlock. This leads to a significant and measurable
performance improvement as a result of access to shared locking for
frequent lookup operations, reduced general overhead, and reduced overhead
in the event of contention. All of these are imported for threaded
applications where simultaneous access to a shared file descriptor array
occurs frequently. Kris has reported 2x-4x transaction rate improvements
on 8-core MySQL benchmarks; smaller improvements can be expected for many
workloads as a result of reduced overhead.

- Generally eliminate the distinction between "fast" and regular
acquisisition of the filedesc lock; the plan is that they will now all
be fast. Change all locking instances to either shared or exclusive
locks.

- Correct a bug (pointed out by kib) in fdfree() where previously msleep()
was called without the mutex held; sx_sleep() is now always called with
the sxlock held exclusively.

- Universally hold the struct file lock over changes to struct file,
rather than the filedesc lock or no lock. Always update the f_ops
field last. A further memory barrier is required here in the future
(discussed with jhb).

- Improve locking and reference management in linux_at(), which fails to
properly acquire vnode references before using vnode pointers. Annotate
improper use of vn_fullpath(), which will be replaced at a future date.

In fcntl(), we conservatively acquire an exclusive lock, even though in
some cases a shared lock may be sufficient, which should be revisited.
The dropping of the filedesc lock in fdgrowtable() is no longer required
as the sxlock can be held over the sleep operation; we should consider
removing that (pointed out by attilio).

Tested by: kris
Discussed with: jhb, kris, attilio, jeff


168275 02-Apr-2007 jkim

MFP4: Turn emul_lock into a mutex.

Submitted by: rdivacky


168037 30-Mar-2007 jkim

MFP4: Linux futex support for amd64.

Initial patch was submitted by kib and additional work was done
by Divacky Roman.

Tested by: emulation


168014 29-Mar-2007 julian

Implement the openat() linux syscall
Submitted by: Roman Divacky (rdivacky@)
MFC after: 2 weeks


167257 06-Mar-2007 rwatson

In translate_path_major_minor(), do not calculate otherwise unused 'fp'
variable, avoiding an extra locking of the file descriptor array.


167157 02-Mar-2007 jkim

MFP4: 115220, 115222

- Fix style(9) and reduce diff between amd64 and i386.
- Prefix Linuxulator macros with LINUX_ to prevent future collision.


166970 25-Feb-2007 netchild

MFp4 (110541):
Sync with rev 1.7 in NetBSD.

Obtained from: NetBSD


166969 25-Feb-2007 netchild

MFp4 (110523, parts which apply cleanly):
semi-automatic style(9)

The futex stuff already differs a lot (only a small part does not differ)
from NetBSD, so we are already way off and can't apply changes from NetBSD
automatically. As we need to merge everything by hand already, we can even
make the files comply to our world order.


166944 24-Feb-2007 netchild

Partial MFp4 of 114977:
Whitespace commit: Fix grammar, spelling and punctuation.

Submitted by: "Scot Hetzel" <swhetzel@gmail.com>


166931 23-Feb-2007 netchild

MFp4 (114193 (i386 part), 114194, 114195, 114200):
- Dont "return" in linux_clone() after we forked the new process in a case
of problems.
- Move the copyout of p2->p_pid outside the emul_lock coverage in
linux_clone().
- Cache the em->pdeath_signal in a local variable and move the copyout
out of the emul_lock coverage.
- Move the free() out of the emul_shared_lock coverage in a preparation
to switch emul_lock to non-sleepable lock (mutex).

Submitted by: rdivacky


166930 23-Feb-2007 netchild

MFp4 (part of 114132):
- Fix a LOR caused by holding emul_lock and proctree_lock at once.

Submitted by: rdivacky


166420 02-Feb-2007 kib

Remove extern int hz; use proper include file instead.


166398 01-Feb-2007 kib

Introduce some more SO_ option equivalents from Linux to FreeBSD.

The msg variable in linux_recvmsg() was not initialized.
Copy it from userspace.

Submitted by: rdivacky


166397 01-Feb-2007 kib

No need to lock emul_lock in exit_group() because em->shared
cannot change (because its referenced by curthread). This fixes
a LOR caused by acquiring emul_shared_lock while holding emul_lock.

Fix typo in comment.

Submitted by: rdivacky


166396 01-Feb-2007 kib

No need to synchronize linux_schedtail with linux_proc_init.
p->p_emuldata is properly initialized in the time when the child can run.

Do not set p->p_emuldata to NULL when the process is exiting.
It does not make any sense and only costs 2 mutex operations.

Do not lock emul_data to unlock it on the very next line.
Comment on possible race while there.

Reparent all procs that are part of a threading group but not its leaders
to init and SIGCHLD init to finish the zombies off. This fixes zombies
left after opera's exit. [1]

There is no need to lock p_em in the linux_proc_init CLONE_THREAD
case because the process cannot change the address of the p_em->shared
because its currently running this code path.
Move assigning of em->shared outside emul_shared_lock.

Noticed by: Scott Robbins <scottro@nyc.rr.com> [1]
Submitted by: rdivacky


166150 20-Jan-2007 netchild

MFp4 (113077, 113083, 113103, 113124, 113097):

Dont expose em->shared to the outside world before its properly
initialized. Might not affect anything but its at least a better
coding style.

Dont expose em via p->p_emuldata until its properly initialized.
This also enables us to get rid of some locking and simplify the
code because we are workin on a local copy.

In linux_fork and linux_vfork create the process in stopped state
to be sure that the new process runs with fully initialized emuldata
structure [1]. Also fix the vfork (both in linux_clone and linux_vfork)
race that could result in never woken up process [2].

Reported by: Scot Hetzel [1]
Suggested by: jhb [2]
Reviewed by: jhb (at least some important parts)
Submitted by: rdivacky
Tested by: Scot Hetzel (on amd64)

Change 2 comments (in the new code) to comply to style(9).

Suggested by: jhb


166085 18-Jan-2007 kib

Add support for LINUX_O_DIRECT, LINUX_O_DIRECT and LINUX_O_NOFOLLOW flags
to open() [1].
Improve locking for accessing session control structures [2].
Try to document (most likely harmless) races in the code [3].

Based on submission by: Intron (intron at intron ac) [1]
Reviewed by: jhb [2]
Discussed with: netchild, rwatson, jhb [3]


166008 14-Jan-2007 netchild

MFp4 (112379):
Implement SETALL/GETALL IPC primitives. This fixes some LTP testcases and
LabView is able to proceed a little bit further.

Submitted by: rdivacky


166006 14-Jan-2007 netchild

MFp4 (112705):
Inherit setting of the default emulation version to the jails.

Pointed out by: jhb
Submitted by: rdivacky


165871 07-Jan-2007 netchild

MFp4 (112646):
Now (ok it's been a while...) that FreeBSD has RLIMIT_AS too, we can use
it in the linuxolator instead of ignoring it.

This fixes a LTP test.

Submitted by: rdivacky


165870 07-Jan-2007 netchild

MFp4 (112535):
No need to lock prison in a case of linux_use26 because the int
setting is atomic and process cannot leave jail.

Submitted by: kib
Reviewed by: jhb
Requested by: rdivacky


165869 07-Jan-2007 netchild

MFp4 (112534):
Dont lock em in a case of just using em->shared->group_pid because
the group_pid never changes.

Submitted by: rdivacky
Reviewed by: kib
Glanced at by: jhb


165868 07-Jan-2007 netchild

MFp4 (112499):
Protect em->shared with the lock in case of CLONE_THREAD.

Submitted by: rdivacky


165867 07-Jan-2007 netchild

MFp4 (112498):
Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion.

Submitted by: rdivacky


165718 01-Jan-2007 delphij

Fix amd64 build.

Submitted by: Divacky Roman <xdivac02 stud fit vutbr cz>


165689 31-Dec-2006 netchild

MFp4 (111746, 108671, 108945, 112352):
- add linux utimes syscall [1]
- add linux rt_sigtimedwait syscall [2]

Submitted by: "Scot Hetzel" <swhetzel@gmail.com> [1]
Submitted by: Bruce Becker <hostmaster@whois.gts.net> [2]
PR: 93199 [2]


165688 31-Dec-2006 netchild

MFp4:
- semi-automatic style fixes


165687 31-Dec-2006 netchild

MFp4 (111746+):
Redo the checking for 2.6 emulation. We now cache the value of
use26 and replace calls to linux_get_osrelease() + parsing with
a call to linux_use26(). Typical path is lockless now.

Pointed out by: kib

This allows to ship RELENG_7_0 with a default osrelease of 2.4.2 and the
possibility to enable 2.6.x emulation without the possible performance
impact of the previous version of the check.

Submitted by: rdivacky


165686 31-Dec-2006 netchild

MFp4:
- semi-automatic style fixes
- spelling fixes in comments
- add some comments


165439 21-Dec-2006 netchild

MFP4 (110956):
Add definition for LINUX_MSG_INFO.

This fixes the tinderbox errors.

Submitted by: rdivacky


165408 20-Dec-2006 jkim

MFP4: 109655

- Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to
src/sys/compat/linux/linux_time.c.
- Validate timespec ranges before use as Linux kernel does.
- Fix l_timespec structure.
- Clean up style(9) nits.


165407 20-Dec-2006 jkim

MFP4: 110179

Add rudimentary IPC_INFO/MSG_INFO command support for linux_msgctl()
to pacify Linux ipcs(1). While I am here, add more bound checks
for linux_msgsnd() and linux_msgrcv().


165404 20-Dec-2006 jkim

MFP4: (part of) 110058

Use new kern_msgsnd()/kern_msgrcv() to fix linux32 emulation on amd64.


164893 04-Dec-2006 jkim

MFP4: 109653

Linux mknod(2) can open any files, not just char/block or fifo files.
This fixes Linux Test Project test cases mknod01, mknod07 and mknod09.


164890 04-Dec-2006 jkim

MFP4: 109652

Fixes for 'blocking in fifoor state' problem of LTP tests.
linux_*stat*() functions were opening files with O_RDONLY to get
major/minor pair for char/block special files. Unfortunately,
when these functions are used against fifo, it is blocked forever
because there is no writer. Instead, we only open char/block special
files for major/minor conversion. We have to get rid of kern_open()
entirely from translate_path_major_minor() but today is not the day.
While I am here, add checks for errors before calling
translate_path_major_minor().


164826 02-Dec-2006 netchild

MFP4 (108673, 110519, 110874):
- Currently LINUX_MAX_COMM_LEN is smaller than MAXCOMLEN, but in case
this will change we have a buffer overflow. Apply some defensive
programming to DTRT when this should happen.
- Use copyinstr() instead of copyin where appropriate.
* Fallback to copyin() in case of ENAMETOOLONG. [1]
* Use the right source and destination (it was wrong before).
- Use strlcpy instead of strcpy.
- Properly lock the read case (PR_GET_NAME) like the write case.

Reviewed by: rwatson (except [1])
Suggested by: rwatson [1]


164383 18-Nov-2006 kib

Add missed ")". Fix the build.

Pointy hat to: kib


164380 18-Nov-2006 kib

Sync struct sysinfo with real one from linux.

Submitted by: rdivacky


164379 18-Nov-2006 kib

Use standard debugging facilities in linux_getcwd().

Submitted by: rdivacky


164378 18-Nov-2006 kib

Add debuging printfs to syscalls that do not contain it yet. In
sethostname do not print the hostname because it would require to copyin
the string. Sethostname is not very frequently used.

Submitted by: rdivacky


164377 18-Nov-2006 kib

Remove unecessary locking of process in linux_getpid.

Suggested by: jhb
Submitted by: rdivacky


164297 15-Nov-2006 kib

Group pid and parent are shared in a case of CLONE_THREAD not CLONE_VM.
This fix lets clone02 LTP test pass with 2.6 emulation. In reality 99%
of the cases are that CLONE_VM and CLONE_THREAD are both set so it
seemed to work.

Submitted by: rdivacky


164296 15-Nov-2006 kib

In rev 1.188 of linux_misc.c the added check for valid options ommited
__WCLONE. This fixes it thus fixing skype/teamspeak to not keep zombies
after exit.

Submitted by: rdivacky
Reported by: Bakul Shah (bakul at bitblocks com)


164184 11-Nov-2006 trhodes

Merge posix4/* into normal kernel hierarchy.

Reviewed by: glanced at by jhb
Approved by: silence on -arch@ and -standards@


164033 06-Nov-2006 rwatson

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


163760 29-Oct-2006 netchild

Backout the linux aio stuff. Several problems where identified and the
dynamic nature (if no native aio code is available, the linux part
returns ENOSYS because of missing requisites) should be solved differently
than it is.

All this will be done in P4.

Not included in this commit is a backout of the changes to the native aio
code (removing static in some places). Those changes (and some more) will
also be needed when the reworked linux aio stuff will reenter the tree.

Requested by: rwatson
Discussed with: rwatson


163740 28-Oct-2006 netchild

Fix style(9).

Noticed by: rwatson


163734 28-Oct-2006 netchild

MFP4:
Implement prctl().

Submitted by: rdivacky
Tested with: LTP


163606 22-Oct-2006 rwatson

Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from: TrustedBSD Project
Sponsored by: SPARTA


163381 15-Oct-2006 netchild

Fix compile (use the right variable name).


163379 15-Oct-2006 netchild

MFP4 (with some minor changes):

Implement the linux_io_* syscalls (AIO). They are only enabled if the native
AIO code is available (either compiled in to the kernel or as a module) at
the time the functions are used. If the AIO stuff is not available there
will be a ENOSYS.

From the submitter:
---snip---
DESIGN NOTES:

1. Linux permits a process to own multiple AIO queues (distinguished by
"context"), but FreeBSD creates only one single AIO queue per process.
My code maintains a request queue (STAILQ of queue(3)) per "context",
and throws all AIO requests of all contexts owned by a process into
the single FreeBSD per-process AIO queue.

When the process calls io_destroy(2), io_getevents(2), io_submit(2) and
io_cancel(2), my code can pick out requests owned by the specified context
from the single FreeBSD per-process AIO queue according to the per-context
request queues maintained by my code.

2. The request queue maintained by my code stores contrast information between
Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks
(struct aiocb). FreeBSD IO control block actually exists in userland memory
space, required by FreeBSD native aio_XXXXXX(2).

3. It is quite troubling that the function io_getevents() of libaio-0.3.105
needs to use Linux-specific "struct aio_ring", which is a partial mirror
of context in user space. I would rather take the address of context in
kernel as the context ID, but the io_getevents() of libaio forces me to
take the address of the "ring" in user space as the context ID.

To my surprise, one comment line in the file "io_getevents.c" of
libaio-0.3.105 reads:

Ben will hate me for this

REFERENCE:

1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/
(include/linux/aio_abi.h, fs/aio.c)

2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/
(io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2))

3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html
The design notes: http://lse.sourceforge.net/io/aionotes.txt

4. The package libaio, both source and binary:
http://rpmfind.net/linux/rpm2html/search.php?query=libaio
Simple transparent interface to Linux AIO system calls.

5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/
POSIX AIO implementation based on Linux AIO system calls (depending on
libaio).
---snip---

Submitted by: Li, Xiao <intron@intron.ac>


163369 15-Oct-2006 netchild

MFP4 (107868 - 107870):
Use a macro to test for a valid signal instead of doing it my hand everywhere.

Submitted by: rdivacky


163217 10-Oct-2006 jhb

Don't pass unused bufsz to kern_shmctl().


163216 10-Oct-2006 jhb

Only try to copyin a msqid for the IPC_SET command to msgctl(). Other
commands (such as IPC_RMID) were bogusly failing with EFAULT.

Tested by: jkim


163215 10-Oct-2006 jhb

Remove unnecessary casts before PTRIN().


163132 08-Oct-2006 netchild

- change if (cond) panic() to KASSERT.
- Dont forget to free em in a case of error.

Suggested by: ssouhlal
Submitted by: rdivacky
Tested with: LTP


163131 08-Oct-2006 netchild

- Replace homegrown check for FIFO with S_ISFIFO. [1]
- Check the status of the options before messing with it.

Inspired by: NetBSD [1]
Submitted by: rdivacky
Tested with: LTP


162585 23-Sep-2006 netchild

MFp4:
- Linux returns ENOPROTOOPT in a case of not supported opt to setsockopt.
- Return EISDIR in pread() when arg is a directory.
- Return EINVAL instead of EFAULT when namelen is not correct in accept().
- Return EINVAL instead of EACCESS if invalid access mode is entered in
access().
- Return EINVAL instead of EADDRNOTAVAIL in a case of bad salen param
to bind().

Submitted by: rdivacky
Tested with: LTP (vfork01 fails now, but it seems to be a race and
not caused by those changes)
MFC after: 1 week


162358 16-Sep-2006 netchild

- don't reboot() when feed with wrong parameters (and enough permissions) [1]
- add support to power off the system [2]
- check the linux magic values [3]

Submitted by: Marcin Cieslak <saper@SYSTEM.PL> [1,2]
Modelled after: linux man page of the reboot() syscall [3]
Found by: LTP testcase "reboot02" [1]
Tested with: LTP testcase "reboot02" [1,3]
MFC after: 1 week


162201 10-Sep-2006 netchild

The Linux unlink syscall uses a different errno value when trying to unlink
a directory.

PR: 102897 [1]
Noticed by: Knut Anders Hatlen <kahatlen@gmail.com>, testrun with LTP [1]
Submitted by: Marcin Cieslak <saper@SYSTEM.PL>
Tested by: netchild (LTP test run)


162184 09-Sep-2006 netchild

- Extend the coverage of PROC_LOCK to cover wakeup(&p->p_emuldata);
- Lock the emuldata in a case when we just created it.

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Suggested by: jhb


162182 09-Sep-2006 netchild

Change futex lock from mutex to sx. Make futex_get atomic (protected by the
futex lock).

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Suggested by: jhb


162179 09-Sep-2006 netchild

- don't wake every sleeper just the first one [1]
- remove debuging printf [2]

Submitted by: intron <mag@intron.ac> [1], rdivacky [2]


161697 28-Aug-2006 ssouhlal

FREE -> free

Submitted by: rdivacky


161665 27-Aug-2006 netchild

Add the linux statfs64 call. This allows Tivoli backup to proceed a little
but further on -current (still not successful, but a step into the right
direction).

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Tested by: Paul Mather <paul@gromit.dlib.vt.edu>


161637 26-Aug-2006 netchild

Correct the number of retries in a futex_wake() call.

Sponsored by: Google SoC 2006
Submitted by: rdivacky


161610 25-Aug-2006 rwatson

Don't call suser_cred() directly from linux_sethostname(), as it just
wraps userland_sysctl(), which performs necessary privilege checks as
part of its normal operation.

MFC after: 1 week


161474 20-Aug-2006 netchild

Sync the MI parts for amd64 with i386 and remove the corresponding special
handling for amd64 in the common code. The MD parts for amd64 are still
outstanding, but at least this fixes some panics on amd64.

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Tested by: bsam


161461 19-Aug-2006 netchild

Get rid of some nested includes.

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Noticed by: jhb


161460 19-Aug-2006 ssouhlal

MALLOC -> malloc and FREE -> free

Submitted by: rdivacky
Pointed out by: jhb


161459 19-Aug-2006 ssouhlal

ifdef DEBUG a printf

Submitted by: rdivacky


161420 17-Aug-2006 netchild

- disable some more code when osrelease=2.4.2
- protect td->td_proc->p_pid with the proc lock in linux_getpid
in the amd64 (= non i386) case [1]

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Noticed by: netchild [1]


161419 17-Aug-2006 netchild

Move some stuff into headers where they belong.

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Noticed by: jhb, ssouhlal


161398 17-Aug-2006 netchild

Fix the DEBUG build:
- linux_emul.c [1]
- linux_futex.c [2]

Sponsored by: Google SoC 2006 [1]
Submitted by: rdivacky [1]
netchild [2]


161365 16-Aug-2006 netchild

Style fixes to comments.

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Noticed by: jhb, ssouhlal


161317 15-Aug-2006 netchild

Disable some parts of the code on amd64 for now to prevent a panic. A better
fix will come later.

Sponsored by: Google SoC 2006
Submitted by: rdivacky


161310 15-Aug-2006 netchild

Add the linux 2.6.x stuff (not used by default!):
- TLS - complete
- pid/tid mangling - complete
- thread area - complete
- futexes - complete with issues
- clone() extension - complete with some possible minor issues
- mq*/timer*/clock* stuff - complete but untested and the mq* stuff is
disabled when not build as part of the kernel with native FreeBSD mq*
support (module support for this will come later)

Tested with:
- linux-firefox - works, tested
- linux-opera - works, tested
- linux-realplay - doesnt work, issue with futexes
- linux-skype - doesnt work, issue with futexes
- linux-rt2-demo - works, tested
- linux-acroread - doesnt work, unknown reason (coredump) and sometimes
issue with futexes
- various unix utilities in linux-base-gentoo3 and linux-base-fc4:
everything tried worked

On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.

To test this new stuff, you have to run
sysctl compat.linux.osrelease=2.6.16
to switch back use
sysctl compat.linux.osrelease=2.4.2

Don't switch while running a linux program, strange things may or may not
happen.

Sponsored by: Google SoC 2006
Submitted by: rdivacky
Some suggestions/help by: jhb, kib, manu@NetBSD.org, netchild


161304 15-Aug-2006 netchild

Add some new files needed for linux 2.6.x compatibility.

Please don't style(9) the NetBSD code, we want to stay in sync. Not imported
on a vendor branch since we need local changes.

Sponsored by: Google SoC 2006
Submitted by: rdivacky
With help from: manu@NetBSD.org
Obtained from: NetBSD (linux_{futex,time}.*)


160555 21-Jul-2006 jhb

- Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional
Giant VFS locking in that function.
- Remove bogus code to handle the case where namei() returns success but a
NULL vnode pointer.
- Note that this code duplicates exec_check_permissions() and annotate
where it differs.
- Hold the vnode lock longer to protect the write to set VV_TEXT in
v_vflag.
- Mark linux_uselib() MPSAFE.

Reviewed by: rwatson


160506 19-Jul-2006 jhb

Don't free the sockaddr in kern_bind() and kern_connect() as not all
callers pass a sockaddr allocated via malloc() from M_SONAME anymore.
Instead, free it in the callers when necessary.


160276 11-Jul-2006 jhb

- Add conditional VFS Giant locking to getdents_common() (linux ABIs),
ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(),
and svr4_sys_getdents64() similar to that in getdirentries().
- Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(),
linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and
svr4_sys_getdents64() MPSAFE.


160190 08-Jul-2006 jhb

Add a kern_close() so that the ABIs can close a file descriptor w/o having
to populate a close_args struct and change some of the places that do.


160187 08-Jul-2006 jhb

Rework kern_semctl a bit to always assume the UIO_SYSSPACE case. This
mostly consists of pushing a few copyin's and copyout's up into
__semctl() as all the other callers were already doing the UIO_SYSSPACE
case. This also changes kern_semctl() to set the return value in a passed
in pointer to a register_t rather than td->td_retval[0] directly so that
callers can only set td->td_retval[0] if all the various copyout's succeed.

As a result of these changes, kern_semctl() no longer does copyin/copyout
(except for GETALL/SETALL) so simplify the locking to acquire the semakptr
mutex before the MAC check and hold it all the way until the end of the
big switch statement. The GETALL/SETALL cases have to temporarily drop it
while they do copyin/malloc and copyout. Also, simplify the SETALL case to
remove handling for a non-existent race condition.


160143 06-Jul-2006 jhb

- Protect the list of linux ioctl handlers with an sx lock.
- Hold Giant while calling linux ioctl handlers for now as they aren't all
known to be MPSAFE yet.
- Mark linux_ioctl() MPSAFE.


159992 27-Jun-2006 jhb

Axe the stackgap macros as the Linux ABIs no longer use the stackgap.


159991 27-Jun-2006 jhb

- Add a kern_semctl() helper function for __semctl(). It accepts a pointer
to a copied-in copy of the 'union semun' and a uioseg to indicate which
memory space the 'buf' pointer of the union points to. This is then used
in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap.
- Mark linux_ipc() and svr4_sys_semsys() MPSAFE.


159896 23-Jun-2006 netchild

The linux times syscall can be called with a NULL pointer, so keep cool
and don't panic.

This fix is different from the patch submitted as it not only prevents
a NULL-pointer dereference, but also skips some work in this case.

Noticed by: Dmitry Ganenko <dima@apk-inform.com>
Reviewed by: rdivacky (the original version as in emulation@)
MFC after: 1 week
Security: This is a RELENG_x_y candidate (local DoS).
Go ahead by: secteam (cperciva)


158658 16-May-2006 ambrisko

Fix file leaking in translate_path_major_minor.


158415 10-May-2006 netchild

Now that we don't have a linuxolator on alpha anymore:
- unifdef __alpha__
- revert rev. 1.66 of linux_socket.c


158406 10-May-2006 netchild

Implement rt_sigpending in the linuxolator.

PR: 92671
Submitted by: Markus Niemist"o <markus.niemisto@gmx.net>


158312 05-May-2006 ambrisko

Fix the the duplicate cut-n-paste in linux_fstat64 pointed out by
Alexander Leidinger. I forget to fix it in this version.


158311 05-May-2006 ambrisko

Enhance the Linux emulation layer to make MegaRAID SAS managements tool happy.
Add back in a scheme to emulate old type major/minor numbers via hooks into
stat, linprocfs to return major/minors that Linux app's expect. Currently
only /dev/null is always registered. Drivers can register via the Linux
type shim similar to the ioctl shim but by using
linux_device_register_handler/linux_device_unregister_handler functions.
The structure is:

struct linux_device_handler {
char *bsd_driver_name;
char *linux_driver_name;
char *bsd_device_name;
char *linux_device_name;
int linux_major;
int linux_minor;
int linux_char_device;
};

Linprocfs uses this to display the major number of the driver. The
soon to be available linsysfs will use it to fill in the driver name.
Linux_stat uses it to translate the major/minor into Linux type values.

Note major numbers are dynamically assigned via passing in a -1 for
the major number so we don't need to keep track of them.

This is somewhat needed due to us switching to our devfs. MegaCli
will not run until I add in the linsysfs and mfi Linux compat changes.

Sponsored by: IronPort Systems


157369 01-Apr-2006 rwatson

Annotate uses of fgetsock() with indications that they should rely
on their existing file descriptor references to sockets, rather than
use fgetsock() to retrieve a direct socket reference.

MFC after: 3 months


157189 27-Mar-2006 avatar

Unbreaking build by removing a now unused variable.


157183 27-Mar-2006 jhb

Use td_ucred rather than p_ucred to avoid panics and general unhappiness.

Pointy hat to: netchild


156976 21-Mar-2006 netchild

Fix the LINT build on alpha:
- rename some file local structure definitions, the names clash with
autogenerated names
- on !alpha add some compatibility defines for those renamed structures
- make some functions globally visible on alpha


156921 20-Mar-2006 netchild

Fix tinderbox on alpha.

Tested by: cross-compile


156874 19-Mar-2006 ru

Unbreak COMPAT_LINUX32 option support on amd64.

Broken by: netchild


156850 18-Mar-2006 netchild

Fixup some problems in my previous commit (COMPAT_43).

Pointyhat to: netchild


156842 18-Mar-2006 netchild

Get rid of the need of COMPAT_43 in the linuxolator.

Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz>
Obtained from: DragonFly (some parts)


155382 06-Feb-2006 jeff

- Remove ifdef disabled code that doesn't have a chance of working anymore.


155033 30-Jan-2006 jeff

- vn_lock with LK_RETRY can not return an error. The code that handled this
case was not necessary.

Sponsored by: Isilon Systems, Inc.


154872 26-Jan-2006 cognet

Fix a typo : deivce => device

Spotted by: rwatson


154834 26-Jan-2006 cognet

Linux compat bits needed to make linux programs use the new ptys :
linux_ioctl.[ch] : Implement LINUX_TIOCGPTN, which returns the pty number
linux_stats.c :
- Return the magic number for devfs.
- In various stats()-related functions, check that we're stating a
file in /dev/pts, and if so, change the st_rdev field to match what linux
expects to be there for a slave pty device. The glibc checks for this, and
their openpty() fails if it is no correct.


153775 28-Dec-2005 trhodes

Cast tv_sec to intmax_t and print with %jd in some ifdef'ed code.


153744 27-Dec-2005 glebius

Add \n to log() message.

Submitted by: Stanislaw Halik <weirdo tehran.lain.pl>


153448 15-Dec-2005 jhb

Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1)
which existed to cleanup the linux_osname mutex. Now that MTX_SYSINIT()
has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here
was redundant and resulted in panics in debug kernels.

MFC after: 1 week
Reported by: Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu


153378 13-Dec-2005 delphij

In Linux, kernel parameters passed to ioctl are by value, while in FreeBSD
they are passed by reference. Handle the difference within the
linux_ioctl_termio on the LINUX_TCFLSH path.

Submitted by: Jaroslav Drzik <jaro_AT_coop-voz_dot_sk>


153236 08-Dec-2005 glebius

Suppress logging about unimplemented syscalls to one time per process. This
prevents hard flood of the system console.

Reviewed by: bde


153072 04-Dec-2005 ru

Fix -Wundef.


151316 14-Oct-2005 davidxu

1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
sendsig use discrete parameters, now they uses member fields of
ksiginfo_t structure. For sendsig, this change allows us to pass
POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
an extra siginfo list holds additional siginfo_t data for signals.
kernel code uses psignal() still behavior as before, it won't be failed
even under memory pressure, only exception is when deleting a signal,
we should call sigqueue_delete to remove signal from sigqueue but
not SIGDELSET. Current there is no kernel code will deliver a signal
with additional data, so kernel should be as stable as before,
a ksiginfo can carry more information, for example, allow signal to
be delivered but throw away siginfo data if memory is not enough.
SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
not be caught or masked.
The sigqueue() syscall allows user code to queue a signal to target
process, if resource is unavailable, EAGAIN will be returned as
specification said.
Just before thread exits, signal queue memory will be freed by
sigqueue_flush.
Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64


150663 28-Sep-2005 rwatson

Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57,
osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60,
svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81,
svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55,
svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10,
ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58,
unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:

Now that Giant is acquired in uprintf() and tprintf(), the caller no
longer leads to acquire Giant unless it also holds another mutex that
would generate a lock order reversal when calling into these functions.
Specifically not backed out is the acquisition of Giant in nfs_socket.c
and rpcclnt.c, where local mutexes are held and would otherwise violate
the lock order with Giant.

This aligns this code more with the eventual locking of ttys.

Suggested by: bde


150335 19-Sep-2005 rwatson

Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(),
as they both interact with the tty code (!MPSAFE) and may sleep if the
tty buffer is full (per comment).

Modify all consumers of uprintf() and tprintf() to hold Giant around
calls into these functions. In most cases, this means adding an
acquisition of Giant immediately around the function. In some cases
(nfs_timer()), it means acquiring Giant higher up in the callout.

With these changes, UFS no longer panics on SMP when either blocks are
exhausted or inodes are exhausted under load due to races in the tty
code when running without Giant.

NB: Some reduction in calls to uprintf() in the svr4 code is probably
desirable.

NB: In the case of nfs_timer(), calling uprintf() while holding a mutex,
or even in a callout at all, is a bad idea, and will generate warnings
and potential upset. This needs to be fixed, but was a problem before
this change.

NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having
non-MPSAFE tty code.

MFC after: 1 week


149551 28-Aug-2005 delphij

Fix kernel build.

Reported by: tinderbox


149524 27-Aug-2005 rodrigc

Rewrite linux_ifconf() to be more like ifconf() in net/if.c
so that we do not call uiomove() while IFNET_RLOCK() is held.
This eliminates the witness warning:

Calling uiomove() with the following non-sleepable locks held:
exclusive sleep mutex ifnet r = 0 (0xc096dd60) locked @
/usr/src/sys/modules/linux/../../compat/linux/linux_ioctl.c:2170

MFC after: 2 days


148887 09-Aug-2005 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days


148540 29-Jul-2005 jhb

Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c
so that they aren't duplicated 3 times and are also in the same file as
the code that depends on the SYSVIPC modules.


147854 09-Jul-2005 jhb

Add Giant around linux_getcwd_common() in linux_getcwd().

Approved by: re (scottl)


147853 09-Jul-2005 jhb

Add missing locking to linux_connect() so that it can be marked MP safe:
- Conditionally grab Giant around the EISCONN hack at the end based on
debug.mpsafenet.
- Protect access to so_emuldata via SOCK_LOCK.

Reviewed by: rwatson
Approved by: re (scottl)


147816 07-Jul-2005 jhb

Fix the computation of uptime for linux_sysinfo(). Before it was returning
the uptime in seconds mod 60 which wasn't very useful.

Approved by: re (scottl)


147559 23-Jun-2005 pjd

Actually only protect mount-point if security.jail.enforce_statfs is set to 2.
If we don't return statistics about requested file systems, system tools
may not work correctly or at all.

Approved by: re (scottl)


147185 09-Jun-2005 pjd

Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs
and extend its functionality:

value policy
0 show all mount-points without any restrictions
1 show only mount-points below jail's chroot and show only part of the
mount-point's path (if jail's chroot directory is /jails/foo and
mount-point is /jails/foo/usr/home only /usr/home will be shown)
2 show only mount-point where jail's chroot directory is placed.

Default value is 2.

Discussed with: rwatson


147141 08-Jun-2005 sobomax

Properly convert FreeBSD priority values into Linux values in the
getpriority(2) syscall.

PR: kern/81951
Submitted by: Andriy Gapon <avg@icyb.net.ua>


146695 27-May-2005 pjd

Remove (now) unused argument 'td' from bsd_to_linux_statfs().


146505 22-May-2005 pjd

The code is under '#ifdef not_that_way', but anyway:

- Add missing prison_check_mount() check.


146502 22-May-2005 pjd

If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us,
so do not duplicate the code in cvtstatfs().
Note, that we now need to clear fsid in freebsd4_getfsstat().

This moves all security related checks from functions like cvtstatfs()
and will allow to add more security related stuff (like statfs(2), etc.
protection for jails) a bit easier.


145584 27-Apr-2005 jeff

- Pass the ISOPEN flag to namei so filesystems will know we're about to
open them or otherwise access the data.


145006 13-Apr-2005 jeff

- Change all filesystems and vfs_cache to relock the dvp once the child is
locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details.

Sponsored by: Isilon Systems, Inc.


144988 13-Apr-2005 mdodd

Implement SOUND_MIXER_INFO ioctl in compat layer.


144987 13-Apr-2005 mdodd

Add support for O_NOFOLLOW and O_DIRECT to Linux fcntl() F_GETFL/F_SETFL.


144501 01-Apr-2005 jhb

- Change the vm_mmap() function to accept an objtype_t parameter specifying
the type of object represented by the handle argument.
- Allow vm_mmap() to map device memory via cdev objects in addition to
vnodes and anonymous memory. Note that mmaping a cdev directly does not
currently perform any MAC checks like mapping a vnode does.
- Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the
cdev the ioctl is acting on rather than trying to find a suitable vnode
to map from.

Reviewed by: alc, arch@


144290 29-Mar-2005 jeff

- Initial cn_lkflags to LK_EXCLUSIVE.

Sponsored by: Isilon Systems, Inc.


144075 24-Mar-2005 brooks

Use the CTASSERT() macro instead of rolling my own, non-portable one
using #error.

Suggested by: jhb


144070 24-Mar-2005 brooks

Compile errors are way more useful then panics later.

Replace a KASSERT of LINUX_IFNAMSIZ == IFNAMSIZ with a preprocessor
check and #error message. This will prevent nasty suprises if users
change IFNAMSIZ without updating the linux code appropriatly.


144012 23-Mar-2005 das

Reject packets larger than IP_MAXPACKET in linux_sendto() for sockets
with the IP_HDRINCL option set. Without this change, a Linux process
with access to a raw socket could cause a kernel panic. Raw sockets
must be created by root, and are generally not consigned to untrusted
applications; hence, the security implications of this bug are
minimal. I believe this only affects 6-CURRENT on or after 2005-01-30.

Found by: Coverity Prevent analysis tool
Security: Local DOS


143635 15-Mar-2005 phk

Neuter the duplicated disk-device magic code for now. Somebody with
serious linux-clue is necessary to fix this properly.


143295 08-Mar-2005 sobomax

Add kernel-only flag MSG_NOSIGNAL to be used in emulation layers to surpress
SIGPIPE signal for the duration of the sento-family syscalls. Use it to
replace previously added hack in Linux layer based on temporarily setting
SO_NOSIGPIPE flag.

Suggested by: alfred


143233 07-Mar-2005 sobomax

Handle MSG_NOSIGNAL flag in linux_send() by setting SO_NOSIGPIPE on socket
for the duration of the send() call. Such approach may be less than ideal
in threading environment, when several threads share the same socket and it
might happen that several of them are calling linux_send() at the same time
with and without SO_NOSIGPIPE set.

However, such race condition is very unlikely in practice, therefore this
change provides practical improvement compared to the previous behaviour.

PR: kern/76426
Submitted by: Steven Hartland <killing@multiplay.co.uk>
MFC after: 3 days


143197 07-Mar-2005 sobomax

Handle unimplemented syscall by instantly returning ENOSYS instead of sending
signal first and only then returning ENOSYS to match what real linux does.

PR: kern/74302
Submitted by: Travis Poppe <tlp@LiquidX.org>


142939 01-Mar-2005 jhb

Remove linux_emul_find() and the CHECKALT*() macros as they are no longer
used.


142220 22-Feb-2005 phk

Neuter linux_ustat() until somebody finds time to try to fix it.

The fundamental problem is that we get only the lower 8 bits of the
minor device number so there is no guarantee that we can actually
find the disk device in question at all.

This was probably a bigger issue pre-GEOM where the upper bits
signaled which slice were in use.

The secondary problem is how we get from (partial) dev_t to vnode.

The correct implementation will involve traversing the mount list
looking for a perfect match or a possible match (for truncated
minor).


141829 13-Feb-2005 njl

Unbreak the kernel build. Pointy hat to: sobomax.


141815 13-Feb-2005 sobomax

Backout previous change (disabling of security checks for signals delivered
in emulation layers), since it appears to be too broad.

Requested by: rwatson


141812 13-Feb-2005 sobomax

Split out kill(2) syscall service routine into user-level and kernel part, the
former is callable from user space and the latter from the kernel one. Make
kernel version take additional argument which tells if the respective call
should check for additional restrictions for sending signals to suid/sugid
applications or not.

Make all emulation layers using non-checked version, since signal numbers in
emulation layers can have different meaning that in native mode and such
protection can cause misbehaviour.

As a result remove LIBTHR from the signals allowed to be delivered to a
suid/sugid application.

Requested (sorta) by: rwatson
MFC after: 2 weeks


141691 11-Feb-2005 sobomax

Semctl with IPC_STAT command should return zero in case of success.

PR: 73778
Submitted by: Andriy Gapon <avg@icyb.net.ua>
MFC after: 2 weeks


141473 07-Feb-2005 jhb

- Use kern_{l,f,}stat() and kern_{f,}statfs() functions rather than
duplicating the contents of the same functions inline.
- Consolidate common code to convert a BSD statfs struct to a Linux struct
into a static worker function.


141472 07-Feb-2005 jhb

Make linux_emul_convpath() a simple wrapper for kern_alternate_path().


141471 07-Feb-2005 jhb

- Tweak kern_msgctl() to return a copy of the requested message queue id
structure in the struct pointed to by the 3rd argument for IPC_STAT and
get rid of the 4th argument. The old way returned a pointer into the
kernel array that the calling function would then access afterwards
without holding the appropriate locks and doing non-lock-safe things like
copyout() with the data anyways. This change removes that unsafeness and
resulting race conditions as well as simplifying the interface.
- Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(),
fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of
code duplication for freebsd4 and netbsd compatability system calls.
- Add a new lookup function kern_alternate_path() that looks up a filename
under an alternate prefix and determines which filename should be used.
This is basically a more general version of linux_emul_convpath() that
can be shared by all the ABIs thus allowing for further reduction of
code duplication.


141467 07-Feb-2005 jhb

Use kern_setitimer() to implement linux_alarm() instead of fondling the
real interval timer directly.


141031 30-Jan-2005 sobomax

Boot away another stackgap (one of the lest ones in linuxlator/i386) by
providing special version of CDIOCREADSUBCHANNEL ioctl(), which assumes that
result has to be placed into kernel space not user space. In the long run
more generic solution has to be designed WRT emulating various ioctl()s
that operate on userspace buffers, but right now there is only one such
ioctl() is emulated, so that it makes little sense.

MFC after: 2 weeks


141029 30-Jan-2005 sobomax

Extend kern_sendit() to take another enum uio_seg argument, which specifies
where the buffer to send lies and use it to eliminate yet another stackgap
in linuxlator.

MFC after: 2 weeks


140839 26-Jan-2005 sobomax

Split out kernel side of msgctl(2) into two parts: the first that pops data
from the userland and pushes results back and the second which does
actual processing. Use the latter to eliminate stackgap in the linux wrapper
of that syscall.

MFC after: 2 weeks


140832 25-Jan-2005 sobomax

Split out kernel side of {get,set}itimer(2) into two parts: the first that
pops data from the userland and pushes results back and the second which does
actual processing. Use the latter to eliminate stackgap in the linux wrappers
of those syscalls.

MFC after: 2 weeks


140214 14-Jan-2005 obrien

Match the LINUX32's style with existing style
Submitted by: Jung-uk Kim <jkim@niksun.com>

Use positive, not negative logic.


140213 14-Jan-2005 obrien

Fix Linux compat 'uname -m' on AMD64.

Submitted by: Jung-uk Kim <jkim@niksun.com>
(patch reworked by me)


139743 05-Jan-2005 imp

Start each of the license/copyright comments with /*-


138353 03-Dec-2004 phk

Do not blindly pass linux filesystem specific mount data across.


138107 26-Nov-2004 phk

Ignore MNT_NODEV option, it is implicit in choice of filesystem.


136356 10-Oct-2004 dwmalone

Rename thread args to be called "td" rather than "p" to be
consistent with other bits of this file. There should be no
functional change.

Submitted by: Andrea Campi (many moons ago)
MFC after: 2 month


136152 05-Oct-2004 jhb

Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
times it needs rather than calling getrusage() twice with associated
stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
for user, system, and interrupt time as well as a bintime of the total
runtime. A new p_rux field in struct proc replaces the same inline fields
from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux
field in struct proc contains the "raw" child time usage statistics.
ruadd() has been changed to handle adding the associated rusage_ext
structures as well as the values in rusage. Effectively, the values in
rusage_ext replace the ru_utime and ru_stime values in struct rusage. These
two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
calculates appropriate timevals for user and system time as well as updating
the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a
copy of the process' p_rux structure to compute the timevals after updating
the runtime appropriately if any of the threads in that process are
currently executing. It also now only locks sched_lock internally while
doing the rux_runtime fixup. calcru() now only requires the caller to
hold the proc lock and calcru1() only requires the proc lock internally.
calcru() also no longer allows callers to ask for an interrupt timeval
since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
calling calcru1() on p_crux. Note that this means that any code that wants
child times must now call this function rather than reading from p_cru
directly. This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
proctree lock is used to protect the process group rather than the process
group lock. By holding this lock until the end of the function we now
ensure that the process/thread that we pick to dump info about will no
longer vanish while we are trying to output its info to the console.

Submitted by: bde (mostly)
MFC after: 1 month


135715 24-Sep-2004 phk

Hold thread reference while frobbing cdevsw.


134266 24-Aug-2004 jhb

Fix the ABI wrappers to use kern_fcntl() rather than calling fcntl()
directly. This removes a few more users of the stackgap and also marks
the syscalls using these wrappers MP safe where appropriate.

Tested on: i386 with linux acroread5
Compiled on: i386, alpha LINT


134209 23-Aug-2004 des

Don't try to translate the control message unless we're certain it's
valid; otherwise a caller could trick us into changing any 32-bit word
in kernel memory to LINUX_SOL_SOCKET (0x00000001) if its previous value
is SOL_SOCKET (0x0000ffff).

MFC after: 3 days


133850 16-Aug-2004 obrien

Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.


133845 16-Aug-2004 obrien

Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.


133840 16-Aug-2004 obrien

Fix the 'DEBUG' argument code to unbreak the LINT build.


133816 16-Aug-2004 tjr

Changes to MI Linux emulation code necessary to run 32-bit Linux binaries
on AMD64, and the general case where the emulated platform has different
size pointers than we use natively:
- declare certain structure members as l_uintptr_t and use the new PTRIN
and PTROUT macros to convert to and from native pointers.
- declare some structures __packed on amd64 when the layout would differ
from that used on i386.
- include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h>
if compiling with COMPAT_LINUX32. This will need to be revisited before
32-bit and 64-bit Linux emulation support can coexist in the same kernel.
- other small scattered changes.

This should be a no-op on i386 and Alpha.


133749 15-Aug-2004 tjr

Replace linux_getitimer() and linux_setitimer() with implementations
based on those in freebsd32_misc.c, removing the assumption that Linux
uses the same layout for struct itimerval as we use natively.


133747 15-Aug-2004 tjr

Avoid assuming that l_timeval is the same as the native struct timeval
in linux_select().


133745 15-Aug-2004 tjr

Use sv_psstrings from the current process's sysentvec structure instead
of PS_STRINGS. This is a no-op at present, but it will be needed when
running 32-bit Linux binaries on amd64 to ensure PS_STRINGS is in
addressable memory.


133716 14-Aug-2004 phk

Add XXX comment about findcdev() misuse.


132708 27-Jul-2004 phk

Use kernel_vmount() instead of vfs_nmount().


132653 26-Jul-2004 cperciva

Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is
somewhat clearer, but more importantly allows for a consistent naming
scheme for suser_cred flags.

The old name is still defined, but will be removed in a few days (unless I
hear any complaints...)

Discussed with: rwatson, scottl
Requested by: jhb


132347 18-Jul-2004 dwmalone

I missed two pieces of the commit to this file. Robert has already
added one, this adds the other.


132331 18-Jul-2004 rwatson

Remove 'sg' argument to linux_sendto_hdrincl, which is what I think was
intended. This fixes the build, but might require revision.


132313 17-Jul-2004 dwmalone

Add a kern_setsockopt and kern_getsockopt which can read the option
values from either user land or from the kernel. Use them for
[gs]etsockopt and to clean up some calls to [gs]etsockopt in the
Linux emulation code that uses the stackgap.


131897 10-Jul-2004 phk

Clean up and wash struct iovec and struct uio handling.

Add copyiniov() which copies a struct iovec array in from userland into
a malloc'ed struct iovec. Caller frees.

Change uiofromiov() to malloc the uio (caller frees) and name it
copyinuio() which is more appropriate.

Add cloneuio() which returns a malloc'ed copy. Caller frees.

Use them throughout.


131796 08-Jul-2004 phk

Use a couple of regular kernel entry points, rather than COMPAT_43
entry points.


131461 02-Jul-2004 netchild

Implement SNDCTL_DSP_SETDUPLEX. This may fix sound apps which want to
use full duplex mode.

Approved by: matk


130959 23-Jun-2004 bde

Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of
depending on namespace pollution in <sys/vnode.h> for the definition
of GIANT_REQUIRED.

Sorted includes.


130902 22-Jun-2004 rwatson

Mark linux_emul_convpath() as GIANT_REQUIRED.


130691 18-Jun-2004 bms

Add stub for Linux SOUND_MIXER_READ_RECMASK, required by some Linux sound
applications.

PR: misc/27471
Submitted by: Gavin Atkinson (with cleanups)


130689 18-Jun-2004 bms

Add a stub for the Linux SOUND_MIXER_INFO ioctl (even though we don't
actually implement it), as some applications, such as RealProducer,
expect to be able to use it.

PR: kern/65971
Submitted by: Matt Wright


130688 18-Jun-2004 bms

Linux applications expect to be able to call SIOCGIFCONF with an
NULL ifc.ifc_buf pointer, to determine the expected buffer size.

The submitted fix only takes account of interfaces with an AF_INET
address configured. This could no doubt be improved.

PR: kern/45753
Submitted by: Jacques Garrigue (with cleanups)


130687 18-Jun-2004 bms

Fix the VT_SETMODE/CDROMIOCTOCENTRY problem correctly.

Reviewed by: tjr


130682 18-Jun-2004 bms

Fix two attempts to use an unchecked NULL pointer provided from the
userland, for the CDIOREADTOCENTRY and VT_SETMODE cases respectively.

Noticed by: tjr


130640 17-Jun-2004 phk

Second half of the dev_t cleanup.

The big lines are:
NODEV -> NULL
NOUDEV -> NODEV
udev_t -> dev_t
udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.


130585 16-Jun-2004 phk

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


130453 14-Jun-2004 phk

Add support for more linux ioctls.

I've had this sitting in my tree for a long time and I can't seem to
find who sent it to me in the first place, apologies to whoever is
missing out on a Contributed by: line here.

I belive it works as it should.


130344 11-Jun-2004 phk

Deorbit COMPAT_SUNOS.

We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.


127140 17-Mar-2004 jhb

- Replace wait1() with a kern_wait() function that accepts the pid,
options, status pointer and rusage pointer as arguments. It is up to
the caller to copyout the status and rusage to userland if needed. This
lets us axe the 'compat' argument and hide all that functionality in
owait(), by the way. This also cleans up some locking in kern_wait()
since it no longer has to drop locks around copyout() since all the
copyout()'s are deferred.
- Convert owait(), wait4(), and the various ABI compat wait() syscalls to
use kern_wait() rather than wait1() or wait4(). This removes a bit
more stackgap usage.

Tested on: i386
Compiled on: i386, alpha, amd64


127059 16-Mar-2004 tjr

Use vfs_nmount() to mount linprocfs filesystems in linux_mount();
linprocfs doesn't support the old mount interface.


127057 16-Mar-2004 tjr

Correct size argument passed to copyinstr() in linux_mount(): mntfromname
and mntonname are both MNAMELEN characters long, not MFSNAMELEN.


126851 11-Mar-2004 phk

Remove unused second arg to vfinddev().
Don't call addaliasu() on VBLK nodes.


126081 21-Feb-2004 phk

Device megapatch 5/6:

Remove the unused second argument from udev2dev().

Convert all remaining users of makedev() to use udev2dev(). The
semantic difference is that udev2dev() will only locate a pre-existing
dev_t, it will not line makedev() create a new one.

Apart from the tiny well controlled windown in D_PSEUDO drivers,
there should no longer be any "anonymous" dev_t's in the system
now, only dev_t's created with make_dev() and make_dev_alias()


125997 19-Feb-2004 bms

Add BSD compatibility tty ioctls LINUX_TIOCSBRK and LINUX_TIOCCBRK. This
addition appears to allow VMware 3 Workstation to operate with nmdm(4)
as a virtual COM device.

Tested by: Guido van Rooij


125454 04-Feb-2004 jhb

Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count. The plimit
structure is treated similarly to struct ucred in that is is always copy
on write, so having a reference to a structure is sufficient to read from
it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
limits from a process to keep the limit structure from changing out from
under you while reading from it.
- Various global limits that are ints are not protected by a lock since
int writes are atomic on all the archs we support and thus a lock
wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
either an rlimit, or the current or max individual limit of the specified
resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
(it didn't used the stackgap when it should have) but uses lim_rlimit()
and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits. It
also no longer uses the stackgap for accessing sysctl's for the
ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result,
ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by: mtm (mostly, I only did a few cleanups and catchups)
Tested on: i386
Compiled on: alpha, amd64


124537 14-Jan-2004 truckman

VOP_GETATTR() wants the vnode passed to it to be locked. Instead
of adding the code to lock and unlock the vnodes and taking care
to avoid deadlock, simplify linux_emul_convpath() by comparing the
vnode pointers directly instead of comparing their va_fsid and
va_fileid attributes. This allows the removal of the calls to
VOP_GETATTR().


124082 02-Jan-2004 alc

Lock the traversal of the vm object list. Use TAILQ_FOREACH consistently.


123828 25-Dec-2003 bde

Quick fix for LINT breakage caused by interface changes in accept(2), etc.
The log message for rev.1.160 of kern/uipc_syscalls.c and associated
changes only claimed to add restrict qualifiers (which have no effect in
the kernel so they probably shouldn't be added), but the following
interface changes were also made:
- caddr_t to `void *' and `struct sockaddr_t *'
- `int *' to `socklen_t *'.
These interface changes are not quite null, and this fix is quick (like
the changes in uipc_syscalls 1.160) because it uses bogus casts instead
of complete bounds-checked conversions.

Things should be fixed better when the conversions can be done without
using the stack gap. linux_check_hdrincl() already uses the stack gap
and is fixed completely though the type mismatches in it were not fatal
(there were only fatal type mismatches from unopaquing pointers to
[o]sockaddr't's -- the difference between accept()'s args and oaccept()'s
args is now non-opaque, but this is not reflected in their args structs).


122892 19-Nov-2003 kan

Do not call VOP_GETATTR in getdents function. It does not serve any
purpose and the resulting vattr structure was ignored. In addition,
the VOP_GETATTR call was made with no vnode lock held, resulting in
vnode locking violation panic with debug kernels.

Reported by: truckman

Approved by: re@ (rwatson)


122861 17-Nov-2003 rwatson

Add a MAC check for VOP_LOOKUP() in the Linux getwcd() implementation.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


122802 16-Nov-2003 sobomax

Pull latest changes from OpenBSD:

- improve sysinfo(2) syscall;
- add dummy fadvise64(2) syscall;
- add dummy *xattr(2) family of syscalls;
- add protos for the syscalls 222-225, 238-249 and 253-267;
- add exit_group(2) syscall, which is currently just wired to exit(2).

Obtained from: OpenBSD
MFC after: 2 weeks


122358 09-Nov-2003 dwmalone

Use kern_sendit rather than sendit for the Linux send* syscalls.
This means we can avoid using the stack gap for most send* syscalls
now (it is still used in the IP_HDRINCL case).


122153 05-Nov-2003 anholt

Prevent leaking of fsid to non-root users in linux_statfs and linux_fstatfs.
Matches native syscalls now.

PR: kern/58793
Submitted by: David P. Reese Jr. <daver@gomerbud.com>
MFC after: 1 week


122088 05-Nov-2003 fjoe

Back out the following revisions:

1.36 +73 -60 src/sys/compat/linux/linux_ipc.c
1.83 +102 -48 src/sys/kern/sysv_shm.c
1.8 +4 -0 src/sys/sys/syscallsubr.h

That change was intended to support vmware3, but
wantrem parameter is useless because vmware3 uses SYSV shared memory
to talk with X server and X server is native application.
The patch worked because check for wantrem was not valid
(wantrem and SHMSEG_REMOVED was never checked for SHMSEG_ALLOCATED segments).

Add kern.ipc.shm_allow_removed (integer, rw) sysctl (default 0) which when set
to 1 allows to return removed segments in
shm_find_segment_by_shmid() and shm_find_segment_by_shmidx().

MFC after: 1 week


121816 31-Oct-2003 brooks

Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)


121302 21-Oct-2003 tjr

Reject negative ngrp arguments in linux_setgroups() and linux_setgroups16();
stops users being able to cause setgroups to clobber the kernel stack by
copying in data past the end of the linux_gidset array.


121286 20-Oct-2003 sam

fix build: linux_to_bsd_msf_lba is no longer used because of previous commit


121272 20-Oct-2003 sos

We dont support CDROMREADAUDIO anymore.


121008 11-Oct-2003 iwasaki

Fix some problems in linux_sendmsg() and linux_recvmsg().
- Allocate storage for uap->msg always because it is copyin()'ed in
native sendmsg().
- Convert sockopt level from Linux to FreeBSD after native recvmsg() calling.
- Some cleanups.

Tested with: Oracle 9i shared server connection mode.

MFC after: 1 week


119839 07-Sep-2003 bde

Restored a non-egregious cast so that this file compiles on i386's
with 64-bit longs again. This was fixed in rev.1.42 but the fix
rotted non-fatally in rev.1.105 and fatally in rev.1.137.

Many more non-egregrious casts are strictly required for conversions
from semi-opaque types to pointers, but we avoid most of them by using
types that are almost certain to be compatible with uintptr_t for
representing pointers (e.g., vm_offset_t). Here we don't really want
the u_longs, but we have them because a.out.h and its support code
doesn't use typedefs (it uses unsigned in V7 and unsigned long in
FreeBSD) and is too obsolete to fix now.


118149 29-Jul-2003 des

Try to make 'uname -a' look more like it does on Linux:

- cut the version string at the newline, suppressing information about
who built the kernel and in what directory. Most of this information
was already lost to truncation.

- on i386, return the precise CPU class (if known) rather than just
"i386". Linux software which uses this information to select
which binary to run often does not know what to make of "i386".


118047 26-Jul-2003 phk

Add a "int fd" argument to VOP_OPEN() which in the future will
contain the filedescriptor number on opens from userland.

The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*

For now pass -1 all over the place.


117723 18-Jul-2003 phk

Add a new function swap_pager_status() which reports the total size of the
paging space and how much of it is in use (in pages).

Use this interface from the Linuxolator instead of groping around in the
internals of the swap_pager.


116999 28-Jun-2003 marcel

Don't map LINUX_POSIX_VDISABLE to _POSIX_VDISABLE and vice versa for
the VMIN and VTIME members of the c_cc array. These members are not
special control characters. By not excluding these members we
changed the noncanonical mode input processing when both members
were 0 on entry (=LINUX_POSIX_VDISABLE) as we would remap them to 255
(=_POSIX_VDISABLE). See termios(4) case A for how that screws up
your terminal I/O.

PR: 23173
Originator: Bjarne Blichfeldt <bbl@dk.damgaard.com>
Patch by: Boris Nikolaus <bn@dali.tellique.de> (original submission)
Philipp Mergenthaler <philipp.mergenthaler@stud.uni-karlsruhe.de>
Reminders by: Joseph Holland King <gte743n@cad.gatech.edu>
MFC after: 5 days


116678 22-Jun-2003 phk

Add a f_vnode field to struct file.

Several of the subtypes have an associated vnode which is used for
stuff like the f*() functions.

By giving the vnode a speparate field, a number of checks for the specific
subtype can be replaced simply with a check for f_vnode != NULL, and
we can later free f_data up to subtype specific use.

At this point in time, f_data still points to the vnode, so any code I
might have overlooked will still work.


116173 10-Jun-2003 obrien

Use __FBSDID().


114724 05-May-2003 mbr

Change the semantics of sysv shm emulation to take a additional
argument to the functions shm{at,ctl}1 and shm_find_segment_by_shmid{x}.
The BSD semantics didn't allow the usage of shared segment after
being marked for removal through IPC_RMID.

The patch involves the following functions:
- shmat
- shmctl
- shm_find_segment_by_shmid
- shm_find_segment_by_shmidx
- linux_shmat
- linux_shmctl

Submitted by: Orlando Bassotto <orlando.bassotto@ieo-research.it>
Reviewed by: marcel


114230 29-Apr-2003 mbr

Initialize tbuf in newstat_copyout() too.

Reviewed by: phk


114216 29-Apr-2003 kan

Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on: standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>


114214 29-Apr-2003 mbr

Do the same thing for stat64_copyout() as we already
do for newstat_copyout().

Lie about disk drives which are character devices
in FreeBSD but block devices under Linux.

PR: 37227
Submitted by: Vladimir B. Grebenschikov <vova@sw.ru>
Reviewed by: phk
MFC after: 2 weeks


114174 28-Apr-2003 jhb

Argh! We want to return the old signal set when the error return is zero
(i.e. success), not non-zero (failure).

Submitted by: tegge
Pointy hat to: jhb


114023 25-Apr-2003 jhb

Use a switch to convert the Linux sigprocmask flags to the equivalent
FreeBSD flags instead of just adding one to the Linux flags. This should
be identical to the previous version except that I have at least one report
of this patch fixing problems people were having with Linux apps after my
last commit to this file. It is safer to use the switch then to make
assumptions about the flag values anyways, esp. since we currently use
MD defines for the values of the flags and this is MI code.

Tested by: Michael Class <michael_class@gmx.net>


113991 24-Apr-2003 anholt

Add an ioctl handler for the DRM. This removes the need for the DRM_LINUX
option, which has been a source of frustration for many users.


113917 23-Apr-2003 jhb

Fix a lock order reversal. Unlock the proc before calling fget().

Reported by: kris


113859 22-Apr-2003 jhb

- Replace inline implementations of sigprocmask() with calls to
kern_sigprocmask() in the various binary compatibility emulators.
- Replace calls to sigsuspend(), sigaltstack(), sigaction(), and
sigprocmask() that used the stackgap with calls to the corresponding
kern_sig*() functions instead without using the stackgap.


113615 17-Apr-2003 jhb

Don't hold the proc lock while performing sigset conversions on local
variables.


113613 17-Apr-2003 jhb

Use local struct proc variables to reduce repeated td->td_proc dereferences
and improve readability.


113581 16-Apr-2003 phk

Don't include <sys/disklabel.h>


113579 16-Apr-2003 jhb

Explicitly cast a l_ulong to an unsigned long to make all arch's happy
with the printf format.


113577 16-Apr-2003 jhb

Fix printf format in a debug printf.


112938 01-Apr-2003 phk

Add #include <sys/conf.h> so we don't rely on <sys/disk.h> doing it.


112888 31-Mar-2003 jeff

- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with
a follow on commit to kern_sig.c
- signotify() now operates on a thread since unmasked pending signals are
stored in the thread.
- PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.


112740 28-Mar-2003 phk

Fix an XXX: and implement LINUX_BLKGETSIZE correctly.


112682 26-Mar-2003 jhb

Add a cleanup function to destroy the osname_lock and call it on module
unload.

Submitted by: gallatin
Reported by: Martin Karlsson <mk-freebsd@bredband.net>


112451 20-Mar-2003 jhb

Use td->td_ucred instead of td->td_proc->p_ucred.


112430 20-Mar-2003 phk

Backout the getcwd changes, a more comprehensive effort will be needed.


112342 17-Mar-2003 phk

(This commit certainly increases the need for a wash&clean of vfs_cache.c,
but I decided that it was important for this patch to not bit-rot, and
since it is mainly moving code around, the total amount of entropy is
epsilon /phk)

This is a patch to move the common parts of linux_getcwd() back into
kern/vfs_cache.c so that the standard FreeBSD libc getcwd() can use it's
extended functionality. The linux syscall linux_getcwd() in
compat/linux/linux_getcwd.c has been rewritten to use it too. It should
be possible to simplify libc's getcwd() after this. No doubt this code
needs some cleaning up, since I've left in the sysctl variables I used
for debugging.

PR: 48169
Submitted by: James Whitwell <abacau@yahoo.com.au>


112206 13-Mar-2003 jhb

- Change the linux_[gs]et_os{name, release, s_version}() functions to
take a thread instead of a proc for their first argument.
- Add a mutex to protect the system-wide Linux osname, osrelease, and
oss_version variables.
- Change linux_get_prison() to take a thread instead of a proc for its
first argument and to use td_ucred rather than p_ucred. This is ok
because a thread's prison does not change even though it's ucred might.
- Also, change linux_get_prison() to return a struct prison * instead of
a struct linux_prison * since it returns with the struct prison locked
and this makes it easier to safely unlock the prison when we are done
messing with it.


111798 03-Mar-2003 des

Clean up whitespace and remove register keyword.


111797 03-Mar-2003 des

More caddr_t removal, in conjunction with copy{in,out}(9) this time.
Also clean up some egregious casts and incorrect use of sizeof.


111742 02-Mar-2003 des

Clean up whitespace, s/register //, refrain from strong urge to ANSIfy.


111741 02-Mar-2003 des

uiomove-related caddr_t -> void * (just the low-hanging fruit)


111173 20-Feb-2003 ume

Add M_WAITOK


111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


111034 17-Feb-2003 tjr

Use the proc lock to protect p_realtimer instead of Giant, and obtain
sched_lock around accesses to p_stats->p_timer[] to avoid a potential
race with hardclock. getitimer(), setitimer() and the realitexpire()
callout are now Giant-free.


110980 16-Feb-2003 tjr

Add MPSAFE comment to linux_sigpending().


110848 14-Feb-2003 tjr

Obtain proc lock around modification of p_siglist in linux_wait4().


110538 08-Feb-2003 dwmalone

1) Linux_sendto was trashing the BSD sockaddr it put in the stackgap,
so be more careful about calling stackgap_init.

Tested by: Fred Souza <fred@storming.org>

2) Linux_sendmsg was forgetting to fill out the bsd_args struct.

Reviewed by: ume

3) The args to linux_connect have differently named types on alpha and
i386, so add a cast to stop gcc complaining.

Spotted by: peter


110376 05-Feb-2003 ume

Avoid undefined symbol error with an IPv4 only kernel.

Reported by: "Sergey A. Osokin" <osa@freebsd.org.ru>


110295 03-Feb-2003 ume

Add IPv6 support for Linuxlator.

Reviewed by: dwmalone
MFC after: 10 days


109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


109153 13-Jan-2003 dillon

Bow to the whining masses and change a union back into void *. Retain
removal of unnecessary casts and throw in some minor cleanups to see if
anyone complains, just for the hell of it.


109123 12-Jan-2003 dillon

Change struct file f_data to un_data, a union of the correct struct
pointer types, and remove a huge number of casts from code using it.

Change struct xfile xf_data to xun_data (ABI is still compatible).

If we need to add a #define for f_data and xf_data we can, but I don't
think it will be necessary. There are no operational changes in this
commit.


108541 02-Jan-2003 alfred

Add function linux_msg() for regulating output from the linux emulation
code, make the emulator use it.

Rename unsupported_msg() to unimplemented_syscall(). Rename some arguments
for clarity

Fixup grammar.

Requested by: bde


108523 01-Jan-2003 alfred

When complaining about obsolete/unimplemented syscalls output the process
name to make things more clear for the user.

PR: 46661
MFC After: 3 days


108172 22-Dec-2002 hsu

SMP locking for ifnet list.


107680 08-Dec-2002 iedowse

Fix emulation of the fcntl64() syscall. In Linux, this is exactly
the same as fcntl() except that it supports the new 64-bit file
locking commands (LINUX_F_GETLK64 etc) that use the `flock64'
structure. We had been interpreting all flock structures passed to
fcntl64() as `struct flock64' instead of only the ones from F_*64
commands.

The glibc in linux_base-7 uses fcntl64() by default, but the bug
was often non-fatal since the misinterpretation typically only
causes junk to appear in the `l_len' field and most junk values are
accepted as valid range lengths. The result is occasional EINVAL
errors from F_SETLK and a few bytes after the supplied `struct
flock' getting clobbered during F_GETLK.

PR: kern/37656
Reviewed by: marcel
Approved by: re
MFC after: 1 week


105477 19-Oct-2002 marcel

Implement the CDROMREADAUDIO ioctl.


105359 17-Oct-2002 robert

- Use strlcpy() rather than strncpy() to copy NUL terminated
strings.
- Pass the correct buffer size to getcredhostname().


104893 11-Oct-2002 sobomax

- Add support for IPC_64 extensions into shmctl(2), semctl(2) and msgctl(2);
- add wrappers for mmap2(2) and ftruncate64(2) system calls;
- don't spam console with printf's when VFAT_READDIR_BOTH ioctl(2) is invoked;
- add support for SOUND_MIXER_READ_STEREODEVS ioctl(2);
- make msgctl(IPC_STAT) and IPC_SET actually working by converting from
BSD msqid_ds to Linux and vice versa;
- properly return EINVAL if semget(2) is called with nsems being negative.

Reviewed by: marcel
Approved by: marcel
Tested with: LSB runtime test


104306 01-Oct-2002 jmallett

Back our kernel support for reliable signal queues.

Requested by: rwatson, phk, and many others


104233 30-Sep-2002 jmallett

First half of implementation of ksiginfo, signal queues, and such. This
gets signals operating based on a TailQ, and is good enough to run X11,
GNOME, and do job control. There are some intricate parts which could be
more refined to match the sigset_t versions, but those require further
evaluation of directions in which our signal system can expand and contract
to fit our needs.

After this has been in the tree for a while, I will make in kernel API
changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo
more robustly, such that we can actually pass information with our
(queued) signals to the userland. That will also result in using a
struct ksiginfo pointer, rather than a signal number, in a lot of
kern_sig.c, to refer to an individual pending signal queue member, but
right now there is no defined behaviour for such.

CODAFS is unfinished in this regard because the logic is unclear in
some places.

Sponsored by: New Gold Technology
Reviewed by: bde, tjr, jake [an older version, logic similar]


103941 25-Sep-2002 jeff

- Hold the vn lock over vm_mmap().


103886 24-Sep-2002 mini

Back out last commit. Linux uses the old 4.3BSD sockaddr format.


103839 23-Sep-2002 mini

Don't use compatability syscall wrappers in emulation code.
This is needed for the COMPAT_FREEBSD3 option split.

Reviewed by: alfred, jake


103712 20-Sep-2002 mdodd

Remove NVIDIA ioctl bits. They will be provided in a kernel module.


103705 20-Sep-2002 phk

Put an XXX comment here to point somebody in the right direction.


103664 20-Sep-2002 imp

Current uses struct thread *td rather than struct proc *p.


103652 19-Sep-2002 mdodd

Pass flags to msync() accounting for differences in the definition of
MS_SYNC on FreeBSD and Linux.

Submitted by: Christian Zander <zander@minion.de>


103651 19-Sep-2002 mdodd

This patch extends the FreeBSD Linux compatibility layer to support
NVIDIA API calls; more specifically, it adds an ioctl() handler for
the range of possible NVIDIA ioctl numbers.

Submitted by: Christian Zander <zander@minion.de>


102963 05-Sep-2002 bde

Do not cast from a pointer to an integer of a possibly different size.
This fixes a warning on i386's with 64-bit longs.


102954 05-Sep-2002 bde

Include <sys/malloc.h> instead of depending on namespace pollution 2
layers deep in <sys/proc.h> or <sys/vnode.h>.

Removed unused includes. Sorted includes.


102947 05-Sep-2002 marcel

Implement LINUX_TIOCSCTTY.

PR: kern/42404


102872 02-Sep-2002 iedowse

Use the new kern_*() functions to avoid using the stack gap in
linux_fcntl*() and linux_getcwd().


102814 01-Sep-2002 iedowse

Use the new kern_* functions to avoid the need to store arguments
in the stack gap. This converts most VFS and signal related system
calls, as well as select().

Discussed on: -arch
Approved by: marcel


102803 01-Sep-2002 iedowse

Add a new function linux_emul_convpath(), which is a version of
linux_emul_find() that does not use stack gap storage but instead
always returns the resulting path in a malloc'd kernel buffer.
Implement linux_emul_find() in terms of this function. Also add
LCONVPATH* macros that wrap linux_emul_convpath in the same way
that the CHECKALT* macros wrap linux_emul_find().


102052 18-Aug-2002 sobomax

Increase size of ifnet.if_flags from 16 bits (short) to 32 bits (int). To avoid
breaking application ABI use unused ifreq.ifru_flags[1] for upper 16 bits in
SIOCSIFFLAGS and SIOCGIFFLAGS ioctl's.

Reviewed by: -hackers, -net


102003 17-Aug-2002 rwatson

In continuation of early fileop credential changes, modify fo_ioctl() to
accept an 'active_cred' argument reflecting the credential of the thread
initiating the ioctl operation.

- Change fo_ioctl() to accept active_cred; change consumers of the
fo_ioctl() interface to generally pass active_cred from td->td_ucred.
- In fifofs, initialize filetmp.f_cred to ap->a_cred so that the
invocations of soo_ioctl() are provided access to the calling f_cred.
Pass ap->a_td->td_ucred as the active_cred, but note that this is
required because we don't yet distinguish file_cred and active_cred
in invoking VOP's.
- Update kqueue_ioctl() for its new argument.
- Update pipe_ioctl() for its new argument, pass active_cred rather
than td_ucred to MAC for authorization.
- Update soo_ioctl() for its new argument.
- Update vn_ioctl() for its new argument, use active_cred rather than
td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR().

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


101983 16-Aug-2002 rwatson

Make similar changes to fo_stat() and fo_poll() as made earlier to
fo_read() and fo_write(): explicitly use the cred argument to fo_poll()
as "active_cred" using the passed file descriptor's f_cred reference
to provide access to the file credential. Add an active_cred
argument to fo_stat() so that implementers have access to the active
credential as well as the file credential. Generally modify callers
of fo_stat() to pass in td->td_ucred rather than fp->f_cred, which
was redundantly provided via the fp argument. This set of modifications
also permits threads to perform these operations on behalf of another
thread without modifying their credential.

Trickle this change down into fo_stat/poll() implementations:

- badfo_poll(), badfo_stat(): modify/add arguments.
- kqueue_poll(), kqueue_stat(): modify arguments.
- pipe_poll(), pipe_stat(): modify/add arguments, pass active_cred to
MAC checks rather than td->td_ucred.
- soo_poll(), soo_stat(): modify/add arguments, pass fp->f_cred rather
than cred to pru_sopoll() to maintain current semantics.
- sopoll(): moidfy arguments.
- vn_poll(), vn_statfile(): modify/add arguments, pass new arguments
to vn_stat(). Pass active_cred to MAC and fp->f_cred to VOP_POLL()
to maintian current semantics.
- vn_close(): rename cred to file_cred to reflect reality while I'm here.
- vn_stat(): Add active_cred and file_cred arguments to vn_stat()
and consumers so that this distinction is maintained at the VFS
as well as 'struct file' layer. Pass active_cred instead of
td->td_ucred to MAC and to VOP_GETATTR() to maintain current semantics.

- fifofs: modify the creation of a "filetemp" so that the file
credential is properly initialized and can be used in the socket
code if desired. Pass ap->a_td->td_ucred as the active
credential to soo_poll(). If we teach the vnop interface about
the distinction between file and active credentials, we would use
the active credential here.

Note that current inconsistent passing of active_cred vs. file_cred to
VOP's is maintained. It's not clear why GETATTR would be authorized
using active_cred while POLL would be authorized using file_cred at
the file system level.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


101707 12-Aug-2002 rwatson

Another fix that wasn't pulled in from the MAC branch: the
struct mount is not cached as *mp at this point, so use
vp->v_mount directly, following the check that it's non-NULL.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


101706 12-Aug-2002 rwatson

Fix missing parens in MAC readdir() check. This fix was in the MAC
branch, but apparently didn't get moved over when it was made.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


101308 04-Aug-2002 jeff

- Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
with VOP calls is needed.
- v_iflag is protected by interlock and is used for dealing with vnode
management issues. These flags include X/O LOCK, FREE, DOOMED, etc.
- All accesses to v_iflag and v_vflag have either been locked or marked with
mp_fixme's.
- Many ASSERT_VOP_LOCKED calls have been added where the locking was not
clear.
- Many functions in vfs_subr.c were restructured to provide for stronger
locking.

Idea stolen from: BSD/OS


101189 01-Aug-2002 rwatson

Introduce support for Mandatory Access Control and extensible
kernel access control.

Invoke appropriate MAC entry points for a number of VFS-related
operations in the Linux ABI module. In particular, handle uselib
in a manner similar to open() (more work is probably needed here),
as well as handle statfs(), and linux readdir()-like calls.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


99687 09-Jul-2002 robert

Move the switch statement labels for the explicit 64-bit
command arguments into the correct function, linux_fcntl64(),
and thus out of the scope of a compilation for the alpha
platform.

Requested by: obrien


99670 09-Jul-2002 robert

Enable emulation of the F_GETLK64, F_SETLK64, and F_SETLKW64
lock commands arguments to linux_fcntl64().


98878 26-Jun-2002 arr

- Remove the Giant acquisition from linux_socket_ioctl() as it was really
there to protect fdrop() (which in turn can call vrele()), however,
fdrop_locked() grabs Giant for us, so we do not have to.

Reviewed by: jhb
Inspired by: alc


98209 14-Jun-2002 rwatson

Add a comment about how we should use vn_open() here instead of directly
invoking VOP_OPEN(). This would reduce code redundancy with the rest
of the kernel, and also is required for MAC to work properly.


97748 02-Jun-2002 schweikh

Fix typo in the BSD copyright: s/withough/without/

Spotted and suggested by: des
MFC after: 3 weeks


96840 18-May-2002 marcel

In msgrcv(), set msgtyp correctly. Hardwiring 0 as the message type
yields incorrect behaviour. The hardwiring was present in the very
first commit that implemented msgrcv() (revision 1.4) and hasn't been
changed since. The native implementation was complete at that time,
so there doesn't seem to be a reason for the hardwiring from a
technical point of view.

Submitted by: Reinier Bezuidenhout <rbezuide@yahoo.com>


96398 11-May-2002 dd

sysctl -w -> sysctl


95837 01-May-2002 peter

Zap some stale unused headers, including one machine/psl.h (which is
a stub on alpha). Compile tested on alpha and x86.


95130 20-Apr-2002 rwatson

Add an XXX: linux_uselib() should be using vn_open() rather than invoking
VOP_OPEN() and doing lots of manual checking. This would further
centralize use of the name functions, and once the MAC code is integrated,
meaning few extraneous MAC checks scattered all over the place. I don't
have time to fix this now, but want to make sure it doesn't get
forgotten. Anyone interested in fixing this should feel free.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


94621 13-Apr-2002 jhb

Rework logic of syscalls that modify process credentials as described in
rev 1.152 of sys/kern/kern_prot.c.


94454 11-Apr-2002 jhb

Use td_ucred in a few spots.


93793 04-Apr-2002 bde

Moved signal handling and rescheduling from userret() to ast() so that
they aren't in the usual path of execution for syscalls and traps.
The main complication for this is that we have to set flags to control
ast() everywhere that changes the signal mask.

Avoid locking in userret() in most of the remaining cases.

Submitted by: luoqi (first part only, long ago, reorganized by me)
Reminded by: dillon


93593 01-Apr-2002 jhb

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


93073 24-Mar-2002 bde

Fixed some style bugs in the removal of __P(()). Tabs before "__P(("
were not removed.


92787 20-Mar-2002 jeff

Remove references to vm_zone.h and switch over to the new uma API.


92761 20-Mar-2002 alfred

Remove __P.


91406 27-Feb-2002 jhb

Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.


91392 27-Feb-2002 robert

Use the updated getcredhostname() function.


91385 27-Feb-2002 robert

Use the getcredhostname function to fill the hostname into
the linux_newuname_args structure. This should fix the case
of jailed linux processes not using the jail's hostname.

PR: 35336
Reviewed by: phk


91140 23-Feb-2002 tanimura

Lock struct pgrp, session and sigio.

New locks are:

- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.

Please refer to sys/proc.h for the coverage of these locks.

Changes on the pgrp/session interface:

- pgfind() needs the pgrpsess_lock held.

- The caller of enterpgrp() is responsible to allocate a new pgrp and
session.

- Call enterthispgrp() in order to enter an existing pgrp.

- pgsignal() requires a pgrp lock held.

Reviewed by: jhb, alfred
Tested on: cvsup.jp.FreeBSD.org
(which is a quad-CPU machine running -current)


90984 20-Feb-2002 alfred

fix file descriptor leak.

Submitted by: Mark Santcroos <marks@ripe.net>


90690 15-Feb-2002 bde

Garbage collect options AVM_A1_PCI, AVM_A1_PCMCIA, DEBUG_LINUX, DEV_APM,
GUS_DMA, GUS_DMA2, GUS_IRQ, OLTR_NO_BULLSEYE_MAC, OLTR_NO_HAWKEYE_MAC,
OLTR_NO_TMS_MAC and PCIC_RESUME_RESET.


89944 29-Jan-2002 marcel

Have SIOCGIFCONF return all (if any) AF_INET addresses for the
interfaces we encounter. In Linux, all addresses are returned for
which gifconf handlers are installed. This boils down to AF_DECnet
and AF_INET. We care mostly about AF_INET for now. Adding additional
families is simple enough.

Returning the addresses is important for RPC clients to function
properly. Andrew found in some reference code that the logic that
handles the retransmission looks for an interface that's up and has
an AF_INET address. This obviously failed as we didn't return any
addresses at all.

Note also that with this change we don't return interfaces that don't
have AF_INET addresses, whereas before we returned any interface
present in the system. This is in line with what Linux does (modulo
interfaces with only AF_DECnet addresses of course :-)

Reported by: "Andrew Atrens" <atrens@nortelnetworks.com>
MFC after: 1 week


89717 23-Jan-2002 gallatin

Linux/alpha uses the same BSDish return mechanism we do for
getpid, getuid, getgid and pipe, since they bootstrapped from
OSF/1 and never cleaned up. Switch to the native syscalls
on alpha so that the above functions work

MFC after: 7 days


89379 15-Jan-2002 marcel

Reinstate linux_ifname. Although the Linuxulator doesn't use it
itself, it's used outside the Linuxulator. Reimplement the
function so that its behaviour matches the current renaming
scheme. It's probably better to formalize these interdependencies.


89319 14-Jan-2002 alfred

Replace ffind_* with fget calls.

Make fget MPsafe.

Make fgetvp and fgetsock use the fget subsystem to reduce code bloat.

Push giant down in fpathconf().


89311 13-Jan-2002 alfred

Remove unused variable.


89306 13-Jan-2002 alfred

SMP Lock struct file, filedesc and the global file list.

Seigo Tanimura (tanimura) posted the initial delta.

I've polished it quite a bit reducing the need for locking and
adapting it for KSE.

Locks:

1 mutex in each filedesc
protects all the fields.
protects "struct file" initialization, while a struct file
is being changed from &badfileops -> &pipeops or something
the filedesc should be locked.

1 mutex in each struct file
protects the refcount fields.
doesn't protect anything else.
the flags used for garbage collection have been moved to
f_gcflag which was the FILLER short, this doesn't need
locking because the garbage collection is a single threaded
container.
could likely be made to use a pool mutex.

1 sx lock for the global filelist.

struct file * fhold(struct file *fp);
/* increments reference count on a file */

struct file * fhold_locked(struct file *fp);
/* like fhold but expects file to locked */

struct file * ffind_hold(struct thread *, int fd);
/* finds the struct file in thread, adds one reference and
returns it unlocked */

struct file * ffind_lock(struct thread *, int fd);
/* ffind_hold, but returns file locked */

I still have to smp-safe the fget cruft, I'll get to that asap.


89182 10-Jan-2002 marcel

Further fixes related to the interface renaming. Now that we
properly translate the interface name passed to us, make sure
we also translate correctly before we return the list of
interfaces with the SIOCGIFCONF ioctl. It is common to use
the interface names returned by that ioctl in further ioctls,
such as SIOCGIFFLAGS.

Remove linux_ifname as it is no longer used. Also remove
ifname_bsd_to_linux as it cannot be used anymore now that
linux_ifname is removed (was deadcode anyway).

Reported and tested by: Andrew Atrens <atrens@nortelnetworks.com>


87599 10-Dec-2001 obrien

Update to C99, s/__FUNCTION__/__func__/,
also don't use ANSI string concatenation.


87335 04-Dec-2001 marcel

When translating the interface name when "eth?" is given, do not
use the internal index number as the unit number to compare with.
The first ethernet interface in Linux is called "eth0", whereas
our internal index starts wth 1 and is not unique to ethernet
interfaces (lo0 has index 1 for example). Instead, use a function-
local index number that starts with 0 and is incremented only
for ethernet interfaces. This way the unit number will match the
n-th ethernet interface in the system, which is exactly what it
means in Linux.

Tested by: Glenn Johnson <gjohnson@srrc.ars.usda.gov>
MFC after: 3 days


87275 03-Dec-2001 rwatson

o Introduce pr_mtx into struct prison, providing protection for the
mutable contents of struct prison (hostname, securelevel, refcount,
pr_linux, ...)
o Generally introduce mtx_lock()/mtx_unlock() calls throughout kern/
so as to enforce these protections, in particular, in kern_mib.c
protection sysctl access to the hostname and securelevel, as well as
kern_prot.c access to the securelevel for access control purposes.
o Rewrite linux emulator abstractions for accessing per-jail linux
mib entries (osname, osrelease, osversion) so that they don't return
a pointer to the text in the struct linux_prison, rather, a copy
to an array passed into the calls. Likewise, update linprocfs to
use these primitives.
o Update in_pcb.c to always use prison_getip() rather than directly
accessing struct prison.

Reviewed by: jhb


86852 24-Nov-2001 des

Revert incorrect KSEfication: realitexpire expects a struct proc *, not a
struct thread *.


86607 19-Nov-2001 iedowse

Deal with a few issues that cropped up following the recent changes
to the code for translating socket and private ioctls:

- Only perform socket ioctl translation if the file descriptor is a
socket.
- Treat socket ioctls on non-sockets specially, and for now assume
that these are directed at a tap/vmnet device, so translate the
ioctl numbers as appropriate (the way if_tap abuses some socket
ioctls to pass non-ifreq data is utterly bogus, but this is how
VMware on FreeBSD has always "worked"; I will deal with this
later).
- Add (untested) support for translating SIOCSIFADDR.
- In all cases where we fail to translate an ioctl, return ENOIOCTL
so that other handlers have a chance to do the translation.

This should fix the "/dev/vmnet1: Invalid argument" errors that
users of VMware were experiencing, though I have only verified this
on RELENG_4.

Submitted by: des (mostly)
MFC after: 3 days


86555 18-Nov-2001 marcel

Implement DVD-ROM ioctls.

PR: 26955
Submitted by: Boris Nikolaus (email unknown)


86540 18-Nov-2001 marcel

Implement missing SOUND_MIXER_WRITE_RECSRC ioctl.

PR: 22971
Tested by: dougb


86504 17-Nov-2001 dillon

Fix missing holdsock()->fgetsock()

Submitted by: Hisashi Hiramoto <hiramoto@phys.chs.nihon-u.ac.jp>


86484 17-Nov-2001 peter

Forward declare struct ifnet - this fixes a warning in tdfx_pci.c


86483 17-Nov-2001 peter

Fix printf warnings (int/long)
#if 0 around unused ifname_bsd_to_linux() function


86482 17-Nov-2001 peter

Fix warning in debug printf. This is a long on alpha, and int on i386,
but printed with %ld always.


86183 08-Nov-2001 rwatson

o Replace reference to 'struct proc' with 'struct thread' in 'struct
sysctl_req', which describes in-progress sysctl requests. This permits
sysctl handlers to have access to the current thread, permitting work
on implementing td->td_ucred, migration of suser() to using struct
thread to derive the appropriate ucred, and allowing struct thread to be
passed down to other code, such as network code where td is not currently
available (and curproc is used).

o Note: netncp and netsmb are not updated to reflect this change, as they
are not currently KSE-adapted.

Reviewed by: julian
Obtained from: TrustedBSD Project


85623 28-Oct-2001 mr

Introduce [IPC|SHM]_[INFO|STAT] to shmctl to make
`/compat/linux/usr/bin/ipcs -m` happy.


85599 27-Oct-2001 des

Eliminate the prefix parameter to linux_emul_find(), which was always
linux_emul_path anyway. Linux_emul_find() has interesting bugs in its
prefix handling (which luckily are not currently exploitable); this
commit is preliminary to an attempt at cleaning it up.

Approved by: marcel


85569 26-Oct-2001 fenner

Force the length of the sockaddr to be correct for AF_INET and AF_INET6
in bind() and connect(). Linux doesn't care if the length of the
sockaddr matches its address family; FreeBSD does. This fixes the
known issues with the resolver in linux_base-7.


85203 20-Oct-2001 des

Tweak the way we determine if an interface needs to have its name translated.
Add some missing break statements in the socket ioctl switch.
Check the return value from copyin() / copyout().
Fix some disorderings and misindentations.
Support a couple more socket ioctls.
Add missing break statements.


85139 19-Oct-2001 marcel

Fix Alpha related brokenness. We used to have a MD linux_ioctl.h
that appeared to be very different from the MI version. These
differences were mostly bogus and caused by copying octal
definitions and write them as hexadecimal values without doing
any base conversion (ie 010 was copied to 0x10). After filtering
out these differences, any remaining (real) incompatibilities
have been merged into the MI header file to make them more visible.

While here, fix the termios <-> termio conversion WRT to the c_cc
field for Alpha. The termios values do not match the termio values
and thus prevents us from copying.

By eliminating the Alpha MD copy of linux_ioctl.h we also fixed
the recent build breakage caused by putting new bits in the MI
header and not in the MD header.


85127 19-Oct-2001 des

Add support for the "device private" ioctls soon to be used by the an driver.
Also slightly change the name translation policy - only rename interfaces
that have the IFF_BROADCAST flag set. This is not perfect, but is closer to
how Linux names network interfaces.


85125 19-Oct-2001 des

Whitespace fix.


85022 16-Oct-2001 marcel

Implement linux_chown and linux_lchown. The fchown syscall maps
directly to the native syscall, because no filename handling
needs to be done.

Tested by: Martin Blapp <mb@imp.ch>


85012 15-Oct-2001 des

Try to make Linux socket ioctls work. Up until now they've only *pretended*
to work, but haven't really due to subtle differences in structs etc.

This is still not perfect (some ioctls are still known not to work, while
others haven't been tested at all), but it's enough to get Debian's ifconfig
to produce relatively sane output.

More work will be needed to get all ioctls (or at least a reasonable subset)
working, and to support the Cisco Aironet config tool mentioned in the PR.

PR: 26546
Submitted by: Doug Ambrisko <ambrisko@ambrisko.com>


84916 14-Oct-2001 marcel

When casting from uid16/gid16 to uid/gid respectively, make sure
that "no change" (ie 0xFFFF) is properly cast to (int)-1 for those
syscalls that set uids and/or gids.

Verified by: LTP


84783 10-Oct-2001 ps

Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader
tunable.

Reviewed by: peter
MFC after: 2 weeks


84075 28-Sep-2001 marcel

Remove linux_getpgid(). We map the syscall natively now.

PR: kern/21402


84067 28-Sep-2001 marcel

Swap the src and dst arguments of the bcopy added in the
previous commit. It ain't memcpy... *cough*


83955 26-Sep-2001 marcel

The arg parameter is passed by value in Linux, but not in FreeBSD.
We still have to account for a copyin. Make sure the copyin will
succeed by passing the FreeBSD syscall a pointer to userspace,
albeit one that's automagically mapped into kernel space.

Reported by: mr, Mitsuru IWASAKI <iwasaki@jp.FreeBSD.org>
Tested by: Mitsuru IWASAKI <iwasaki@jp.FreeBSD.org>


83667 19-Sep-2001 sobomax

Fix abuse of vtagtype. In addition, after this the linux programs will be
able correctly distinguish ext2fs from the ufs filesystem (previously ext2fs
was indistinguishable from the ufs).

Reviewed by: phk, marcel


83503 15-Sep-2001 mr

Add a wrapper for linux_getsid -> getsid Syscall.


83501 15-Sep-2001 mr

Implement LINUX_[SEM|IPC]_[STAT|INFO]
to make /compat/linux/usr/bin/ipcs -s happy.

PR: kern/29698 (part)
Reviewed by: audit


83436 14-Sep-2001 marcel

Fix off by one error introduced by the use of the ifnet_byindex()
macro. The commit log clearly states that the index given to the
macro is one higher than previously used to index the array. This
wasn't represented in the code and resulted in kernel page faults.

Reported by: Andrew Atrens <atrens@nortelnetworks.com>


83382 12-Sep-2001 jhb

Whitespace fix.


83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


83221 08-Sep-2001 marcel

Round of cleanups and enhancements. These include (in random order):

o Introduce private types for use in linux syscalls for two reasons:
1. establish type independence for ease in porting and,
2. provide a visual queue as to which syscalls have proper
prototypes to further cleanup the i386/alpha split.
Linuxulator types are prefixed by 'l_'. void and char have not
been "virtualized".

o Provide dummy functions for all syscalls and remove dummy functions
or implementations of truely obsolete syscalls.

o Sanitize the shm*, sem* and msg* syscalls.

o Make a first attempt to implement the linux_sysctl syscall. At this
time it only returns one MIB (KERN_VERSION), but most importantly,
it tells us when we need to add additional sysctls :-)

o Bump the kenel version up to 2.4.2 (this is not the same as the
KERN_VERSION MIB, BTW).

o Implement new syscalls, of which most are specific to i386. Our
syscall table is now up to date with Linux 2.4.2. Some highlights:
- Implement the 32-bit uid_t and gid_t bases syscalls.
- Implement a couple of 64-bit file size/offset bases syscalls.

o Fix or improve numerous syscalls and prototypes.

o Reduce style(9) violations while I'm here. Especially indentation
inconsistencies within the same file are addressed. Re-indenting
did not obfuscate actual changes to the extend that it could not
be combined.

NOTE: I spend some time testing these changes and found that if there
were regressions, they were not caused by these changes AFAICT.
It was observed that installing a RH 7.1 runtime environment
did make matters worse. Hangs and/or reboots have been observed
with and without these changes, so when it failed to make life
better in cases it doesn't look like it made it worse.


83130 06-Sep-2001 jlemon

Wrap array accesses in macros, which also happen to be lvalues:

ifnet_addrs[i - 1] -> ifaddr_byindex(i)
ifindex2ifnet[i] -> ifnet_byindex(i)

This is intended to ease the conversion to SMPng.


82745 01-Sep-2001 marcel

Speculatively add this file. It's part of the Linuxulator update
to make it emulate Linux kernel version 2.4.2, which is required
in order to upgrade the linux_base port to RH 7.1.

Note that this file is only needed for 32-bit architectures. To
us this means i386 (for now?)


82518 29-Aug-2001 gallatin

Fix linux_getcwd() so that if the cwd isn't cached (__getcwd() fails),
the cwd is looked up inside the kernel. The native getcwd() in libc
handles this in userland if __getcwd() fails.

Obtained from: NetBSD via OpenBSD
Tested by: Chris Casey <chriss@phys.ksu.edu>, Markus Holmberg <markush@acc.umu.se>
Reviewed by: Darrell Anderson <anderson@cs.duke.edu>
PR: kern/24315


80180 23-Jul-2001 pirzyk

Added the linux_sysinfo function to implement sysinfo(2).

PR: kern/27759
Reviewed by: marcel
Approved by: marcel
MFC after: 1 week


78264 15-Jun-2001 peter

Bah, back out part of previous commit. I got too carried away.
linux_debug_map[] is referred to from elsewhere.


78258 15-Jun-2001 peter

Fix warning:
239: warning: no previous prototype for `linux_debug'


78257 15-Jun-2001 peter

Fix warning:
413: warning: long unsigned int format, vm_offset_t arg (arg 2)


78161 13-Jun-2001 peter

With this commit, I hereby pronounce gensetdefs past its use-by date.

Replace the a.out emulation of 'struct linker_set' with something
a little more flexible. <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.

The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).

The macros declare a strongly typed set. They return elements with the
type that you declare the set with, rather than a generic void *.

For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.

For a.out, we use the old linker_set struct.

NOTE: the item lists are no longer null terminated. This is why
the code impact is high in certain areas.

The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.

linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.

Reviewed by: eivind


77675 04-Jun-2001 paul

S_IFCHR is not a bit mask, it's just a value in a field. The correct
way to clear that field is to use S_IFMT.

Pointed out by BDE.


77575 01-Jun-2001 ru

Remove vestiges of MFS.


77435 29-May-2001 phk

Remove MFS


77223 26-May-2001 ru

- sys/n[tw]fs moved to sys/fs/n[tw]fs
- /usr/include/n[tw]fs moved to /usr/include/fs/n[tw]fs


77183 25-May-2001 rwatson

o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
pcred->pc_uidinfo, which was associated with the real uid, only rename
it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
original macro that pointed.
p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
we figure out locking and optimizations; generally speaking, this
means moving to a structure like this:
newcred = crdup(oldcred);
...
p->p_ucred = newcred;
crfree(oldcred);
It's not race-free, but better than nothing. There are also races
in sys_process.c, all inter-process authorization, fork, exec, and
exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
allocation.
o Clean up ktrcanset() to take into account changes, and move to using
suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
calls to better document current behavior. In a couple of places,
current behavior is a little questionable and we need to check
POSIX.1 to make sure it's "right". More commenting work still
remains to be done.
o Update credential management calls, such as crfree(), to take into
account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
change_euid()
change_egid()
change_ruid()
change_rgid()
change_svuid()
change_svgid()
In each case, the call now acts on a credential not a process, and as
such no longer requires more complicated process locking/etc. They
now assume the caller will do any necessary allocation of an
exclusive credential reference. Each is commented to document its
reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
processes and pcreds. Note that this authorization, as well as
CANSIGIO(), needs to be updated to use the p_cansignal() and
p_cansched() centralized authorization routines, as they currently
do not take into account some desirable restrictions that are handled
by the centralized routines, as well as being inconsistent with other
similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from: TrustedBSD Project
Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit


76166 01-May-2001 markm

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


75988 25-Apr-2001 paul

A bogus check for a char device also matched symbolic links.
Replace it with a correct check using S_ISCHR()

Symbolic links will now work again in linux compatibility.


75916 24-Apr-2001 rwatson

o Change a suser() call to a suser_xxx(..., PRISON_ROOT) call in the
linuxulator so as to allow privileged processes within a jail() to
invoke the Linux initgroups() system call. This allows the Linux
"su" to work properly (better) when running a complete Linux
environment under jail(). This problem was reported by Attila
Nagy <bra@fsn.hu>.

Reviewed by: marcel


75893 24-Apr-2001 jhb

Change the pfind() and zpfind() functions to lock the process that they
find before releasing the allproc lock and returning.

Reviewed by: -smp, dfr, jake


75053 01-Apr-2001 alc

Add linux_sched_get_priority_max() and linux_sched_get_priority_min(): The
policy parameter requires translation.


74701 23-Mar-2001 gallatin

fix linux_times() to take into account linux's value of CLK_TCK on the alpha.
Previously, results were off by a factor of 10

Tested by: Yoriaki FUJIMORI <fujimori@grafin.fujimori.cache.waseda.ac.jp>


73353 02-Mar-2001 jlemon

Only pick up so_error the first time through with EISCONN, as advertised.
The sense of the test was reversed, so we were returning EISCONN, then 0.

Pointed out and tested by: Martin Blapp <mb@imp.ch>


73288 01-Mar-2001 jlemon

Correctly emulate linux_connect. For nonblocking sockets, the behavior
is to return EINPROGRESS, EALREADY, (so_error ONCE), EISCONN. Certain
linux applications rely on the so_error (normally 0) being returned in
order to operate properly.

Tested by: Thomas Moestl <tmoestl@gmx.net>


73286 01-Mar-2001 adrian

Reviewed by: jlemon

An initial tidyup of the mount() syscall and VFS mount code.

This code replaces the earlier work done by jlemon in an attempt to
make linux_mount() work.

* the guts of the mount work has been moved into vfs_mount().

* move `type', `path' and `flags' from being userland variables into being
kernel variables in vfs_mount(). `data' remains a pointer into
userspace.

* Attempt to verify the `type' and `path' strings passed to vfs_mount()
aren't too long.

* rework mount() and linux_mount() to take the userland parameters
(besides data, as mentioned) and pass kernel variables to vfs_mount().
(linux_mount() already did this, I've just tidied it up a little more.)

* remove the copyin*() stuff for `path'. `data' still requires copyin*()
since its a pointer into userland.

* set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each
filesystem. This variable is generally initialised with `path', and
each filesystem can override it if they want to.

* NOTE: f_mntonname is intiailised with "/" in the case of a root mount.


72786 21-Feb-2001 rwatson

o Move per-process jail pointer (p->pr_prison) to inside of the subject
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
pr_free(), invoked by the similarly named credential reference
management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
mutex use.

Notes:

o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
required to protect the reference count plus some fields in the
structure.

Reviewed by: freebsd-arch
Obtained from: TrustedBSD Project


72543 16-Feb-2001 jlemon

Allow debugging output to be controlled on a per-syscall granularity.
Also clean up debugging output in a slightly more uniform fashion.

The default behavior remains the same (all debugging output is turned on)


72538 16-Feb-2001 jlemon

Add mount syscall to linux emulation. Also improve emulation of reboot.


72200 09-Feb-2001 bmilekic

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


72091 06-Feb-2001 asmodai

Fix typo: seperate -> separate.

Seperate does not exist in the english language.


71699 27-Jan-2001 jhb

Back out proc locking to protect p_ucred for obtaining additional
references along with the actual obtaining of additional references.


71445 23-Jan-2001 jhb

Protect calcru() with sched_lock.


71048 14-Jan-2001 joe

Instead of hard coding the major numbers for IDE and SCSI disks
look in the device's cdevsw for the D_DISK flag.


70458 29-Dec-2000 paul

Map FreeBSD character device hard disks to Linux block device hard disks.

This fixes the problem with VMWARE not being able to use raw disks.


70178 19-Dec-2000 assar

translate the flags in recvfrom and recvmsg from linux to bsd ones

Approved by: marcel


70061 15-Dec-2000 jhb

Lock access to proc members.

Glanced over by: marcel


69595 05-Dec-2000 marcel

Remove call to bzero after MALLOC and instead add M_ZERO
to MALLOC.


69539 03-Dec-2000 marcel

Don't auto-generate the syscalls.


69286 27-Nov-2000 jake

Use callout_reset instead of timeout(9). Most callouts are statically
allocated, 2 have been added to struct proc for setitimer and sleep.

Reviewed by: jhb, jlemon


68803 16-Nov-2000 gallatin

Use the linux_connect() on alpha rather than passing directly through
to our native connect(). This is required to deal with the differences
in the way linux handles connects on non-blocking sockets.

This gets the private beta of the Compaq Linux/alpha JDK working
on FreeBSD/alpha

Approved by: marcel


68662 13-Nov-2000 marcel

Fix F_SETOWN on pipes. Linux returns EINVAL while we send a SIGIO
signal. There's at least 1 program that is known to break.
Submitted patch has been edited to match current code.

MFC: yes
Submitted by: bde


68583 10-Nov-2000 marcel

Revert auto-generation. The Alpha port is broken.
Syncing with it is wrong.


68519 09-Nov-2000 marcel

Sync with Alpha:
Do not use sysent.c, proto.h and syscall.h in source tree;
use auto-generated versions.


68347 05-Nov-2000 marcel

Fix getdents syscall.

The offset field in struct dirent was set to the offset of
the next dirent in rev 1.36. The offset was calculated from
the current offset and the record length. This offset does
not necessarily match the real offset when we are using
cookies. Therefore, also use the cookies to set the offset
field in struct dirent if we're using cookies to iterate
through the dirents.


68251 02-Nov-2000 gallatin

zap a stray include that snuck in with rev 1.56

Submitted by: Clive Lin <clive@CirX.ORG>


68247 02-Nov-2000 gallatin

fix a comment that was inadvertantly changed by a cvs merge
pointed out by: obrien


68225 02-Nov-2000 marcel

Fix linux_ustat syscall. We only have cdevs now, so looking
for a block device isn't that useful anymore.

Reported by: Wesley Morgan <morganw@chemicals.tacorp.com>
Submitted by: gallatin
Acknowledged by: phk


68214 01-Nov-2000 gallatin

Support for the linux ipc syscalls on the alpha, where each one has
its own syscall rather than going through a demux function like
linux_ipc() on i386


68210 01-Nov-2000 gallatin

fix linux_termio and linux_termios structs on alpha. alpha differences
are in the termios struct (probably because linux wants to be compatible
with the osf/1 termios struct), not the termio struct.


68201 01-Nov-2000 obrien

The MI/MD split wasn't perfect and the MI files need hacks for the
AlphaLinux compat bits. This will be better cleaned up soon.

Agreed to what ever was necessary by: marcel


67234 17-Oct-2000 gallatin

A start at an implemention of linux_rt_sendsig & linux_rt_sigreturn
and associated user-level signal trampoline glue.

Without this patch, an SA_SIGINFO style handler can be installed by a linux
app, but if the handler accesses its sip argument, it will get a garbage
pointer and likely segfault.

We currently supply a valid pointer, but its contents are mainly
garbage. Filling this in properly is future work.

This is the second of 3 commits that will get IBM's JDK 1.3 working with
FreeBSD ...


66834 08-Oct-2000 phk

Initiate deorbit burn sequence for <machine/console.h>.

Replace all in-tree uses with necessary subset of <sys/{fb,kb,cons}io.h>.
This is also the appropriate fix for exo-tree sources.

Put warnings in <machine/console.h> to discourage use.
November 15th 2000 the warnings will be converted to errors.
January 15th 2001 the <machine/console.h> files will be removed.


65108 26-Aug-2000 marcel

Whitespace change: (near) KNF


65106 26-Aug-2000 marcel

Fix bug in previous commit. We need to trim the limits to fit
the datatype (= long). Use ULONG_MAX and LONG_MAX to avoid
creating MD code.


65099 26-Aug-2000 marcel

Re-implement linux_{g|s}etrlimit in terms of {g|s}etrlimit
instead of the o{g|s}etrlimit so that the dependency on
COMPAT_43 is removed.


65067 25-Aug-2000 marcel

Fix typo in license.


64913 22-Aug-2000 marcel

Update include directives.


64911 22-Aug-2000 marcel

Update include directives.

Make linux_to_bsd_sigset and linux_do_sigaction non-static.

Move linux_sigaction. linux_sigsuspend, linux_rt_sigsuspend,
linux_pause and linux_sigaltstack to MD code.


64909 22-Aug-2000 marcel

Update include directives.

Move linux_select to MD code (i386 compat. syscall).

Move linux_fork, linux_vfork, linux_clone, linux_mmap,
linux_pipe, linux_ioperm, linux_iopl and linux_modify_ldt
to MD code.


64907 22-Aug-2000 marcel

Update include directives.


64906 22-Aug-2000 marcel

Update include directives.

Make the sem*, msg* and shm* function non-static as they are
called from MD code.

Move linux_ipc to MD code.


64905 22-Aug-2000 marcel

Update include directives and remove linux_execve.


64904 22-Aug-2000 marcel

Provide prototypes for functions used by MD code.


63903 27-Jul-2000 marcel

Remove the only use of SCARG and perform dead code elimination.


63778 23-Jul-2000 marcel

Add bounds checking to stackgap_alloc. Previously it was possible
to construct a path that was long enough (ie longer than
SPARE_USRSPACE bytes) and trash the stack.

Note that SPARE_USRSPACE is much smaller than MAXPATHLEN so that
the Linuxulator will now return ENAMETOOLONG even if the path
is smaller than MAXPATHLEN.

PR: 12749


63605 20-Jul-2000 marcel

Revert implementation of setfsuid and setfsgid due to security
issues.

Requested by: rwatson
Backed by: kris


63285 17-Jul-2000 marcel

Implement pread and pwrite.

PR: 17991
Submitted by: Geoffrey Speicher <geoff@caribbean.sea-incorporated.com>


63280 16-Jul-2000 marcel

Implement setfsuid and setfsgid. Implementation derived from patch
in PR.

PR: 16993
Submitted by: Bjoern Groenvall <bg@sics.se>


63233 15-Jul-2000 marcel

Simplify the F_GETOWN and F_SETOWN fcntl commands. The workaround
is not needed since the FreeBSD native implementation switched
from TIOC{G|S}PGRP to FIO{G|S}ETOWN (kern_descrip.c rev 1.55).

PR: 16946
Submitted by: Victor Salaman <salaman@teknos.com>


62573 04-Jul-2000 phk

Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.

Pointed out by: bde


62454 03-Jul-2000 phk

Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:

Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

-sysctl_vm_zone SYSCTL_HANDLER_ARGS
+sysctl_vm_zone (SYSCTL_HANDLER_ARGS)


61702 15-Jun-2000 cracauer

Linux allows to mmap annonymous with a file descriptor passed, FreeBSD
doesn't. In the Linux emulation layer, ignore the fd passed when
MAP_ANON is specified.

Known application to be fixed: Xanalys/Harlequin Lispworks

Also improve debug output for mmap, now showing what the emulation
layer mapped to what (-DDEBUG).

Reviewed by: marcel


60938 26-May-2000 jake

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


60833 23-May-2000 jake

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


59794 30-Apr-2000 phk

Remove unneeded #include <vm/vm_zone.h>

Generated by: src/tools/tools/kerninclude


57998 13-Mar-2000 nsayer

Fix some style bugs. The long line is in a chunk of code that's
being rewritten, though.

Submitted by: bde


57867 09-Mar-2000 marcel

Fix bug in linux_wait4 and linux_waitpid where garbage in the status
argument could panic the kernel.

Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
Prompted by: jkh, gallatin
Approved by: prompters


57858 09-Mar-2000 nsayer

Implement Linux BLKGETSIZE ioctl, and open the door to implementing
other BLK.* ioctls should the desire arize.

Approved by: jkh (via dufault)


57564 28-Feb-2000 marcel

Fix accept(2) behavior in that accepted sockets don't inherit the
parents flags.

Note on the PR:
The PR contains another patch that's not being committed without
further background information. The PR stays open for now.

PR: 16946 (Victor A. Salaman <salaman@teknos.com>)
Prompted by: msmith
Indirect/implicit approval: jkh (shoot me if I'm wrong :-)


56940 01-Feb-2000 nsayer

Avoid passing an uninitialized structure member to the real
READSUBCHANNEL ioctl. This makes vmware work with SCSI CDROM
drives.

Approved by: jkh


55771 10-Jan-2000 marcel

Return Linux kernel version 2.2.12 by default. This is in line
with linux_base-6.1.


55629 08-Jan-2000 marcel

Convert the filesystem type returned in struct statfs by syscalls
linux_statfs and linux_fstatfs. Linux binaries testing this expect
the filesystem's magic number and not our vnode's tag.

PR: 15425
Tested by: Vladimir N. Silyaev <vsilyaev@mindspring.com>


54655 15-Dec-1999 eivind

Introduce NDFREE (and remove VOP_ABORTOP)


54399 10-Dec-1999 marcel

Remove unused includes.

Found by: phk-scan


54152 05-Dec-1999 archie

Fix LINT breakage.


54122 04-Dec-1999 marcel

Implement pluggable ioctl handlers.

Other modules can register and unregister ioctl handlers to extend the
ioctls known by the Linuxulator. A recent application is the vmware
port. The Linuxulator itself uses the new interface to register its
handlers as well. Handlers for the following types of ioctls have been
defined:
cdrom
console (=keyboard and VT handling)
socket
sound
termio

All ioctl related defines and declarations have been moved to a new
file (linux_ioctl.h), except for the pluggable ioctl handler interface
definition.

While there, cleanup linux.h some more.

linux.h and linux_ioctl.[ch] have been made to conform to style(9) as
much as possible.

Inspired and reviewed by: Vladimir N. Silyaev


53954 30-Nov-1999 marcel

Implement linux_sigaltstack.


53902 29-Nov-1999 alfred

add linuxulator wrapper for SNDCTL_DSP_GETODELAY


53758 27-Nov-1999 marcel

Implement linux_ustat.

Reviewed by: bde


53713 26-Nov-1999 marcel

Implement fdatasync in terms of fsync. The regeneration of proto.h,
syscall.h and sysent.h was probably forgotten after the last change
syscalls.master.


53009 08-Nov-1999 phk

simplify check for device.


52986 08-Nov-1999 peter

Use fo_stat() rather than Yet Another duplication of kern_descrip.c's stat
code.


52635 29-Oct-1999 phk

useracc() the prequel:

Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.


52421 21-Oct-1999 marcel

Fix the duplicate filenames that are the result of using getdents.

glibc2 defines struct dirent differently than the Linux kernel does.
The getdents function therefore needs to read a heuristically defined
number of kernel dirents to satisfy the request. In case where too
many kernel dirents have been read, the function lseeks on the
directory so that a next call will start with the right dirent. The
offset used in lseeking is the offset-field in the last dirent passed
to the application. This can only mean that the offset-field holds
the offset of the next dirent and not the offset of the dirent itself.


51969 06-Oct-1999 jhay

Swap IOC_OUT and IOC_IN for the SETDIR macro. The linux ioctl read and
write bits are swapped.

Reviewed by: luoqi, marcel


51793 29-Sep-1999 marcel

sigset_t change (part 4 of 5)
-----------------------------

The compatibility code and/or emulators have been updated:

iBCS2 now mostly uses the older syscalls. SVR4 now properly
handles all signals. This has been achieved by using the
new sigset_t throughout the emulator. The Linuxulator has
been severely updated. Internally the new Linux sigset_t is
made the default. These are then mapped to and from the
new FreeBSD sigset_t.

Also, rt_sigsuspend has been implemented in the Linuxulator.
Implementing this syscall basicly caused all this sigset_t
changing in the first place and the syscall has been used
throughout the change as a means for testing. It basicly is
too much work to undo the implementation so that it can
later be added again.

A special note on the use of sv_sigtbl and sv_sigsize in
struct sysentvec:
Every signal larger than sv_sigsize is not translated and is
passed on to the signal handler unmodified. Signals in the
range 1 upto and including sv_sigsize are translated.
The rationale is that only the system defined signals need to
be translated.

The emulators also have been updated so that the translation
tables are only indexed for valid (system defined) signals.
This change also fixes the translation bug already in the
SVR4 emulator.


51654 25-Sep-1999 phk

This patch clears the way for removing a number of tty related
fields in struct cdevsw:

d_stop moved to struct tty.
d_reset already unused.
d_devtotty linkage now provided by dev_t->si_tty.

These fields will be removed from struct cdevsw together with
d_params and d_maxio Real Soon Now.

The changes in this patch consist of:

initialize dev->si_tty in *_open()
initialize tty->t_stop
remove devtotty functions
rename ttpoll to ttypoll
a few adjustments to these changes in the generic code
a bump of __FreeBSD_version
add a couple of FreeBSD tags


51602 23-Sep-1999 marcel

Linux doesn't complain if you remove a msg queue that doesn't exist
(given the proper permissions).


51569 22-Sep-1999 luoqi

Implement linux_ioperm() syscall. Fix linux_iopl() to use the level argument.
SVGAlib should now work.

Reviewed by: marcel


51418 19-Sep-1999 green

This is what was "fdfix2.patch," a fix for fd sharing. It's pretty
far-reaching in fd-land, so you'll want to consult the code for
changes. The biggest change is that now, you don't use
fp->f_ops->fo_foo(fp, bar)
but instead
fo_foo(fp, bar),
which increments and decrements the fp refcount upon entry and exit.
Two new calls, fhold() and fdrop(), are provided. Each does what it
seems like it should, and if fdrop() brings the refcount to zero, the
fd is freed as well.

Thanks to peter ("to hell with it, it looks ok to me.") for his review.
Thanks to msmith for keeping me from putting locks everywhere :)

Reviewed by: peter


51348 17-Sep-1999 marcel

Fix getcwd. It must return the length of the path including the terminating 0.
While I'm here, fix style and debug printf.

Fix derived from patch by: Darryl Okahata <darrylo@sr.hp.com>


50903 04-Sep-1999 peter

<machine/soundcard.h> -> <sys/soundcard.h>, since it's an exported API
that's arch neutral and OSS API and Linux API compatable.


50833 03-Sep-1999 marcel

I missed the namechange of field desc in struct i386_ldt_args into descs while
reviewing luoqi's changes...

Pointed out by: luoqi


50818 02-Sep-1999 marcel

Implementation of the modify_ldt syscall. Use the sysarch() interface to do
the actual work. When USER_LDT is not defined for a kernel, sysarch returns
EOPNOTSUPP. Display a message in that case and return ENOSYS to userland.

Reviewed by: luoqi


50558 29-Aug-1999 marcel

Fix a braino: Linux minor device numbers are 8 bits wide and not 10.


50546 29-Aug-1999 marcel

Fix a missing '-1' in the size argument of copyout in getgroups. Spotted while
reviewing the MFC in -stable.


50500 28-Aug-1999 marcel

Implement the OSS_GETVERSION ioctl. The version returned can be changed through
the sysctl variable `compat.linux.oss_version'.

PR: 12917
Originator: Dean Lombardo <dlombardo@excite.com>


50480 28-Aug-1999 peter

$Id$ -> $FreeBSD$


50477 28-Aug-1999 peter

$Id$ -> $FreeBSD$


50465 27-Aug-1999 marcel

Add sysctl variables for the Linuxulator. These reside under `compat.linux' as
discussed on current.

The following variables are defined (for now):

osname (defaults to "Linux")
Allow users to change the name of the OS as returned by uname(2),
specially added for all those Linux Netscape users and statistics
maniacs :-) We now have what we all wanted!

osrelease (defaults to "2.2.5")
Allow users to change the version of the OS as returned by uname(2).
Since -current supports glibc2.1 now, change the default to 2.2.5
(was 2.0.36).

oss_version (defaults to 198144 [0x030600])
This one will be used by the OSS_GETVERSION ioctl (PR 12917) which I
can commit now that we have the MIB. The default version number is the
lowest version possible with the current 'encoding'.

A note about imprisoned processes (see jail(2)):
These variables are copy-on-write (as suggested by phk). This means that
imprisoned processes will use the system wide value unless it is written/set
by the process. From that moment on, a copy local to the prison will be
used.

A note about the implementation:
I choose to add a single pointer to struct prison, because I didn't like the
idea of changing struct prison every time I come up with a new variable. As
a side effect, the extra storage is only needed when a variable is set from
within the prison. This also minimizes kernel bloat when the Linuxulator is
not used; both compiled in or as a module.

Reviewed by: bde (first version only) and phk


50356 25-Aug-1999 marcel

Fix linux_newlstat in that it doesn't return the attributes of its containing
directory. Also, update arguments of NDINIT for both newstat and newlstat.

While I'm at it, fix style bugs in all {s|ls|fs}tat syscalls.

Reported by: bde


50350 25-Aug-1999 marcel

Fix {g|s}etgroups semantics. We use cr_groups[0] to hold egid. This means that
egid will be twice in the set and that setting cr_groups[0] will change egid.
This is simply solved by ignoring cr_groups[0]. That is; linux_getgroups does
not return cr_groups[0] and linux_setgroups does not touch it.

Noticed by: bde
Brought to my attention by: sheldonh


50345 25-Aug-1999 marcel

Change all UNIMPL syscalls to STD and add them to linux_dummy. Now we always
know if and when an unimplemented or obsoleted syscall is being used. Make the
message more end-user friendly.

And as long as we're here, rename some unimplemeted syscalls (linux_phys ->
linux_umount2, linux_vm86 -> linux_vm86old, linux_new_vm86 -> linux_vm86).

Change prototype for linux_newuname from `struct linux_newuname_t *' into
`struct linux_new_utsname *'. This change is reflected in linux.h and
linux_misc.c.


49960 17-Aug-1999 marcel

Fix a bug in debug-printfs of struct linux_termios fields, where I forgot to
change the format specifier after changing the definition of the structure.

Submitted by: billf
Commented on by: bde


49959 17-Aug-1999 marcel

Fix bug in the debug-printf of the vfork syscall, where the format specifier
didn't match the argument (p->p_pid).

While I'm at it, also fix the dupo in the format string and fix the annoying
inconsistency in all the debug-printfs wrt p_pid arguments. Change all of them
to use the %ld format specifier and cast the p_pid arguments to long.

Submitted by: billf


49890 16-Aug-1999 marcel

Implement linux_vfork() syscall by calling vfork(). Analogous to the
linux_fork() implementation.


49849 15-Aug-1999 marcel

Provide wrappers for sched_{s|g}etscheduler. We need to convert the policy
argument.

PR: 12006
Originator: Jean-Claude MICHOT <jcmichot@teaser.fr>


49845 15-Aug-1999 marcel

Fix bug in the fcntl syscall where 'arg' was not set properly.

PR: 12147
Submitted by: Allan Saddi <asaddi@philosophysw.com>


49842 15-Aug-1999 marcel

Include opt_compat.h so that COMPAT_43 is defined. This gives us the proper
prototypes of o{s|g}etrlimit (from sys/sysproto.h). Update linux_{s|g}etrlimit
so that the arguments to o{s|g}etrlimit are corresponding the prototypes.

Pointed out by: bde


49788 14-Aug-1999 marcel

Implementation of the linux_getcwd syscall.


49786 14-Aug-1999 marcel

Implementation of linux_rt_sigaction and linux_rt_sigprocmask syscalls. Both
functions use the new sigset_t and sigaction_t which allows support for more
than 32 signals. Only the lower 32 signals are supported for now.

linux_rt_sigaction, linux_sigaction and linux_signal use linux_do_sigaction
to do the actual work. That way unnecessary redundancy is avoided. The same
has been done for linux_rt_sigprocmask and linux_sigprocmask. They call
linux_do_sigprocmask to do the actual work.


49774 14-Aug-1999 marcel

Fix LINUX_TIOC{S|G}SERIAL implementation. Both do not copy data in or out
of kernel space. Remove the ioctl supporting functions, and move the actual
code to the switch-statement. Now everybody can clearly see that the
implementation is really poor.

Also fix a typo in LINUX_TIOCGETD. The underlying function was given command
TIOCSETD instead op TIOCGETD...


49768 14-Aug-1999 marcel

Fix the LINUX_TCSET{A|AW|AF} and LINUX_TCSET{S|SW|SF} ioctls. These all suffer
from the same bug in that the argument is not first copied from user space
before it is used. This is part 2 (of 2) of the termios fixes.


49766 14-Aug-1999 marcel

Fix a couple of termio/termios conversion bugs/typos/dupos/brainos and other
changes. This is part 1 of the complete termios ioctl fixes.

o change type of c_{i|o|c|l}flag in struct termios from unsigned long to
unsigned int. The type now matches the Linux definitions.
o replaced constants by the corresponding defines in sptab[] for clarity.
Since there's no define for 135 baud, its mapping has been dropped.

function bsd_to_linux_termios:
o Fix typo IXON -> IXANY.
o Remove bogus assignment to c_cc[LINUX_VSWTC].

function linux_to_bsd_termios:
o Fix dupo LINUX_IXON -> LINUX_IXANY.
o Add LINUX_CREAD mapping.
o Fix typo IEXTEN -> LINUX_IEXTEN.

function linux_to_bsd_termio:
o Small optimization: Don't preset the complete c_cc array when we next
assign to the first LINUX_NCC entries.


49688 13-Aug-1999 marcel

Implementation of the CDROMSUBCHNL ioctl.


49676 13-Aug-1999 marcel

In doing lock type conversion (struct flock), make sure that carbage in results
in deterministic behaviour. In this case known garbage out.
The fix is different than suggested in the PR.

PR: 12749
Originator: Boris Nikolaus <boris@cs.tu-berlin.de>


49662 12-Aug-1999 marcel

Use a wrapper for the link syscall that does name translations.

PR: 12749
Submitted by: Boris Nikolaus <boris@cs.tu-berlin.de>


49626 11-Aug-1999 marcel

Do not map {s|g}etrlimit onto FreeBSD syscalls. The arguments don't match.
The linux syscalls translate the arguments first before invoking the
FreeBSD native syscalls.

PR: kern/9591
Originator: John Plevyak <jplevyak@inktomi.com>


49523 08-Aug-1999 marcel

Fix page fault in linux_uselib syscall.

PR: 12910
Submitted by: Peter Holm <peter@holm.cc>


49478 07-Aug-1999 green

We don't end up checking for a return value of EFAULT from the copyinstr()
in the pathname translation procedure. This proves fatal, and can be
easily fixed. This or a similar change needs to be committed to svr4_util.h
and ibcs2_util.h. I will update ibcs2_util.h, if noone else thinks of a
better way to do this, in the same manner. I will leave svr4 to the
respective maintainer.

This closes the problem of the only crash I've been able to produce as
a user recently, except for (currently not-in-the-source tree) fd
table sharing fixes. Thanks goes to pho for his stress-testers.


48885 18-Jul-1999 phk

Use the vn_todev() function, rather than VOP_GETATTR


48851 17-Jul-1999 marcel

Implementation of TCXONC.

Reviewed by: bde


48685 08-Jul-1999 marcel

Implement VT_RELDISP ioctl

Submitted by: Kazutaka Yokota <yokota@FreeBSD.org>


48628 06-Jul-1999 marcel

Trivial implementation of TIOCM{S|G}ET and TIOCMBI{S|C} ioctls. No need
to convert the arguments.


48620 06-Jul-1999 cracauer

Rename struct members sa_siginfo. POSIX reserves identifiers starting
with sa_ when <signal.h> is included. They would conflict with the
upcoming SA_SIGINFO implementation.

Reviewed by: BDE


48595 05-Jul-1999 marcel

Let newuname return "Linux" as the OS name and not "FreeBSD". Also, return a
more sensible (for Linux applications) release number. Hardcoding a release
number has its drawbacks, but it will do for now.


47028 11-May-1999 phk

Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
major() umajor()
minor() uminor()
makedev() umakedev()
dev2udev() udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits. This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference. If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).


46803 09-May-1999 peter

Fix a couple of warnings and some bitrot in comments.


46778 09-May-1999 phk

Yet another place which knew too much. Still not sure how much
good this does in the end.


46676 08-May-1999 phk

I got tired of seeing all the cdevsw[major(foo)] all over the place.

Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.


46571 06-May-1999 peter

Fix up a few easy 'assignment used as truth value' and 'suggest parens
around && within ||' type warnings. I'm pretty sure I have not masked
any problems here, I've committed real problem fixes seperately.


46163 29-Apr-1999 luoqi

- Handle mixer read ioctls correctly. They have the same group, number and
argument size as their write counterparts and were handled as write ioctls.
- Emulate some cdrom ioctls.


46129 28-Apr-1999 luoqi

Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
is added to point to per-cpu pages, per-cpu global variables are now
accessed through this new selector (%fs). The selectors in gdt table are
rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by: Alan Cox <alc@cs.rice.edu>
John Dyson <dyson@iquest.net>
Julian Elischer <julian@whistel.com>
Bruce Evans <bde@zeta.org.au>
David Greenman <dg@root.com>


46116 27-Apr-1999 phk

Change suser_xxx() to suser() where it applies.


46112 27-Apr-1999 phk

Suser() simplification:

1:
s/suser/suser_xxx/

2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.


45821 19-Apr-1999 peter

unifdef -DVM_STACK - it's been on for a while for x86 and was checked
and appeared to be working for the Alpha some time ago.


44384 02-Mar-1999 julian

Fix thread/process tracking and differentiation for Linux threads emulation.

Submitted by: Richard Seaman, Jr." <dick@tar.com>

Also clean some compiler warnings in surrounding code.


43208 26-Jan-1999 julian

Enable Linux threads support by default.
This takes the conditionals out of the code that has been tested by
various people for a while.
ps and friends (libkvm) will need a recompile as some proc structure
changes are made.

Submitted by: "Richard Seaman, Jr." <dick@tar.com>


42509 11-Jan-1999 msmith

Fix linux sendmsg() emulation

Submitted by: Brian Feldman <green@unixhelp.org>


42499 10-Jan-1999 eivind

Use truncate() instead of otruncate() - step on the way to stopping
the linulator from depending on COMPAT_43.


42360 06-Jan-1999 julian

Add (but don't activate) code for a special VM option to make
downward growing stacks more general.
Add (but don't activate) code to use the new stack facility
when running threads, (specifically the linux threads support).
This allows people to use both linux compiled linuxthreads, and also the
native FreeBSD linux-threads port.

The code is conditional on VM_STACK. Not using this will
produce the old heavily tested system.

Submitted by: Richard Seaman <dick@tar.com>


42186 30-Dec-1998 sos

Commit patch in

PR: 9232
Submitted by: marcel@scc.nl <Marcel Moolenaar>


42185 30-Dec-1998 sos

Commit #2 of

PR: 9235
Submitted by: marcel@scc.nl <Marcel Moolenaar>


42054 24-Dec-1998 julian

According to the author..

"I've been having a problem running the patches [committed to current]
installed with the COMPAT_LINUX_THREADS option along
with the VM_STACK patches I did. I'm not sure what
the problem is, since it seemed to work before.

In any event, the attached patch fixes the problem for
me. While I've had no report of problems from anyone
else, possibly it would be wise to commit the patch
until the problem is found.

Also, there was some left-over junk in the linux_misc.c
file from some earlier work I did. The attached patch
cleans that up too."

Submitted by: "Richard Seaman, Jr." <dick@tar.com>


41986 21-Dec-1998 sos

Kill(pid, 0) normally returns 0 on both FreeBSD and Redhat after having
performed all sorts of sanity checks. The FreeBSD linux emulator returns
EINVAL in such a case.
Allowing signal 0 to be passed to kill will result in compatible behaviour.

PR: 9082
Submitted by: Marcel Moolenaar <marcel@scc.nl>


41931 19-Dec-1998 julian

Reviewed by: Luoqi Chen, Jordan Hubbard
Submitted by: "Richard Seaman, Jr." <lists@tar.com>
Obtained from: linux :-)

Code to allow Linux Threads to run under FreeBSD.

By default not enabled
This code is dependent on the conditional
COMPAT_LINUX_THREADS (suggested by Garret)
This is not yet a 'real' option but will be within some number of hours.


41871 16-Dec-1998 bde

Removed the cast to a pointer in the definition of PS_STRINGS and
adjusted related casts to match (only in the kernel in this commit).
The pointer was only wanted in one place in kern_exec.c. Applications
should use the kern.ps_strings sysctl instead of PS_STRINGS, so they
shouldn't notice this change.


41650 10-Dec-1998 jkh

linux_pipe does not preserve the edx register. Linux and
programs using glibc expect edx to be preserved accross syscalls.
As a result, linux programs running in emulation mode can
have whatever value may be represented by edx clobbered.

PR: 9038
Submitted-By: Richard Seaman, Jr. <dick@tar.com>


41514 04-Dec-1998 archie

Examine all occurrences of sprintf(), strcat(), and str[n]cpy()
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.

These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.

Reviewed by: Bruce Evans <bde@zeta.org.au>
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: Mike Spengler <mks@networkcs.com>


41105 12-Nov-1998 jkh

MF22: Bring in some linux sound ioctl support which I committed to 2.2
for PR 7792 but did not bring forward.

Submitted by: Avatar Liang <avatar@www.mmlab.cse.yzu.edu.tw>
PR: 8656


40203 11-Oct-1998 jdp

Fix a couple of out-of-bounds array references in mapping between
Linux and FreeBSD signal numbers. Also, check signal numbers passed
in from application programs for validity. Without these checks,
it is trivial to panic the system from a Linux program.


39978 05-Oct-1998 jfieber

Make async I/O on a socket work.

Although the current Sybase license does not permit running under
emulation, FreeBSD 3.0 is now "Sybase Ready" should the license change.


39977 05-Oct-1998 sos

In linux_newuname bzero the right type of struct (linux_newuname_t).


39799 30-Sep-1998 jfieber

Add several missing ioctl handlers. One needed by Sybase, the others
found while looking for the one.


39620 24-Sep-1998 jkh

MF22: revert time bogon.


39598 23-Sep-1998 jkh

return time in proper format for linux.


38679 31-Aug-1998 jkh

Argh! *Now* the correct 3.0 fix is committed.


38677 31-Aug-1998 jkh

Whoops! Stamp out a 2.2-ism that snuck between branches here.


38672 31-Aug-1998 jkh

Initial support for using linux X servers under emulation - to use an
XFree86 server, users need to create the following links in their
/compat/linux/dev directory (assuming kernel configured with 4 VTs).

lrwxrwxrwx 1 root wheel 7 Aug 30 22:59 tty0 -> console
lrwxrwxrwx 1 root wheel 5 Aug 30 22:45 tty1 -> ttyv0
lrwxrwxrwx 1 root wheel 5 Aug 30 22:45 tty2 -> ttyv1
lrwxrwxrwx 1 root wheel 5 Aug 30 22:45 tty3 -> ttyv2
lrwxrwxrwx 1 root wheel 5 Aug 30 22:45 tty4 -> ttyv3

VT switching is still not yet supported. Attempting to switch VT
currently will cause Xserver bus error.

Submitted by: Chain Lee <chain@110.net>


38354 16-Aug-1998 bde

Use [u]intptr_t instead of [u_]long for casts between pointers and
integers. Don't forget to cast to (void *) as well.


38344 15-Aug-1998 bde

Oops, the previous fix confused Linux's sigset_t with a pointer type.
It can be integral or a struct in POSIX, so it is difficult to print,
but it is actually declared as unsigned long. Assume that it is
unsigned integral.


38127 05-Aug-1998 bde

Converted the second last instance of hzto() to tvtohz().

Fixed nearby bugs (in linux_alarm()):
- the itimer for the alarm was relative to the epoch instead of relative
to the boot time. This was harmless because the itimer's interval is 0.
- the seconds arg was not checked for validity before converting it to a
possibly different value.
- printf format errors.

Improvements:
Don't use splclock(). splsoftclock() suffices. Don't complicate things
by micro-optimizing interrupt latency.

Minor improvements:
Various micro-optimizations to exploit the specialness of the alarm itimer
and the value 0.


37950 29-Jul-1998 bde

Fixed print format errors.


37548 10-Jul-1998 jkh

Quick and dirty support for Linux's mremap. Not used by anything
but quake2 AFAIK.

Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>


37287 30-Jun-1998 jmg

remove option LINUX as it did nothing, add DEBUG_LINUX to debug the
linux emulation...

(actually moved LINUX to opt_dontuse.h)


36735 07-Jun-1998 dfr

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


36587 02-Jun-1998 jkh

".. x11amp appears to be calling shmctl(id, IPC_RMID, 0) and the emulation
layer does not like the null shmid_ds buffer pointer. The emulation layer
returned an error without ever calling FreeBSD's shmctl, so the segments
were not being deleted when the reference count went to zero."

Submitted by: Kevin Street <street@iname.com>


36119 17-May-1998 phk

s/nanoruntime/nanouptime/g
s/microruntime/microuptime/g

Reviewed by: bde


35058 06-Apr-1998 phk

Make a kernel version of the timer* functions called timerval* to be
more consistent.

OK'ed by: bde


35034 04-Apr-1998 phk

Use microruntime() rather than doing it by hand.


34961 30-Mar-1998 phk

Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time. Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by: bde


34941 29-Mar-1998 peter

The linux chown syscall is more like lchown, a new chown syscall that
follows links was added.


34924 28-Mar-1998 bde

Moved some #includes from <sys/param.h> nearer to where they are actually
used.


33821 25-Feb-1998 bde

Removed redundant test against MAXDSIZ (the rlimit test is stronger).


33148 07-Feb-1998 msmith

In the words of the submitter:

----
I've worked to enhance the connect() patches.

I've just tested this with the Linux JDK appletviewer on an applet
that does a lot of connects, and it works as well as during my
previous tests.

The connect() patch is now a merge between my older patch and the
OpenBSD stuff. It ensures that any async error is returned by
connect() instead of getsockopt(SOL_SOCKET, SO_ERROR) as reasonnable
systems do.

There are also minor patches to implement IPPROTO_TCP for
get/setsocktopt(). These are also tested (with Linux Apache).
----

I would appreciate any feedback regarding these changes, as they'd
be very useful in 2.2.6.

Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)


32266 05-Jan-1998 jmb

sigh....forgot to update the DEBUG printf
to show both the path and the length args
to linux emulation truncate()

Submitted by: jmb


32265 05-Jan-1998 jmb

length argument to truncate() in linux emulation
was not being set copied to the bsd arguments..
frequently, resulting in files of over 100MB of NULs

PR: 386/5044
Reviewed by: jmb
Submitted by: (Richard Winkel) rich@math.missouri.edu


31784 16-Dec-1997 eivind

Make hidden COMPAT_43 dependencies explict. Options in headers is a
pain in the backside.


31778 16-Dec-1997 eivind

Make COMPAT_43 and COMPAT_SUNOS new-style options.


31730 15-Dec-1997 msmith

As described by the submitter:

These patches enables us to play quake2 .

Support linux keyboard ioctl for setting RAW, MEDIUMRAW and XLATE.

Support linux virtual terminal operations:
OPENQRY, GETMODE, SETMODE, GETSTATE, ACTIVATE, and WAITACTIVE.

Submitted by: Amancio Hasty <hasty@rah.star-gate.com>


31711 14-Dec-1997 msmith

As described by the submitter:

- emulate Linux IP_HDRINCL behaviour in sendto(): byte order fixed
Note that we do an extra getsockopt() on every sendto()
to check if the option is set because we don't keep state
in the emulator code. Is there a better way to implement
this?
- correct a bug (value of "name" not passed) with
getsockopt()

Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)


31561 05-Dec-1997 bde

Don't include <sys/lock.h> in headers when only `struct simplelock' is
required. Fixed everything that depended on the pollution.


31198 17-Nov-1997 ahasty

Added support for linux sound ioctls:
LINUX_SNDCTL_DSP_GETOPTR
LINUX_SNDCTL_DSP_GETIPTR
LINUX_SNDCTL_DSP_SETTRIGGER
LINUX_SNDCTL_DSP_GETCAPS

With this rev level the linux realaudio player 5 and xquake should work.


30994 06-Nov-1997 phk

Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.


30855 30-Oct-1997 kato

Securelevel and formatting fixes, and trapframe simplification.

Reviewed by: sos
Submitted by: bde


30837 29-Oct-1997 kato

Implement linux_iopl and linux_nice.


30804 28-Oct-1997 kato

Implement linux_semop, linux_semget and linux_semctl.

PR: 4355


29679 21-Sep-1997 gibbs

Update for changes in the callout interface.


28039 10-Aug-1997 sos

Ops the arguments to copyin was in the wrong order..
This has survived since the first version, sigh.


27557 20-Jul-1997 bde

Removed unused #includes.


26378 02-Jun-1997 dfr

Make this thing actually compile.


26366 02-Jun-1997 msmith

Oops, remove some bogus debugging code that crept in with the last commit.


26364 02-Jun-1997 msmith

Add support for the SIOCGIFHWADDR ioctl, commonly used by
license managers to obtain the host's ethernet address as
a key.

Note that this implementation takes the first hardware address for
the first ethernet interface found, and disregards the interface name
that may be passed in, as linux ethernet devices are all "ethX".


25219 28-Apr-1997 msmith

Always include PROT_READ for Linux mmap operations.
Submitted by: Hannu Savolainen <hannu@voxware.pp.fi> via jkh


24672 06-Apr-1997 dfr

Remove dependancy on UFS' DIRBLKSIZ definition.

2.2 candidate.

Submitted by: bde


24654 05-Apr-1997 dfr

Fix linux_getdents so that it can cope with filesystems which translate
the directory format (ext2fs, cd9660). For these filesystems, it must use
cookies to find the correct offset to use for subsequent reads. Without it,
linux /bin/ls tends to loop re-reading the same block over and over again.

2.2 candidate.


24478 01-Apr-1997 bde

Removed potentially harmful garbage <vm/lock.h> and fixed bogus
use of it. It was actually harmless because the use was null due
to fortuitous include orders and identical (wrong) idempotency
macros.


24205 24-Mar-1997 bde

Don't include <sys/ioctl.h> in the kernel. Stage 3: include
<sys/filio.h> instead of <sys/ioctl.h> in non-network non-tty files.


24203 24-Mar-1997 bde

Don't include <sys/ioctl.h> in the kernel. Stage 1: don't include
it when it is not used. In most cases, the reasons for including it
went away when the special ioctl headers became self-sufficient.


24131 23-Mar-1997 bde

Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined.
Fixed everything that depended on getting fcntl.h stuff from the wrong
place. Most things don't depend on file.h stuff at all.


22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


22543 10-Feb-1997 mpp

Make this compile again after the Lite2 merge.

VOP_UNLOCK was being called with the wrong mumber of arguments.

Also silenced a -Wall warning.


22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


20691 19-Dec-1996 bde

Fixed lseek() on named pipes. It always succeeded but should always fail.
Broke locking on named pipes in the same way as locking on non-vnodes
(wrong errno). This will be fixed later.

The fix involves negative logic. Named pipes are now distinguished from
other types of files with vnodes, and there is additional code to handle
vnodes and named pipes in the same way only where that makes sense (not
for lseek, locking or TIOCSCTTY).


20101 03-Dec-1996 fenner

Add IP_OPTIONS and the multicast-related setsockopts to the
list of IP setsockopts the Linux emulator recognizes.

Explicitly disallow IP_HDRINCL since Linux's handling of
raw output is different than BSD's.

Closes PR#kern/2111.

Submitted by: y-nakaga@ccs.mt.nec.co.jp (Yoshihisa NAKAGAWA)


19414 05-Nov-1996 smpatel

Add audio mixer ioctls.
Only writing to the mixer is implemented.


18027 03-Sep-1996 bde

Changed type of ni_dirp in `struct namei' from caddr_t to `const char *'
so that the compiler can see that it is OK to use const strings in
NDINIT(). Some emulators want to use paths of the form "/compat/foo".
Removed the casts that hid the non-problem. Didn't fix the missing
consts in syscalls.master that hid the non-problem.


17450 05-Aug-1996 nate

Fix memory leak bug in the path parsing code which never released it's
buffer in certain error conditions. Sync up the code to that in NetBSD
where applicable.

Reviewed by: Gary Jennejohn <garyj@munich.netsurf.de>
Submitted by: Michael Smith <msmith@atrad.adelaide.edu.au>
Obtained from: NetBSD sources


16632 23-Jun-1996 bde

Removed unused #include. Linux doesn't support SCO consoles.


16322 12-Jun-1996 gpalmer

Clean up -Wunused warnings.

Reviewed by: bde


15538 02-May-1996 phk

First pass at cleaning up macros relating to pages, clusters and all that.


15117 07-Apr-1996 bde

Removed never-used #includes of <machine/cpu.h>. Many were apparently
copied from bad examples.


14703 19-Mar-1996 bde

Fixed unsigned longs that should have been vm_offset_t.

vm_offset_t is currently unsigned long but should probably be plain
unsigned for i386's to match the choice of minimal types to represent
for fixed-width types in Lite2. Anyway, it shouldn't be assumed
to be unsigned long.

I only fixed the type mismatches that were detected when I changed
vm_offset_t to unsigned. Only pointer type mismatches were detected.


14584 12-Mar-1996 peter

Remove references to MAP_FILE.. That is now "default" and is only
a "#define MAP_FILE 0" that is still there for net-2 source compatability.


14471 10-Mar-1996 peter

Fix the vm_map_remove and vm_map_protect calls.. Somewhere along the
line, these had got (start, length) arguments instead of (start, end)
args. This could be the cause of Robert Sanders lockups with ZMAGIC
binaries.


14466 10-Mar-1996 peter

Implement rudumentry support for the linux TIOC[SG]ETSERIAL ioctl's.
To complete this, some extra state has to be kept somewhere so that the
B38400 flag in Linux can be correctly translated to/from either 38400,
57600 or 115200.

Submitted by: Robert Sanders <rsanders@mindspring.com>


14465 10-Mar-1996 peter

Fix the getdents() emulation, the Linux ELF libraries use this, and
this code was not quite right (linux has a readdir and getdents syscall,
with the same args. readdir only returns one entry and uses a mutant
dirent structure. This code was also returning the mutant form for
getdents as well. My fault for missing this before.)


14463 10-Mar-1996 peter

Fix a (mostly harmless) bogon when allocating space above the stack
in the stack gap..


14456 10-Mar-1996 sos

First attempt at FreeBSD & Linux ELF support.

Compile and link a new kernel, that will give native ELF support, and
provide the hooks for other ELF interpreters as well.

To make native ELF binaries use John Polstras elf-kit-1.0.1..
For the time being also use his ld-elf.so.1 and put it in
/usr/libexec.

The Linux emulator has been enhanced to also run ELF binaries, it
is however in its very first incarnation.
Just get some Linux ELF libs (Slackware-3.0) and put them in the
prober place (/compat/linux/...).
I've ben able to run all the Slackware-3.0 binaries I've tried
so far.
(No it won't run quake yet :)


14381 04-Mar-1996 peter

update linux_times() and linux_utime() emulation,
fix sigsuspend() (actually back out my recent change there)
and regen the syscall tables..


14371 04-Mar-1996 peter

Add support for LINUX_TCSETAW and LINUX_TCSETAF, which Linux-pine uses.

Submitted by: Robert Sanders <rsanders@mindspring.com>


14361 03-Mar-1996 peter

Add support for the old-style Linux termio (not termios) TCGETA etc.

Also, LINUX_POSIX_VDISABLE is \0, FreeBSD's is 0xff. Convert between them.

This enables some more programs to run, including the Livingston Portmaster
utilities (PMtools).

Submitted by: Robert Sanders <rsanders@mindspring.com>


14342 02-Mar-1996 peter

Minor touch-up... make two functions static, and add missing $Id$


14331 02-Mar-1996 peter

Mega-commit for Linux emulator update.. This has been stress tested under
netscape-2.0 for Linux running all the Java stuff. The scrollbars are now
working, at least on my machine. (whew! :-)

I'm uncomfortable with the size of this commit, but it's too
inter-dependant to easily seperate out.

The main changes:

COMPAT_LINUX is *GONE*. Most of the code has been moved out of the i386
machine dependent section into the linux emulator itself. The int 0x80
syscall code was almost identical to the lcall 7,0 code and a minor tweak
allows them to both be used with the same C code. All kernels can now
just modload the lkm and it'll DTRT without having to rebuild the kernel
first. Like IBCS2, you can statically compile it in with "options LINUX".

A pile of new syscalls implemented, including getdents(), llseek(),
readv(), writev(), msync(), personality(). The Linux-ELF libraries want
to use some of these.

linux_select() now obeys Linux semantics, ie: returns the time remaining
of the timeout value rather than leaving it the original value.

Quite a few bugs removed, including incorrect arguments being used in
syscalls.. eg: mixups between passing the sigset as an int, vs passing
it as a pointer and doing a copyin(), missing return values, unhandled
cases, SIOC* ioctls, etc.

The build for the code has changed. i386/conf/files now knows how
to build linux_genassym and generate linux_assym.h on the fly.

Supporting changes elsewhere in the kernel:

The user-mode signal trampoline has moved from the U area to immediately
below the top of the stack (below PS_STRINGS). This allows the different
binary emulations to have their own signal trampoline code (which gets rid
of the hardwired syscall 103 (sigreturn on BSD, syslog on Linux)) and so
that the emulator can provide the exact "struct sigcontext *" argument to
the program's signal handlers.

The sigstack's "ss_flags" now uses SS_DISABLE and SS_ONSTACK flags, which
have the same values as the re-used SA_DISABLE and SA_ONSTACK which are
intended for sigaction only. This enables the support of a SA_RESETHAND
flag to sigaction to implement the gross SYSV and Linux SA_ONESHOT signal
semantics where the signal handler is reset when it's triggered.

makesyscalls.sh no longer appends the struct sysentvec on the end of the
generated init_sysent.c code. It's a lot saner to have it in a seperate
file rather than trying to update the structure inside the awk script. :-)

At exec time, the dozen bytes or so of signal trampoline code are copied
to the top of the user's stack, rather than obtaining the trampoline code
the old way by getting a clone of the parent's user area. This allows
Linux and native binaries to freely exec each other without getting
trampolines mixed up.


14114 16-Feb-1996 peter

This is an extract of changes from what I am currently running...
- Optimise the linux a.out loading and uselib system calls so they
take advantage of some of John's recent interface improvements.
Basically, this means they make far less map changes than before.
- Attempt to plug some potentially nasty kernel_map memory leaks..
- Improve support for QMAGIC libs (I only use QMAGIC (ie: a.out libraries from
the slackware 3.0 dist) but this depends on other changes to enhance
the /compat/linux support)
- uselib goes out through a single exit as part of the resource tracking
that I did when closing the resource leaks on errors. This could be
cleaner than what I did, but making a 30-deep nested if/else was not my
idea of fun, neither did I want to repeat the same code 30 times over for
each failure possibility. I guess this function needs to be split into
smaller functions to solve this.

I've been running the Linux Netscape-2.0 (with Java) to test this, and apart
from the long-standing problem with the missing scrollbars, it appears to
still work as before with ZMAGIC libs (and the leaks).. However, I've
been using it with mods for the signal trampoline code for native linux stack
frames on signals and exterminated the blasted sigreturn printf() problem,
so I can't be certain that there is not a dependency on something else.


13739 30-Jan-1996 peter

Call pipe_stat() when presented with a DTYPE_PIPE file in the linux
fstat() syscall, rather than panic("linux newfstat").

(Note: I've extracted this from a larger set of diffs, I'm confident I've
not missed any dependencies but can't modload it to test it on my system)


13503 19-Jan-1996 dyson

Fixed vm_map_find for new vm updates.


13420 14-Jan-1996 sos

Add linux_mknod so that it will do mkfifo if needed...


13334 08-Jan-1996 peter

reran makesyscalls

Always call the SYSV ipc functions, stubs will take their place if
necessary.


13264 05-Jan-1996 wollman

The Linux emulator depends on SYSV IPC but doesn't actually reference
the options.


13113 30-Dec-1995 sos

Oops, forgot a little difference between my src-tree and ours...


13111 29-Dec-1995 sos

My first shot at get sound to work on the emulator.
Inspired by the work Amancio Hasty has done, but implemented
somewhat differently.


12867 15-Dec-1995 peter

Update linux_ipc.c to use the now generated prototypes for the shm* calls
it makes while emulating the linux equivalents.


12860 15-Dec-1995 peter

Initial attempt at getting Linux QMAGIC shared lib support. I have
successfully run linux netscape 2.0b3 with a QMAGIC ld.so and libc/libm
that I found on some linux machine that I _think_ is running slackware 3.0.

There are still problems.. ld.so claims the libraries are the wrong
format, but it still runs anyway.. :-/ The QMAGIC ld.so also screams
about needing ld.so.cache, and running a linux ldconfig is quite
educational. You soon learn to run "chroot /compat/linux /bin/ldconfig"
where ldconfig is living in /compat/linux/bin. :-]

(Lets just say that it puts loads of symlinks in /usr/lib otherwise :-)


12858 15-Dec-1995 peter

Clean up some warnings by using the generated structures in <sys/sysproto.h>
for passing to the bsd system calls, rather than inveninting our own
equivalent structures.


12842 14-Dec-1995 bde

Restored a vm #include.


12689 09-Dec-1995 peter

Attempt to make the Linux LKM compile again after the recent VM include
de-nesting changes...
(I figured this might be usefulif it actually built, since I've told
everybody to rebuild it or die.. :-)


12652 06-Dec-1995 bde

Include <vm/vm.h> explicitly to avoid breaking when vnode_if.h doesn't
include vm stuff.


12458 22-Nov-1995 bde

Completed function declarations and added prototypes.

Removed some unnecessary #includes.

Fixed warnings about nested externs.


12130 06-Nov-1995 dg

All:
Changed vnodep -> vp for consistency with the rest of the kernel, and
changed iparams -> imgp for brevity.

kern_exec.c:
Explicitly initialized some additional parts of the image_params struct
to avoid bzeroing it. Rewrote the set-id code to reduce the number of
logical tests. The rewrite exposed a mostly benign bug in the algorithm:
traced set-id images would get ktracing disabled even if the set-id didn't
happen for other reasons.


11418 10-Oct-1995 swallace

Fix the getdirentries of ibcs2 to handle uneven DIRBLKSIZ offsets.
Slight modification from previous fix.

Also, fix problem where an entry would be skipped next call if not enough room
in buffer current call.


11163 04-Oct-1995 julian

Submitted by: Juergen Lock <nox@jelal.hb.north.de>
Obtained from: other people on the net ?

1. stepping over syscalls (gdb ni) sends you to DDB, and returned
to the wrong address afterwards, with or without DDB. patch in
i386/i386/trap.c below.

2. the linux emulator (modload'ed) still causes panics with DIAGNOSTIC,
re-applied a patch posted to one of the lists...


10358 28-Aug-1995 julian

Reviewed by: julian with quick glances by bruce and others
Submitted by: terry (terry lambert)
This is a composite of 3 patch sets submitted by terry.
they are:
New low-level init code that supports loadbal modules better
some cleanups in the namei code to help terry in 16-bit character support
some changes to the mount-root code to make it a little more
modular..

NOTE: mounting root off cdrom or NFS MIGHT be broken as I haven't been able
to test those cases..

certainly mounting root of disk still works just fine..
mfs should work but is untested. (tomorrows task)

The low level init stuff includes a total rewrite of init_main.c
to make it possible for new modules to have an init phase by simply
adding an entry to a TEXT_SET (or is it DATA_SET) list. thus a new module can
be added to the kernel without editing any other files other than the
'files' file.


10355 28-Aug-1995 swallace

Modified linux_readdir() function to properly handle Linux readdir()
calls with a byte size of 1. This special case was not
correctly emulated. Now programs such as a simple 'ls' to a commercial
Macintosh emulator called 'executor' will work correctly.


9313 25-Jun-1995 sos

First incarnation of our Linux emulator or rather compatibility code.
This first shot only incorporaties so much functionality that DOOM
can run (the X version), signal handling is VERY weak, so is many
other things. But it meets my milestone number one (you guessed it
- running DOOM).

Uses /compat/linux as prefix for loading shared libs, so it won't
conflict with our own libs.

Kernel must be compiled with "options COMPAT_LINUX" for this to work.