History log of /openbsd-current/usr.bin/mandoc/mdoc_validate.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.306 08-Jun-2022 schwarze

When looking for the next block to tag, we aren't interested in children
of the current block but really want the next block instead. This fixes
a segfault reported by Evan Silberman <evan at jklol dot net> on bugs@.


Revision tags: OPENBSD_7_1_BASE
# 1.305 04-Oct-2021 schwarze

store the operating system name obtained from uname(3) in the adequate
struct together with similar state date rather than in a function-scope
static variable, such that it can be free(3)d in roff_man_free();
no functional change


Revision tags: OPENBSD_7_0_BASE
# 1.304 18-Jul-2021 schwarze

Support auto-tagging for ".It Va".

This combination is somewhat rare because few libraries expose so many
global variables that they need a list to enumerate them, but when the
idiom does occur, tagging the variable names is generally useful.
For example, this helps awk(1), dc(1), make(1), rc.subr(8), ...

Missing feature reported and patch reviewed, tested, and OK'ed by kn@.


Revision tags: OPENBSD_6_9_BASE
# 1.303 30-Oct-2020 schwarze

Promote section headers that can can be used unmodified as fragment
identifiers from TAG_WEAK to TAG_STRONG,
such that for example ...#DESCRIPTION always works.
Suggested by Aman Verma on the discuss@ list.


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.302 26-Apr-2020 schwarze

While we do not recommend the idiom ".Fl Fl long" for long options
because it is an abuse of semantic macros for device-specific
presentational effects, this idiom is so widespread that it makes
sense to convert it to the recommended ".Fl \-long" during the
validation phase. For example, this improves HTML formatting
in pages where authors have used the dubious .Fl Fl.

Feature suggested by Steffen Nurpmeso <steffen at sdaoden dot eu>
on freebsd-hackers.


# 1.301 24-Apr-2020 schwarze

provide a STYLE message when mandoc knows the file name and the extension
disagrees with the section number given in the .Dt or .TH macro;
feature suggested and patch tested by jmc@


# 1.300 18-Apr-2020 schwarze

When a .Tg is attached to a paragraph, attach the permalink
to the first word, or the first few words if they are short.


# 1.299 08-Apr-2020 schwarze

Use a separate node->tag attribute rather than abusing the node->string
attribute for the purpose. No functional change intended.
The purpose is to make it possible to later attach tags to text nodes.


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.305 04-Oct-2021 schwarze

store the operating system name obtained from uname(3) in the adequate
struct together with similar state date rather than in a function-scope
static variable, such that it can be free(3)d in roff_man_free();
no functional change


Revision tags: OPENBSD_7_0_BASE
# 1.304 18-Jul-2021 schwarze

Support auto-tagging for ".It Va".

This combination is somewhat rare because few libraries expose so many
global variables that they need a list to enumerate them, but when the
idiom does occur, tagging the variable names is generally useful.
For example, this helps awk(1), dc(1), make(1), rc.subr(8), ...

Missing feature reported and patch reviewed, tested, and OK'ed by kn@.


Revision tags: OPENBSD_6_9_BASE
# 1.303 30-Oct-2020 schwarze

Promote section headers that can can be used unmodified as fragment
identifiers from TAG_WEAK to TAG_STRONG,
such that for example ...#DESCRIPTION always works.
Suggested by Aman Verma on the discuss@ list.


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.302 26-Apr-2020 schwarze

While we do not recommend the idiom ".Fl Fl long" for long options
because it is an abuse of semantic macros for device-specific
presentational effects, this idiom is so widespread that it makes
sense to convert it to the recommended ".Fl \-long" during the
validation phase. For example, this improves HTML formatting
in pages where authors have used the dubious .Fl Fl.

Feature suggested by Steffen Nurpmeso <steffen at sdaoden dot eu>
on freebsd-hackers.


# 1.301 24-Apr-2020 schwarze

provide a STYLE message when mandoc knows the file name and the extension
disagrees with the section number given in the .Dt or .TH macro;
feature suggested and patch tested by jmc@


# 1.300 18-Apr-2020 schwarze

When a .Tg is attached to a paragraph, attach the permalink
to the first word, or the first few words if they are short.


# 1.299 08-Apr-2020 schwarze

Use a separate node->tag attribute rather than abusing the node->string
attribute for the purpose. No functional change intended.
The purpose is to make it possible to later attach tags to text nodes.


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.304 18-Jul-2021 schwarze

Support auto-tagging for ".It Va".

This combination is somewhat rare because few libraries expose so many
global variables that they need a list to enumerate them, but when the
idiom does occur, tagging the variable names is generally useful.
For example, this helps awk(1), dc(1), make(1), rc.subr(8), ...

Missing feature reported and patch reviewed, tested, and OK'ed by kn@.


Revision tags: OPENBSD_6_9_BASE
# 1.303 30-Oct-2020 schwarze

Promote section headers that can can be used unmodified as fragment
identifiers from TAG_WEAK to TAG_STRONG,
such that for example ...#DESCRIPTION always works.
Suggested by Aman Verma on the discuss@ list.


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.302 26-Apr-2020 schwarze

While we do not recommend the idiom ".Fl Fl long" for long options
because it is an abuse of semantic macros for device-specific
presentational effects, this idiom is so widespread that it makes
sense to convert it to the recommended ".Fl \-long" during the
validation phase. For example, this improves HTML formatting
in pages where authors have used the dubious .Fl Fl.

Feature suggested by Steffen Nurpmeso <steffen at sdaoden dot eu>
on freebsd-hackers.


# 1.301 24-Apr-2020 schwarze

provide a STYLE message when mandoc knows the file name and the extension
disagrees with the section number given in the .Dt or .TH macro;
feature suggested and patch tested by jmc@


# 1.300 18-Apr-2020 schwarze

When a .Tg is attached to a paragraph, attach the permalink
to the first word, or the first few words if they are short.


# 1.299 08-Apr-2020 schwarze

Use a separate node->tag attribute rather than abusing the node->string
attribute for the purpose. No functional change intended.
The purpose is to make it possible to later attach tags to text nodes.


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.303 30-Oct-2020 schwarze

Promote section headers that can can be used unmodified as fragment
identifiers from TAG_WEAK to TAG_STRONG,
such that for example ...#DESCRIPTION always works.
Suggested by Aman Verma on the discuss@ list.


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.302 26-Apr-2020 schwarze

While we do not recommend the idiom ".Fl Fl long" for long options
because it is an abuse of semantic macros for device-specific
presentational effects, this idiom is so widespread that it makes
sense to convert it to the recommended ".Fl \-long" during the
validation phase. For example, this improves HTML formatting
in pages where authors have used the dubious .Fl Fl.

Feature suggested by Steffen Nurpmeso <steffen at sdaoden dot eu>
on freebsd-hackers.


# 1.301 24-Apr-2020 schwarze

provide a STYLE message when mandoc knows the file name and the extension
disagrees with the section number given in the .Dt or .TH macro;
feature suggested and patch tested by jmc@


# 1.300 18-Apr-2020 schwarze

When a .Tg is attached to a paragraph, attach the permalink
to the first word, or the first few words if they are short.


# 1.299 08-Apr-2020 schwarze

Use a separate node->tag attribute rather than abusing the node->string
attribute for the purpose. No functional change intended.
The purpose is to make it possible to later attach tags to text nodes.


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.302 26-Apr-2020 schwarze

While we do not recommend the idiom ".Fl Fl long" for long options
because it is an abuse of semantic macros for device-specific
presentational effects, this idiom is so widespread that it makes
sense to convert it to the recommended ".Fl \-long" during the
validation phase. For example, this improves HTML formatting
in pages where authors have used the dubious .Fl Fl.

Feature suggested by Steffen Nurpmeso <steffen at sdaoden dot eu>
on freebsd-hackers.


# 1.301 24-Apr-2020 schwarze

provide a STYLE message when mandoc knows the file name and the extension
disagrees with the section number given in the .Dt or .TH macro;
feature suggested and patch tested by jmc@


# 1.300 18-Apr-2020 schwarze

When a .Tg is attached to a paragraph, attach the permalink
to the first word, or the first few words if they are short.


# 1.299 08-Apr-2020 schwarze

Use a separate node->tag attribute rather than abusing the node->string
attribute for the purpose. No functional change intended.
The purpose is to make it possible to later attach tags to text nodes.


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.301 24-Apr-2020 schwarze

provide a STYLE message when mandoc knows the file name and the extension
disagrees with the section number given in the .Dt or .TH macro;
feature suggested and patch tested by jmc@


# 1.300 18-Apr-2020 schwarze

When a .Tg is attached to a paragraph, attach the permalink
to the first word, or the first few words if they are short.


# 1.299 08-Apr-2020 schwarze

Use a separate node->tag attribute rather than abusing the node->string
attribute for the purpose. No functional change intended.
The purpose is to make it possible to later attach tags to text nodes.


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.300 18-Apr-2020 schwarze

When a .Tg is attached to a paragraph, attach the permalink
to the first word, or the first few words if they are short.


# 1.299 08-Apr-2020 schwarze

Use a separate node->tag attribute rather than abusing the node->string
attribute for the purpose. No functional change intended.
The purpose is to make it possible to later attach tags to text nodes.


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.299 08-Apr-2020 schwarze

Use a separate node->tag attribute rather than abusing the node->string
attribute for the purpose. No functional change intended.
The purpose is to make it possible to later attach tags to text nodes.


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.298 06-Apr-2020 schwarze

Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.
In HTML output, improve the logic for writing inside permalinks:
skip them when there is no child content or when there is a risk
that the children might contain flow content.


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.297 02-Apr-2020 schwarze

Copy tagged strings before marking hyphens as breakable.
For example, this makes ":tCo-processes" work in ksh(1).


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.296 01-Apr-2020 schwarze

Just like we are already doing it in HTML output, automatically tag
section and subsection headers in terminal output, too. Even though
admittedly, commands like "/SEE" and "/ Subsec" work, too, there
is no downside, and besides, with the recent improvements in the
tagging framework, implementation cost is negligible.


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.295 13-Mar-2020 schwarze

Split tagging into a validation part including prioritization
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.

Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.

Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.

Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.

The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.294 27-Feb-2020 schwarze

Fully support explicit tagging of .Sh and .Ss.
This fixes the offset of two lines in terminal output
and this improves HTML output by putting the id= attribute
and <a> element into the respective <h1> or <h2> element rather
than writing an additional <mark> element.

To that end, introduce node flags NODE_ID (to make the node a link
target, for example by writing an HTML id= attribute or by calling
tag_put()) and NODE_HREF (to make the node a link source, used only
in HTML output, used only to write an <a class="permalink"> element).

In particular:
* In the validator, generalize the concept of the "next node"
such that it also works before .Sh and .Ss.
* If the first argument of .Tg is empty, don't forget to complain
if there are additional arguments, which will be ignored.
* In the terminal formatter, support writing of explicit tags
for all kinds of nodes, not just for .Tg.
* In deroff(), allow nodes to have an explicit string representation
even when they aren't text nodes. Use this for explicitly tagged
section headers. Suprisingly, this is sufficient to make HTML
output work, without explicit code changes in the HTML formatter.
* In syntax tree output, display NODE_ID and NODE_HREF.


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.293 27-Feb-2020 schwarze

Introduce the concept of nodes that are semantically transparent:
they are skipped when looking for previous or following high-level
macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm
and .Tg, and man(7) .DT and .PD. Use this concept for a variety
of improved decisions in various validators and formatters.

While here,
* remove a few const qualifiers on struct arguments that caused trouble;
* get rid of some more Yoda notation in the vicinity;
* and apply some other stylistic improvements in the vicinity.

I found this class of issues while considering .Tg patches from kn@.


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.292 19-Jan-2020 schwarze

Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a place
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.

Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.

Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.


# 1.291 19-Jan-2020 schwarze

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.


Revision tags: OPENBSD_6_6_BASE
# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.290 13-Sep-2019 schwarze

Improve validation of function names:
1. Relax checking to accept function types of the form
"ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>).
2. Tighten checking to require the closing parenthesis.


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.289 27-Jun-2019 schwarze

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that. Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


Revision tags: OPENBSD_6_5_BASE
# 1.288 13-Mar-2019 schwarze

Contrary to what the NetBSD attribute(3) manual page suggests,
using __dead instead of __attribute__((__noreturn__)) actually
hinders portability rather than helping it.

Given that mandoc already uses __attribute__ in several files
and that in the portable version, ./configure already contains
rudimentary support for ignoring it on platforms that do not
support it, use __attribute__ directly.

This is expected to fix build failures that Stephen Gregoratto
<dev at sgregoratto dot me> reported from Arch and Debian Linux.


# 1.287 11-Mar-2019 schwarze

mark check_abort() and post_abort() as __dead;
based on a patch by Christos@ Zoulas at NetBSD


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.286 04-Mar-2019 schwarze

When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:

$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.

Friendlier error message suggested by jmc@, who also OK'ed the patch.


# 1.285 04-Mar-2019 schwarze

Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.284 31-Dec-2018 schwarze

Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.


# 1.283 31-Dec-2018 schwarze

Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.


# 1.282 31-Dec-2018 schwarze

Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.281 30-Dec-2018 schwarze

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.280 14-Dec-2018 schwarze

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.


# 1.279 04-Dec-2018 schwarze

Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.

Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.

In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.


# 1.278 03-Dec-2018 schwarze

In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.


Revision tags: OPENBSD_6_4_BASE
# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.277 17-Aug-2018 schwarze

Remove more pointer arithmetic passing via regions outside the array
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.


# 1.276 16-Aug-2018 schwarze

Do not calculate a pointer to a memory location before the beginning of
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson
point out that is undefined behaviour by the C standard even if we
never access the pointer.


# 1.275 01-Aug-2018 schwarze

Fix an off-by-one string read access that could happen if an empty
string argument preceded a string argument beginning with "--".
Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.


# 1.274 01-Aug-2018 schwarze

Avoid a read access one byte beyond the end of an allocated string
which occurred in situations like ".Fl a Cm --"; found by
Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.273 11-Apr-2018 schwarze

preserve comments before .Dd when converting mdoc(7) to man(7)
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.272 05-Apr-2018 schwarze

use the portable \(lq and \(rq internally rather than \(Lq and \(Rq


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.271 16-Mar-2018 schwarze

Ouch, fix previous: In the edge case of a single-character string
containing nothing but a single hyphen, the pointer got incremented
twice at one point, causing a read overrun found by naddy@.


# 1.270 16-Mar-2018 schwarze

Style message about bad input encoding of em-dashes as -- instead of \(em.
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


# 1.269 06-Feb-2018 schwarze

Delete the "no blank before trailing delimiter" check from the
partial explicit macros. Leah Neukirchen <leah at vuxu dot org>
rightfully points out that the check makes no sense for these macros.


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@


Revision tags: OPENBSD_6_2_BASE
# 1.268 12-Sep-2017 schwarze

Do not segfault when there are two .Dt macros, the first without
an architecture argument and the second with an invalid one.
Bug found by jsg@ with afl(1).


# 1.267 02-Aug-2017 schwarze

No longer use names that only occur in the SYNOPSIS section as names
for man(1) lookup. For OpenBSD base and Xenocara, that functionality
was never intended to be required, and i just fixed the last handful
of offenders using it - not counting the horribly ill-designed
interfaces engine(3) and lh_new(3) which are impossible to properly
document in the first place.

Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm,
.Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function"
still works.

This change also gets rid of a few bogus warnings "cross reference
to self" which actually are *not* to self, like in yp(8).

This former functionality was intended to help third-party software
in the ports tree and on non-OpenBSD systems containing manual pages
with incomplete or corrupt NAME sections. But it turned out it did
more harm than good, and caused more confusion than relief,
specifically for third party manuals and for maintainers of
mandoc-portable on other operating systems. So kill it.
Problems reported, among others, by Yuri Pankov (illumos).

OK jmc@


# 1.266 31-Jul-2017 schwarze

Fix an out of bounds read access to a constant array that caused
segfaults on certain hardened versions of glibc. Triggered by .sp
or blank lines right before .SS or .SH, or before the first .Sh.
Found the hard way by Dr. Markus Waldner on Debian
and by Leah Neukirchen on Void Linux.


# 1.265 20-Jul-2017 schwarze

correctly handle letters in .Nx arguments; improves for example
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...


# 1.264 15-Jul-2017 schwarze

If -column, -diag, -inset, -item, or -ohang lists have a -width,
don't just talk about ignoring it, actually do ignore it.
No change for terminal output, improves HTML output.


# 1.263 03-Jul-2017 schwarze

report trailing delimiters after macros where they are usually a mistake;
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>


# 1.262 02-Jul-2017 schwarze

add warning "cross reference to self"; inspired by mdoclint


# 1.261 01-Jul-2017 schwarze

Basic reporting of .Xrs to manual pages that don't exist
in the base system, inspired by mdoclint(1).

We are able to do this because (1) the -mdoc parser, the -Tlint validator,
and the man(1) manual page lookup code are all in the same program
and (2) the mandoc.db(5) database format allows fast lookup.

Feedback from, previous versions tested by, and OK jmc@.

A few features will be added to this in the tree, step by step.


# 1.260 29-Jun-2017 schwarze

warn about some non-portable idioms in .Bl -column;
triggered by a question from Yuri Pankov (illumos)


# 1.259 27-Jun-2017 schwarze

warn about .Ns macros that have no effect because they are followed
by an isolated closing delimiter; inspired by mdoclint


# 1.258 25-Jun-2017 schwarze

Catch typos in .Sh names; suggested by jmc@.

I'm using a very simple, linear time / zero space fuzzy string
matching heuristic rather than a full Levenshtein metric, to keep
the code both simple and fast.


# 1.257 24-Jun-2017 schwarze

operating system dependent message about unknown architecture;
inspired by mdoclint


# 1.256 24-Jun-2017 schwarze

in the base system, suggest leaving .Os blank; inspired by mdoclint


# 1.255 24-Jun-2017 schwarze

Split -Wstyle into -Wstyle and the even lower -Wbase, and add
-Wopenbsd and -Wnetbsd to check conventions for the base system of
a specific operating system. Mark operating system specific messages
with "(OpenBSD)" at the end.

Please use just "-Tlint" to check base system manuals (defaulting
to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the
manuals of portable software projects you maintain that are not
part of OpenBSD base, to avoid bogus recommendations about base
system conventions that do not apply.

Issue originally reported by semarie@, solution using
an idea from tedu@, discussed with jmc@ and jca@.


# 1.254 17-Jun-2017 schwarze

style message about missing RCS ids; inspired by mdoclint


# 1.253 11-Jun-2017 schwarze

ooops, fix a glitch in the previous commit...


# 1.252 11-Jun-2017 schwarze

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).


# 1.251 11-Jun-2017 schwarze

style message about missing .Fn markup; inspired by mdoclint


# 1.250 11-Jun-2017 schwarze

Do not issue the message "no blank before trailing delimiter" for .No.
In practice, that message only matters inside .Bf, and even there, it
can occasionally be a false positive. In all other cases, it usually
is a false positive, so it is better to drop it outright.
Suggested by jmc@.


# 1.249 10-Jun-2017 schwarze

Reduce false positives for the "no blank before trailing delimiter" message.
This brings us down to one false positive for about every 18 pages.


# 1.248 10-Jun-2017 schwarze

style message about missing blank before trailing delimiter;
inspired by mdoclint(1), and jmc@ considers it useful


# 1.247 07-Jun-2017 schwarze

style checks related to .Er; inspired by mdoclint(1)


# 1.246 01-Jun-2017 schwarze

STYLE message about full stop at the end of .Nd; inspired by mdoclint(1)


# 1.245 31-May-2017 schwarze

STYLE message about missing use of Ox/Nx/Fx/Dx; OK jmc@ wiz@


# 1.244 30-May-2017 schwarze

STYLE message about useless macros we don't want (Bt Tn Ud);
not a WARNING because they don't endanger portability


# 1.243 14-May-2017 schwarze

warn about punctuation between .Xr and .Rs in SEE ALSO;
inspired by mdoclint


# 1.242 05-May-2017 schwarze

Move .sp to the roff modules. Enough infrastructure is in place
now that this actually saves code: -70 LOC.


# 1.241 05-May-2017 schwarze

move .ll to the roff modules


# 1.240 05-May-2017 schwarze

Move handling of the roff(7) .ft request from the man(7)
modules to the new roff(7) modules. As a side effect,
mdoc(7) now handles .ft, too. Of course, do not use that.


# 1.239 04-May-2017 schwarze

Parser reorg:
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.


# 1.238 29-Apr-2017 schwarze

Parser unification: use nice ohashes for all three request and macro tables;
no functional change, minus two source files, minus 200 lines of code.


# 1.237 28-Apr-2017 schwarze

Delete .Pp right before the first .Sh and right before any .Ss,
and warn about it; mdoclint(1) does so, and it makes sense.


# 1.236 24-Apr-2017 schwarze

Continue parser unification:
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].


Revision tags: OPENBSD_6_1_BASE
# 1.235 06-Mar-2017 schwarze

Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.


# 1.234 06-Feb-2017 schwarze

The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.

While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.


# 1.233 11-Jan-2017 schwarze

Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.

This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.


# 1.232 10-Jan-2017 schwarze

Use new NODE_NOSRC and NODE_NOPRT flags for .Bx and .At.
More rigorous AST and 40 lines less code.


# 1.231 10-Jan-2017 schwarze

For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.


# 1.230 10-Jan-2017 schwarze

unify names of AST node flags; no change of cpp output


# 1.229 10-Jan-2017 schwarze

Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.

These will help to make handling of text production macros more rigorous.


# 1.228 08-Jan-2017 schwarze

Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.


# 1.227 08-Jan-2017 schwarze

Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd


# 1.226 28-Dec-2016 schwarze

Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.


# 1.225 09-Oct-2016 schwarze

Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.

I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.

Useless complication noticed by Carsten Kunze (Heirloom roff).


# 1.224 20-Aug-2016 schwarze

If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).


# 1.223 11-Aug-2016 schwarze

oops, fix stupid typo in previous


# 1.222 11-Aug-2016 schwarze

If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).


# 1.221 10-Aug-2016 schwarze

Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)


# 1.220 10-Aug-2016 schwarze

Don't printf("%s", NULL) if .It has a macro as an argument
in a list of a type where items don't takes arguments.
Issue found by tb@ with afl(1).


# 1.219 10-Aug-2016 schwarze

When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).


# 1.218 09-Aug-2016 schwarze

fix printf("%s", NULL);
found while investigating an unrelated bug report from jsg@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.217 08-Jan-2016 schwarze

Delete the redundant "nchild" member of struct roff_node, replacing
most uses by one, a few by two pointer checks, and only one by a
tiny loop - not only making data smaller, but code shorter as well.

This gets rid of an implicit invariant that confused both static
analysis tools and human auditors. No functional change.


# 1.216 30-Oct-2015 schwarze

If a .Bd block has no arguments at all, drop the block and only keep
its contents. Removing a gratuitious difference to groff output
found after a related bug report from krw@.


# 1.215 21-Oct-2015 schwarze

Move all mdoc(7) node validation done before child parsing
to the new separate validation pass, except for a tiny bit
needed by the parser which goes to the new mdoc_state() module;
cleaner, simpler, and surprisingly also shorter by 15 lines.


# 1.214 20-Oct-2015 schwarze

In order to become able to generate syntax tree nodes on the roff(7)
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.


# 1.213 19-Oct-2015 schwarze

style cleanup, no functional change


# 1.212 12-Oct-2015 schwarze

Delete an assignment that is unconditionally overwritten two lines later;
found by Svyatoslav Mishyn <juef at openmailbox dot org>
with the clang static analyzer.


# 1.211 12-Oct-2015 schwarze

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.


# 1.210 06-Oct-2015 schwarze

modernize style: "return" is not a function; ok cmp(1)


# 1.209 26-Sep-2015 schwarze

/* NOTREACHED */ after abort() is silly, delete it


# 1.208 14-Sep-2015 schwarze

Remove the warning about children of .Vt blocks because actually,
.Vt type global_variable No = Dv defined_constant ;
is the best way to specify in the SYNOPSIS how a global variable
is initialized in the rare case where that matters.
Issue noticed by jmc@.


Revision tags: OPENBSD_5_8_BASE
# 1.207 23-Apr-2015 schwarze

Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.


# 1.206 20-Apr-2015 schwarze

Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.


# 1.205 19-Apr-2015 schwarze

Unify some node handling functions that use TOKEN_NONE.
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.


# 1.204 19-Apr-2015 schwarze

Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.


# 1.203 19-Apr-2015 schwarze

Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.


# 1.202 18-Apr-2015 schwarze

Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.


# 1.201 02-Apr-2015 schwarze

Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.


# 1.200 02-Apr-2015 schwarze

First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.


Revision tags: OPENBSD_5_7_BASE
# 1.199 23-Feb-2015 schwarze

oops, in NAME, don't nag about the comma after .Nm


# 1.198 23-Feb-2015 schwarze

improve NAME section diagnostics;
confusing messages reported by Jan Stary <hans at stare dot cz>


# 1.197 17-Feb-2015 schwarze

Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in base by about 7%.
Reminded of the issue by naddy@.


# 1.196 16-Feb-2015 schwarze

clean up post_dt() validation function;
improved diagnostics, minus six lines of code


# 1.195 14-Feb-2015 schwarze

shut up about tabs in SYNOPSIS .Fd lines, there is no good way to avoid them


# 1.194 12-Feb-2015 schwarze

Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.

Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.

This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.

Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.


# 1.193 10-Feb-2015 schwarze

trim trailing white space, no code change;
from Svyatoslav Mishyn <juef at openmailboxd dot org>, Crux Linux


# 1.192 06-Feb-2015 schwarze

replace the last legacy generic message type, "argument count wrong",
by more specific messages, improving diagnostics for .cc .tr .Bl -column


# 1.191 06-Feb-2015 schwarze

Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.


# 1.190 06-Feb-2015 schwarze

better handle .Fo and .Fd without argument
better handle .Fo with more than one argument


# 1.189 06-Feb-2015 schwarze

better handle empty .Bd .Bl .D1 .Dl blocks


# 1.188 06-Feb-2015 schwarze

better handle .In .Sh .Ss .St .Xr without arguments


# 1.187 05-Feb-2015 schwarze

fix handling of empty .An macros


# 1.186 04-Feb-2015 schwarze

Discard excess head arguments for .Bd .Bl .Bk and delete hwarn_eq0().
Discard empty .Bk blocks.
Improve related diagnostics.


# 1.185 04-Feb-2015 schwarze

improve diagnostics regarding arguments of .An .Pp .Lp .br .sp
in particular, get rid of check_count(..., CHECK_EQ, 0)


# 1.184 04-Feb-2015 schwarze

discard .Rs head arguments and improve .Rs diagnostics


# 1.183 04-Feb-2015 schwarze

more specific .Nd diagnostics, allowing to get rid of enum check_lvl
and the respective argument of check_count()


# 1.182 03-Feb-2015 schwarze

Bring .Pp/.Lp handling inside .Nm blocks closer to groff;
as a bonus, get rid of another call to rew_sub().


# 1.181 18-Dec-2014 schwarze

Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.


# 1.180 18-Dec-2014 schwarze

When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.


# 1.179 30-Nov-2014 schwarze

Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.


# 1.178 28-Nov-2014 schwarze

Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.


# 1.177 28-Nov-2014 schwarze

Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.


# 1.176 28-Nov-2014 schwarze

Remove bulky, irrelevant library description string tables
not used by a single manual in OpenBSD and just print library names;
will remain in the portable version for use by FreeBSD and NetBSD.
Removes 150 lines from source tree and 16 Kilobytes (4%) from binary.
Bloat found by deraadt@.


# 1.175 28-Nov-2014 schwarze

Simplify code by making mdoc validation handlers void.
No functional change, minus 90 lines of code.


# 1.174 27-Nov-2014 schwarze

Downgrade .Bd -file from FATAL to ERROR.
Since this was the last remaining FATAL error in this area,
this change will allow major simplifications in the mdoc(7) parser.


# 1.173 27-Nov-2014 schwarze

Fix the obsolete .Db (toggle debug mode) macro to ignore its arguments
and not trigger an assertion when there is more than one argument;
the latter found by jsg@ with afl.


# 1.172 26-Nov-2014 schwarze

remove an unreachable warning about .Sm arguments


# 1.171 17-Nov-2014 schwarze

Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.


# 1.170 30-Oct-2014 schwarze

Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.


# 1.169 13-Oct-2014 schwarze

Do not warn about declarations of functions returning function pointers,
getting rid of a false positive noticed by bentley@.


# 1.168 11-Oct-2014 schwarze

oops, don't crash when .Fo has no argument


# 1.167 11-Oct-2014 schwarze

warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@


# 1.166 12-Sep-2014 schwarze

warn about commas in function arguments; inspired by mdoclint(1)


# 1.165 11-Sep-2014 schwarze

warn about botched .Xr ordering and punctuation below SEE ALSO;
inspired by mdoclint(1)


# 1.164 07-Sep-2014 schwarze

warn about AUTHORS sections without .An macros, inspired by mdoclint(1)


# 1.163 07-Sep-2014 schwarze

Allow .ll in the prologue; Daniel Levai reports Slackware Linux uses this.


# 1.162 19-Aug-2014 schwarze

Do not dereference a NULL pointer if a .Bl macro has
no -type, -width, -offset or -compact arguments whatsoever;
this got broken in mdoc_validate.c rev. 1.156.
While here, sort headers.


# 1.161 08-Aug-2014 schwarze

Bring the handling of defective prologues even closer to groff,
in particular relaxing the distinction between prologue and body
and further improving messages.
* The last .Dd wins and the last .Os wins, even in the body.
* The last .Dt before the first body macro wins.
* Missing title in .Dt defaults to UNTITLED. Warn about it.
* Missing section in .Dt does not default to 1. But warn about it.
* Do not warn multiple times about the same mdoc(7) prologue macro.
* Warn about missing .Os.
* Incomplete .TH defaults to empty strings. Warn about it.


# 1.160 08-Aug-2014 schwarze

Simplify by allowing only one post-handler.
Saves 36 static arrays and 10 lines of code
at the expense of only five new trivial static functions.
No functional change.


# 1.159 08-Aug-2014 schwarze

Simplify by allowing only one pre-handler.
Saves 12 static arrays and 19 lines of code.
No functional change.


# 1.158 08-Aug-2014 schwarze

demacrify: get rid of man_nmsg(), man_pmsg(), mdoc_nmsg(), mdoc_pmsg()


# 1.157 08-Aug-2014 schwarze

mention requests and macros in more messages


# 1.156 08-Aug-2014 schwarze

Split MANDOCERR_IGNARGV into one message for .An and one for .Bl
and report the macro name and argument.


# 1.155 08-Aug-2014 schwarze

In .Bl -column, if some of the column width declarations are given
right after the -column argument and some at the very end of the
argument list, after some other arguments like -compact, concatenate
the column lists.
This gets rid of one of the last useless FATAL errors
and actually shortens the code by a few lines.

This fixes an issue introduced more than five years ago, at first
causing an assert() since bsd.lv mdoc_action.c rev. 1.14 (June 17, 2009),
then later a FATAL error since mdoc_validate rev. 1.130 (Nov. 30, 2010),
and marked as "TODO" ever since.


# 1.154 08-Aug-2014 schwarze

Remove the useless FATAL error "argument count wrong, violates syntax".
The last remaining instance was .It in .Bl -column with more than one
excessive .Ta. However, simply downgrading from FATAL to ERROR, it just
works fine, almost the same way as in groff, without any other changes.


# 1.153 08-Aug-2014 schwarze

Get rid of the useless FATAL error "child violates parent syntax".
When finding items outside lists, simply skip them and throw an ERROR.
Handle subsections before the first section instead of bailing out.


# 1.152 08-Aug-2014 schwarze

Remove two useless FATAL errors.
When a file contains neither text nor macros, treat it as an empty document.
When the mdoc(7) document prologue is incomplete, use some default values.


# 1.151 08-Aug-2014 schwarze

better name and wording for the last two non-generic errors


# 1.150 08-Aug-2014 schwarze

Various improvements related to .Ex and .Rv:
* let .Nm fall back to the empty string, not to UNKNOWN
* never let .Rv copy an argument from .Nm
* avoid spurious \fR after empty .Nm in -Tman
* correct handling of .Ex and .Rv in -Tman
* correct the wording of the output for .Rv without arguments
* use non-breaking spaces in .Ex and .Rv output where required
* split MANDOCERR_NONAME into a warning for .Ex and an error for .Nm


# 1.149 08-Aug-2014 schwarze

Partial implementation of .Bd -centered.

In groff, .Bd -centered operates in fill mode, which is relatively
hard to implement, while this implementation operates in non-fill
mode so far. As long as you pay attention that your lines do not
overflow, it works. To make sure that rendering is the same for
mandoc and groff, it is recommended to insert .br between lines
for now. This implementation will need improvement later.


Revision tags: OPENBSD_5_6_BASE
# 1.148 07-Jul-2014 schwarze

no need to delete any content from .Rs blocks,
and downgrade the related message from ERROR to WARNING


# 1.147 06-Jul-2014 schwarze

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.


# 1.146 05-Jul-2014 schwarze

Cleanup with respect to bad macro arguments.
* Fix .Sm with invalid arg: move arg out and toggle mode.
* Promote "unknown standard" from WARNING to ERROR, it loses information.
* Delete MANDOCERR_BADWIDTH, it would only indicate a mandoc(1) bug.
* Do not report MANDOCERR_BL_LATETYPE when there is no type at all.
* Mention macro names, arguments and fallbacks.


# 1.145 05-Jul-2014 schwarze

Cleanup regarding -offset and -width:
* Bugfix: Last one wins, not first one.
* Fix .Bl -width without argument: it means 0n, so do not ignore it.
* Report macro names, argument names and fallbacks in related messages.
* Simplify: Garbage collect auxiliary variables in pre_bd() and pre_bl().


# 1.144 04-Jul-2014 schwarze

Clean up messages regarding excess arguments:
* Downgrade ".Bf -emphasis Em" from FATAL to WARNING.
* Mention the macros, the arguments, and the fallbacks.
* Hierarchical naming.
Also fix the handling of excess .It head arguments in -Tman.


# 1.143 04-Jul-2014 schwarze

Clean up messages related to missing arguments.
* Do not warn about empty -column cells, they seem valid to me.
* Downgrade empty item and missing -std from ERROR to WARNING.
* Hierarchical naming.
* Descriptive, not imperative style.
* Mention macro names, argument names, and fallbacks.
* Garbage collect some unreachable code in post_it().


# 1.142 03-Jul-2014 schwarze

Fix formatting of empty .Bl -inset item heads.
Downgrade empty item heads from ERROR to WARNING.
Show the list type in the error message.
Choose better variable names for nodes in post_it().


# 1.141 02-Jul-2014 schwarze

Improve and test the messages about empty macros,
in particular reporting the macro names involved.


# 1.140 02-Jul-2014 schwarze

When .Sm is called without an argument, groff toggles the spacing mode,
so let us do the same for compatibility. Using this feature is of
course not recommended except in manual page obfuscation contests.


# 1.139 02-Jul-2014 schwarze

Disentangle the MANDOCERR_CHILD message, which reported three
completely different things, into three distinct messages.
Also mention the macro names we are talking about.


# 1.138 02-Jul-2014 schwarze

Clean up warnings related to macros and nesting.
* Hierarchical naming of enum mandocerr items.
* Improve the wording to make it comprehensible.
* Mention the offending macro.
* Garbage collect one chunk of ancient, long unreachable code.


# 1.137 02-Jul-2014 schwarze

Improve "skipping paragraph macro" messages,
showing which macro was skipped and before or after what.


# 1.136 02-Jul-2014 schwarze

Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,
since this is hardly more complicated than explicitly ignoring them
as we did in the past. Of course, do not use them!


# 1.135 01-Jul-2014 schwarze

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.


# 1.134 20-Jun-2014 schwarze

As suggested by jmc@, only include line and column numbers into messages
when they are meaningful, to avoid confusing stuff like this:
$ mandoc /dev/null
mandoc: /dev/null:0:1: FATAL: not a manual
Instead, just say:
mandoc: /dev/null: FATAL: not a manual

Another example this applies to is documents having a prologue,
but lacking a body. Do not throw a FATAL error for these; instead,
issue a warning and show the empty document, in the man(7) case with
the same amount of blank lines as groff does. Also downgrade mdoc(7)
documents having content before the first .Sh from FATAL to WARNING.


# 1.133 20-Jun-2014 schwarze

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.


# 1.132 23-Apr-2014 schwarze

Audit malloc(3)/calloc(3)/realloc(3) usage.
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).


# 1.131 23-Apr-2014 schwarze

Audit strlcpy(3)/strlcat(3) usage.

* Repair three instances of silent truncation, use asprintf(3).
* Change two instances of strlen(3)+malloc(3)+strlcpy(3)+strlcat(3)+...
to use asprintf(3) instead to make them less error prone.
* Cast the return value of four instances where the destination
buffer is known to be large enough to (void).
* Completely remove three useless instances of strlcpy(3)/strlcat(3).
* Mark two places in -Thtml with XXX that can cause information loss
and crashes but are not easy to fix, requiring design changes of
some internal interfaces.
* The file mandocdb.c remains to be audited.


# 1.130 20-Apr-2014 schwarze

strlen+malloc+snprintf is error prone;
rewrite post_lb() to use asprintf(3) instead


# 1.129 20-Apr-2014 schwarze

make sure static buffers for snprintf(3) are large enough
and cast snprintf return value to (void) where they are


# 1.128 20-Apr-2014 schwarze

KNF: case (FOO): -> case FOO, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change


# 1.127 15-Apr-2014 schwarze

Using macros in .Sh header lines, or having .Sm off or .Bk -words open
while processing .Sh, is not at all recommended, but it's not strictly
a syntax violation either, and in any case, mandoc must not die in an
assertion. I broke this in rev. 1.124.

Crash found while trying to read the (rather broken) original 4.3BSD-Reno
od(1) manual page.


# 1.126 31-Mar-2014 dlg

recognise the CONTEXT section. we consider it only applicable to
section 9 manpages for now.

requested by schwartz@ pre 5.5
tweaks by jmc@ schwartz@
ok schwartz@


# 1.125 30-Mar-2014 schwarze

Implement the roff(7) .ll (line length) request.
Found by naddy@ in the textproc/enchant(1) port.
Of course, do not use this in new manuals.


# 1.124 23-Mar-2014 schwarze

Retire the old concat() function.
For .Sh, i wasn't even needed at all.
For .Dd, .Nm, and .Os, use the new mdoc_deroff() instead.
This gets rid of the last limited-size static buffers in this file,
hence eliminates the last explicit MANDOCERR_MEM throwers here,
and it shortens the code by 50 lines.


# 1.123 21-Mar-2014 schwarze

avoid repetitive code for asprintf error handling


# 1.122 21-Mar-2014 schwarze

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.


Revision tags: OPENBSD_5_5_BASE
# 1.121 16-Feb-2014 schwarze

After Werner Lemberg accepted and committed some updates to the manual
page template contained in groff_mdoc(7), catch up with our own stuff.
In particular, allow ERRORS in section 4 and DIAGNOSTICS in section 9.
ok jmc@


# 1.120 11-Jan-2014 schwarze

Remove useless use of strnlen(3).
Yuckiness pointed out by deraadt@.


# 1.119 07-Jan-2014 schwarze

Cache the result of uname(3) such that we don't need to call it
over and over again for each manual; found with gprof(1).
Speeds up mandocdb(8) -Q by 3%, now at 39.5% of makewhatis(8).


# 1.118 06-Jan-2014 schwarze

Another 18% speedup for mandocdb(8) -Q, found by gprof(1).
In -Q mode, refrain form validating and normalizing the format
of the date given in .Dd or .TH, as it won't be used anyway.

For /usr/share/man, mandocdb -Q now takes 45% of the time of makewhatis(8).


# 1.117 06-Jan-2014 schwarze

Joerg Sonnenberger contributed copyrightable amounts of text to
some files. To make it clear that he also put his contributions
under the ISC license, with his explicit permission, add his
Copyright notice to the relevant files. No code change.


# 1.116 15-Dec-2013 schwarze

The "value" argument to the roff(7) .nr requests ends right before
the first non-digit character. While here, implement and document
an optional sign, requesting increment or decrement, as documented
in the Ossanna/Kernighan/Ritter troff manual and supported by groff.

Reported by bentley@ on discuss at mdocml.


# 1.115 21-Oct-2013 schwarze

There are three kinds of input lines: text lines, macros taking
positional arguments (like Dt Fn Xr) and macros taking text as
arguments (like Nd Sh Em %T An). In the past, even the latter put
each word of their arguments into its own MDOC_TEXT node; instead,
concatenate arguments unless delimiters, keeps or spacing mode
prevent that. Regarding mandoc(1), this is internal refactoring,
no output change intended.

Once we will switch mandocdb(8) from DB to SQLite in the future,
this is going to be required to support search expressions crossing
word boundaries, and it will reduce both database sizes and build
times by a bit more than 5% each.


# 1.114 06-Oct-2013 schwarze

We don't do hyphenation, but we allow breaking the line at hyphens that are
already there in the middle of words. So far, we only allowed this on text
lines. Now it turns out some macros allow this for their arguments, too,
in particular .Nd and most of the .%? citation macros.

Issue found by Franco Fichtner <franco at lastsummer dot de> while doing
systematic groff-mandoc comparisons in the DragonFly base system, THANKS!

While here, garbage collect two empty prevalidator function pointer lists
and sort a couple of function declarations.


# 1.113 06-Oct-2013 schwarze

If there is random stuff inside a .Bl block body before the first .It,
do not throw a FATAL error and do not die, but just throw a WARNING
and move the stuff out of the .Bl block.

This bug felt completely 2008-ish; meanwhile, such bugs from the
Kristaps-doesnt-like-syntax-errors-so-lets-just-give-up--Era
are becoming rare, but this was one of the last survivors.

Thanks to bentley@ for reminding me to finally fix this.


# 1.112 03-Oct-2013 schwarze

Support setting arbitrary roff(7) number registers,
preserving read support for the ".nr nS" SYNOPSIS state register;
read support for arbitrary registers is still not available.

Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013),
but implemented differently. I don't want to have yet another different
implementation of a hash table in mandoc - it would be the second one
in roff.c alone and the fifth one in mandoc grand total.
Instead, i designed and implemented roff_setreg() and roff_getreg()
to be similar to roff_setstrn() and roff_getstrn().

Once we feel the need to optimize, we can introduce one common
hash table implementation for everything in mandoc.


# 1.111 16-Sep-2013 schwarze

One of the WARNING messages has to use the word "section" twice in two
different meanings, that cannot be helped. But we can make this less
confusing by stating that the second instance refers to stuff like (2),
(3), and (9), and by adding the sections header the first instance
refers to, for example ERRORS or RETURN VALUES.

Source for confusion noticed by Jan Stary <hans at stare dot cz>,
better wording suggested by jmc@, tweaked by me.


# 1.110 05-Aug-2013 schwarze

Put .%C before .%D in .Rs output
because that's the usual order in formal citations.

My patch that was accepted into groff by Werner Lemberg
uses the same order, so keep groff and mandoc consistent.

Committing now because jmc@ already starts to rely on the .%C macro,
see for example /usr/src/usr.bin/bdes/bdes.1 rev. 1.11.


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.109 17-Nov-2012 schwarze

Cleanup naming of local variables to make the code easier on the eye:
Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta"
and avoid the confusing "*m" which was sometimes this, sometimes that.
No functional change.

ok kristaps@ some time ago


# 1.108 16-Nov-2012 schwarze

Warn about unknown volume or arch in Dt macro arguments;
patch written by Nicolas Joly <njoly at pasteur dot fr>.


Revision tags: OPENBSD_5_2_BASE
# 1.107 18-Jul-2012 schwarze

Fix handling of paragraph macros inside lists:
* When they are trailing the last item, move them outside the list.
* When they are trailing any other none-compact item, drop them.

Improves formatting of 40 pages, e.g. grep(1), ksh(1), netstat(1),
ath(4), bsd.port.mk(5), pf.conf(5), mount(8), crypto(9).


# 1.106 16-Jul-2012 schwarze

Several -mdoc parser improvements related to vertical spacing:
* So far, .Pp and .Lp were removed before paragraph type blocks.
* Now also remove .br before paragraph type blocks.
* Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it.
* Do not treat .sp as a paragraph, don't remove anything before it.
* After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines.
* After .sp and .br, remove .br.


# 1.105 12-Jul-2012 schwarze

The post_nm() validation function crashed when the first .Nm child node
was a non-text node. Fix this by rewriting post_nm() to always set
the meta name to UNKNOWN when the name is missing or unusable.
While here, make MANDOCERR_NONAME an ERROR, as it usually renders
the page content unintelligible.

Bug reported by Maxim <Belooussov at gmail dot com>, thanks.


# 1.104 11-Jul-2012 schwarze

fix position and formatting of %U


# 1.103 10-Jul-2012 schwarze

* implement -Tman .Bl -item -inset -diag -ohang -dash -hyphen -enum .It
* fix -Tman .Bl -bullet .It
* adjust the -Tascii .Bl -bullet -dash -hyphen .It
default and minimum width to new groff standards,
it changed from 4n (in groff 1.15) to 2n (in groff 1.21)
* same for -Tascii -enum, it changed from 5n to 2n
* use -hang formatting for -Tascii -enum -width 2n
* for -Tascii -enum, the default is -width 3n


# 1.102 24-May-2012 schwarze

Support -Ios='OpenBSD 5.1' to override uname(3) as the source of the
default value for the mdoc(7) .Os macro.
Needed for man.cgi on the OpenBSD website.

Problem with man.cgi first noticed by deraadt@;
beck@ and deraadt@ agree with the way to solve the issue.


# 1.101 15-Apr-2012 schwarze

Two bugfixes regarding the -width and -offset macro arguments:
1) They consume the next argument even if it starts with a dash.
2) When -width is the last argument on the line such that the
actual width argument is missing, downgrade from a fatal to a
non-fatal error, just like for -offset. The formatting still
doesn't agree with groff, but at least we don't die any longer.

Item 2 was observed and that part of the patch coded by kristaps@,
who found lots of instances of this particular formatting error
in Mac OSX manuals.


Revision tags: OPENBSD_5_1_BASE
# 1.100 03-Dec-2011 schwarze

remove useless "#ifdef __linux__" that crept in,
and trivial sync to bsd.lv (two new comments)


# 1.99 02-Dec-2011 schwarze

In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.

ok kristaps@

To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff


# 1.98 19-Nov-2011 schwarze

Avoid a NULL pointer access if an .Rs block body contains nothing
but invalid nodes. Output still differs a lot from groff, but at
least let's not crash.
Problem found and patch provided by joerg@, thanks!


# 1.97 16-Nov-2011 schwarze

When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).


# 1.96 16-Oct-2011 schwarze

Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.


# 1.95 18-Sep-2011 schwarze

sync to version 1.11.7 from kristaps@
main new feature: support the roff(7) .tr request
plus various bugfixes and some refactoring

regressions are so minor that it's better to get this in
and fix them in the tree


# 1.94 18-Sep-2011 schwarze

sync to version 1.11.5:
adding an implementation of the eqn(7) language
by kristaps@

So far, only .EQ/.EN blocks are handled, in-line equations are not, and
rendering is not yet very pretty, but the parser is fairly complete.


Revision tags: OPENBSD_5_0_BASE
# 1.93 29-May-2011 schwarze

Merge release 1.11.3, almost all code by kristaps@:
* Unicode output support (no Unicode input yet, though).
* Refactoring: completely handle predefined strings in roff.c.
- New function mandoc_escape() replaces a2roffdeco() and mandoc_special().
- Start using mandoc_getarg() in mdoc_argv.c.
- Clean up parsing of delimiters in mdoc(7).
* And many minor fixes and lots of cleanup.


# 1.92 24-Apr-2011 schwarze

Merge version 1.11.1:
Again lots of cleanup and maintenance work by kristaps@.
- simplify error reporting: less function pointers, more mandoc_[v]msg
- main: split document parsing out of main.c into read.c
- roff, mdoc, man: improved recognition of control characters
- roff: better handling of if/else stack overflows
- roff: add some predefined strings for backward compatibility
- mdoc, man: empty sections are not errors
- mdoc: move delimiter handling to libmdoc
- some header restructuring and some minor features and fixes
This merge causes two minor regressions
that i will fix in separate commits right afterwards.


# 1.91 21-Apr-2011 schwarze

Merge version 1.10.10:
lots of cleanup and maintenance work by kristaps@.
- move some main.c globals into struct curparse
- move mandoc_*alloc to mandoc.h such that all code can use them
- make mandoc_isdelim available to formatting frontends
- dissolve mdoc_strings.c, move the code where it is used
- make all error reporting functions void, their return values were useless
- and various minor cleanups and fixes


# 1.90 20-Mar-2011 schwarze

Import the foundation for eqn(7) support.
Written by kristaps@.

For now, i'm adding one line to each of the four frontends
to just pass the input text through to the output,
not yet interpreting any of then eqn keywords.


# 1.89 07-Mar-2011 schwarze

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
- always store dates as strings, not as seconds since the Epoch
- for input, try the three most common formats everywhere
- for unrecognized format, just pass the date though verbatim
- when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@


Revision tags: OPENBSD_4_9_BASE
# 1.88 06-Feb-2011 schwarze

If .Ns is specified on its own line, ignore it, like groff does;
from kristaps@.


# 1.87 30-Jan-2011 schwarze

Make .Bx accept not more than two arguments.
Convert the first character of the second argument to uppercase.
Append the second argument with a hyphen.
Improves chpass(1), column(1), fstat(1), ...
from kristaps@


# 1.86 30-Jan-2011 schwarze

Like in groff, if .%B is specified, quote .%T; from kristaps@.


# 1.85 22-Jan-2011 schwarze

Check argument count validation for all in_line() macros.
Most empty in_line() macros are already removed by the parser,
so there is no need to check again in mdoc_validate.c.
This also downgrades almost all remaining argument count issues
from ERROR to WARNING.
ok kristaps@


# 1.84 04-Jan-2011 schwarze

Merge kristaps@' cleaner tbl integration, removing mine;
there are still a few bugs, but fixing these will be easier in tree.


# 1.83 03-Jan-2011 schwarze

Partial cleanup of argument count validation in mdoc(7):

* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.

Looks fine to kristaps@.

Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.


# 1.82 29-Dec-2010 schwarze

Reorg by Kristaps: In libmdoc, replace the union of pointers to structs
of macro-specific data by a pointer to a union of structs, which makes the
code simpler and more robust at the expense of a small memory overhead.
Merging was somewhat difficult because we mustn't break tbl(1) support
which the bsd.lv version does not yet have.


# 1.81 26-Dec-2010 schwarze

Behave more like groff (both old and new): Specifying both .%T and .%J in
an .Rs block causes the title to be quoted instead of underlined, such
that journal title and article title appear visually different.
Original diff from kristaps@, simplified by me, tweaked again by kristaps@.


# 1.80 21-Dec-2010 schwarze

Migrate .An to use a pointer to its data, like everybody else.
In preparation for a simpler ref-counted system for node data.
From kristaps@.


# 1.79 21-Dec-2010 schwarze

Vertical spacing improvements from kristaps@, small tweaks by me:
Add a "last child" member to struct mdoc_node.
Remove .Pp or .Lp if it is the first or last child of an .Sh or .Ss body.
Thus, no need to do the same in the front-ends any longer.
Tolerate some cases of .Pp inside .Bl.


# 1.78 09-Dec-2010 schwarze

Allow quote macros (`Op', `Aq', `Bq', `Dq', `Pq', `Ql', `Qq', `Sq', and
`Brq') to have zero arguments without warning. This makes sense because
the multi-line quote macros (`Oo/Oc' etc.) allow zero children anyway.
Furthermore, the documentation doesn't state that they're required to
have children.

Reported by Alex Kozlov, patch from kristaps@.


# 1.77 07-Dec-2010 schwarze

Complete the merge of bsd.lv version 1.10.7:
No more functional changes, just sync ordering, comments and white space.


# 1.76 01-Dec-2010 schwarze

Merge mdoc_action.c into mdoc_validate.c, because having two places to do
basically the same things just causes code duplication and confusion.
Work by kristaps@, including a few bugfixes he found during the merge,
and reapplying OpenBSD changes on top.


# 1.75 26-Oct-2010 schwarze

Downgrade nearly 20 ERRORS to WARNINGS.
All these indicate problems in the mdoc(7) or man(7) source code,
but they can't cause relevant information loss or clobbered formatting.
While here, error message improve wording and make it more uniform,
don't throw MANDOCERR_NOWIDTHARG twice when there is one single issue,
and consolidate MANDOCERR_WIDTHARG into MANDOCERR_IGNARGV.


# 1.74 24-Oct-2010 schwarze

Do not throw FATAL errors when there is no need to:
- when encountering nested displays (.Bd containing .Bd, .D1, .D1)
- when a block end macro was forgotten
- when ending a block that was never started
- when the uname(3) system call failed
along with a little related cleanup


# 1.73 23-Oct-2010 schwarze

use proper message in case of multiple arguments to .An
from kristaps@


# 1.72 23-Oct-2010 schwarze

cleanup mdoc(7) validation code: use real functions, not macros
from kristaps@


# 1.71 16-Oct-2010 schwarze

Support tbl(1) code embedded into mdoc(7) input files.
Very similar to what i have done in man(7) yesterday.
Allows to build cpu(4) on HPPA, wi(4), and phantasia(6).
Now we are able to build all tbl code in base.


# 1.70 27-Sep-2010 schwarze

Merge the last bits of 1.10.6 (released today), most were already in:
* ignore double-.Pp
* ignore .Pp before .Bd and .Bl (unless -compact in specified)
* avoid double blank line upon .Pp, .br and friends in literal context
* cast enums to int when passing them to exit(3) to please lint(1)
While merging, fix a regression introduced by kristaps@:
Outside literal mode, double blank lines must both be printed.
To achieve this again after kristaps@ improvements in 1.10.6,
treat such blank lines as .sp (instead of .Pp as in 1.10.5)
and drop .Pp before .sp just like dropping .Pp before .Pp.


# 1.69 20-Sep-2010 schwarze

Make .Pp .Lp .br not FATAL when having arguments;
From kristaps@


# 1.68 20-Aug-2010 schwarze

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.


Revision tags: OPENBSD_4_8_BASE
# 1.67 31-Jul-2010 schwarze

Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.
NOT including Kristaps' .Bd -literal changes which cause regressions.
Features:
* -Tpdf now fully working
Bugfixes:
* proper handling of quoted strings by .ds in roff(7)
* allow empty .Dd
* make .Sm start no-spacing after the first output word
* underline .Ad
* minor fixes in -Thtml
and some optimisations in terminal output.


# 1.66 25-Jul-2010 schwarze

Sync to bsd.lv; in particular, pull in lots of bug fixes.
new features:
* support the .in macro in man(7)
* support minimal PDF output
* support .Sm in mdoc(7) HTML output
* support .Vb and .nf in man(7) HTML output
* complete the mdoc(7) manual
bug fixes:
* do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@
* avoid double blank lines related to man(7) .sp and .br
* let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@
* let "\ " produce a non-breaking space; reported by deraadt@
* discard \m colour escape sequences; reported by J.C. Roberts
* map undefined 1-character-escapes to the literal character itself
maintenance:
* express mdoc(7) arguments in terms of an enum for additional type-safety
* simplify mandoc_special() and a2roffdeco()
* use strcspn in term_word() in place of a manual loop
* minor optimisations in the -Tps and -Thtml formatting frontends


# 1.65 13-Jul-2010 schwarze

Merge release 1.10.4 (all code by kristaps@), providing four new features:
1) Proper .Bk support: allow output line breaks at input line breaks,
but keep input lines together in the output, finally fixing
synopses like aucat(1), mail(1) and tmux(1).
2) Mostly finished -Tps (PostScript) output.
3) Implement -Thtml output for .Nm blocks and .Bk -words.
4) Allow iterative interpolation of user-defined roff(7) strings.
Also contains some minor bugfixes and some performance improvements.


# 1.64 02-Jul-2010 schwarze

Not only for -tag lists, but for -hang, -ohang, -inset, -diag,
and -item list as well, empty bodies are OK, they do not even
warrant a warning, much less the error they were throwing.
According to kristaps, joerg@ also brought this up some time ago.
ok kristaps@ jmc@


# 1.63 27-Jun-2010 schwarze

Full .nr nS support, unbreaking the kernel manuals.

Kristaps coded this from scratch after reading my .nr patch;
it is simpler and more powerful.

Registers live in struct regset in regs.h, struct man and struct mdoc
contain pointers to it. The nS register is cleared when parsing .Sh.
Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.


# 1.62 27-Jun-2010 schwarze

minor .Bk fixes:
* do not print invalid arguments verbatim (no groffs prints them, either)
* do not trigger TERMP_PREKEEP twice
* do not die from invlid arguments (groff won't die, either)
* continue to ignore even valid arguments (just like groff)
ok kristaps@ on the previous version, before removing my last bug ;)


# 1.61 26-Jun-2010 schwarze

merge release 1.10.2
* bug fixes:
- interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein)
- handling of roff conditionals (found by Ulrich Spoerlein)
- .Bd -offset will no more default to 6n
* maintenance:
- more caching of .Bd and .Bl arguments for efficiency
- deconstify man(7) validation routines
- add FreeBSD library names (provided by Ulrich Spoerlein)
* start PostScript font-switching


# 1.60 06-Jun-2010 schwarze

Merge bsd.lv version 1.10.1 (to be released soon).

The main step forward is that this now has *much* better .Bl -column
support, now supporting many manuals that previously errored out
without producing any output.

Other fixes include:
* do not die from multiple list types, use the first and warn
* in .Bl without a type, default to -item
* various tweaks to .Dt
* fix .In, .Fd, .Ft, .Fn and .Fo formatting
* some documentation fixes and additions
* and fix a couple of bugs reported by Ulrich Spoerlein:
* better support for roff block-end "\}" without a preceding dot
* .In must not break the line outside SYNOPSIS
* spelling in some error messages

While merging, fix one regression in .In spacing
that needs to go to bsd.lv, too.


# 1.59 06-Jun-2010 schwarze

Merge bsd.lv release 1.10.0,
which is mostly the post-hackathon release,
bringing in the OpenBSD changes to bsd.lv,
but which also has a few additional minor fixes:

* .Lb is an in-line macro, not in_line_eoln
* .Bt, .Ud now warn when discarding arguments
* allow bad -man dates to flow verbatim into the front-ends
- so far all reported by Ulrich Spoerlein
* .Ar, .Fl and .Li starting with closing punctuation emit an empty element
* empty .Li macros print nothing, but may cause spacing
* proper EOS handling for .Bt, .Ex, .Rv, and .Ud.
* cleanup: collapse posts_xr into posts_wtext (which is the same)
* efficiency: very simple table lookup for roff.c


# 1.58 26-May-2010 schwarze

When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.

Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.

Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.

idea and coding by kristaps@

Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.


# 1.57 24-May-2010 schwarze

lift 64-byte max width for Sh (now BUFSIZ); from kristaps@


# 1.56 24-May-2010 schwarze

Increase performance by saving the list type in struct mdoc_node.
This will eventually be used so that mdoc_macro can know whether to
dump list line arguments into the body (`Bl -column' overflowing).
Remove a2list() and arg_listtype() because of this.

From kristaps@.

While merging, fix a regression in mdoc_term.c, print_bvspace():
The bsd.lv version of this broke vertical spacing in .Bl -column.


# 1.55 23-May-2010 schwarze

Unified error and warning message system for all of mandoc,
featuring three message levels, as agreed during the mandoc hackathon:
* FATAL parser failure, cannot produce any output from this input file:
eventually, we hope to convert most of these to ERRORs.
* ERROR, meaning mandoc cannot cope fully with the input syntax and will
probably lose information or produce structurally garbled output;
it will try to produce output anyway but exit non-zero at the end,
which is eventually intended to make the ports infrastructure happy.
* WARNING, meaning you should clean up the input file, but output
is probably mostly OK, so this will not cause error-exit at the end.
This commit is mostly just converting the old system to the new one; before
the classification will become really reliable, we must check all messages.

In particular,
* set up a new central message string table in main.c
* drop the old message string tables from man.c and mdoc.c
* get rid of the piece-meal merr enums in libman and libmdoc
* reduce number of error/warning functions from 16 to 6 (still a lot...)

While here, handle a few problems more gracefully:
* allow .Rv and .Ex to work without a prior .Nm
* allow .An to ignore extra arguments
* allow undeclared columns in .Bl -column

Written by kristaps@.


# 1.54 15-May-2010 schwarze

allow non-numeric manual sections in -mdoc;
while here, allow LIBRARY in section 9;
by kristaps@


# 1.53 15-May-2010 schwarze

various improvements regarding errors and warnings Joerg Sonnenberger:
* If the last -column .Bl isn't specified, it is auto-sized.
* An invalid .St argument should be a warning, not an error.
Just put the argument into the output.
* An invalid .At argument should be a warning, not an error.
Just print the argument, like new groff does.
* Remove warnings concerning manual section (like 1, 6, 8).
It was only used for .Ex and not really useful.
* Remove warnings concerning page section (like SYNOPSIS).
These were only used for .Fd and .Lb and not really useful.


# 1.52 14-May-2010 schwarze

Integrate kristaps@' end-of-sentence (EOS) framework
which is simpler and more powerful than mine, and remove mine.

* man(7) now has EOS handling, too
* put EOS detection into its own function in libmandoc
* use node and termp flags to communicate the EOS condition
* no more EOS pseudo-macro
* no more non-printable EOS marker character on the formatter level

This slightly breaks EOS detection after trailing punctuation
in mdoc(7) macros, but that will be restored soon.


# 1.51 14-May-2010 schwarze

Merge 1.9.25, keeping local patches;
this does not merge kristaps' end-of-sentences handling yet,
i will check that separately. This one includes:
* handle \*(Ba as a delimiter
* introduce ARGS_PEND for .Bl -column .It end-of-line special casing
* section ordering: expect EXIT STATUS at the right place
* line break fixes in SYNOPSIS
* allow literal contexts to have arbitrary line lengths
* the input file column number can not be used to identify the beginning
of a line because white space is allowed after the initial '.'
* proper leading spaces in -man -Tascii mode
* do not let Lb break lines in -mdoc -Thtml LIBRARY


# 1.50 14-May-2010 schwarze

merge 1.9.24, keeping local patches; some changes:
* preserve multiple consecutive space characters in input
* do not restrict .Cd and .Rv to certain sections (requested by Joerg)
* do not run lookup() on quoted words
* enum return types for mdoc_args and mdoc_argv
* fix auto-closing of LINK tag in -Txhtml (from Daniel Friesel)
* various lint and manual fixes


# 1.49 13-May-2010 schwarze

Remove the command line option -fno-ign-chars.
This option was not useful, you never want mandoc to die
just because there is an invalid character in the input file,
neither in production nor when linting: a warning is sufficient.
This was particularly annoying because it was part of -fstrict
and could not be switched off.
"less is more" kristaps@


# 1.48 07-Apr-2010 schwarze

Merge the good parts of 1.9.23,
avoid the bad parts of 1.9.23, and keep local patches.

Input in general:
* Basic handling of roff-style font escapes \f, \F.
* Quoted punctuation does not count as punctuation.

mdoc(7) parser:
* Make .Pf callable; noted by Claus Assmann.
* Let .Bd and .Bl ignore unknown arguments; noted by deraadt@.
* Do not warn when .Er is used outside certain sections.
* Replace mdoc_node_free[list] by mdoc_node_delete.
* Replace #define by enum for rew*() return values.

man(7) parser:
* When .TH is missing, use default section and date.

Output in general:
* Curly braces do not count as punctuation.
* No space after .Fl w/o args when a macro follows on the same line.

HTML output:
* Unify PAIR_*_INIT macros, introduce new PAIR_ID_INIT().
* Print whitespace after, not before .Vt .Fn .Ft .Fo.

Checked that all manuals in base still build.


# 1.47 03-Apr-2010 schwarze

no need to die from .Xr without arguments, we can just ignore it

ok deraadt@


# 1.46 03-Apr-2010 schwarze

When two conflicting list types are specified for the same list,
use the first, discard the second, and warn. No need to bail out.

ok deraadt@


# 1.45 03-Apr-2010 schwarze

* outside literal context in mdoc(7), handle blank lines like .Pp
* a missing NAME section in mdoc(7) need not be fatal

ok deraadt@


# 1.44 02-Apr-2010 schwarze

merge 1.9.22, keeping local patches
* convert mdoc tokens from #define to enum
* fix a segfault with .Xo/.Xc in explicit blocks
* Thorn is \*(Th, not \*(TH; noticed by Joerg Sonnenberger


Revision tags: OPENBSD_4_7_BASE
# 1.43 02-Mar-2010 schwarze

Proper inter-sentence spacing for mdoc(7).
When a text line or a non-block macro line in the source code ends
in any of ".!?", consider that an end of sentence (EOS).
This makes Jason's rule "new sentence, new line" even more important.
Let the parser detect the EOS and insert a token into the AST.
Let the -Tascii frontend render the EOS token as a double space before
the next word.


# 1.42 18-Feb-2010 schwarze

sync to release 1.9.15:
* corrected .Vt handling (spotted by Joerg Sonnenberger)
* corrected .Xr argument handling (based on my patch)
* removed \\ escape sequence (because it is for low-level roff only)
* warn about trailing whitespace (suggested by jmc@)
* -Txhtml support
* and some general cleanup and doc improvements


# 1.41 01-Jan-2010 schwarze

.Bl may have .Sm as a child.
The comment in the source code and OK by kristaps@;
merged upstream in rev. 1.55.


# 1.40 23-Dec-2009 schwarze

sync to 1.9.13: minor fixes:

correctness/functionality:
- bugfix: properly ignore lines with only a dot in -man
- bugfix: .Bl -ohang doesn't allow -width, warn about this
- improve date string handling by new function mandoc_a2time
- some HTML improvements
- significant documentation additions in man.7 and mdoc.7

portability:
- replace __dead by __attribute__((noreturn))
- bugfix: correct .Dx rendering
- some more library names for NetBSD

simplicity:
- replace hand-rolled putchar(3)-loops by fwrite(3)
- replace single-character printf(3) by putchar(3)


# 1.39 22-Dec-2009 schwarze

sync to 1.9.12, mostly portability and refactoring:

correctness/functionality:
- bugfix: do not die when overstep hits the right margin
- new option: -fign-escape
- and various HTML features

portability:
- replace bzero(3) by memset(3), which is ANSI C
- replace err(3)/warn(3) by perror(3)/exit(3), which is ANSI C
- iuse argv[0] instead of __progname
- add time.h to various files for FreeBSD compilation

simplicity:
- do not allocate header/footer data dynamically in *_term.c
- provide and use malloc frontends that error out on failure

for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.38 27-Oct-2009 schwarze

sync to 1.9.11: adapt printing of dates to groff conventions,
NetBSD portability fixes and some minor bugfixes and feature enhancements;
also checked that my hyphenation code still works on top of this


# 1.37 21-Oct-2009 schwarze

sync to 1.9.9, featuring:
* -Thtml output mode
* roff scaling units
* and some minor fixes
for full changelogs, see http://bsd.lv/cgi-bin/cvsweb.cgi/


# 1.36 19-Oct-2009 schwarze

sync to 1.9.6: multiple improvements to references (.Rs)
* validate and order .Rs child nodes
* underline book title (.%B) and issuer (.%I)
* enclose title of article (.%T) in quotes
* avoid calling mdoc_verr directly, use a proper error code instead


# 1.35 22-Aug-2009 schwarze

sync to 1.9.1: .Rv and .Ex accept multiple arguments


# 1.34 09-Aug-2009 schwarze

sync to 1.8.4: correct error message to complain about .An line arguments


# 1.33 09-Aug-2009 schwarze

sync to 1.8.3: In .Bl -column, handle one column in excess,
but still issue a warning


# 1.32 09-Aug-2009 schwarze

sync to 1.8.2: more .Bl -column fixes, in particular:
1) -column implies -compact
2) do not die from fewer columns than specified (more are still fatal)


# 1.31 26-Jul-2009 schwarze

sync to 1.8.1: support .br and .sp


# 1.30 19-Jul-2009 schwarze

The abbreviation for .Bf -symbolic is .Bf Sy, not .Bf Sm;
"Gah! Fixed." kristaps@


# 1.29 18-Jul-2009 schwarze

sync to 1.8.0: a bad .St argument causes an error, not a warning


# 1.28 18-Jul-2009 schwarze

sync to 1.8.0: white space fixes, no code change


# 1.27 18-Jul-2009 schwarze

sync to 1.8.0: move mdoc_a2att, mdoc_a2st, and mdoc_a2lib to libmdoc


# 1.26 18-Jul-2009 schwarze

sync to 1.8.0: avoid duplicate warning about a malformed NAME section
when the next section following NAME is a custom section


# 1.25 18-Jul-2009 schwarze

sync to 1.8.0: .Nd is now a BFI, was an ELEM,
and use \(en instead of \- for .Nd


# 1.24 13-Jul-2009 schwarze

fix a trivial pasto that crept into 1.7.23; also submitted upstream


# 1.23 13-Jul-2009 schwarze

sync to 1.7.24: mdoc_nwarn/mdoc_nerr got mixed up;
fix from joerg at netbsd via kristaps@


# 1.22 12-Jul-2009 schwarze

sync to 1.7.24: make .In handling more similar to new groff


# 1.21 12-Jul-2009 schwarze

sync to 1.7.23: pass warning code to mdoc_pwarn() instead of warning message
define additional warning macro mdoc_nwarn()
remove obsolete warning functions mdoc_warn(), pwarn(), vwarn(), nwarn()
remove various now unused "enum mdoc_warn" and "enum mwarn"


# 1.20 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_perr() instead of error string
and use the so improved mdoc_nerr() at many places;
get rid of now unused static functions perr()


# 1.19 12-Jul-2009 schwarze

sync to 1.7.23: pass error code to mdoc_nerr() instead of error string
and use the so improved mdoc_nerr() at many places


# 1.18 12-Jul-2009 schwarze

sync to 1.7.23: unify the various "enum merr" into libman.h and libmdoc.h,
use it as a new argument to mdoc_err(), the same way as for for man_err(),
and use string tables instead of switch statements to select error messages


# 1.17 12-Jul-2009 schwarze

sync to 1.7.23: third step to get rid of enum mdoc_warn:
mdoc_verr is not using enum mdoc_warn, so use it at a few more places


# 1.16 12-Jul-2009 schwarze

sync to 1.7.23: second step to get rid of enum mdoc_warn:
remove type from mdoc_vwarn arguments, and use this function where apropriate


# 1.15 08-Jul-2009 schwarze

sync to 1.7.21: unified escape sequence validation for mdoc and man
checking is still incomplete, but a bit better, in particular for man
now in sync with 1.7.22: the only 1.7.22 diff was already in


# 1.14 06-Jul-2009 schwarze

remove unused WDEPCOL warning that became unused in 1.7.19
ok kristaps@ and contained in 1.7.21


Revision tags: OPENBSD_4_6_BASE
# 1.13 26-Jun-2009 schwarze

the forms \*x, \*(xx and \*[xxx] are not deprecated, so revert most of 1.8;
noticed by jmc@; ok kristaps@; to be included in 1.7.21


# 1.12 23-Jun-2009 schwarze

sync to 1.7.20: like for the -man case, add an nchild counter to the -mdoc
nodes, simplifying the validation code; no functional change


# 1.11 21-Jun-2009 schwarze

sync to 1.7.19: .Bl -column now correctly handles tail entries,
for example: .Bl -column -compact -offset ... args ...


# 1.10 21-Jun-2009 schwarze

sync to 1.7.19: kristaps@ rewrote post_bf to reduce nesting
no functional change


# 1.9 19-Jun-2009 schwarze

sync to 1.7.19: more elegant section handling


# 1.8 19-Jun-2009 schwarze

sync to 1.7.19: escape sequences of the forms \*x and \*(xx are deprecated


# 1.7 18-Jun-2009 schwarze

sync to 1.7.19: improved comment handling


# 1.6 18-Jun-2009 schwarze

sync to 1.7.16: The .Er macro may also be used in sections (3) and (9).


# 1.5 18-Jun-2009 schwarze

sync to 1.7.16: use dedicated warning types for list validation
instead of hand-rolled warnings


# 1.4 18-Jun-2009 schwarze

sync to 1.7.16: make a couple of macros callable, reserve "|",
and some tweaks to .Lk


# 1.3 17-Jun-2009 schwarze

sync to 1.7.16: rename static function printwarn to warn_print


# 1.2 14-Jun-2009 schwarze

sync to 1.7.16: comments, whitespace and spelling fixes; no functional change


# 1.1 06-Apr-2009 kristaps

Initial check-in of mandoc for formatting manuals. ok deraadt@