59641728 melifaro Feb. 22, 2021, 11:37 p.m.
The routing stack control depends on quite a tree of functions to
 determine the proper attributes of a route such as a source address (ifa)
 or transmit ifp of a route.

When actually inserting a route, the stack needs to ensure that ifa and ifp
 points to the entities that are still valid.
Validity means slightly more than just pointer validity - stack need guarantee
 that the provided objects are not scheduled for deletion.

Currently, callers either ignore it (most ifp parts, historically) or try to
 use refcounting (ifa parts). Even in case of ifa refcounting it's not always
 implemented in fully-safe manner. For example, some codepaths inside
 rt_getifa_fib() are referencing ifa while not holding any locks, resulting in
 possibility of referencing scheduled-for-deletion ifa.

Instead of trying to fix all of the callers by enforcing proper refcounting,
 switch to a different model.
As the rib_action() already requires epoch, do not require any stability guarantees
 other than the epoch-provided one.
Use newly-added conditional versions of the refcounting functions
 (ifa_try_ref(), if_try_ref()) and fail if any of these fails.

Reviewed by:	donner
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D28837
7563019b melifaro Feb. 22, 2021, 11:37 p.m.
When we have an ifp pointer and the code is running inside epoch,
 epoch guarantees the pointer will not be freed.
However, the following case can still happen:

* in thread 1 we drop to refcount=0 for ifp and schedule its deletion.
* in thread 2 we use this ifp and reference it
* destroy callout kicks in
* unhappy user reports a bug

This can happen with the current implementation of ifnet_byindex_ref(),
 as we're not holding any locks preventing ifnet deletion by a parallel thread.

To address it, add if_try_ref(), allowing to return failure when
 referencing ifp with refcount=0.
Additionally, enforce existing if_ref() is with KASSERT to provide a
 cleaner error in such scenarios.

Finally, fix ifnet_byindex_ref() by using if_try_ref() and returning NULL
 if the latter fails.

MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D28836
537f92cd markj Feb. 22, 2021, 11:22 p.m.
The scheme used for early slab allocations changed in commit a81c400e75.

Reported by:	alc
Reviewed by:	alc
MFC after:	1 week
d510bf13 mav Feb. 22, 2021, 10:33 p.m.
The previous implementation was reported to try to coalesce packets
in situations when it should not, that resulted in assertion later.
This implementation better checks the first packet of the chain for
the coallescing elligibility.

MFC after:	3 days
963cf6cb jrtc27 Feb. 22, 2021, 10:27 p.m.
61c50cbc tsoome Feb. 21, 2021, 10:45 a.m.
If we start with console set to comconsole, the local
console (vidconsole, efi) is never initialized and attempt to
use the data can render the loader hung.

Reported by:	Kamigishi Rei
MFC after: 3 days
f11e9f32 imp Feb. 22, 2021, 9:39 p.m.
"in" got dropped when I shuffled things around.

Noticed by: rpokala@
MFC After: 3 days
8c09ecb2 imp Feb. 22, 2021, 9:20 p.m.
Add details about when armv6 and armv7 support was added.
23e875fd markj Feb. 22, 2021, 8:50 p.m.
Otherwise, on a powerpc64 NUMA system with hashed page tables, the
first-level superpage reservation size is large enough that the value of
the kernel KVA arena import quantum, KVA_NUMA_IMPORT_QUANTUM, is
negative and gets sign-extended when passed to vmem_set_import().  This
results in a boot-time hang on such platforms.

Reported by:	bdragon
MFC after:	3 days
5ce2d4a1 rew Feb. 22, 2021, 8:31 p.m.
Add /var/run/bhyve/ to BSD.var.dist so we don't have to call mkdir when
creating the unix domain socket for a given bhyve vm.

The path to the unix domain socket for a bhyve vm will now be
/var/run/bhyve/vmname instead of /var/run/bhyve/checkpoint/vmname

Move BHYVE_RUN_DIR from snapshot.c to snapshot.h so it can be shared
to bhyvectl(8).

Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D28783
811e27fa jamie Feb. 22, 2021, 8:27 p.m.
Add the PD_KILL flag that instructs prison_deref() to take steps
to actively kill a prison and its descendents, namely marking it
PRISON_STATE_DYING, clearing its PR_PERSIST flag, and killing any
attached processes.

This replaces a similar loop in sys_jail_remove(), bringing the
operation under the same single hold on allprison_lock that it already
has. It is also used to clean up failed jail (re-)creations in
kern_jail_set(), which didn't generally take all the proper steps.

Differential Revision:  https://reviews.freebsd.org/D28473
ab77cc9e imp Feb. 22, 2021, 8:20 p.m.
Our uefi support has included environment variable support for several years
now. Remove the bogus blanket statement saying we don't support them.

MFC After: 3 days
d1498777 dim Feb. 22, 2021, 8:01 p.m.
After 0ee0dbfb0d26cf4bc37f24f12e76c7f532b0f368 where I imported a more
recent libcxxrt snapshot, the variables 'rtn' and 'has_ret' could in
some cases be used while still uninitialized. Most obviously this would
lead to a jemalloc complaint about a bad free(), aborting the program.

Fix this by initializing a bunch variables in their declarations. This
change has also been sent upstream, with some additional changes to be
used in their testing framework.

PR:		253226
MFC after:	3 days
a805ffbc cy Feb. 22, 2021, 7:20 p.m.
LARGE_NAT is a C macro that increases
	NAT_SIZE from 127 to 2047,
	RDR_SIZE from 127 to 2047,
	HOSTMAP_SIZE from 2047 to 8191,
	NAT_TABLE_MAX from 30000 to 180000, and
	NAT_TABLE_SZ from 2047 to 16383.

These values can be altered at runtime using the ipf -T command however
some adminstrators of large firewalls rebuild the kernel to enable
LARGE_NAT at boot. This revision adds the tunable net.inet.ipf.large_nat
which allows an administrator to set this option at boot instead of build
time. Setting the LARGE_NAT macro to 1 is unaffected allowing build-time
users to continue using the old way.
e2ad10e8 cy Feb. 22, 2021, 7:20 p.m.
As of ipfilter 5.1.2 the IPv4 and IPv6 rules tables have been merged.
The ipf(8) -6 option has been a NOP since then. Currently the additional
ipf -6 load statement in rc.d/ipfilter simply added the second ipfilter
rules file to the table already populated by the previous ipf command.
Plenty of time has passed since ipfilter 5.1.2 was imported. It is time to
remove the option from rc.conf and the rc script.

Differential Revision:	https://reviews.freebsd.org/D28615