0d60e88b melifaro Aug. 10, 2022, 6:20 p.m.
This and the follow-up routing-related changes target to remove or
 reduce `struct rt_addrinfo` usage and use recently-landed nhop(9)
 KPI instead.
Traditionally `rt_addrinfo` structure has been used to propagate all necessary
information between the protocol/rtsock and a routing layer. Many
functions inside routing subsystem uses it internally. However, using
this structure became somewhat complicated, as there are too many ways
of specifying a single state and verifying data consistency is hard.
For example, arerouting flgs consistent with mask/gateway sockaddr pointers?
Is mask really a host mask? Are sockaddr "valid" (e.g. properly zeroed, masked,
have proper length)? Are they mutable? Is the suggested interface specified
 by the interface index embedded into the sockadd_dl gateway, or passed
 as RTAX_IFP parameter, or directly provided by rti_ifp or it needs to
 be derived from the ifa?
These (and other similar) questions have to be considered every time when
 a function has `rt_addrinfo` pointer as an argument.

The new approach is to bring more control back to the protocols and
construct the desired routing objects themselves - in the end, it's the
protocol/subsystem who knows the desired outcome.

This specific diff changes the following:
* add explicit basic low-level radix operations:
 add_route() (renamed from add_route_nhop())
 delete_route() (factored from change_route_nhop())
 change_route() (renamed from change_route_nhop)
* remove "info" parameter from change_route_conditional() as a part
 of reducing rt_addrinfo usage in the internal KPIs
* add lookup_prefix_rt() wrapper for doing re-lookups after
 RIB lock/unlock

Differential Revision: https://reviews.freebsd.org/D36070
MFC after:	2 weeks
07285bb4 glebius Aug. 10, 2022, 6:09 p.m.
This streamlines cloning of a socket from a listener.  Now we do not
drop the inpcb lock during creation of a new socket, do not do useless
state transitions, and put a fully initialized socket+inpcb+tcpcb into
the listen queue.

Before this change, first we would allocate the socket and inpcb+tcpcb via
tcp_usr_attach() as TCPS_CLOSED, link them into global list of pcbs, unlock
pcb and put this onto incomplete queue (see 6f3caa6d815).  Then, after
sonewconn() we would lock it again, transition into TCPS_SYN_RECEIVED,
insert into inpcb hash, finalize initialization of tcpcb.  And then, in
call into tcp_do_segment() and upon transition to TCPS_ESTABLISHED call
soisconnected().  This call would lock the listening socket once again
with a LOR protection sequence and then we would relocate the socket onto
the complete queue and only now it is ready for accept(2).

Reviewed by:		rrs, tuexen
Differential revision:	https://reviews.freebsd.org/D36064
8f5a0a2e glebius Aug. 10, 2022, 6:09 p.m.
as alternative KPI to sonewconn().  The latter has three stages:
- check the listening socket queue limits
- allocate a new socket
- call into protocol attach method
- link the new socket into the listen queue of the listening socket

The attach method, originally designed for a creation of socket by the
socket(2) syscall has slightly different semantics than attach of a socket
cloned by listener.  Make it possible for protocols to call into the
first stage, then perform a different attach, and then call into the
final stage.  The first stage, that checks limits and clones a socket
is called solisten_clone(), and the function that enqueues the socket
is solisten_enqueue().

Reviewed by:		tuexen
Differential revision:	https://reviews.freebsd.org/D36063
c7a62c92 glebius Aug. 10, 2022, 6:09 p.m.
Reviewed by:		rrs, tuexen
Differential revision:	https://reviews.freebsd.org/D36062
d38a784b manu Aug. 10, 2022, 5:25 p.m.
Needed by the module.
fbc50a69 manu Aug. 10, 2022, 5:25 p.m.
Needed by the module.
87f642ac manu Aug. 10, 2022, 5:22 p.m.
Changing mode on a pin (input/output/pullup/pulldown) is a bit slow.
Improve this by caching what we can.
We need to check if the pin is in gpio mode, do that the first time
that we have a request for this pin and cache the result. We can't do
that at attach as we are a child of rk_pinctrl and it didn't finished
its attach then.
Cache also the flags specific to the pinctrl (pullup or pulldown) if the
pin is in input mode.
Cache the registers that deals with input/output mode and output value. Also
remove some register reads when we change the direction of a pin or when we
change the output value since the bit changed in the registers only affect output
abc7a4a0 andrew Aug. 10, 2022, 4:02 p.m.
Define PAGE_SIZE and PAGE_MASK based on PAGE_SHIFT. With this we only
need to set one value to change one value to change the page size.

While here remove the unused PAGE_MASK_* macros.

Sponsored by:	The FreeBSD Foundation
7dc4d511 emaste Aug. 10, 2022, 2:39 p.m.
Fixes INVARIANTS build with Clang 15, which previously failed due to
set-but-not-used variable warnings.

Reviewed by:	dim
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D36097
d88eb465 glebius Aug. 10, 2022, 2:32 p.m.
Imagine we are in SYN-RCVD state and two ACKs arrive at the same time,
both valid, e.g. coming from the same host and with valid sequence.

First packet would locate the listening socket in the inpcb database,
write-lock it and start expanding the syncache entry into a socket.
Meanwhile second packet would wait on the write lock of the listening
socket.  First packet will create a new ESTABLISHED socket, free the
syncache entry and unlock the listening socket.  Second packet would
call into syncache_expand(), but this time it will fail as there
is no syncache entry.  Second packet would generate RST, effectively
resetting the remote connection.

It seems to me, that it is impossible to solve this problem with
just rearranging locks, as the race happens at a wire level.

To solve the problem, for an ACK packet arrived on a listening socket,
that failed syncache lookup, perform a second non-wildcard lookup right
away.  That lookup may find the new born socket.  Otherwise, we indeed
send RST.

Tested by:		kp
Reviewed by:		tuexen, rrs
PR:			265154
Differential revision:	https://reviews.freebsd.org/D36066
f998535a melifaro Aug. 10, 2022, 2:19 p.m.

The current assumption is that kernel-handled rtadv prefixes along with
 the interface address prefixes are the only prefixes considered in
 the ND neighbor eligibility code.
Change this by allowing any non-gatewaye routes to be eligible. This
 will allow DHCPv6-controlled routes to be correctly handled by
 the ND code.
Refactor nd6_is_new_addr_neighbor() to enable more deterministic
 performance in "found" case and remove non-needed
 V_rt_add_addr_allfibs handling logic.

Reviewed By: kbowling
Differential Revision: https://reviews.freebsd.org/D23695
MFC after:	1 month
8d6b3a85 manu Aug. 10, 2022, 1:47 p.m.
Node names for gpio bank were made generic in Linux 5.16 so stop
using them to map the gpio controller to the pin controller bank unit.

Sponsored by:	Beckhoff Automation GmbH & Co. KG
c9ccf3a3 manu Aug. 10, 2022, 12:32 p.m.
Sponsored by:   Beckhoff Automation GmbH & Co. KG
9066e824 manu Aug. 10, 2022, 12:31 p.m.
e67e8565 manu Aug. 10, 2022, 12:29 p.m.
Sponsored by:   Beckhoff Automation GmbH & Co. KG