API
Standard usage
Newton.newtonsolve
— Functionnewtonsolve(rf!, x0::AbstractVector, [cache::NewtonCache]; tol=1.e-6, maxiter=100)
Solve the nonlinear equation (system) r(x)=0
using the newton-raphson method by calling the mutating residual function rf!(r, x)
, with signature rf!(r::T, x::T)::T where T<:AbstractVector
x0
is the initial guess and the optional cache
can be preallocated by calling NewtonCache(x0)
. Note that x0
is not modified, unless aliased to getx(cache)
. tol
is the tolerance for norm(r)
and maxiter
the maximum number of iterations.
returns x, drdx, converged::Bool
drdx
is the derivative of r wrt. x at the returned x
.
newtonsolve(rf, x0::T; tol=1.e-6, maxiter=100, [linsolver]) where {T <:
Union{Number, StaticArrays.SVector, Tensors.Vec, Tensors.SecondOrderTensor}}
Solve the nonlinear equation (system) r(x)=0
using the newton-raphson method by calling the residual function r=rf(x)
, with signature rf(x::T)::T
x0::T
is the initial guess, tol
the tolerance form norm(r)
, and maxiter
the maximum number of iterations.
A non-standard linsolver
can optionally be specified, please see Linear solvers for more information.
returns: x, drdx, converged::Bool
drdx
is the derivative of r wrt. x at the returned x
.
Newton.NewtonCache
— Typefunction NewtonCache(x::AbstractVector; [linsolver])
Create the cache used by the newtonsolve
and linsolve!
. Only a copy of x
will be used.
A special linsolver
can optionally be given, please see Linear solvers for more information.
Newton.getx
— Functiongetx(cache::NewtonCache)
Extract out the unknown values. This can be used to avoid allocations when solving defining the initial guess.
Newton.logging_mode
— FunctionNewton.logging_mode(; enable=true)
Helper to turn on (enable=true
) or off (enable=false
) logging of iterations in Newton.jl
. Internally, changes the how Newton.@if_logging expr
is evaluated: when logging mode is enabled, expr
is evaluated, otherwise expr
is ignored.
Available linear solvers
Newton.AbstractLinsolver
— TypeNewton.jl comes with the following linear solvers
Newton.StandardLinsolver
— TypeNewton.StandardLinsolver()
This is the default linear solver, which gives safe operations and don't require any special packages to be loaded.
Newton.UnsafeFastLinsolver
— TypeUnsafeFastLinsolver()
This is a special linear solver, which calculates the inverse recursively by using the analytical inverses of 2x2 and 3x3 matrices. This gives exceptional performance for small matrices, but suffers from numerical errors and can be sensitive to badly conditioned matrices. When using this method, it may be advisable to (adaptively) try a slower method if the newton iterations fail to converge.
Newton.RecursiveFactorizationLinsolver
— TypeRecursiveFactorizationLinsolver()
This linear solver utilizes the LU decomposition in RecursiveFactorization.jl
, which gives faster performance than the StandardLinsolver
. While not as fast as UnsafeFastLinsolver
, it is always accurate. Is available via an extension, requiring the user to load RecursiveFactorization.jl
separately.
Fast inverse
Newton.inv!
— FunctionNewton.inv!(A::Matrix, cache::NewtonCache)
In-place inverse, which, depending on the linsolver
in cache
, can be much more efficient than inv(A)
. However, note that A
will be used as workspace and values should not be used after calling Newton.inv!
. In some cases, A
will become its inverse, and the output aliased to A
. This behavior is not true in general, and should not be relied upon.
Newton.sinv
— Functionsinv(a::SMatrix{d, d}) where {d}
Fast, but numerically unstable implementation of the inverse of a statically sized matrix. Beneficial up to d ≈ 50, but can give large floating point errors for badly conditioned and/or large matrices.
About 4 to 5 timers faster than StaticArrays for sizes ∈ [5, 20], and at least twice as fast up to 50 according to benchmarks on macbook with M3 processor.
Newton.sinv!
— Functionsinv!(K::Matrix)
Invert K
in-place using the unsafe static implementation up to a size of 20x20, and fall back to generic LinearAlgebra.inv
Use inside AD-calls
Newton.ad_newtonsolve
— Functionad_newtonsolve(rf, x0, rf_dual_args::Tuple; kwargs...)
Solve rf(x, y₁(z), y₂(z), ...) = 0
to find x(y₁(z), y₂(z), ...)
, given the initial guess x0
as a non-dual number, and the dual input numbers rf_dual_args = (y₁(z), y₂(z), ...)
where all are derivatives wrt. the same variable z
. Return x
of Dual
type seeded such that it corresponds to the derivative dx/dz = ∂x/∂yᵢ ⋆ dyᵢ/dz
where ⋆
is the appropriate contraction.
Implementation:
Uses the adjoint, i.e. dr/dyᵢ = 0 = ∂r/∂x ⋆ ∂x/∂yᵢ + ∂r/∂yᵢ ⇒ ∂x/∂yᵢ = -[∂r/∂x]⁻¹ ⋆ ∂r/∂yᵢ
, such that we avoid doing newton iterations with dual numbers.
ad_newtonsolve(rf, x0, rf_args::Tuple; kwargs...)
If rf_args
do not contain dual numbers, the standard newtonsolver is just called on f(x) = rf(x, y₁, y₂, ...)
, and the solution x
is returned. This allows writing generic code where the gradient is sometimes required, but not always.
Example
using Newton, Tensors, ForwardDiff, BenchmarkTools
rf(x::Vec, a::Number) = a * x - (x ⋅ x) * x
function myfun!(outputs::Vector, inputs::Vector)
x0 = ones(Vec{2}) # Initial guess
a = inputs[1] + 2 * inputs[2]
x, converged = ad_newtonsolve(rf, x0, (a,))
outputs[1] = x ⋅ x
outputs[2] = a * x[1]
return outputs
end
out = zeros(2); inp = [1.2, 0.5]
ForwardDiff.jacobian(myfun!, out, inp)
gives
2×2 Matrix{Float64}:
1.0 2.0
1.57321 3.14643
This approach is faster then naively differentiating a call which includes a newtonsolve, as we avoid iterating using Dual
numbers.
using Newton, Tensors, ForwardDiff, BenchmarkTools
rf(x::Vec, a::Number) = a * x - (x ⋅ x) * x
function myfun!(outputs::Vector, inputs::Vector)
x0 = ones(Vec{2}) # Initial guess
a = inputs[1] + 2 * inputs[2]
x, converged = ad_newtonsolve(rf, x0, (a,))
outputs[1] = x ⋅ x
outputs[2] = a * x[1]
return outputs
end
function myfun2!(outputs::Vector, inputs::Vector)
x0 = ones(Vec{2}) # Initial guess
a = inputs[1] + 2 * inputs[2]
x, _, converged = newtonsolve(x -> rf(x, a), x0)
outputs[1] = x ⋅ x
outputs[2] = a * x[1]
return outputs
end
J = zeros(2,2)
out = zeros(2); inp = [1.2, 0.5]
cfg = ForwardDiff.JacobianConfig(myfun!, out, inp)
cfg2 = ForwardDiff.JacobianConfig(myfun2!, out, inp)
# Call the standard function using newtonsolve
@btime myfun2!($out, $inp); # 143.381 ns (0 allocations: 0 bytes)
# Differentiate through newtonsolve
@btime ForwardDiff.jacobian!($J, $myfun2!, $out, $inp, $cfg2); # 285.662 ns (0 allocations: 0 bytes)
# Call the function which uses ad_newtonsolve (no difference)
@btime myfun!($out, $inp); # 143.381 ns (0 allocations: 0 bytes)
# Differentiate through ad_newtonsolve
@btime ForwardDiff.jacobian!($J, $myfun!, $out, $inp, $cfg); # 183.359 ns (0 allocations: 0 bytes)
showing that we get quite close to a regular non-differentiating call wrt. computational time in this microbenchmark.
Internal API
Newton.linsolve!
— Functionlinsolve!(K::AbstractMatrix, b::AbstractVector, cache::NewtonCache)
Solves the linear equation system Kx=b
, mutating both K
and b
. b
is mutated to the solution x
.
linsolve!(linsolver, K::AbstractMatrix, b::AbstractVector, cache::NewtonCache)
The default implementation will call this signature, which should be overloaded for a different linsolver
passed to cache
upon construction.
Newton.linsolve
— Functionlinsolve(linsolver, K, b)
Solve the equation (system) b = K ⋆ x
where ⋆
is the appropriate contraction between K
and x
, and b
and x
have the same type and size. The following combinations of K
and x
are supported.
K | b , x |
---|---|
Number | Number |
SMatrix{N, N} | SVector{N} |
SecondOrderTensor{d} | AbstractTensor{o, d} |
FourthOrderTensor{d} | AbstractTensor{o, d} |
Newton.extract_submatrix
— Functionextract_submatrix(::Type{SMatrix{d1, d2}}, m::SMatrix, start_row, start_col)
Efficiently extract s::SMatrix{d1, d2}
such that s == m[start_row:(start_row + d1 - 1), start_col:(start_col + d2 - 1)]
Newton.join_submatrices
— Functionjoin_submatrices(a11, a12, a21, a22)
Efficiently join the submatrices to return a::SMatrix = [a11 a12; a21 a22].