API

Standard usage

Newton.newtonsolve — Function

newtonsolve(rf!, x0::AbstractVector, [cache::NewtonCache]; tol=1.e-6, maxiter=100)

Solve the nonlinear equation (system) r(x)=0 using the newton-raphson method by calling the mutating residual function rf!(r, x), with signature rf!(r::T, x::T)::T where T<:AbstractVector x0 is the initial guess and the optional cache can be preallocated by calling NewtonCache(x0). Note that x0 is not modified, unless aliased to getx(cache). tol is the tolerance for norm(r) and maxiter the maximum number of iterations.

returns x, drdx, converged::Bool

drdx is the derivative of r wrt. x at the returned x.

source

newtonsolve(rf, x0::T; tol=1.e-6, maxiter=100) where {T <: 
    Union{Number, StaticArrays.SVector, Tensors.Vec, Tensors.SecondOrderTensor}}

Solve the nonlinear equation (system) r(x)=0 using the newton-raphson method by calling the residual function r=rf(x), with signature rf(x::T)::T x0::T is the initial guess, tol the tolerance form norm(r), and maxiter the maximum number of iterations.

returns: x, drdx, converged::Bool

drdx is the derivative of r wrt. x at the returned x.

source

Newton.NewtonCache — Type

function NewtonCache(x::AbstractVector)

Create the cache used by the newtonsolve and linsolve!. Only a copy of x will be used.

source

Newton.getx — Function

getx(cache::NewtonCache)

Extract out the unknown values. This can be used to avoid allocations when solving defining the initial guess.

source

Newton.logging_mode — Function

Newton.logging_mode(; enable=true)

Helper to turn on (enable=true) or off (enable=false) logging of iterations in Newton.jl. Internally, changes the how Newton.@if_logging expr is evaluated: when logging mode is enabled, expr is evaluated, otherwise expr is ignored.

source

Fast inverse

Newton.inv! — Function

Newton.inv!(A::Matrix, cache::NewtonCache)

Utilize the LU decomposition from RecursiveFactorization.jl along with the non-exported LinearAlgebra.inv!(::LU) to calculate the inverse of A more efficient than inv(A). Note that A will be used as workspace and values should not be used after calling Newton.inv!

source

Use inside AD-calls

Newton.ad_newtonsolve — Function

ad_newtonsolve(rf, x0, rf_dual_args::Tuple; kwargs...)

Solve rf(x, y₁(z), y₂(z), ...) = 0 to find x(y₁(z), y₂(z), ...), given the initial guess x0 as a non-dual number, and the dual input numbers rf_dual_args = (y₁(z), y₂(z), ...) where all are derivatives wrt. the same variable z. Return x of Dual type seeded such that it corresponds to the derivative dx/dz = ∂x/∂yᵢ ⋆ dyᵢ/dz where ⋆ is the appropriate contraction.

Implementation:

Uses the adjoint, i.e. dr/dyᵢ = 0 = ∂r/∂x ⋆ ∂x/∂yᵢ + ∂r/∂yᵢ ⇒ ∂x/∂yᵢ = -[∂r/∂x]⁻¹ ⋆ ∂r/∂yᵢ, such that we avoid doing newton iterations with dual numbers.

ad_newtonsolve(rf, x0, rf_args::Tuple; kwargs...)

If rf_args do not contain dual numbers, the standard newtonsolver is just called on f(x) = rf(x, y₁, y₂, ...), and the solution x is returned. This allows writing generic code where the gradient is sometimes required, but not always.

Example

using Newton, Tensors, ForwardDiff, BenchmarkTools
rf(x::Vec, a::Number) = a * x - (x ⋅ x) * x
function myfun!(outputs::Vector, inputs::Vector)
    x0 = ones(Vec{2}) # Initial guess
    a = inputs[1] + 2 * inputs[2]
    x, converged = ad_newtonsolve(rf, x0, (a,))
    outputs[1] = x ⋅ x
    outputs[2] = a * x[1]
    return outputs 
end
out = zeros(2); inp = [1.2, 0.5]
ForwardDiff.jacobian(myfun!, out, inp)

gives

2×2 Matrix{Float64}:
 1.0      2.0
 1.57321  3.14643

Note

The maximum length of rf_dual_args where it is highly efficient is currently 5. For longer length there will be a dynamic dispatch, but this number can be extended by adding more methods to the internal Newton.get_dual_results function.

source

This approach is faster then naively differentiating a call which includes a newtonsolve, as we avoid iterating using Dual numbers.

using Newton, Tensors, ForwardDiff, BenchmarkTools
rf(x::Vec, a::Number) = a * x - (x ⋅ x) * x
function myfun!(outputs::Vector, inputs::Vector)
    x0 = ones(Vec{2}) # Initial guess
    a = inputs[1] + 2 * inputs[2]
    x, converged = ad_newtonsolve(rf, x0, (a,))
    outputs[1] = x ⋅ x
    outputs[2] = a * x[1]
    return outputs 
end
function myfun2!(outputs::Vector, inputs::Vector)
    x0 = ones(Vec{2}) # Initial guess
    a = inputs[1] + 2 * inputs[2]
    x, _, converged = newtonsolve(x -> rf(x, a), x0)
    outputs[1] = x ⋅ x
    outputs[2] = a * x[1]
    return outputs
end
J = zeros(2,2)
out = zeros(2); inp = [1.2, 0.5]
cfg = ForwardDiff.JacobianConfig(myfun!, out, inp)
cfg2 = ForwardDiff.JacobianConfig(myfun2!, out, inp)
# Call the standard function using newtonsolve
@btime myfun2!($out, $inp);                                     # 143.381 ns (0 allocations: 0 bytes)
# Differentiate through newtonsolve
@btime ForwardDiff.jacobian!($J, $myfun2!, $out, $inp, $cfg2);  # 285.662 ns (0 allocations: 0 bytes)
# Call the function which uses ad_newtonsolve (no difference)
@btime myfun!($out, $inp);                                      # 143.381 ns (0 allocations: 0 bytes)
# Differentiate through ad_newtonsolve
@btime ForwardDiff.jacobian!($J, $myfun!, $out, $inp, $cfg);    # 183.359 ns (0 allocations: 0 bytes)

showing that we get quite close to a regular non-differentiating call wrt. computational time in this microbenchmark.

Internal API

Newton.linsolve! — Function

linsolve!(K::AbstractMatrix, b::AbstractVector, cache::NewtonCache)

Solves the linear equation system Kx=b, mutating both K and b. b is mutated to the solution x

source