Tight interval arithmetic via convexity¶

or, an excuse to explore metaprogramming in Julia¶

Recall from the interval arithmetic lectures the example

$ x \in [-1, 2] $

$ y = x^2 - x $

$ y \gets [-2, 5]$

This is a wild overestimate! Plotting that function over that range:

using Plots
pyplot()
x = [-1:0.1:1]
plot(x, x.^2 - x, label="y")
ylims!(-2, 5)

What's at issue, as David said is "the dependency problem (the fact that x is repeated twice in the expression, and it does not know that the xs are the same)."

The dependency problem occurs when the code is not aware of the mathematical structure of the expression, but Julia can parse expressions quite nicely; how, then, might we use metaprogramming to reduce these overestimates?

For this project I made a little macro to parse simple convex and concave expressions of intervals into intervals tight up to solver tolerance. More about solver tolerance later; for now the thing to note is that these intervals are a guaranteed span: if you position your input variables just right within their intervals, you can achieve both ends of the output interval (though not necessarily everything in between).

x = Interval(-1, 2)
y = @nanotol  x^2 - x

[-0.25000000100006126, 2.0]

There's several ways to describe the specialness of convex functions, but for now let's just think of them geometrically: the line from any point on a convex surface to any other point passes above the surface.

This means a few things: first, that the hyperplane between the corners of our intervals will be an upper bound on values within the intervals. Graphed:

x = [-1:0.1:1]
plot(x, x.^2 - x, linestyle=:dashdot, label="y")
plot!(x,-x+1, label="upper bound")
plot!([-1], [2], color="black", markershape=:circle, label="high point")
ylims!(-2, 5)

Secondly, it means that the highest gradient hyperplanes at those corners is a lower bound on the function.

x = [-1:0.1:1]
plot(x, x.^2 - x, linestyle=:dashdot, label="y")
plot!(x,-x+1, label="upper bound")
plot!([-1], [2], color="black", markershape=:circle, label="high point")
plot!(x,-3x-1, color="green", label="lower bound")
plot!(x,3x-3, color="green", label="lower bound")
plot!([1/3], [-2], color="black", markershape=:circle, label="low point")
ylims!(-2, 5)

$y \gets [-2, 2]$

So far, compared to dependency-less interval arithmetic, this has improved our upper-bound estimate considerably but our lower-bound estimate not at all.

But a part of a convex function is still convex, so we can recurse! Let's evaluate the gradient at our most extreme lower bound.

x = [-1:0.1:1]
plot(x, x.^2 - x, linestyle=:dashdot, label="y")
plot!([-1:0.01:0.34],-(5/3)*[-1:0.01:0.34]+1/3, color="red", label="upper bound")
plot!([0.34:0.01:1],(1/3)*([0.34:0.01:1] - 1), color="red", label="upper bound")
plot!([-1], [2], color="black", markershape=:circle, label="high point")
plot!(x,-3x-1, color="green", label="lower bound")
plot!(x,3x-3, color="green", label="lower bound")
plot!(x,-x/3-1/9, color="green", label="lower bound")
plot!([0.866], [-0.4], color="black", markershape=:circle, label="low point")

ylims!(-2, 5)

$y \gets [-0.4, 2]$

Much better! It's clear that we could keep doing this up to arbitrary precision, saving the user time if they only needed a particular accuracy.

Initially I thought it'd be fun to implement this very simple first-order method for minimizing a convex function, but then I realized Convex.jl could do it all for me, mostly. Turns out it's a whole library that specializes in checking whether something is convex or concave, and optimizing it if it is.

Before we get to that, let's define some things:

type Interval
    lb::Float64
    ub::Float64
end

import Base.show
show(io::IO, x::Interval) = print(io, "[$(x.lb), $(x.ub)]");

Import some other things:

import Convex 
import SCS
import ECOS

A little helper function to pull symbols out of algebraic expressions...

(probably far from the best way to do it)

function getsyms(expr, symbols)
    for arg in expr.args[2:end]
        if isa(arg, Symbol)
            push!(symbols, arg)
        elseif isa(arg, Expr)
            getsyms(arg, symbols)
        end
    end
    return symbols
end;

And finally I'm a terrible macro author, and so am actually just using the macro as an interface that automatically quotes its argument. There's definitely a better way to do this.

function parse_expr(expr)
    # Get symbols and make Convex.jl Variables
    syms = unique(getsyms(expr, []))
    n = length(syms)
    symvals = [eval(s) for s in syms]
    symvars = [Convex.Variable() for s in syms]
    
    # create a function of the expression
    eval(Expr(:(=), Expr(:call, :exprfn, syms...), expr))
    # and call it to determine vexity
    objective = exprfn(symvars...)
    vex = Convex.vexity(objective)
    
    # evalute each corner, hilariously
    function corner(i)
        bitrep = bits(UInt8(i))[end:-1:9-n]  # use UInt8 for now lol
        return Float64[bitrep[j] == '1' ? symvals[j].ub : symvals[j].lb for j in 1:n]
    end
    corner_vals = Float64[exprfn(corner(i)...) for i in 0:2^n-1]
    
    # Make the affine-appropriate interval
    out_interval = Interval(minimum(corner_vals), maximum(corner_vals))
    
    if vex == Convex.AffineVexity()
        nothing
    elseif vex == Convex.ConvexVexity()
        # There's room at the bottom!
        p = Convex.minimize(objective)
        for i in 1:n
            p.constraints +=  symvals[i].lb <= symvars[i]
            p.constraints +=  symvars[i] <= symvals[i].ub
        end
        # can use ECOSSolver with abstol=1e-9
        Convex.solve!(p, ECOS.ECOSSolver(verbose=0, abstol=1e-9))
        out_interval.lb = p.optval - 1e-9
#         for i in 1:n
#             println(syms[i], " ", symvars[i].value)
#         end
    elseif vex == Convex.ConcaveVexity()
        # There's room at the top!
        p = Convex.maximize(objective)
        for i in 1:n
            p.constraints +=  symvals[i].lb <= symvars[i]
            p.constraints +=  symvars[i] <= symvals[i].ub
        end
        Convex.solve!(p, ECOS.ECOSSolver(verbose=0, abstol=1e-9))
        out_interval.ub = p.optval + 1e-9
    else
        # can't handle this
        error("Expression was not convex or concave, and so cannot be",
              " evaluated to an interval tolerance.")
    end
    
    return out_interval
end

macro nanotol(expr) parse_expr(expr) end

Alright, we made it through that function. Let's parse a wacky 3-variable interval to celebrate! (Any ideas for other things to parse?)

x = Interval(-1, 1)
y = Interval(-1, 1)
z = Interval(-1, 1)
g = @nanotol  (x - y + z)^2 + exp(x+3*y+z) + max(x, z)

[-0.5482862740147436, 150.4131591025766]

We know this is very slow compared to the ValidatedNumerics library, but just how slow?

using BenchmarkTools

function testfn()
    global x, y, z
    x = Interval(-5*rand(), 5*rand())
    y = Interval(-5*rand(), 5*rand())
    z = Interval(-5*rand(), 5*rand())
    return parse_expr(:(abs(x - y - z)^2 + exp(x+y+z) + 0.001*max(x, z)))
end

@benchmark testfn()

WARNING: Problem status Suboptimal; solution may be inaccurate.
WARNING: Problem status Suboptimal; solution may be inaccurate.
WARNING: Problem status Suboptimal; solution may be inaccurate.
WARNING: Problem status Suboptimal; solution may be inaccurate.

BenchmarkTools.Trial: 
  samples:          250
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  449.49 kb
  allocs estimate:  9531
  minimum time:     16.02 ms (0.00% GC)
  median time:      17.52 ms (0.00% GC)
  mean time:        20.00 ms (1.03% GC)
  maximum time:     37.15 ms (0.00% GC)

Ok, well, that's pretty slow. It's also less accurate than ValidatedNumerics, as solvers generally the gap between the primal and dual problem, not (as we'd like) the gap between the lower bound and actual result (or for SCS, not even that).

We could fix this by doing one step of the gradient-bounding method originally outlined, using dual variables to get the gradient.

Also, Convex.jl can't parse all convex expressions; in particular Geometric Programming provides a class of convex expressions that would be quite useful for physics expressions of positive quantities.

With geometric programming support, this could be a tool for engineering calculations where uncertain parameters are modeled as a fixed range; this is surprisingly common, especially in back-of-matlab-napkin code.

Thanks for reading. Any questions?

Ned Burnell (eburn@mit.edu)