The Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is a quasi-Newton method for convex optimization.


Definition and Syntax

bool bfgs(arma::vec& init_out_vals, std::function<double (const arma::vec& vals_inp, arma::vec* grad_out, void* opt_data)> opt_objfn, void* opt_data);
bool bfgs(arma::vec& init_out_vals, std::function<double (const arma::vec& vals_inp, arma::vec* grad_out, void* opt_data)> opt_objfn, void* opt_data, algo_settings_t& settings);

Function arguments:

  • init_out_vals a column vector of initial values; will be replaced by the solution values.
  • opt_objfn the function to be minimized, taking three arguments:
    • vals_inp a vector of inputs;
    • grad_out a vector to store the gradient; and
    • opt_data additional parameters passed to the function.
  • opt_data additional parameters passed to the function.
  • settings parameters controlling the optimization routine; see below.

Optimization control parameters:

  • double err_tol the tolerance value controlling how small $\| \nabla f \|$ should be before 'convergence' is declared.
  • int iter_max the maximum number of iterations/updates before the algorithm exits.
  • bool vals_bound whether the search space is bounded. If true, then
    • arma::vec lower_bounds this defines the lower bounds.
    • arma::vec upper_bounds this defines the upper bounds.

Details

Let $x^{(i)}$ denote the function input values at stage $i$ of the algorithm. The BFGS updating rule is as follows.

  • The descent direction is computed using $$d^{(i)} = -W^{(i)} \nabla f^{(i)}$$ where $W^{(i)}$ is the approximate inverse hessian of $f$ at stage $i$, and $\nabla f^{(i)}$ is the gradient of $f$ evaluated at $x^{(i)}$.
  • The input values are then updated: $$x^{(i+1)} = x^{(i)} + \alpha^{(i)} d^{(i)}$$ where $\alpha^{(i)}$ is found via line search: $$\alpha^{(i)} = \arg \min_\alpha f(x^{(i)} + \alpha d^{(i)})$$ The line search method used in OptimLib is that of More and Thuente (1994).
  • $W^{(i)}$ is updated for stage $i+1$ using $$ W^{(i+1)} = \left(I - \dfrac{sy'}{y's}\right) W^{(i)} \left(I - \dfrac{sy'}{y's}\right)' + \dfrac{ss'}{y's} $$ where
  • \begin{align*} s &:= x^{(i+1)} - x^{(i)} \\ y &:= \nabla f^{(i+1)} - \nabla f^{(i)} \end{align*}

The algorithm stops when $\| \nabla f \|$ is less than err_tol, or the total number of iterations exceeds iter_max.


Examples

As examples, consider minimization of:

  • the Sphere function $$\min_{x \in \mathbb{R}^n} \sum_{i=1}^n x_i^2$$ The solution is the zero vector
  • Booth's function $$\min_{x \in [-10,10]^2} \left\{ (x_1 + 2x_2 - 7)^2 + (2 x_1 + x_2 - 5)^2 \right\}$$ The global minimum is located at $(1,3)$.

Below are plots of each example, restricted to the two-dimensional problem where relevant.

Sphere

Booth


Code solving each example using OptimLib is given below.

#include "optim.hpp"

double sphere_fn(const arma::vec& vals_inp, arma::vec* grad_out, void* opt_data)
{
    double obj_val = arma::dot(vals_inp,vals_inp);
    //
    if (grad_out) {
        *grad_out = 2.0*vals_inp;
    }
    //
    return obj_val;
}

double booth_fn(const arma::vec& vals_inp, arma::vec* grad_out, void* opt_data)
{
    double x_1 = vals_inp(0);
    double x_2 = vals_inp(1);

    double obj_val = std::pow(x_1 + 2*x_2 - 7.0,2) + std::pow(2*x_1 + x_2 - 5.0,2);
    //
    if (grad_out) {
        (*grad_out)(0) = 2*(x_1 + 2*x_2 - 7.0) + 2*(2*x_1 + x_2 - 5.0)*2;
        (*grad_out)(1) = 2*(x_1 + 2*x_2 - 7.0)*2 + 2*(2*x_1 + x_2 - 5.0);
    }
    //
    return obj_val;
}

int main()
{
    //
    // sphere function

    const int test_dim = 5;

    arma::vec x = arma::ones(test_dim,1); // initial values (1,1,...,1)

    bool success = optim::bfgs(x,sphere_fn,nullptr);

    if (success) {
        std::cout << "bfgs: sphere test completed successfully." << std::endl;
    } else {
        std::cout << "bfgs: sphere test completed unsuccessfully." << std::endl;
    }

    arma::cout << "bfgs: solution to sphere test:\n" << x << arma::endl;

    //
    // Booth function

    arma::vec x_2 = arma::zeros(2,1); // initial values (0,0)

    bool success_2 = optim::bfgs(x,booth_fn,nullptr);

    if (success_2) {
        std::cout << "bfgs: Booth test completed successfully." << std::endl;
    } else {
        std::cout << "bfgs: Booth test completed unsuccessfully." << std::endl;
    }

    arma::cout << "bfgs: solution to Booth test:\n" << x_2 << arma::endl;

    return 0;
}