If m=F, n=F+3 and q has the form (q1,q2,q3,0,0,..,0) then the geometrical interpretation of the problem is that we are trying to find the point inside an F-faced convex hull closest to an exterior plane having normal (q1,q2,q3).
The primal simplex algorithm operates by hoping from vertex to adjacent vertex, always moving closer to the desired plane; calculating the vertex positions from the given set of planes ("facets.").
We will illustrate its use using the same numeric example discussed
in "Algorithms" (Robert Sedgewick, Addison-Wesley 1984) but approached in a
slightly different way. The problem as presented by Sedgewick is:
Maximise x1 + x2 + x3 subject to the constraints
-x1 | + | x2 | £ 5 |
x1 | + | 4x2 | £ 45 |
2x1 | + | x2 | £ 27 |
3x1 | - | 4x2 | £ 24 |
x3 | £ 4 |
-x1 | + | x2 | + | x4 | = | 5 |
x1 | + | 4x2 | + | x5 | = | 45 |
2x1 | + | x2 | + | x6 | = | 27 |
3x1 | - | 4x2 | + | x7 | = | 24 |
x3 | + | x8 | = | 4 |
A = | æ | -1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | ö |
ç | 1 | 4 | 0 | 0 | 1 | 0 | 0 | 0 | ÷ | |
ç | 2 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | ÷ | |
ç | 3 | -4 | 0 | 0 | 0 | 0 | 1 | 0 | ÷ | |
è | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | ø |
Viewed geometrically, the inequalities for x1,x2 and x3 (including x1 ³ 0,x2 ³ 0,x3 ³ 0) describe a 3D convex hull having the origin (0,0,0) as one vertex. The slack variable x4 gives the distance of the point (x1,x2,x3) from the plane -x1 + x2 = 5, x5 gives the distance from x1 + 4x2 = 45, and so on. x1,x2 and x3 may themselves be thought of as describing the distances of the point (x1,x2,x3) from the planes x1=0, x2=0, and x3=0 respectively
The first step of the Primal Simplex Algorithm is to find a feasible solution to the problem, that is, find an x>0 satisfying Ax=b. If any solutions exist, ie. if the problem is solvable, then there will be an optimal solution having at most m non-zero terms in x (and thus at least nm zero terms), so we look for a feasible solution of this form.
Geometrically, this is saying that there will always be an optimal
point for the objective function that is a vertex of the hull.
An obvious candidate solution (used in Sedgewick) is the "origin"
x=(0,0,0,5,45,27,24,4) but for the sake of generality we will start with x=(8,0,0,13,37,11,0,4).
We call the m non-zero variables in x
(x1,x4,x5,x6 and x8)
basis variables and need to rearrange our equations into canonical form with
respect to the basis variables. This means that each basis variable can
appear in only one equation and within that equation must have coefficient 1.
When the equations are arranged in canonical form, it is clear what
value x takes since the basis variables each appear in only one
equation and the non-basis variables are known to be zero.
(If we had taken x4,x5,x6,x7, and x8 as our basis variables as Sedgewick
does then the equations would already be canonical form)
We can rearrange the fourth constraint as x1 - 1.33x2+ 0.33x7 = 8
and then replace x1 in the other equations with 8+1.33x2-0.33x7
thus:
Maximise 2.33x2 + x3 -0.33x7 = M-8
subject to the constraints
-0.33x2 | + | x4 | + | 0.33x7 | =13 |
5.33x2 | + | x5 | - | 0.33x7 | =37 |
3.67x2 | + | x6 | - | 0.67x7 | =11 |
x1 | - | 1.33x2 | + | 0.33x7 | =8 |
x3 | + | x8 | =4 |
We can express these equations more succinctly using an (m+1)×(n+1)
matrix thus:
B= | æ | 0.00 | -2.33 | -1.00 | 0.00 | 0.00 | 0.00 | 0.33 | 0.00 | 8.00 | ö |
ç | 0.00 | -0.33 | 0.00 | 1.00 | 0.00 | 0.00 | 0.33 | 0.00 | 13.00 | ÷ | |
ç | 0.00 | 5.33 | 0.00 | 0.00 | 1.00 | 0.00 | -0.33 | 0.00 | 37.00 | ÷ | |
ç | 0.00 | 3.67 | 0.00 | 0.00 | 0.00 | 1.00 | -0.67 | 0.00 | 11.00 | ÷ | |
ç | 1.00 | -1.33 | 0.00 | 0.00 | 0.00 | 0.00 | 0.33 | 0.00 | 8.00 | ÷ | |
è | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 4.00 | ø |
The top (0th) row of B is the negation of the objective function; the
rightmost entry of this row (B0n) contains the current value of the objective
function. The remainder of the rightmost column contains the values of
the basis variables.
Note that we will be counting rows from zero, but columns from one.
Suppose we wish to change our basis variables from
x1,x4,x5,x6,x8
to
x1,x2,x4,x5,x8
, ie. to replace x6 with x2.
To rearrange the equations into canonical form with respect to the new basis
we rewrite the third constraint as
x2 + 0.27x6 - 0.18x7 = 3
and then plug it into the other equations to obtain
æ | 0.00 | 0.00 | -1.00 | 0.00 | 0.00 | 0.64 | -0.09 | 0.00 | 15.00 | ö |
ç | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.09 | 0.27 | 0.00 | 14.00 | ÷ |
ç | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | -1.45 | 0.64 | 0.00 | 21.00 | ÷ |
ç | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.27 | -0.18 | 0.00 | 3.00 | ÷ |
ç | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.36 | 0.09 | 0.00 | 12.00 | ÷ |
è | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 4.00 | ø |
which exemplifies the feasible solution (12,3,0,14,21,0,0,4). This is equivalent to multiplying the third row by 1/3.67 and then adding appropriate multiples of it to the other rows in order to zero their second entries - an operation very similar to that used in Gaussian Elimination.
If we next wanted to replace x8 with x3
in our basis we would use the fifth row (the row which defines x8)
to zero the third entry of each of the other rows to move to the solution (12,3,4,14,21,0,0,0).
This operation is known as "pivoting about" B53.
Pivoting about Bij >0 corresponds to bringing xj
into the basis instead of xk, where k is the unique value in {1,2,..,n} for
which Bik ¹ 0.
We force xk to be zero and allow xj to be nonzero.
Geometrically this corresponds to moving (x1,x2,x3) away from the (j+4)th
face and into the (k+4)th face.
This pivoting procedure gives an easy way of obtaining successive
potential solutions to the problem. If, after pivoting, the rightmost
column is non-negative then the new solution is feasible but as yet we
have no way of guaranteeing that it is "better" than the last solution.
This is done by careful choice of pivot point with reference to the top
row.
If xi is any basis variable then B0i = 0.
Thus if B0j < 0 then making xj
positive nonzero at the expense of xi will increase the objective
function.
The rightmost column will transform as
Bk(n+1)' = Bk(n+1) BkjBi(n+1)/Bij so
we can ensure it remains nonnegative by choosing i to minimise Bi(n+1)/Bij
over those i such that Bij >0.
Bij £ 0 for all i Î {1,2,..,n} can only occur if the
constaint A x=b is unbounded, ie. permits some xi to be infinite.
At each stage, therefore, we first choose a column j satisfying B0j <0.
If there are more than one such then we can choose the one with the
smallest entry in row zero (greatest increment); the one that would give
the greatest increase in the objective function (steepest descent); or
select at random.
Having selected j we then choose i to minimise Bi(n+1)/Bij
over those i such that Bij >0. If there are two or more such, then we
select at random or choose the one that removes the variable with lowest
index from the basis (ie. minimises k where k is uniquely defined by
Bik¹0). It is important to choose randomly rather than, say, the first or
last ones found in order to avoid cycling, an infinite sequence of pivots
that do not increase the objective function.
Once there are no negative entries in the top row then we have an optimal solution.
The cost of each pivot is (n-m+1)D+(m(n-m+1))M., however all the D's are by the same value and one is into 1 so we can replace them with
1R+(nm)M giving 1R+((n-m)(m+1)+m)M.
The primal simplex algorithm as presented above needs a feasible solution having at most m nonzero values in order to begin. If such a solution cannot be found by inspection then we can use the primal simplex algorithm to find one by introducing n new variables xn+1,xn+2,..,xn+m, (adding one to each of the m constraint equations) and replacing the objective function with åi=n+1n+m xi.
If we can achieve an
objective value of 0 for this new problem then (possibly after a few
further pivots to remove any remaining "new" variables from the basis) we
will have a feasible solution to the original constraints.
Application to 3D Convex Hulls
The standard form for the primal simplex algorithm is not quite the
form of the problem that arises when trying to find the vertex of a
convex hull H =(H,d,V) having centre c and orientation A nearest the
plane r.q=g since we do not necessarily have (0,0,0) as a vertex.
We take variables x1=r1,x2=r2,x3=r3, and slack variables x4,x5,..,xF+3.
Our equations are
Maximise q1x1+q2x2+q3x3
Subject to
(Ani)1x1+(Ani)2x2+(Ani)3x3+ x3+i £ ei ; i=1,2,..F.
where ei = di+(Ani).c ; i=1,2,..F.
We do not impose x1,x2,x3 ³ 0 and do not necessarily have 0 Î H. We can
assume that d ³ 0 however.
The matrix for these equations is the (F+1)×(F+4) matrix
B= | æ | -q1 | -q2 | -q3 | 0 | 0 | 0..0 | 0 | ö |
ç | 1 | 0 | 0..0 | e1 | ÷ | ||||
ç | AH | 0 | 1 | 0..0 | e2 | ÷ | |||
ç | . | . | .... | .. | ÷ | ||||
è | 0 | 0 | 0..1 | eF | ø |
x1 | = | (Avh1, , ) | |||||
x2 | = | (Avh2, , ) | |||||
x3 | = | (Avh3, , ) | |||||
x3+i | = | ei-ni.vh | i=1 | 2 | .. | F |
(AN)T | æ | x1 | ö | + | æ | x3+ | t | ö | = | æ | es | ö
| | ç | x2 | ÷ | | ç | x3 | + | t | ÷ | | ç | et | ÷
| | è | x3 | ø | | è | x3 | + | u | ø | | è | eu | ø
| |
æ | x1 | ö | + | M | æ | x3+ | t | ö | = | M | æ | es | ö
| ç | x2 | ÷ | | | ç | x3 | + | t | ÷ | | | ç | et | ÷
| è | x3 | ø | | | è | x3 | + | u | ø | | | è | eu | ø
| |
1 | 2 | 3 | ... | s | t | u | F+4 | ||||||
( | 1 | 0 | 0 | ... | m11 | 0..0 | m12 | 0..0 | m13 | 0..0 | es'=m11es+m12et+m13eu | ) | |
( | 0 | 1 | 0 | ... | m21 | 0..0 | m22 | 0..0 | m23 | 0..0 | et'=m21es+m22et+m23eu | ) | |
and | ( | 0 | 0 | 1 | ... | m31 | 0..0 | m32 | 0..0 | m33 | 0..0 | eu'=m31es+m32et+m33eu | ) |
1 | 2 | 3 | ... | s | ... | t | ... | i | ... | u | ... | F+4 | |||||||
( | bi1 | bi2 | bi3 | 0 | ... | 0 | 0 | ... | 0 | 0 | ... | 1 | 0 | ... | 0 | 0 | ... | ei | ) |
with | |||||||||||||||||||
( | 0 | 0 | 0 | 0 | ... | bis' | 0 | ... | bit' | 0 | ... | 1 | 0 | ... | biu' | 0 | ... | ei' | ) |
1 | 2 | 3 | ... | s | ... | t | ... | i | ... | u | ... | F+4 | |||||||
( | -q1 | -q2 | -q3 | 0 | ... | 0 | 0 | ... | 0 | 0 | ... | 0 | 0 | ... | 0 | 0 | ... | 0 | ) |
with | |||||||||||||||||||
( | 0 | 0 | 0 | 0 | ... | qs' | 0 | ... | qt' | 0 | ... | 1 | 0 | ... | qu' | 0 | ... | Q | ) |
We cannot use the algorithm exactly as given above because the
assumption x ³ 0 does not hold for the first three components of x.
We have "lost" the three planes corresponding to the
first three variables (x1=0, x2=0, and x3=0)
and so must be more careful in our choice of pivot.
We get round this by keeping
x1,x2, and x3 in the basis at all times.
Recalling that pivoting about Bij
>0 corresponds to bringing xj into the basis at the expense of xk, where k is the unique value in {1,2,..,n} for which Bik ¹; 0, we see that we will
never want to pivot about Bij if j=1,2, or 3 and must avoid pivoting about Bij
if any of Bi1,Bi2, or Bi3 is nonzero.
This means that we cannot, as before, for a given choice of j choose i
to minimise Bi(n+1)/Bij over those i such that Bij >0 so we cannot guarantee
that the rightmost column will remain nonnegative.
However, if we choose i to minimise Bi(n+1)/Bij over those i such that
Bij >0 and Bi1=Bi2=Bi3=0 then the only entries in the rightmost column that
can become negative are the first second and third which is okay since
this corresponds merely to vertices having negative coordinates.
On each iteration we will have a choice of three pivot columns, the
cost of choosing the pivot row is (F-3)D or less, and the cost of the
pivot operation is 4D+4FM simplifying to 1R+(4F+3)M.
Generating the Vertices of a Convex Hull
The Primal simplex algorithm's ability to moves from vertex to vertex of a convex
hull is potentially useful as a means to generate the vertices V from H
and d. Might AV be so derived from AH and d faster than by explicitly rotationg each vertex?
A "Vertex Generation Algorithm" has no need for an objective function
so we drop the zeroth row of B. At each stage we will have a free choice
of three pivot columns, however (apart from on the first iteration) one
of these would take us back to the vertex we came from.
Topologically, we need to visit each node of a trinary graph, a
depth first recursive traversal will serve but requires B to be placed on
a stack every bifurcation. Each vertex is the intersection of three
planes (the i,j, and kth say). On travelling from one vertex <i,j,k> to
another <i,j,l> we flag <i,j> as a "traversed edge-pair". We begin our first
iteration with no traversed pairs. At each subsequent iteration we will
be at a vertex <i,j,k> considering the i,j, and kth columns as possible pivot
columns.
If <i,j> is a traversed pair we know not to choose the kth column
because <i,j,k> must have been already visited. Likewise we do not choose
the jth column if <i,k> has been traversed, and reject the ith column if <j,k>
has been traversed. If we reject all three pivot columns we exit from the
recursive procedure; if only one column is acceptable we choose it; if
two (or all three) are acceptable we must choose both (or all three) and
impliment them one after another by preserving B and then calling the
traversal procedure recursively.
Though this method will generate V it has two drawbacks. Firstly, it
is recursive and requires the large matrix B to be preserved over the
recursion. Secondly, it will generate any vertex at which more than three
planes intersect (the peak of a square based pyramid for example) more
than once.
It is not easy to solve either of these problems if all that is known
about H is H and d (possibly the case for the hull-compiler), but if the connectivity of the hull is known (likely for the hull-rotater) then we
can eliminate both these problems if we provide the algorithm with an
"edge-walk" {i1,i2,...iP}, a sequence of pivot columns that specify a non-
recursive traversal of the hull that traverses all the edges of the hull
at the expense of possibly visting some vertices more than once.
We can save the algorithm a further (F-3)D per iteration by providing
the pivot row as well, effectively providing a "vertex-walk"
{(i1,j1),(i2,j2),..(iP,jP)}.
Each iteration will take 1R+(4F-1)M since we no longer have the top
row of B to maintain.
This compares badly with the 9M (or 6M+3T) required to rotate a v Î V by A.
We conclude therefore that it is faster to rotate hull vertices
than to recreate them from pre-rotated planes.