Primal Simplex Algorythm

Primal Simplex Algorithm
"Fast cars, fast women, fast algorithms... what more could a man want?" -- Joe Mattis.

The Basic Algorithm
This is a technique used to solve the linear programming problem:
Maximise a linear objective function q.x subject to the linear constraints Ax = b over x > 0 where q and x are n dimensional vectors, b is an m dimensional vector, and A is an m×n matrix.

If m=F, n=F+3 and q has the form (q₁,q₂,q₃,0,0,..,0) then the geometrical interpretation of the problem is that we are trying to find the point inside an F-faced convex hull closest to an exterior plane having normal (q₁,q₂,q₃).

The primal simplex algorithm operates by hoping from vertex to adjacent vertex, always moving closer to the desired plane; calculating the vertex positions from the given set of planes ("facets.").

We will illustrate its use using the same numeric example discussed in "Algorithms" (Robert Sedgewick, Addison-Wesley 1984) but approached in a slightly different way. The problem as presented by Sedgewick is:
Maximise x₁ + x₂ + x₃ subject to the constraints

-x₁ + x₂ £ 5
x₁ + 4x₂ £ 45
2x₁ + x₂ £ 27
3x₁ - 4x₂ £ 24
x₃ £ 4
over x₁,x₂,x₃ ³ 0.
We reexpress this in standard form by introducing slack variables x₄,x₅,x₆,x₇,x₈ thus
Maximise M=x₁ + x₂ + x₃ subject to the constraints

-x₁ + x₂ + x₄ = 5
x₁ + 4x₂ + x₅ = 45
2x₁ + x₂ + x₆ = 27
3x₁ - 4x₂ + x₇ = 24
x₃ + x₈ = 4
over x₁,x₂,x₃,x₄,x₅,x₆,x₇,x₈ ³ 0.
This is the standard form of the problem with q=(1,1,1,0,0,0,0,0)^T, b=(5,45,27,24,4)^T and

A = æ -1 1 0 1 0 0 0 0 ö
   ç 1 4 0 0 1 0 0 0 ÷
   ç 2 1 0 0 0 1 0 0 ÷
   ç 3 -4 0 0 0 0 1 0 ÷
   è 0 0 1 0 0 0 0 1 ø

Viewed geometrically, the inequalities for x₁,x₂ and x₃ (including x₁ ³ 0,x₂ ³ 0,x₃ ³ 0) describe a 3D convex hull having the origin (0,0,0) as one vertex. The slack variable x₄ gives the distance of the point (x₁,x₂,x₃) from the plane -x₁ + x₂ = 5, x₅ gives the distance from x₁ + 4x₂ = 45, and so on. x₁,x₂ and x₃ may themselves be thought of as describing the distances of the point (x₁,x₂,x₃) from the planes x₁=0, x₂=0, and x₃=0 respectively

The first step of the Primal Simplex Algorithm is to find a feasible solution to the problem, that is, find an x>0 satisfying Ax=b. If any solutions exist, ie. if the problem is solvable, then there will be an optimal solution having at most m non-zero terms in x (and thus at least nm zero terms), so we look for a feasible solution of this form.

Geometrically, this is saying that there will always be an optimal point for the objective function that is a vertex of the hull.

An obvious candidate solution (used in Sedgewick) is the "origin" x=(0,0,0,5,45,27,24,4) but for the sake of generality we will start with x=(8,0,0,13,37,11,0,4).

We call the m non-zero variables in x (x₁,x₄,x₅,x₆ and x₈) basis variables and need to rearrange our equations into canonical form with respect to the basis variables. This means that each basis variable can appear in only one equation and within that equation must have coefficient 1.
When the equations are arranged in canonical form, it is clear what value x takes since the basis variables each appear in only one equation and the non-basis variables are known to be zero.
(If we had taken x₄,x₅,x₆,x₇, and x₈ as our basis variables as Sedgewick does then the equations would already be canonical form)
We can rearrange the fourth constraint as x₁ - 1.33x₂+ 0.33x₇ = 8 and then replace x₁ in the other equations with 8+1.33x₂-0.33x₇ thus:

Maximise 2.33x₂ + x₃-0.33x₇ = M-8 subject to the constraints

-0.33x₂ + x₄ + 0.33x₇ =13
  5.33x₂ + x₅ - 0.33x₇ =37
  3.67x₂ + x₆ - 0.67x₇ =11
      x₁ - 1.33x₂ + 0.33x₇ =8
        x₃ + x₈ =4
over x₁,x₂,x₃,x₄,x₅,x₆,x₇,x₈ ³ 0.
Geometrically, the non-basis variables correspond to the planes that meet at the vertex we are currently at.

We can express these equations more succinctly using an (m+1)×(n+1) matrix thus:

B= æ 0.00 -2.33 -1.00 0.00 0.00 0.00 0.33 0.00 8.00 ö
ç 0.00 -0.33 0.00 1.00 0.00 0.00 0.33 0.00 13.00 ÷
ç 0.00 5.33 0.00 0.00 1.00 0.00 -0.33 0.00 37.00 ÷ ç 0.00 3.67 0.00 0.00 0.00 1.00 -0.67 0.00 11.00 ÷
ç 1.00 -1.33 0.00 0.00 0.00 0.00 0.33 0.00 8.00 ÷
è 0.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00 4.00 ø

The top (0^th) row of B is the negation of the objective function; the rightmost entry of this row (B_0n) contains the current value of the objective function. The remainder of the rightmost column contains the values of the basis variables.
Note that we will be counting rows from zero, but columns from one.

Suppose we wish to change our basis variables from x₁,x₄,x₅,x₆,x₈ to x₁,x₂,x₄,x₅,x₈ , ie. to replace x₆ with x₂.
To rearrange the equations into canonical form with respect to the new basis we rewrite the third constraint as
x₂ + 0.27x₆ - 0.18x₇ = 3
and then plug it into the other equations to obtain

æ 0.00 0.00 -1.00 0.00 0.00 0.64 -0.09 0.00 15.00 ö
ç 0.00 0.00 0.00 1.00 0.00 0.09 0.27 0.00 14.00 ÷
ç 0.00 0.00 0.00 0.00 1.00 -1.45 0.64 0.00 21.00 ÷
ç 0.00 1.00 0.00 0.00 0.00 0.27 -0.18 0.00 3.00 ÷
ç 1.00 0.00 0.00 0.00 0.00 0.36 0.09 0.00 12.00 ÷
è 0.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00 4.00 ø

which exemplifies the feasible solution (12,3,0,14,21,0,0,4). This is equivalent to multiplying the third row by 1/3.67 and then adding appropriate multiples of it to the other rows in order to zero their second entries - an operation very similar to that used in Gaussian Elimination.

If we next wanted to replace x₈ with x₃ in our basis we would use the fifth row (the row which defines x₈) to zero the third entry of each of the other rows to move to the solution (12,3,4,14,21,0,0,0).
This operation is known as "pivoting about" B₅₃.

Pivoting about B_ij >0 corresponds to bringing x_j into the basis instead of x_k, where k is the unique value in {1,2,..,n} for which B_ik ¹ 0. We force x_k to be zero and allow x_j to be nonzero. Geometrically this corresponds to moving (x₁,x₂,x₃) away from the (j+4)th face and into the (k+4)th face.

This pivoting procedure gives an easy way of obtaining successive potential solutions to the problem. If, after pivoting, the rightmost column is non-negative then the new solution is feasible but as yet we have no way of guaranteeing that it is "better" than the last solution. This is done by careful choice of pivot point with reference to the top row.
If x_i is any basis variable then B_0i = 0. Thus if B_0j < 0 then making x_j positive nonzero at the expense of x_i will increase the objective function.

The rightmost column will transform as B_k(n+1)' = B_k(n+1) B_kjB_i(n+1)/B_ij so we can ensure it remains nonnegative by choosing i to minimise B_i(n+1)/B_ij over those i such that B_ij >0.
B_ij £ 0 for all i Î {1,2,..,n} can only occur if the constaint A x=b is unbounded, ie. permits some x_i to be infinite.

At each stage, therefore, we first choose a column j satisfying B_0j <0. If there are more than one such then we can choose the one with the smallest entry in row zero (greatest increment); the one that would give the greatest increase in the objective function (steepest descent); or select at random. Having selected j we then choose i to minimise B_i(n+1)/B_ij over those i such that B_ij >0. If there are two or more such, then we select at random or choose the one that removes the variable with lowest index from the basis (ie. minimises k where k is uniquely defined by B_ik¹0). It is important to choose randomly rather than, say, the first or last ones found in order to avoid cycling, an infinite sequence of pivots that do not increase the objective function.

Once there are no negative entries in the top row then we have an optimal solution.

The cost of each pivot is (n-m+1)D+(m(n-m+1))M., however all the D's are by the same value and one is into 1 so we can replace them with 1R+(nm)M giving 1R+((n-m)(m+1)+m)M.

The primal simplex algorithm as presented above needs a feasible solution having at most m nonzero values in order to begin. If such a solution cannot be found by inspection then we can use the primal simplex algorithm to find one by introducing n new variables x_n+1,x_n+2,..,x_n+m, (adding one to each of the m constraint equations) and replacing the objective function with å_i=n+1^n+m x_i.

If we can achieve an objective value of 0 for this new problem then (possibly after a few further pivots to remove any remaining "new" variables from the basis) we will have a feasible solution to the original constraints.

Application to 3D Convex Hulls
    The standard form for the primal simplex algorithm is not quite the form of the problem that arises when trying to find the vertex of a convex hull H =(H,d,V) having centre c and orientation A nearest the plane r.q=g since we do not necessarily have (0,0,0) as a vertex.
We take variables x₁=r₁,x₂=r₂,x₃=r₃, and slack variables x₄,x₅,..,x_F+3.
Our equations are
Maximise q₁x₁+q₂x₂+q₃x₃
Subject to (An_i)₁x₁+(An_i)₂x₂+(An_i)₃x₃+ x_3+i £ e_i ; i=1,2,..F.
where e_i = d_i+(An_i).c ; i=1,2,..F.
We do not impose x₁,x₂,x₃ ³ 0 and do not necessarily have 0 Î H. We can assume that d ³ 0 however.
The matrix for these equations is the (F+1)×(F+4) matrix

B= æ -q₁ -q₂ -q₃ 0 0 0..0 0 ö
ç 1 0 0..0 e₁ ÷
ç AH 0 1 0..0 e₂ ÷
ç . . .... .. ÷
è 0 0 0..1 e_F ø
Our initial solution is r=Av_h where v_h Î V is one of the vertices (preferably one likely to be only a few edges away from the optimal vertex) so that

x₁ = (Av_h₁, , )
x₂ = (Av_h₂, , )
x₃ = (Av_h₃, , )
x_3+i = e_i-n_i.v_h i=1 2 .. F
Since v_h is a vertex, it must lie in at least three of the planes and so at least three of the slack variables x_3+i are zero (non-basis). Let these be x_3+s,x_3+t, and x_3+u.
The B presented above is in canonical form with respect to the basis x₄,x₅,...,x_3+F. We need to rearrange it so that it is canonical form with respect to the basis formed by replacing x_3+s,x_3+t, and x_3+u with x₁,x₂, and x₃. The s^th,t^th, and u^th equations taken together enable us to do this by expressing x₁,x₂, and x₃ in terms of x_3+s,x_3+t, and x_3+u. The three equations are
(An_s)₁x₁ + (An_s)₂x₂ + (An_s)₃x₃ + x_3+s = e_s
(An_t)₁x₁ + (An_t)₂x₂ + (An_t)₃x₃ + x_3+t = e_t
(An_u)₁x₁ + (An_u)₂x₂ + (An_u)₃x₃ + x_3+u = e_u
which we can rewrite as

(AN)^T æ x₁ ö + æ x₃ + t ö = æ e_s ö
         ç x₂ ÷ ç x₃ + t ÷ ç e_t ÷
         è x₃ ø è x₃ + u ø è e_u ø
where N=(n_s,n_t,n_u) is bound to be invertible since no two of the normals can be parallel (or the planes would not have met at v_h).
Let M=((AN)^T)^-1=(N^TA^T)^-1=AN^-T. We have

æ x₁ ö + M æ x₃ + t ö = M æ e_s ö
ç x₂ ÷ ç x₃ + t ÷ ç e_t ÷
è x₃ ø è x₃ + u ø è e_u ø
and so we can replace the sth,tth,and suh rows of B with the rows

1 2 3 ... s t u F+4
( 1 0 0 ... m₁₁ 0..0 m₁₂ 0..0 m₁₃ 0..0 e_s'=m₁₁e_s+m₁₂e_t+m₁₃e_u )
( 0 1 0 ... m₂₁ 0..0 m₂₂ 0..0 m₂₃ 0..0 e_t'=m₂₁e_s+m₂₂e_t+m₂₃e_u )
and ( 0 0 1 ... m₃₁ 0..0 m₃₂ 0..0 m₃₃ 0..0 e_u'=m₃₁e_s+m₃₂e_t+m₃₃e_u )
(Cost 9M)
Having done this we subtract multiples of these three rows from all the other rows. For i¹0, s,t,u we replace the ith row

1 2 3 ... s ... t ... i ... u ... F+4
( b_i1 b_i2 b_i3 0 ... 0 0 ... 0 0 ... 1 0 ... 0 0 ... e_i )
with
( 0 0 0 0 ... b_is' 0 ... b_it' 0 ... 1 0 ... b_iu' 0 ... e_i' )
where
b_is' = -b_i1m₁₁-b_i2m₂₁-b_i3m₃₁
b_it' = -b_i1m₁₂-b_i2m₂₂-b_i3m₃₂
b_iu' = -b_i1m₁₃-b_i2m₂₃-b_i3m₃₃ e_i'  = e_i-b_i1e_s'-b_i2e_t'-b_i3e_u'
(Cost 12M for each of F-3 rows).
We replace the top row

1 2 3 ... s ... t ... i ... u ... F+4
( -q₁ -q₂ -q₃ 0 ... 0 0 ... 0 0 ... 0 0 ... 0 0 ... 0 )
with
( 0 0 0 0 ... q_s' 0 ... q_t' 0 ... 1 0 ... q_u' 0 ... Q )
q_s' = q₁m₁₁+q₂m₂₁+q₃m₃₁
q_t' = q₁m₁₂+q₂m₂₂+q₃m₃₃
q_u' = q₁m₁₃+q₂m₂₃+q₃m₃₃
Q  = q₁e_s'+q₂e_t'+q₃e_u'
(Cost 12M).
Having performed this "simultaneous triple-pivot", we have B in canonical form with respect to a vertex of the hull and we can proceed with the primal simplex algorithm.

We cannot use the algorithm exactly as given above because the assumption x ³ 0 does not hold for the first three components of x. We have "lost" the three planes corresponding to the first three variables (x₁=0, x₂=0, and x₃=0) and so must be more careful in our choice of pivot. We get round this by keeping x₁,x₂, and x₃ in the basis at all times. Recalling that pivoting about B_ij >0 corresponds to bringing x_j into the basis at the expense of x_k, where k is the unique value in {1,2,..,n} for which B_ik ¹; 0, we see that we will never want to pivot about B_ij if j=1,2, or 3 and must avoid pivoting about B_ij if any of B_i1,B_i2, or B_i3 is nonzero.
This means that we cannot, as before, for a given choice of j choose i to minimise B_i(n+1)/B_ij over those i such that B_ij >0 so we cannot guarantee that the rightmost column will remain nonnegative.
However, if we choose i to minimise B_i(n+1)/B_ij over those i such that B_ij >0 and B_i1=B_i2=B_i3=0 then the only entries in the rightmost column that can become negative are the first second and third which is okay since this corresponds merely to vertices having negative coordinates.
On each iteration we will have a choice of three pivot columns, the cost of choosing the pivot row is (F-3)D or less, and the cost of the pivot operation is 4D+4FM simplifying to 1R+(4F+3)M.

Generating the Vertices of a Convex Hull

The Primal simplex algorithm's ability to moves from vertex to vertex of a convex hull is potentially useful as a means to generate the vertices V from H and d. Might AV be so derived from AH and d faster than by explicitly rotationg each vertex? A "Vertex Generation Algorithm" has no need for an objective function so we drop the zeroth row of B. At each stage we will have a free choice of three pivot columns, however (apart from on the first iteration) one of these would take us back to the vertex we came from.

Topologically, we need to visit each node of a trinary graph, a depth first recursive traversal will serve but requires B to be placed on a stack every bifurcation. Each vertex is the intersection of three planes (the i,j, and kth say). On travelling from one vertex <i,j,k> to another <i,j,l> we flag <i,j> as a "traversed edge-pair". We begin our first iteration with no traversed pairs. At each subsequent iteration we will be at a vertex <i,j,k> considering the i,j, and kth columns as possible pivot columns.
If <i,j> is a traversed pair we know not to choose the kth column because <i,j,k> must have been already visited. Likewise we do not choose the jth column if <i,k> has been traversed, and reject the ith column if <j,k> has been traversed. If we reject all three pivot columns we exit from the recursive procedure; if only one column is acceptable we choose it; if two (or all three) are acceptable we must choose both (or all three) and impliment them one after another by preserving B and then calling the traversal procedure recursively.

Though this method will generate V it has two drawbacks. Firstly, it is recursive and requires the large matrix B to be preserved over the recursion. Secondly, it will generate any vertex at which more than three planes intersect (the peak of a square based pyramid for example) more than once.
It is not easy to solve either of these problems if all that is known about H is H and d (possibly the case for the hull-compiler), but if the connectivity of the hull is known (likely for the hull-rotater) then we can eliminate both these problems if we provide the algorithm with an "edge-walk" {i1,i₂,...i_P}, a sequence of pivot columns that specify a non- recursive traversal of the hull that traverses all the edges of the hull at the expense of possibly visting some vertices more than once.
We can save the algorithm a further (F-3)D per iteration by providing the pivot row as well, effectively providing a "vertex-walk" {(i₁,j₁),(i₂,j₂),..(i_P,j_P)}.
Each iteration will take 1R+(4F-1)M since we no longer have the top row of B to maintain. This compares badly with the 9M (or 6M+3T) required to rotate a v Î V by A.

We conclude therefore that it is faster to rotate hull vertices than to recreate them from pre-rotated planes.

Glossary Contents Author
Copyright (c) Ian C G Bell 1998
Web Source: www.iancgbell.clara.net/maths or www.bigfoot.com/~iancgbell/maths
18 Nov 2006.

-`x`₁	+	`x`₂	£ 5
`x`₁	+	4`x`₂	£ 45
2`x`₁	+	`x`₂	£ 27
3`x`₁	-	4`x`₂	£ 24
		`x`₃	£ 4

-`x`₁	+	`x`₂	+	`x`₄	=	5
`x`₁	+	4`x`₂	+	`x`₅	=	45
2`x`₁	+	`x`₂	+	`x`₆	=	27
3`x`₁	-	4`x`₂	+	`x`₇	=	24
		`x`₃	+	`x`₈	=	4

A =	æ	-1	1	0	1	0	0	0	0	ö
	ç	1	4	0	0	1	0	0	0	÷
	ç	2	1	0	0	0	1	0	0	÷
	ç	3	-4	0	0	0	0	1	0	÷
	è	0	0	1	0	0	0	0	1	ø

-0.33`x`₂	+	`x`₄	+	0.33`x`₇	=13
5.33`x`₂	+	`x`₅	-	0.33`x`₇	=37
3.67`x`₂	+	`x`₆	-	0.67`x`₇	=11
`x`₁	-	1.33`x`₂	+	0.33`x`₇	=8
`x`₃	+	`x`₈			=4

B=	æ	0.00	-2.33	-1.00	0.00	0.00	0.33	0.00	8.00	ö
	ç	0.00	-0.33	0.00	1.00	0.00	0.33	0.00	13.00	÷
	ç	0.00	5.33	0.00	0.00	1.00	-0.33	0.00	37.00	÷	ç	0.00	3.67	0.00	0.00	0.00	1.00	-0.67	0.00	11.00	÷
	ç	1.00	-1.33	0.00	0.00	0.00	0.33	0.00	8.00	÷
	è	0.00	0.00	1.00	0.00	0.00	0.00	1.00	4.00	ø

B=	æ	-q₁	-q₂	-q₃	0	0	0..0	0	ö
	ç				1	0	0..0	e₁	÷
	ç		AH		0	1	0..0	e₂	÷
	ç				.	.	....	..	÷
	è				0	0	0..1	e_F	ø

`x`₁	=	(Av_h₁, , )
`x`₂	=	(Av_h₂, , )
`x`₃	=	(Av_h₃, , )
`x`_3+i	=	`e_i`-n_i.v_h	`i`=1	2	..	`F`

		1	2	3	...	`s`		`t`		`u`		`F`+4
	(	1	0	0	...	m₁₁	0..0	m₁₂	0..0	m₁₃	0..0	`e_s`'=m₁₁`e_s`+m₁₂`e_t`+m₁₃`e_u`	)
	(	0	1	0	...	m₂₁	0..0	m₂₂	0..0	m₂₃	0..0	`e_t`'=m₂₁`e_s`+m₂₂`e_t`+m₂₃`e_u`	)
and	(	0	0	1	...	m₃₁	0..0	m₃₂	0..0	m₃₃	0..0	`e_u`'=m₃₁`e_s`+m₃₂`e_t`+m₃₃`e_u`	)

	1	2	3		...	`s`		...	`t`		...	`i`		...	`u`		...	`F`+4
(	b_i1	b_i2	b_i3	0	...	0	0	...	0	0	...	1	0	...	0	0	...	`e_i`	)
with
(	0	0	0	0	...	b_is'	0	...	b_it'	0	...	1	0	...	b_iu'	0	...	`e_i`'	)

A =	æ	-1	1	0	1	0	0	0	0	ö
	ç	1	4	0	0	1	0	0	0	÷
	ç	2	1	0	0	0	1	0	0	÷
	ç	3	-4	0	0	0	0	1	0	÷
	è	0	0	1	0	0	0	0	1	ø

A =	æ	-1	1	0	1	0	0	0	0	ö
	ç	1	4	0	0	1	0	0	0	÷
	ç	2	1	0	0	0	1	0	0	÷
	ç	3	-4	0	0	0	0	1	0	÷
	è	0	0	1	0	0	0	0	1	ø