2
$\begingroup$

I'm reading Introduction to Linear Optimization by D. Bertsimas and J. Tsitsiklis. In Chapter 2, Section 2, the authors provide the following definition for a basic solution:

"Consider a polyhedron $P$ defined by linear equality and inequality constraints, and let $x^* \in \mathbb{R}^n$. The vector $x^*$ is a basic solution if:

  • (i) All equality constraints in $P$ are active at $x^*$.
  • (ii) Out of all constraints in $P$ which are active at $x^*$, $n$ of them are linearly independent."

However, upon reading other texts, I have encountered a different definition for a basic solution, specifically for polyhedra in standard form:

"Let $P = \{x \in \mathbb{R}^n : Ax = b, x \geq 0\}$ be a polyhedron in standard form, where $A$ is an $m \times n$ matrix with $m \leq n$. Let $x^* \in \mathbb{R}^n$. Then $x^*$ is a basic solution if:

  • (a) $Ax^* = b$
  • (b) The columns of $A$ corresponding to nonzero components of $x^*$ are linearly independent."

My problem:

I'm struggling to see how these two definitions are equivalent for polyhedra in standard form. Specifically, how does condition (ii) in the first definition (having $n$ linearly independent active constraints) imply condition (b) in the second definition (linear independence of columns corresponding to nonzero components)?

Any clarification or proof would be greatly appreciated.

$\endgroup$

1 Answer 1

1
$\begingroup$

Both definitions are equivalent.

We begin with some simplification. First, by adding slack variables, all inequality constraints (other than non negativity constraints) can be converted to equality constraints so that the only inequality constraints are the non negativity constraints. Further the matrix $A$ can be assumed to be of rank $m \le n$ because any linearly dependent constraints would be either redundant (and can be dropped) or inconsistent (and there are no solutions basic or otherwise). Since the column rank and row rank of a matrix are equal, $A$ therefore has $m$ linearly independent columns.

Definition 1 implies definition 2

$Ax=b$ gives $m$ linearly independent constraints and the remaining $n-m$ constraints out of the linearly independent must be the non negativity constraints which are binding as equality constraints and therefore set the corresponding variables to $0$. Reorder the variables so that the variables corresponding to the binding equality constraints in the above set come last. We can then partition the vector $x$ as $[x_B | x_N]$ where $x_B$ has dimension $m$ while $x_B$ has dimension $m$, and $x_N=0$. Correspondingly we can partition $A$ as $[A_B | A_N]$ where $A_B$ is $m \times m$ and where $A_N$ is $m \times n-m$. The equation $x_N=0$ can we written as $I ~ x_N=0$ where $I$ is the identity matrix of dimension $m-n \times m-n$, Alternatively

$$ \begin{bmatrix} 0 & I \end{bmatrix} \begin{bmatrix} x_B\\ x_N \end{bmatrix} =0 $$

Stacking these together, we can write:

$$ \begin{bmatrix} A_B & A_N\\ 0 & I \end{bmatrix} \begin{bmatrix} x_B\\ x_N \end{bmatrix} = \begin{bmatrix} b\\ 0 \end{bmatrix} $$

Definition 1 says that the matrix

$$C = \begin{bmatrix} A_B & A_N\\ 0 & I \end{bmatrix}$$ is of full rank.

By subtracting a suitable linear combination of rows of $I$, we can reduce each row of $A_N$ to zero. Since the entries to the left of $I$ are all zero, we can perform these row operations on the matrix $C$ without affecting $A_B$. This would reduce $C$ to the block diagonal form

$$\begin{bmatrix} A_B & 0\\ 0 & I \end{bmatrix}$$

Since this matrix if of full rank, it follows that $A_B$ is of full rank and hence has linearly independent columns. Since the non zero elements of $x$ are a subset of $x_B$, the columns of $A$ corresponding to these non zero elements are a subset of the columns of $A_B$ and are therefore linearly independent. Definition 2 is therefore satisfied.

Definition 2 implies definition 1

Let $x_P$ be the subset of $k \le m$ positive variables of $x$. Definition 2 says that the columns of $A$ corresponding to $x_P$ are linearly independent. We can add $m-k$ more linearly independent columns to get a set of $m$ linearly independent columns of $A$ which include the columns corresponding to $x_P$. Let the variables corresponding to this set of $m$ columns be $x_B$ and again reorder the variables if necessary to ensure that $x_B$ is the first $m$ variables of $x$ so that $x=[x_B | x_N]$. We have $x_N=0$.

As before we can stack the equation together to get:

$$ \begin{bmatrix} A_B & A_N\\ 0 & I \end{bmatrix} \begin{bmatrix} x_B\\ x_N \end{bmatrix} = \begin{bmatrix} b\\ 0 \end{bmatrix} $$

To show that definition 1 is satisfied, we need to show that the matrix

$$C = \begin{bmatrix} A_B & A_N\\ 0 & I \end{bmatrix}$$ is of full rank.

We know that $A_B$ is of full rank ($m$). Therefore every column of $A_N$ can be expressed as a linear combination of columns of $A_B$. In other words, we can perform column operations of $A$ to reduce $A_N$ to zero. Since the entries below $A_B$ are all zero, we can perform the same column operations on $C$ without affecting the $I$ matrix at the bottom right. This reduces $C$ to the block diagonal matrix

$$\begin{bmatrix} A_B & 0\\ 0 & I \end{bmatrix}$$

Since both the blocks $A_B$ and $I$ are of full rank, it follows that the whole matrix $C$ is also of full rank. In other words, there are $n$ linearly independent active equality constraints and therefore definition 1 is satisfied.

Notes

  1. Those familiar with the simplex algorithm will recognize the subscripts $B$ and $N$ as referring to basic and non basic variables, but the proof above does not require any knowledge of the simplex algorithm.

  2. Some variables in $x_B$ might be zero (in simplex terminology, this is the phenomenon of degeneracy). To understand this situation, we can rewrite $Ax=b$ as $A_B x_B=b$ (since $x_N=0$). Since $A_B$ is of full rank, we can write this as $x_B=A_B^{-1}b$. This shows that the zero elements of $x_B$ are a consequence of $Ax=b$. Therefore while $x_\ell=0$ for $\ell \in x_B \setminus x_P$, $x_\ell=0$ in not a linearly independent constraint. Therefore, we must augment $x_P$ to get $x_B$ and add the $x_i=0$ constraints only for $i \in x_N$.

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.