Least Squares Solutions to Over- or Underdetermined Systems Least Squares Solutions to Over- or Underdetermined Systems
It often happens in applications that a linear system of equations Ax = b either does not have a solution or has infinitely many solutions. Applications often use least... Least Squares Solutions to Over- or Underdetermined Systems

It often happens in applications that a linear system of equations Ax = b either does not have a solution or has infinitely many solutions. Applications often use least squares to create a problem that has a unique solution.

Overdetermined systems

Suppose the matrix A has dimensions m by n and the right hand side vector b has dimension m. Then the solution x, if it exists, has to have dimension n. If m > n, i.e. we have more equations than unknowns, the system is overdetermined. The equations will be inconsistent in general. This is typical, for example, in linear regression.

In this case, you can use the least square criterion to determine a solution. Instead of demanding that Ax = b, we look for an x than makes the difference between Ax and bas small as possible, as measured by the 2-norm (root mean square). That is, we pick xto minimize

|| Ax - b||_2^2

meaning we solve the system as best we can, best as measured by the 2-norm.

Underdetermined systems

If the number of rows in the matrix A, i.e. the number of equations, is less than the number of columns, i.e. the number of unknowns, then the system is underdetermined. In general there were be infinitely many solutions. It’s also possible that there are no solutions because the equations are inconsistent.

In this case, we can use least squares to assure that a solution exists, and to decide which of the many possible solutions to choose. Here we want to find the x that minimizes

|| Ax - b ||_2^2 + ||x||_2^2

If there are values of x that satisfy Ax = b then this will chose the solution with least norm.

Least squares solutions and SVD

If the singular value decomposition (SVD) of a real matrix A is given by

A = U \Sigma V^T
and A has rank r, then the least squares solution to the system Ax = b is given explicitly by

x = \sum_{i=1}^r \frac{u_i^T b}{\sigma_i} v_i

where the u‘s are the columns of U and the v‘s are the columns of v.

Note that the denominators are not zero; the fact that A has rank r means that it has rpositive singular values.

Furthermore, the least squares residual, i.e. the degree to which Ax differs from b, is given by

||Ax -b ||_2^2 = \sum_{i = r+1}^m (u_i^Tb)^2

Note that if the matrix A has rank m, then the least squares problem can be solved exactly, and the right side above is an empty sum.

See, for example, Golub and Van Loan’s classic book Matrix Computations.


 

Original Source

John Cook

John Cook

Companies come to me for help with probability, machine learning, mathematical modeling, … anything that falls under applied math, statistics, and computing. Clients have included large companies such as Amazon, Google, Microsoft, and Amgen, as well as a number of law firms, start-ups, and smaller businesses. I help companies take advantage of their data by combining it with expert opinion, uncovering latent insights, creating mathematical models, overcoming computational difficulties, and interpreting the results.