I'm solving underconstrained $Ax=b$ with additional constraint that entries of $x$ are non-negative.
This can be done with QuadraticOptimization by
- putting $||Ax-b||^2$ in the objective term
- putting $Ax=b$ in the constraint term
- both in the objective and in the constraint term
Choosing 1. gives more smooth solution, and a better guess of truth, than any solution involving constraints. Any idea why? Is it possible to encode this bias explicitly?
Background: Hausdorff moment problem: part 2, numeric stability post
(*Recover PDF of ArcSinDistribution[] at n points from m moments*)
{n, m} = {100, 20};
xvals = .5 (Cos[(Range[n] - .5) Pi/n] + 1); (*Chebychev spacing*)
yvals = PDF[ArcSinDistribution[], #] & /@ xvals;
dec[yvals_] := {xvals, yvals}\[Transpose]
originalPlot =
ListLinePlot[dec@yvals, PlotLabel -> "Original Problem",
ImageSize -> 300, Ticks -> None];
(*Form linear system Ax=b*)
{ii, zeros} = {IdentityMatrix[n], ConstantArray[0, n]};
A = Table[
ChebyshevT[k, 2 p - 1], {p, xvals}, {k, 0, m - 1}]\[Transpose];
b = A . yvals;
(*Three ways of solving Ax=b with x>0 constraint*)
sol1 = {"||Ax-b||^2",
QuadraticOptimization[{Transpose[A] . A, -Transpose[A] . b}, {ii,
zeros}, {}]};
sol2 = {"Ax=b", QuadraticOptimization[0, {ii, zeros}, {A, -b}]};
sol3 = {"both",
QuadraticOptimization[{Transpose[A] . A, -Transpose[A] . b}, {ii,
zeros}, {A, -b}]};
{solNames, sols} = {sol1, sol2, sol3}\[Transpose];
solutionsPlot = ListPlot[dec /@ sols, Joined -> {True, False, True},
PlotLabel -> "Solutions", Filling -> {2 -> Axis},
PlotLegends -> solNames, ImageSize -> 300, Ticks -> None];
Grid[{{originalPlot, solutionsPlot}}]
nf[val_] := NumberForm[val, {3, 3}];
TableForm[{nf[Norm[A . # - b]], nf@Norm[# - yvals]} & /@ sols,
TableHeadings -> {solNames, {"train error", "test error"}}]

QuadraticOptimizationtakes your linear constraint VERY seriously, and searches the best fit within the restricted, much smaller parameter subspace than the case 1 where there is no linear constraints. $\endgroup$sol2. You have0as the first argument inQuadraticOptimization[0, ...]. But what should then be minimized when the quadratic objective is constant and zero for any possiblex? It has no sense to me. $\endgroup$A.x-b==0so there is no surprise thatsol2andsol3have smaller error because thesol1did not contained that constraint. But if you compute the error relative tooriginalPlotthatsol1has the smallest error. $\endgroup$1/2 x . q . x + c . x(where{q,c}=={Transpose[A] . A, -Transpose[A] . b}) is same as||Ax-b||^2? $\endgroup$