A failed attempt to use an SDP to find a nonorthogonal factorization

I’m looking at a nonorthogonal factorization problem: given a symmetric matrix A, write it as A = GDG^t where the columns of G all have \ell_\infty norm less than one, D is diagonal, and G,D are chosen to achieve  \vert\vert\vert A \vert\vert\vert = \min \|\text{diag}(D)\|_1 , where the minimization is taken over all such G,D.

Why do I care about this problem? My advisor suggested it when I mentioned my interest in copositive programming, because it’s likelier that I’ll be able to prove something in this area than in that— the positivity constraints in completely positive programs make them NASTY.

What’s the usefulness of such a factorization? For one, it would allow statisticians to use nonorthogonal factor models, and for two, it might be useful (if you generalize it to the case A = UDV^t) for collaborative filtering type problems like the Netflix problem: here V would represent categories, U would represent user weights, and D would represent the universal importance of the factors.

The “norm” is like a dictionary norm, where the dictionary is a subset of the rank one matrices– minimizing the norm corresponds intuitively to minimizing the complexity of the explanation of A. Note that the number of columns of G aren’t specified. My advisor hinted that you can assume this norm is achieved when G is square — I have no idea why that should be the case. Two more interesting questions: why use the \ell_1 norm, and why constrain the \ell_\infty norm of the columns of G? I don’t think, as has been suggested to me, that the \ell_1 norm is there for its sparsity promoting properties, since one would always expect \text{diag}(D) to have no zeros (since we could just drop the columns of G corresponding to zero entries on D‘s diagonal).

I attempted using the following SDP to find the desired factorization:
 \displaystyle \min \sum_{i=1}^k (\frac{1}{P_{kk}}-\frac{1}{N_{kk}})
 \displaystyle \text{ subject to} \begin{pmatrix} A & G & 0 & 0 \\ G^t & P+N & 0 & 0 \\ 0 & 0 & -A & G \\ 0 & 0 & G^t & -(P+N)  \end{pmatrix} \succcurlyeq 0
 \displaystyle \text{ and } -1 \leq G \leq 1 \text{ (entrywise)}
 \displaystyle \text{ and } P, N \text{ diagonal}

Here’s a CVX formulation:

n = 3; % size of A
k = n; % number of columns in G
 
% construct an example so I know a decomposition exists
Gorig = randi(10, n, k);
Dorig = diag(randi(10, 1, k));
A = Gorig*Dorig*Gorig.';
 
% try to find the best decomposition for this example
cvx_begin
    variable Dpos(k,k) diagonal;
    variable Dneg(k,k) diagonal;
    variable G(n,k);
    minimize sum(inv_pos(diag(Dpos - Dneg)))
    subject to
        [A,   G,         zeros(n, n+k); ...
         G.', Dpos+Dneg, zeros(k, n+k); ...
         zeros(n, n+k),   -A , G; ...
         zeros(k, n+k),   G.', -(Dpos+Dneg)] == semidefinite(2*(n+k));
         G < = ones(n,k);
         G >= -ones(n,k);
cvx_end

Unfortunately, this doesn’t work in practice: CVX says the program is infeasible. Obviously it’s not, since I explicitly constructed the example so there’s at least one feasible point. The moral is that just because you can write something as an SDP doesn’t mean you can solve it: rewriting A = GDG^t as A-GDG^t \succcurlyeq 0 and  GDG^t-A \succcurlyeq 0 isn’t efficient enough.

Update: Nope, the moral was that I should refresh my linear algebra. Obviously a matrix can be psd only if its diagonal blocks are, and -A is nsd, so the program is infeasible. (in forming the program, I forgot that not only must the Schur complement be positive, the top principal matrix must also be positive).

Possibly relevant posts:

Tags: , , , , , ,
No comments yet.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">
CommentLuv Enabled