Graph layout by Stress Majorization by Optimization of A Complicated Function

I haven’t had much to talk about lately– the Probabilistic Graphical Models course is eating up my time like crazy: I haven’t had time to do anything but that and prepare my lectures and homeworks for the Matlab/Mathematica course– but I have been doing a lot of coding for the course I’m teaching and the course I’m taking. Most of it isn’t fit for public consumption, but some of it is worth sharing/archiving here.

The code below takes the weighted adjacency matrix of a graph and returns a planar embedding using stress majorization by optimization of a complicated function (a simple algorithm with a fancy name, originally developed by multidimensional scaling people). The idea is that ideally you want an embedding with the property that the distance between the points of the embedding approximates the distance between the corresponding vertices in the graph. This concept gives rise to a stress function, which compares these two distances. The stress function isn’t convex, so you majorize it at each iteration by a convex quadratic and use the minimizer of that quadratic as the next embedding. Continue until the quality of the embedding doesn’t change much.

Surprisingly, Matlab doesn’t seem to have built in graph layout routines (I still don’t get why gplot doesn’t have one built in), except for trees, so I find this routine useful when I need to quickly visualize a graph. You can also graph the m-file: smoacp.m

%SMOACP finds a layout for a graph that as much as possible is an isometry
%from the graph to the plane (local minimum of the stress function).
%
% [P, SIGMA] = SMOACP(A, EPSILON) A is the weighted adjacency matrix for
% the graph, and EPSILON is a stopping tolerance. Once the relative
% decrease in the stress drops below EPSILON, terminate. P is the n-by-2
% matrix of vertex coordinates and SIGMA is the stress of P
%
% [P, SIGMA] = SMOACP(A, EPSILON, PGUESS) uses PGUESS as the initial guess
% for P
%
% See Also SPECTRALGRAPHLAYOUT
%
 
function [P, sigma] = smoacp(A, epsilon, P)
 
    disp('Computing a graph layout that minimizes stress');
    tic
 
    % useful stuff
    n = length(A);
    D = floydwarshall(A);
    remove1strow = @(M) M(2:end, :);
 
    % calculate the LGhat matrix
    Dinv2 = D.^-2;
    Dinv2(1:size(Dinv2,2)+1:end) = 0; % reset the diagonals to 0;
    LG = diag(sum(Dinv2)) - Dinv2;
    LGhat = LG(2:n, 2:n);
 
    % inner function to calculate the stress; note that it can access D,
    % the matrix of intervertex distances, and n even though they're 
    % defined outside of it
    function sigma = stress(P)
        sigma = 0;
        for i = 1:n
           for j = i+1:n
               sigma = sigma + D(i,j)^2*(norm(P(i,:) - P(j,:)) - D(i,j))^2;
           end
        end
    end
 
    % inner function to calculate L^Z(P)
    function M = LZP(Z, P)
 
        % fill in the off diagonal terms
        for i = 1:n
            for j = 1:n
                if i==j
                    continue;
                end
                M(i,j) = -D(i,j)^-1*pinv(norm(Z(i,:) - Z(j,:)));
            end
        end
 
        % fill in the diagonal terms
        M = M - diag(sum(M));
        M = M*P;
    end
 
    % initialize if guess not provided
    if nargin == 2
        P = randn(n, 2);
    end
 
    % recursively minimize the quadratic stress majorizer
    while true
       fprintf('Current layout has stress %f\n', stress(P));
       oldP = P;
 
       % slightly inefficient solve of linear system: could do a little
       % better with linsolve and a precomputed cholesky decomposition
       P = LGhat\remove1strow(LZP(oldP, oldP));
       P = [0, 0; P];
 
       % check the two termination conditions
       if norm(P - oldP) < epsilon
           disp('Terminating because embedding didn''t change much');
           break;
       end
       if  (stress(oldP) - stress(P))/stress(oldP) < epsilon
           disp('Terminating because relative stress didn''t change much');
           %break;
       end
    end
 
    sigma = stress(P);
    fprintf('Took %f seconds to find stress-minimizing layout', toc);
end
 
%FLOYDWARSHALL calculates the distances between vertices in weighted graph
%
% D = FLOYDWARSHALL(A) given an n-by-n weighted adjacency graph A, returns
% the n-by-n matrix of distances between vertices.
%
 
function Dc = floydwarshall(A)
    disp('Computing the intervertex distances')
    tic
 
    n = length(A);
 
    % initialize d(0,i,j)
    Dp = A;
    Dp(A == 0) = Inf;
    A(1:size(A,2):end) = 0; % zero diagonals
 
    % recursively calculate d(k, i, j)
    for k=1:n
        for i=1:n
            for j=1:n
                if i == j
                    continue;
                end
                Dc(i,j) = min(Dp(i,j), Dp(i,k) + Dp(k, j));
            end
        end
        Dp = Dc;
    end
    fprintf('Took %f seconds to calculate distances\n', toc);
end

Possibly relevant posts:

Nov 6th, 2009 | Posted in Mathematics, Programming
Tags:
No comments yet.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">
CommentLuv Enabled