Archive for May, 2006

Simulated Annealing

Friday, May 26th, 2006

This semester, since I was working on a computer vision project, I downloaded a bunch of articles on shape recognition, tracking, color correction, and such. A lot of the algorithms in these papers depend on some optimization algorithm, and these optimization algorithms are assumed to be known to the reader. It seems that simulated annealing, a method of optimization based on the simulation of certain physical mechanisms, can be used as a simple equivalent replacement algorithm for the more esoteric ones.

Here’s how simulated annealing works, as per “The Annealing Algorithm” by Otten and van Ginneken: we are given an instance of the problem, a pair (S, \epsilon) consisting of a set of feasible configurations (aka configuration space) S, and a score (aka objective) function \epsilon, and our aim is to find an optimum state s, satisfying

 \forall_{s^\prime \in S}: \epsilon(s) \leq \epsilon(s^\prime).

There are several ways to conduct this search: blind random searching, in which random configurations are generated and evaluated until some stopping condition is reached; iterative refinement, in which random configurations are generated and evaluated, then tested against their neighbors; in this method, a new starting random configuration considered when all the current one’s neighbors have been tested, and is switched to only if its score is lower than the current low score. Then there are probabilistic hill climbing methods, which keep the testing of neighbors present in the iterative refinement algorithm, but use an acceptance function to determine whether to switch to a new random starting point; this acceptance function has a nonzero probability of allowing a switch to a starting configuration with a higher score. By visualizing the score vs. configuration surface, you can see why probabilistic hill climbing methods are appropriately named: while iterative refinement algorithms can only move down the surface, these methods allow moving up hills; the advantage of allowing this is that it makes the algorithm less sensitive to local minima.

Simulated annealing is a probabilistic hill climbing optimization method motivated by physical principles– I won’t go into the details, because they’re not too important: the method is straightforward enough by itself. To use this method, you need a single parameter

Not a frame, but Bessel

Friday, May 26th, 2006

I mentioned the frame question that came up in the last post to Papadakis, and he gave me this one:

Let \varphi \in L^2(\R^2) be such that \hat{\varphi} is continuous and \text{supp } \hat{\varphi} \subseteq \left[-\frac{1}{2}, \frac{1}{2}\right]^2. Then \left\{T^k\varphi\right\}_{k \in \Z^2} is not a frame, but it is a Bessel sequence.

Looks interesting… I haven’t attempted to show it isn’t a frame yet, and I’m stuck on showing it’s Bessel. But, what else is a three day weekend for?

Update: I finally figured this out after spending the whole weekend on it, and found it is trivial. I just discussed my solution with Juan, and we were misunderstanding each other, because he was thinking of the problem as it should have been stated, as Papadakis probably meant to. Here’s the solution to the problem as stated: to show it is not a frame, simply note that if \text{supp }\hat{v} \not\subseteq \left[-\frac{1}{2}, \frac{1}{2}\right]^2 then \left\langle T^k\varphi, v \right\rangle = \left\langle e^{-2\pi i k \cdot \omega}\hat{\varphi}, \hat{v}\right\rangle = 0, so the lower bound condition is violated.

To show it is Bessel is a little more involved, but nonetheless straightforward: again note \left \langle T^k \varphi, v \right\rangle = \left \langle e^{-2\pi i k \cdot \omega} \hat{\varphi}, \hat{v} \right\rangle . Then consider g = \overline{\hat{\varphi}}\hat{v}|_{\left[-\frac{1}{2}, \frac{1}{2}\right]^2}; g satisfies \|g\|_{\left[-\frac{1}{2}, \frac{1}{2}\right]^2} \leq \|\varphi\|_\infty \|v\|_{L^2(\R^2)}, so we can consider g to be a function in L^2(\left[-\frac{1}{2}, \frac{1}{2}\right]^2). Then we have

 \displaystyle \|g\|^2 = \sum_{k \in \Z^2} \left| \left\langle e^{-2\pi i k \cdot \omega}, \overline{\hat{\varphi}}\hat{v}\right\rangle \right|^2 = \sum_{k \in \Z^2} \left| \left\langle e^{-2\pi i k \cdot \omega} \hat{\varphi}, \hat{v}\right\rangle \right|^2 = \sum_{k \in \Z^2} \left| \left\langle T^k\varphi, v\right\rangle \right|^2 \leq \|\hat{\varphi}\|_\infty^2 \|v\|^2,

so the upper bound can be taken to be \|\hat{\varphi}\|_\infty.

The fishy thing about this problem is that it doesn’t require the continuity of \hat{\varphi}, only that it be in L^\infty. Per Juan, this is because the statement of the problem should be

Let \varphi \in L^2(\R^2) be such that \hat{\varphi} is continuous and \text{supp } \hat{\varphi} \subseteq \left[-\frac{1}{2}, \frac{1}{2}\right]^2. Then \left\{T^k\varphi\right\}_{k \in \Z^2} is not a frame in the closed subspace it generates, but it is a Bessel sequence.

Then the continuity of \hat{\varphi} is needed…

Every finite spanning set is a frame

Friday, May 26th, 2006

I spent way too much time trying to remember why every linear function on a finite-dimensional vector space is bounded. (because if \|x\|=1 and x = \sum_{i=1}^k \lambda_i e_i, then \|f(x)\| \leq \sum_{i=1}^k \|\lambda_i\|\|f(e_i)\|, so \|f\| \leq \sum_{i=1}^k \|f(e_i)\|).

I need this result to show that every finite spanning set in a finite dimensional Hilbert space is a frame. That is, if \{v_j\}_{j=1}^n \subset H spans H then there are 0 \leq c \leq C such that

\displaystyle c\|x\|^2 \leq \sum_{j=1}^n \left|\langle v_j, x \rangle\right|^2 \leq C\|x\|^2,

for each x \in H.

The function \theta: H \rightarrow \R^k defined by \theta(x) = \left(\langle v_1, x\rangle, \ldots, \langle v_n, x \rangle\right) is linear, so by the above result is bounded with some constant \sqrt{C}.

Now I just need show that the inverse is continuous, so there is a c

This problem came up when I was talking to a fellow graduate just now about what I’m doing in Papadakis’ group, and I tried introducing the concept of a frame. The first question he asked was if each spanning set is not necessarily a frame. It seems that this is true in the finite dimensional case, but I’m sure it’s not in general Hilbert spaces, because then what would be the point of defining a frame? I’ll see if I can find a spanning set that isn’t a frame in L^2(\R).

Update:
The lower bound c exists because the inverse function {\theta^\star} is again a linear function on a finite dimensional vector space, so it is bounded by some constant B. We can take c=\frac{1}{B}.

It is easy to find a spanning set in l^2 that is not a frame, because it is not a Bessel sequence (a sequence which defines a bounded operator \theta): simply let \{v_n = n \delta_n \}.

The measure of a manifold

Wednesday, May 24th, 2006

I’ve started reading Milnor’s “Topology from a Differentiable Viewpoint”, and I’m really enjoying it. I’ve made it up to the part where he uses manifold theory to prove the fundamental theorem of algebra– it’s amazing not only that it’s possible to do this, but that someone took the time to figure it out. Incidentally, while choosing a topology book, I came across another unexpected use of topological methods: a proof of the infinitude of the primes.

I’ve set myself the following goal, relating to the problems I mentioned earlier about the probability of the invertibility of random matrices: let M \subset \R^n be a smooth m-dimensional manifold, with  m < n, then m(M) = 0 where m is the Lebesgue measure. Hopefully I’ll have all the machinery I need to examine this question after chapter 2.

Boothby’s “An Introduction to Differentiable Manifolds and Riemannian Geometry”

Tuesday, May 23rd, 2006

I should start a section on books; I would if I read a significant portion of even a tenth of the ones I pick up. I just found a good one on Juan’s desk, actually several: the first two volumes of Spivaks’s Comprehensive Introduction to Differential Geometry, and the above mentioned book. I’m surprised, because Juan doesn’t strike me as a diff. geom. kind of guy; I thought he was strictly into functional analysis. But then again, he took these up from the piles of books professors leave outside of their door occasionally. Starting today, I will troll the halls… noone will get them before me.

LOTR

Saturday, May 20th, 2006

What the hell is it with people and LOTR? I was over at Amazon, looking through various peoples’ list of fantasy books… almost every single one of them mentioned LOTR. Was it ever this popular before the movie came out?

I don’t get the deal– I couldn’t even read the first book in the series, much less finish the entire boring epic. And the movies were better only because you didn’t have to make a sustained effort to pay attention; you could space out for a while out of sheer boredom or confusion, and the movie would have proceeded on without you. That’s how I spent way too much time watching them.

What is so special about LOTR? I would much rather read the Prydain Chronicles, or The Dark is Rising sequence… something with bite.

Solution to probability of matrix invertibility

Thursday, May 18th, 2006

One of the great things about math is that anyone can eventually understand any proof (I’m rounding up from ‘all reasonably intelligent people’ here), yet not anyone can construct one from scratch. And being able to solve one problem does not guarantee that the next one will be any less daunting. There’s a definite asymmetry of information: if you start looking, you can find many hard problems, yet you’re lucky to find one that is answerable.

This was brought to mind by Robert’s elegant solution to the problem on the invertibility of random matrices (with bounded entries) he posed two days ago:

First, the 2×2 case:
Let each of A, B, C, and D be uniformly distributed random variables over some interval (or region in the complex case). Then we will call M = [A B; C D] a random 2×2 matrix. Note that P[M is not invertible] = P[AD - BC = 0]. Since each of A, B, C, and D is a random variable, AD-BC is a random variable. It’s distribution is whatever the distribution of a product of uniform distributions is. I’m not sure what that is, but I am sure that its
density function is continuous over its domain which is an interval. Thus P[AD-BC = 0] is zero.

In general, given n*n uniformly distributed random variables, the determinant of the random matrix generated by those random variables is a random variable with a density function that is continuous over its interval domain. Thus the probability that the determinant is a specific value (namely 0) is zero.

Sweet and simple. The only deal here is answering the question of whether P(AB-CD) is continuous, but I think it is.

I had thought of approaching this problem by showing that the set of zeros of the determinant function, considering the domain to be \C^{n^2}, had measure zero, but couldn’t get a handle on that, since the set of zeros is some complex variety. Now, using Robert’s technique, it seems straightforward to show that the measure of any affine variety is zero.

Open Geometric problem

Wednesday, May 17th, 2006

Craig Larson, a lecturer here at UH wrote an article that made it into Notices of the AMS on the current inadequacies in our recruitment of science and math teachers: the fact that they aren’t required to know the subject before they are allowed to teach it, and the money issues which cause those who would make the best teachers to find another field to work in. It’s a short article, and worth reading.

That plug aside, here’s an interesting and possibly elementary open problem from the SIAM Problemslist:

A sphere B \subset \R^n centered at the origin satisfies the following property:

|B \cap (x + \epsilon B)| = M(\epsilon), \quad x \in \partial B, \quad \epsilon > 0,

where |\cdot| denotes Lebesgue measure and M is a real function. That is, the volume of the intersection of B and a scaled version of B translated to one of its boundary points does not depend upon which boundary point is chosen; it is solely dependent upon the scale \epsilon. This property can be shown to hold for ellipsoids by appropriately scaling the coordinates of a sphere.

Are there other domains B \subset \R^n centered at the origin which satisfy the above equation?

I changed the question somewhat, since I don’t understand what the terms homothety or homothetic center refer to, but it retains the spirit of the original.

Probability of invertibility of matrices

Tuesday, May 16th, 2006

Robert Rosenbaum posted a really nice question to the UH math mailing list:

Given an n \times n matrix of integers, what is the probability that it is
invertible?

What if we put a bound on the magnitude of the integers (so that we consider only a finite number of matrices)? In other words, given an integer n > 0 and an integer M, what is the probability that an n \times n matrix A of integers each less than M in absolute value is invertible? Is there an elementary way to express this value in terms of M and n? I don’t know.

If we don’t restrict ourselves to integer valued matrices and consider all real or complex valued matrices, I think that the probability would be exactly 1. Can someone give an argument for this claim?

I got the last question, where you consider all of \C^{n\times n}: from an earlier problem I posted about, it is clear that the set of noninvertible matrices has empty interior, so measure zero; therefore a random matrix is invertible with probability one.

The discrete case is probably a lot harder…

Update: It seems I made an unwarranted assumption. Just because the set of noninvertible matrices has an empty interior doesn’t mean it has measure zero. I still think that’s true, but the proof is not complete.