Archive for January, 2006

My first mug shot

Saturday, January 28th, 2006

I was searching my sister’s room for interesting books when I came across her webcam (does this sound like too gross an invasion of privacy? :)). Since we need a webcam for our senior design project– right now we’re using an Apple iSight, which has an unfortunate requirement that the machine it’s being used on be an Apple, as well as a tendency to blur our upclose pics of resistors– I decided to try hers out. I was especially ecstatic when I discovered it has 1024×768 still shot capability; that was a feature that I’d been looking for– the iSight only seems to do 640×480. After a little trouble, I got it installed, and working with python; however, even though with the software that comes with it, it can reach 1280×960, with python I can only set it up to 640×480. Maybe that’s because I also can’t tell it to go to a ’still’ mode– it stays in constant 30 fps mode– or maybe the resolution gain is done through software, or maybe for both these reasons.

Anyhow, I got something working with python, so that’s proof of concept and one mark for my side: the one member of our team who has a strong opinion on the subject wants to build the system to run on an Apple, with the iSight. My objection to that stems from the fact that part of our marketing strategy is to place the system in the school’s circuits lab, and all the machines there run windows. Anyhow, it’s always a good idea to go with Windows for these types of projects, so you don’t confuse the profs unnecessarily.

Incidentally, I realized that I’ve never had my picture taken on a webcam before, and never had one posted to the site, so here’s one picture of a very handsome man:

Absorbing and balanced sets

Saturday, January 28th, 2006

If X is a vector space over a field \mathbb{F} and A \subseteq X, we have the following definitions to consider:

  1. A is convex if t y + (1-t)z \in A whenever y,z \in A and 0 < t < 1. Here \mathbb{F} is \R or \C.
  2. If \alpha A \subseteq A whenever |\alpha| \leq 1, then A is balanced.
  3. The set A is absorbing if, for each x \in X, there is a positive number s_x such that x \in tA whenever t > s_x.

An introduction to Banach Space Theory. Robert E. Megginson

Some consequences of sets’ having these properties are well known or intuitively obvious, especially concerning convexity. Here are some interesting questions, from the same source:

  1. Identify all the balanced subsets of \C. Do the same for \R^2.
  2. Suppose that A is balanced. Prove that A is absorbing iff for each x \in X there is a positive number t_x such that  x \in t_x A.
  3. Prove that each convex absorbing subset of \C contains a neighborhood of 0. Is this true for nonconvex absorbing subsets of \C?

In relation to the last question above, consider the polar graph of the relation  2\pi r - (2\pi - \theta) \cos \theta = 0

Optimization problems

Tuesday, January 24th, 2006

Two easy :) problems. The first one illustrates a general idea that is useful (it can be used to help in analytically deriving the FTA, for instance): show that an even polynomial achieves its minimum on \R. The second, while interesting, doesn’t have any immediately interesting applications I know of:

Define the convex (a.k.a. Fenchel) conjugate of a function f: \R^n \rightarrow \R \cup \{\infty\} to be the function f^\star : \R^n \rightarrow \R \cup \{\infty\} defined by

\displaystyle f^{\star}\left(x^\star\right) = \sup\left\{\left\langle x^\star,x\right\rangle - f\left(x\right) : x \in \mathbb{R}^n \right\} = - \inf\left\{f\left(x\right) - \left\langle x^\star,x\right\rangle : x \in \mathbb{R}^n \right\}

show that if f_p = \frac{|x|^p}{p} and \frac{1}{p} + \frac{1}{q} = 1 with p>1 but not necessarily an integer, then f^\star = f_q.

Numb3rs

Sunday, January 22nd, 2006

I finally saw a complete episode of Numb3rs last night– yes, I am way behind the times. In this episode, Charlie was helping the FBI track two escaped fugitives. The quality of the show surprised me: I was expecting a watered down show, one with all hints of adult content washed out for the kiddies, and the mathematical sophistication of at most a college freshman. There was not any of the gratuitous cursing that mars a lot of today’s reality dramas, and I did notice an emphasis on family dynamics that reminded of Judging Amy, but this episode was not overly saccharine: there was an aborted sex scene, when Charlie’s brother interrupted one of the convicts and his girlfriend mid-coitus. And although the main character repreated certain words way more than necessary–e.g., Bayesian, Markov chain–, the eventual application of math seemed to justify the mentioning of them.

The only two bitter notes left from the show are: 1) the fact that the FBI didn’t even consider that the vehicular collision that gave the convicts their opportunity was staged until Charlie mentioned it, and 2) the writers’ insistence on drawing out Charlie’s discovery that the accident had been rigged. I suspect the latter was done because Charlie only had to apply simple algebra and high school physics to make that discovery; to make it smell more mathematical, they had him talk about modeling the accident using a Markov chain. I’m not even sure what that entails: exactly what aspects of the accident could be modeled in a useful manner using Markov chains, and how would they be relevant to the investigation? The convicts were already loose, and we already knew the accident was staged. At least Charlie’s episodic fascination with Markov chains, and as he refers to it ’statistical probability’ (as opposed to the non-statistical kind, I suppose), paid off in the manner in which he tracked one of the fugitives down.

That was the core instance of applied math: from the FBI tipline reportings, collated information on recent crimes that he might be responsible for, and the location of family and friends, Charlie made a time series of possible locations of the fugitive. Then, he used some kind of Bayesian inferential techniques to denoise the data– mainly to counteract the uncertainty and corruption in the data coming from the tiplines– and come up with a pattern to the convicts’ movements. It would have been nice if he’d specified in more detail exactly what methods he used, but I suspect then the show would’ve turned into a math lecture. Like Simon Alexander, the newest postdoc in the wavelets group said in reference to explaining harmonic analysis to a Cal I class: “you can either say very little, or very much; there’s not much in between.”

As long as it illustrates one or more non-obvious application of advanced mathematics per episode, I’ll continue to think Numb3rs is worth watching.

Interpolating splines

Sunday, January 22nd, 2006

Note: I spent a weekend writing a simple implementation of the inefficent spline paradigm described here, in PyGTK; check it out: get the python spline fitting and evaluation code, and PyGTK stub interface. The feature set is small: add new points in the interface by left clicking, and middle click to clear.

Here’s the motivating problem: you have points \{x_i, y_i\} observed at times \{t_i\}– say, samples of the position function of a particle tracing out a path– and you’d like to fit a smooth curve \vec{g}(t) to these points. An intuitive approach to this problem is to represent \vec{g}(t) = (g_x(t), g_y(t)) parametrically, and fit smooth curves to the abscissa and ordinates; this reduces the problem to that of fitting curves to points, which we can solve in many different ways (Lagrange interpolation, etc.) One approach which has several desirable features is that of interpolating splines.

Interpolating splines can be developed under the energy minimization paradigm, in an interesting manner. Under this paradigm, the problem is to, given the points y = \{t_i, \xi_i\}, find a smooth function \hat{g} satisfying:

 \displaystyle \hat{g} = \text{argmin}_g U(g\:|\:y),

where U(g\:|\:y), the energy function, represents some measure of the undesirability of using the function g to represent y. The energy function has two components, U_1(g), the internal energy, which corresponds to a measure of the desirability of g, and U_2(g\:|\:y), the external energy, which corresponds to the compatibility of y and g. To develop splines, we use

 \displaystyle U(g\:|\:y) = \alpha U_1(g) + U_2(g\:|\:y) = \alpha \int \left(\left(D^m g\right)(t)\right)^2\:dt + \sum_{i=1}^n \left(\xi_i - g(t_i)\right)^2,

that is, we look for functions whose variation (measured as the L^2 norm of their m-th derivative) is minimal, and which fit the data points in a least means squared sense. The parameter \alpha controls the relative importance we assign to U_1 and U_2; in what follows, when it matters, we assume \alpha is fixed.

Here’s the first amazing result: Such \hat{g} exist, and furthermore

  1. The curve \hat{g} is a polynomial of degree m+1 on each interval (t_i, t_{i+1}).
  2. At each point t_i, the first m derivatives of \hat{g} are continuous, while the (m+1)-th can be discontinuous.
  3. In each of the intervals (-\infty, t_1) and (t_n, \infty), D^m\hat{g}(t) = 0.

I’d love to see a proof of this theorem– I suspect it uses the variational calculus. Certainly (ii) and (iii) are obvious, but (i) is not to me.

Now, we can define splines:

A curve that satisfies (i) and (ii) with m=2 is called a cubic or bending spline with knots {\cal T} = \{t_i\}. If it also satisfies (iii), it is called a natural cubic spline. An interpolation spline is a cubic spline which passes through the points \{t_i, \xi_i\}.

So the above theorem guarantees the existence of interpolating cubic splines. But we can do even better, with the next theorem, which shows us how to construct such splines:

Let \xi=\{\xi_i\} be n values set according to t_i. There is only one natural spline g interpolating these \xi_i values.

Proof:
By (i) above, on each interval (t_i, t_{i+1}), g has the form

 \displaystyle g(t) = a_i(t - t_i)^3 + b_i(t-t_i)^2 + c_i(t-t_i) + d_i.

For convenience, set h_{i+1} = t_{i+1} - t_i and M_i = D^2g(t_i), then you can show


\displaystyle
b_i = M_i/2 \quad a_i = (M_{i+1} - M_i)/(6h_{i+1}) \\
d_i = \chi_i \quad c_i = \frac{\xi_{i+1} - \xi_i}{h_{i+1}} - h_{i+1}\frac{2M_i + M_{i+1}}{6},

and from (ii) above, we haveD^1g(t_i^-) = D^1g(t_i^+) for i=2,\ldot,n-1, giving

 \displaystyle M_{i-1}h_i + 2M_i(h_i + h_{i+1}) + M_{i+1}h_{i+1} = 6 \left( \frac{\xi_{i+1} - \xi_i}{h_{i+1}} - \frac{\xi_i - \xi_{i-1}}{h_i}\right).

This gives us n-2 equations we can solve for the n-2 unknowns M_i, \: i=1,\ldots,n (we know by (iii) M_1 = M_n = 0); these relations can be written in the linear system RM = Q\xi, whereR is a tridiagonal n-2 \times n-2 matrix, Q is a tridiagonal n-2 \times n matrix, M is the n-2-vector of 2nd derivatives of g, and \xi is the n-vector of the values g should interpolate. This system can be solved for M = R^-1 Q \xi, and these values can be used to calculate the a_i, b_i, c_i, d_i

See my code for implementation of this scheme for 2d interpolation. At some point, I’m going to show how the energy minimization scheme can be adapted from interpolation to smoothing.

New Problem: locating resistors

Wednesday, January 18th, 2006

This morning my senior design group met to discuss our pre-proposal, and surprisingly, we found that our number three choice became our number one. Now we’re proposing to develop an automated sorting machine– in this case, our choice of things to sort is resistors. Why the switch? This project is useful, and much easier conceptually.

Our approach breaks the task down into two tasks: design of an algorithm to locate desired resistors in an image, and managing a robotic pincer to physically isolate the desired resistors once found. My team seems willing to let me handle the image processing part, and pretty much do nothing on the robotic part. Although designing a control algorithm for the pincer may be an interesting problem– is an algorithm really needed? we’re hoping it’ll be more of a point and grasp thing– I’m relieved to be free from dealing with hardware.

Here’s my tentative algorithm for the pre-proposal: run an edge detector on the image, isolate line segments (fortuitously for me, we’re using the resistors that are color coded with linear stripes) to locate resistors, and maybe curve extraction to determine the orientation of the resistor, for more information on how to grasp it. Once you’ve located the color stripes, you can check their color to see if the resistor is one of the ones desired. It seems pretty straightforward: even the curve extraction isn’t really needed– once you know the location of the color strips and their inclination, you pretty much know all you need to about the resistor. But… everything looks harder when you’re down in the dirt getting it working, so maybe writing the algorithm will be harder than writing the GUI after all :)

I’m thinking of using the Hough transform to do the line segment detection.

A Motion Capture Problem

Tuesday, January 17th, 2006

We’re in the process of bidding for projects in my Senior Design class– we have 15 projects to choose from, and have to select our top 3 choices. My team’s number one choice was the ‘Mini-Me’ Lego robot project, in which we are to create a robot that mimics the movements of a human being monitored with a video camera. I thank the powers that be that nowhere on our list is any of the electronic/circuit/solid-state type projects that dominated the list– as interesting as it may be to design a laser system for measuring the stresses in thin film depositions, I have neither the desire nor copious free time required for any of those hardware construction/lab-intensive projects. Nope, all our choices have a strong signal processing component, just the way I like it.

The question is, how best to attack this problem? My first idea, and probably the most realistic given our time constraints, is to model people as cylinders connected with kinematic constraints, slap a few markers on people and track their movements through those, degrade the movements to a level that the robot can handle, and change them into control signals to the motors. But what would be a more mathematically elegant way of approaching this? The first thing that came to mind was active contours, but even if I could manage to get one to fit the human body perfectly, it’d still be hard to get useful information on local movements from such a model.

It’s an interesting problem to ponder… any bright ideas?

Chebyshev’s inequality (continuous, general version)

Sunday, January 15th, 2006

I’ve been having a very hard time reading Grafakos’ book: I’ve spent probably four hours reading only about 6 pages. Part of the reason for this, I hope, is because he expects the introductory material on L^p and weak L^p spaces to be a review– while for me it is new material. If this is what the entire book is like, then I’m in for a rough semester.

One of the points that stumped me for a long time is his proof of the fact that L^p \subset L^{p,\infty}, which uses Chebyshev’s inequality. He states that it is a simple consequence of Chebyshev’s inequality that:

 \displaystyle \alpha^pd_f(\alpha) \leq \int_{\{x : |f(x)| > \alpha\}} |f(x)|^p d\mu(x),

which shows that \|f\|_{L^{p,\infty}} \leq \|f\|_{L^p}.

At first I thought he was referring to the probabilistic statement of the inequality:

 \displaystyle P[ |X - \overline{X}| > \epsilon ] \leq \frac{\text{ var }X}{\epsilon^2},

which I had problems even finding (even planetmath lists the Chebyshev inequality as a sum inequality, with no mention of the continuous version). Even when I did dredge it out of a reference book, I was puzzled– in this case, what would X be, and how could I massage this formula to get Grafakos’?

I just found the answer to my question: the general measure theoretic statement of Chebyshev’s inequality as per wikipedia is: if g is a nonnegative extended real-valued measurable function, nondecreasing on the range of f, then

\displaystyle \mu(\{x \in X: |f(x)| > t\}) \leq \frac{1}{g(t)} \int_X g \circ f d\mu,

which looks more pertinent.

Clearly I need to learn more measure and integration theory than I have so far.

Distribution function

Saturday, January 14th, 2006

Let f \in L^p(X, \mu), and define the distribution function of f,

\displaystyle d_f(\alpha) = \mu\left(\left\{x: |f(x)|> \alpha \right\}\right).

It has the following properties:

  1.  |g| \leq |f| \mu-a.e. implies that d_g \leq d_f
  2. d_{cf} = \frac{1}{|c|} d_f for all c \in \C \setminus \{0\}
  3. d_{f+g}(\alpha + \beta) \leq d_f(\alpha) + d_g(\beta)
  4. d_{fg}(\alpha \beta) \leq d_f(\alpha) + d_g(\beta)

Here’s a neat result from Grafakos’ ‘Classical and Modern Fourier Analysis’: if \phi : [0, \infty) \rightarrow \R is increasing and C^1, then

 \displaystyle \int_X \phi\left(|f|\right) d\mu = \int_0^\infty \phi^\prime(\alpha)d_f(\alpha) d\alpha.

Actually, I think you also need to state \phi(0)=0. As a special case,

 \displaystyle\|f\|^p_p = p \int_0^\infty \alpha^{p-1} d_f(\alpha) d\alpha .

The proof consists of writing \phi(|f|) as an integral in terms of \phi^\prime, extending the limits of integration of that integral to 0, \infty by inserting an appropriate characteristic function \chi(x,\alpha), using Fubini’s theorem– then the trickiest part is recognizing that the original characteristic function is exactly the same as the one needed to get d_f as your interior integral. I’m too lazy to write it out, but it’s a good exercise: it took me embarassingly long to figure out how Grafakos applied the Fubini theorem (namely, the equivalence of the characteristic functions).

The interesting thing about the distribution function is that it is not unique to the function (as a trivial example, under Lebesgue measure, d_f is the same for all translates of f), yet it determines the value of these particular functionals at f \in \L^p.

As an example, let

\displaystyle \phi(x) = \begin{cases}
\sin(x) & x \in [0, 1] \\
0 & \text{ otherwise}
\end{cases}
\quad
f(x) = \begin{cases}
1-x^2 & x \in [0,1] \\
0 & \text { otherwise }
\end{cases}.

You can use the above result to show  \int_0^1 \sin(1- x^2) dx = \int_0^1 \cos(\alpha) \sqrt{1 - \alpha} d\alpha . A non-trivial result!