Archive for July, 2007

ID3 and audio fingerprinting

Tuesday, July 31st, 2007

Today I read a brief synopsis of the ID3v2 tagging technology (ID3 tags are the embedded metadata that let your MP3 player determine the name of a song in a file, the artist, album name, etc.). I was surprised to see that ID3v2 supports the embedding of time-stamped lyrics– none of the free tagging software I saw provides a way to edit or even look at lyric data, but I did see about two commercial ones with those capabilities. Haven’t found any lyrics databases on the net, however.

I was also thinking about audio fingerprinting: since I saw a commercial advertising a cell phone that could identify a song based on a clip, I’ve been wondering how you’d go about designing such a system. One obvious thought that came to mind was using time-frequency decompositions, like wavelets, to compress the data in some way consonant with human auditory perception… but how to translate this broad idea into a specific algorithm? I chanced across libFooID, an opensource implementation of a fingerprinting algorithm; the linked page has a nice synopsis of the algorithm. This algorithm isn’t aimed on identifying songs based on arbitrary fragments, but it’s impressive and useful as is.

On a related note, some researchers at HP wrote a short report on using semantic analysis of song lyrics to identify similar music. The conclusion is that automated auditory similiarity analysis is more accurate, but the two approaches are potentially complementary. The report is worth reading just as a showcase application a probablistic version (ps) of latent semantic indexing (pdf), a technique used in automatic document indexing. Just scanning those papers on semantic analysis has given me a contact high: beautiful math concepts (like principal component analysis and relative entropy) find natural applications here.

Building a ‘mollified’ transition function

Tuesday, July 31st, 2007

Find a C^\infty mapping \phi : [0,1] \rightarrow [a,b] so that all orders of derivatives vanish at 0 and 1. I’m giving it a try…

(more…)

A note on running mathematica remotely using Xming

Monday, July 30th, 2007

Related: notes on setting up mathematica fonts for remote runs under *nixes

I’ve been trying to run mathematica remotely off the Unix cluster at school (using the excellent Xming X Window server under Windows), but apparently fonts that remote applications call need to be installed locally. There are two ways to solve this: install the fonts locally, or use a font server. If Caltech or my department in particular is running a font server, I can’t find any reference to it on the IT sites, so let’s have a go at the first option.

Surprising to me, Wolfram provides the latest versions of the Mathematica font files: Windows fonts, Unix fonts.

Get Xming, install it, and get Xming-fonts, install that too. Get the Unix fonts from above and unarchive them. You have two options here: keep the Mathematica fonts in a separate location from Xming’s, or mix them in. I prefer keeping them separate– updating is easier that way. Follow the appropriate instructions for initializing the font directories (running mkfontscale) for whichever option you choose.

If you installed the fonts to the Xming font directories, you can run Xming as usual and start mathematica over a SSH-tunneled X session. If you didn’t, you can either always start Xming from the command line using e.g.:

Xming -fp "built-ins,C:/Program Files/Xming/fonts/misc/,
               C:/Program Files/Xming/fonts/TTF/,
               C:/Program Files/Xming/fonts/Type1/,
               C:/Program Files/Xming/fonts/75dpi/,
               C:/Program Files/Xming/fonts/100dpi/,
               C:/Mathematica Fonts/Type1,
               C:/Mathematica Fonts/BDF"
         -multiwindow -clipboard

or add the appropriate directories to the font-dirs file in C:\Program Files\Xming. Also, if you’ve already started an SSH session without doing either of the above, you can still add the fonts by using xset from within the SSH session:
xset fp+ "C:/.../Fonts/Type1, C:/.../Fonts/BDF"

I don’t know what the AFM directory is in the Mathematica font distribution: you can leave it out of the font paths and Mathematica will run fine. Maybe it just needs to be there, relative to the Type1 and BDF dirs?

REQ: Development book recommendations

Sunday, July 29th, 2007
PHP, Java, C, C++, Perl, Python, Scheme, HTML, CSS, Javascript, Ajax, Swing, GTK, MySQL, Ant

All development technologies I’d like to have much more hands-on experience in. Best-of-the-best book recommendations enthusiastically welcomed– and by best-of-the-best, I mean on par with the Perl Book, Programming Python, or Javascript: The Definitive Guide– rather more leading towards huge tomes of specifications than tutorials.

Exactness of differential forms

Saturday, July 28th, 2007

Remember a one-form \omega = p(x,y) dx + q(x,y) dy can be identified with a functional

 \displaystyle \gamma \mapsto \int_ \gamma \omega = \int_a^b p(x(t), y(t)) \frac{dx}{dt} + q(x(t),y(t))\frac{dy}{dt} \; dt

on the set of piecewise differentiable curves \gamma(t) = (x(t), y(t));  a \leq t \leq b . It can also be identified with a vector field.

In special cases, a one-form can also represent the total differential of a function: if there is a function f(x,y) such that \frac{\partial f}{\partial x} = p and \frac{\partial f}{\partial y} = q, then since the chain rule gives  \frac{d f(x(t), y(t))}{dt} = p \frac{dx}{dt} + \frac{dy}{dt}, the one-form df = p dx + q dy is called the differential of the function f. A one-form \omega is called exact if there is some f such that \omega = df.

Note that if \omega = df, then

\int_\gamma \omega =  \int_a^b \frac{d f(\gamma(t))}{dt} \; dt = f(\gamma(b)) - f(\gamma(a)),

which shows the integral of an exact one-form is independent of the path taken– it depends only on the endpoints. In particular, the integral of \omega around any closed path is 0. In physics, conservative forces (fields) are exact one-forms.

What if someone gives you a one-form \omega and asks you if it is exact? How could one determine an answer, or maybe even find a function with \omega as its differential?

First, if \omega = df, (assuming f is smooth), then \frac{\partial p}{\partial y} = \frac{\partial q}{\partial x}. The differential of a one-form is the two-form d\omega = \left(\frac{\partial p}{\partial y} - \frac{\partial q}{\partial x}\right) dx dy . So, if \omega = df, then d\omega = 0. One-forms such that d\omega = 0 are called closed.

By Stokes’ theorem,

 \displaystyle \int_{\partial R} \omega  = \int_R d\omega

if R is a simple region, where \int_R d\omega = \int_R \left(\frac{\partial p}{\partial y} - \frac{\partial q}{\partial x}\right) dx dy . Consequently, for closed forms, \int_{\partial R} \omega = 0. Applying Stoke’s theorem to the illustration below gives that \int_{\gamma_1} \omega = \int_{-\gamma_2} \omega if \omega is a closed form, so closedness implies path independence.


un := 1in;
o := (0,0);
z0 = o;
z1 = (1un, .5un);
z2 = (1un, 2un);
z3 = (.25un, .75un);
path p[];
label.top(btex {\Huge R} etex, (.75un, .75un));
pickup pencircle scaled 2;
p0 = z0{dir 60}..z1{right}..{dir 100}z2{dir -100};
p1 = z2{dir -100}..{dir -120}z3..{down}z0;
ahlength := 6;
drawarrow subpath (0,1) of p0;
drawarrow subpath (0,1) of p1;
draw p0;
draw p1;
label.rt(btex \Large $\gamma_1$ etex, z1 shifted (0, -.1un) );
label.lft(btex \Large $\gamma_2$ etex, z3 shifted (-.1un, 0) );

Clearly from an earlier comment, the first prerequisite for being exact is closedness. In fact, you can show that in star-shaped domains, closedness implies exactness. Basically, you fix a value f(p) at p, the center of the star, and define  f(x) = f(p) + \int_\sigma \omega for any point x in the domain, where \sigma is a path from p to x. Taking derivatives shows any such f has \omega as a differential.

If \omega is exact on two domains U_1 and U_2, and U_1 \cap U_2 is connected, then \omega is exact on U_1 \cup U_2 . The idea is that if df_i = \omega on U_i for i=1,2, then d(f_1 - f_2) = 0 on U_1 \cap U_2 , which is connected, so f_1 = f_2 + c on U_1 \cap U_2 . Let f_2^\prime = f_2 - c , then the function f which takes the values of f_1 on U_1 and f_2^\prime on U_2 satisfies df = \omega on U_1 \cup U_2.

A powerful consequence of the fact that a closed one-form is exact in a star-shaped domain is that locally any closed one-form is exact. It then follows readily that a path integral of a closed one-form can be calculated in ‘exact patches’. More specifically, if \omega is a closed one-form on U and \gamma is a smooth path in U, then there is a subdivision a = t_0 < t_1 < \cdots < t_n = b and a collection of open subsets U_1, \ldots, U_n of U such that \gamma maps [t_{i-1}, t_i] into U_i, and the restriction of \omega to U_i is the differential of a function f_i. For any such choice of subdivision and open sets, if P_i = \gamma(t_i),

 \displaystyle \int_\gamma \omega = \sum_{i=1}^n \left( f_i(P_i) - f_i(P_{i-1}) \right)

The proof of this assertion is topological. First, for each point P in \gamma([a,b]), there is a neighborhood U_P in which the form is exact. The preimages of these neighborhoods under \gamma forms an open cover of [a,b], of which a finite subcover can be chosen.

An application of the Lebesgue number lemma tells us that there’s a fine enough refinement of this cover into sets [t_i, t_{i+1}] such that the image of each interval is contained in an U_i. Applying local exactness gives the result.

1-forms and topology

Friday, July 27th, 2007

I ended up not reading much algebraic topology on the way here (trying to wend your way through proofs on a bus, even ones that have already been worked out for you, turned out to be slow and unfruitful work), but I’ve started reading Fulton’s Algebraic Topology since I arrived.

One of the basic issues in algebraic topology seems to be the connection between integration (or more generally, any sort of analysis process) and the topology of a manifold. As an exemplar question: when can a given 1-form \omega = p(x,y) dx + q(x,y) dy be written as the differential of a function f, \omega = df?

I don’t know yet, and the suspense is killing me. In the meantime, here’s a problem: note that if f = \tan^{-1}(y/x) , then in the right half plane df = \frac{-y dx + x dy}{x^2 + y^2} . However, the one form \omega_\nu = \frac{-y dx + x dy}{x^2 + y^2} cannot be written as the differential of a function if the domain of definition is \R^2 \setminus (0,0). What if the domain of definition is the set
\{(re^t \cos(t), r e^t \sin(t)), 0<t&lt;4\pi, \frac{1}{2} < r < 2\}?

Back to Houston

Friday, July 27th, 2007

I got into Houston about 4am on Wednesday. The trip wasn’t as torturous as I had feared, but it was plenty tortuous, so I made the decision about half way through to just buy a plane ticket back. Then on the bus between Phoenix and San Antonio, the bus driver kept stopping to find somewhere to put air in the tires; after about ten unsuccessful attempts at this, we were about an hour and a half late catching the connection to Houston, and had to rush off the bus to the waiting one. In the haste, I left my MP3 player on the seat. So, I’m a littled peeved right now.

Looks like I won’t be flying home, since I want to replace it without turning this trip into a complete money drain: I bought $182 in clothes already (the idea was to buy clothes here and keep them here so I don’t have to lug along luggage everytime I come to visit) and still have to get socks and find acceptably priced underwear ($24 for 4 pairs of boxers at Kohls– wtf? did I grow breasts and am I buying a bra? I think not.) and yesterday was my Dad’s birthday, with an accompanying cash outflow. Then there are the social expenses of catching up with my friends and my sister to look forward to.

So yeah, money’s on the mind right now. And the weather is typically mercurial– this morning I drove my mom’s truck back from dropping her to work, in the kind of rain that makes you wish the windshield wiper had two extra settings. One of the cats loves water, so I tried taking him outside to look at the rain– he escaped my arms and scampered out onto the lawn, then turned around and hightailed it back inside at the first roll of thunder. Geeze did I miss these cats– I can talk to my family on the phone, and even not miss them much, but I really miss Gideon and Wil when I’m not here.

Travel reading

Saturday, July 21st, 2007

I realized that instead of one day, my Greyhound trip to Houston is going to be two days! Two days of torture.

I’ll be spending my awake time fruitfully, studying the math that I want to — since I have yet to begin studying for my quals in earnest, and will be doing that once I reach Houston. The appellation ‘math I want to study’ refers to differential geometry, differential forms, manifolds, tensors, sheaves, connections, bundles, and what not. I bought Lee’s solid treatise on Smooth Manifolds recently, and picked up some other books from the library which might make for lighter reading: Bachman’s A Geometric Approach to Differential Forms, Ramanan’s Global Calculus, and Tornehave and Madsen’s From Calculus to Cohomology; the trip should be more edifying than if I gave into temptation and instead brought fiction to read. You get two carry-ons; one of mine will be nothing but math books, those mentioned above, and the ones I’ll need to study for the quals.

Wish me progress!

Regular Series

Friday, July 20th, 2007

I ran into an interesting paper on the convergence of regular series while browsing for convergence criterion for double series. A regular series is a series of the form \sum_{n=1}^\infty u(n) where u is analytic at infinity.
Equivalently, there is an integer L and coefficients \{c_k\} such that n \geq L implies u(n) = c_0 + \frac{c_1}{n} + \frac{c_2}{n^2} + \cdots

The first result is pretty intuitive:

\sum_{n=L}^\infty \left[ u(n) - c_0 - \frac{c_1}{n} \right] = \sum_{n=L}^\infty \sum_{k=2}^\infty \frac{c_k}{n^k},

where both series are absolutely convergent.
As a corollary, you can see that a regular series converges iff c_0 = c_1 = 0; you also get the formula

 \sum_{n=1}^\infty \left[ u(n) - c_0 - \frac{c_1}{n} \right] = \sum_{n=1}^{L-1} \left[ u(n) - c_0 - \frac{c_1}{n} \right] + \sum_{n=L}^\infty \sum_{k=2}^\infty \frac{c_k}{n^k} .

The final and most interesting development in the paper is an application of the above results to approximating regular series:

Any convergent regular series for which the coefficients c_n are real and nonnegative has a value S given by

 S = \sum_{n=1}^{G+h} u(n) + \theta u(G)

where  G \geq L , h \geq G^2 - G , and 0 < \theta &lt;1 depends on G and h.