Wednesday, April 25, 2007

Homework #5 & Talagrand

Solutions are posted.

Regarding Question #2: I referred to this as "Talagrand's Lemma", and that's how I personally think of it, having learned it first as Proposition 2.2 in Michel Talagrand's ingenious paper How much are increasing sets positively correlated?.

I think that, really, it should count as a folklore result. For example, in arithmetic combinatorics I think it's sometimes called Chang's Theorem. It is also the pith of the very technical proof of Bourgain's Theorem on the Fourier tails of boolean functions (not available online, but appearing in a paper of Khot and Naor). Bourgain uses a different proof, but doesn't quite derive it explicitly. The different proof relies on hypercontractivity and is extremely short:

Given f:{-1,1}n, let L be the degree-1 part of f. Then by Plancherel, W1(f)=f,LfqʹLq for any 1/q+1/qʹ=1, by Holder. Use hypercontractivity on Lq, noting that it has degree 1. One gets Lqq-1L2=q-1W1(f). On the other hand, we have fqʹ=E[fqʹ]1/qʹp1/qʹ=p1-1/q where p=E[f], since f1 and qʹ1. Putting it together, we get W1(f)p1-1/qq-1, which implies W1(f)p2q(1/p)2/q. Now we just need to choose q to balance q and (1/p)2/q; the optimum choice is q=Θ(log(1/p)), and this completes the proof.

Tuesday, April 24, 2007

Randomized DT complexity of monotone graph properties

I posted the lecture notes for today; they contain a bit more information about Yao's Conjecture (f a monotone graph property R(f)Ω(v2)).

Edinburgh

Just got back from a workshop on Geometry and Algorithms in Edinburgh. Was there any Fourier analysis? But of course! What would geometry and algorithms be without it?

Unfortunately, though, it's hard to get it into one's talk in 50 minutes, since one has to cover the geometry and algorithms parts. Guy Kindler mentioned some Fourier analysis in connection with UGC-hardness for Max-Cut. Assaf Naor talked a bit about his upcoming magnum opus with Manor Mendel on "Metric Cotype" which will appear in the Annals of Math. That paper is pretty much all Fourier analysis for functions f:mn, but with m>2. And Robi Krauthgamer came closest to actually doing some Fourier analysis in his slides, while talking about his work on the communication complexity and embeddability of Edit Distance. Unfortunately, his paper (joint with Alexandr Andoni) doesn't seem to be available yet.

I did my best to get there while talking about the parallel repetition/foams stuff I talked about at the ACO Seminar in March, but no dice. I do think the Fourier analysis we used in that result was pretty cute though; basically: (1) Talagrand's Lemma from Homework 5 Problem 2; and (2) the reversed form of the Hypercontractive Inequality, namely:

If f:{-1,1}n0 and -<qp1, then Tρfqfp so long as 0ρ(1-p)/(1-q).


You know -- just like the usual inequality, but reversed.

Friday, April 20, 2007

Homework #5

Hope you enjoyed Avrim on Tuesday and a day off on Thursday.
Please post any questions or comments about Homework #5 here in the comments.

Sunday, April 08, 2007

Homework #4 graded

The average was 35/40, with a high of 38. As usual, pretty much everybody knew what was going on.

The most popular questions seemed to be #1 and #2, even though those were definitely the longest (especially #1!). #5 is definitely the fastest, if you see how to do it.

Homework #4

Solutions are posted on the course web page.

Thursday, April 05, 2007

Geometry of correlated Gaussians

In class we were discussing the geometry of ρ-correlated, n-dimensional Gaussians; call them g, h.

We have that g2=i=1ngi2. This is the sum of n very well-behaved independent random variables with mean 1 and variance 2, so its distribution value will be n±O(n) with very high probability. (To be more accurate, say n±O(nlogn) with probability at least 1-n-O(1).) Hence g will be n(1+O(1/n) with high probability. Since h is also distributed as a standard n-dimensional Gaussian, the same is true of h.

Now imagine g is fixed and we choose h to be ρ-correlated to g. Then gh=i=1ngihi=i=1n(ρgi2+1-ρ2gigiʹ), where the giʹ random variables are independent standard normals. I.e., the dot product is ρg2 plus an independent one-dimensional normal, 1-ρ2N(0,igi2). Since g2=igi2=n±O(n) with high probability, the dot product is ρn±O((ρ+1-ρ2)n) with high probability.

So the cosine of the angle between g and h will be (ρn±O(n))/(n±O(n))=ρ±O(1/n), and hence the angle will be arccosρ±O(1/n) with high probability (assuming ρ is treated as a constant in (0,1)).

Now this doesn't really yet prove that g and h are distributed like a random pair on the surface of the n-radius sphere with angle arccosρ. We just know that their angle will be arccosρ and they will both be on the sphere. We should really look at the distribution of h-ρg. But this is just a scaled n-dimensional Gaussian, so its distribution is spherically symmetric. I guess this more or less justifies the overall claim.
Actually, I believe that h-ρg will be close to orthogonal to ρg with high probability. The dot product between the two will be distributed like 1-ρ2N(0,igi2), which will probably be on the order of 1-ρ2n. However the product of the lengths of the two vectors will be like ρ1-ρ2n. So the cosine of the angle between them will be like 1/(ρn), and the angle will be close to 90 degrees.

Wednesday, April 04, 2007

Approximating not-nice functions by nice functions

In the last class I asserted that it was possible to approximate certain not-nice functions like 1t<t0 and t by B-nice functions. For example, I asserted that for all t0 and 0<λ<1/2 there exists a function Δt0,λ: which is O(1/λ4)-nice and approximates the t0-step-function in the following sense: Δt0,λ(t)=1 for t<t0-λ; Δt0,λ(t)=0 for t>t0+λ; and, 0Δt0,λ(t)1 for t-t0λ.

There was a question in class as to whether this could really be possible. Specifically, if Δt0,λ is 0 for all t>t0+λ, and it's smooth, then it has all derivatives equal to 0 for all t>t0+λ. Shouldn't this make it 0 everywhere?

Well, not so. It's true that for any t>t0+λ, all derivatives are 0. But Taylor's theorem does not force such a function to be 0 everywhere. Remember that Taylor (for smooth functions) just says that for any r, f(x+ε)=f(x)+εfʹ(x)+ε22!fʺ(x)++εr-1(r-1)!f(r-1)(x)εr-1+εrr!f(r)(y) for some y[x,x+ε].

For a concrete example, consider the function φ(t) which is exp(-1/(1-t2)) for t<1, and is 0 for t1. It's easy to see that φ(t) is smooth on (-1,1) and it's not hard to check that its derivatives, of all orders, at ±1 (when approached from inside) are 0.

In fact, functions like φ (known as bump functions or mollifiers) are what one uses to create functions like Δt0,λ -- essentially, by taking Δ*φλ, where Δt0 denotes the actual discontinuous step function, * denotes convolution, and φλ denotes a compressed version of φ, viz., φ(t/λ)/λ.