18-859S: Analysis of Boolean Functions: March 2007

Tuesday, March 27, 2007

Ben Green and Terry Tao

Seems like Ben Green and (Fields Medalist) Terry Tao are into the analysis of boolean functions more and more these days. Check out Terry's recent blog post on Freiman's theorem in $\mathbb{F}_2^n$, AKA $\{-1,1\}^n$, and Ben's earlier guest post on the same blog on the Polynomial Freiman-Ruzsa conjecture. Their latest two papers (mentioned in Terry's post) have all of our favorites -- the boolean cube, Fourier analysis, monotone functions, shifting, quasirandomness, linear threshold functions.

If only I understood them...

As usual, they look impeccably written, so it's just a matter of putting in the time to read them. If any of you wants to do this and then explain them to me, that would be great :)

For a warmup, I always recommend Ben Green's terrific survey article, Finite Field Models In Additive Combinatorics. The main difference between their world and ours is that we obsess over the low-degree Fourier coefficients, treating them as most notable. On the other hand, they are perfectly egalitarian about $\hat{f}(S)$ across $S$; from their arithmetic point of view, all parity functions are created equal.

Friday, March 23, 2007

How much are increasing sets positively correlated?

There's a paper of Talagrand, one of the masters of Fourier analysis of boolean functions, called "How much are increasing sets positively correlated?" which I've always kind of known about but never really fully read. A few days ago I studied it more closely, and it turns out to be really nice, with proofs that are not too hard (although the notation took me a while to decipher). Maybe I'll show some of it in a future class.

It proves some nice facts, including:

If $f : \{-1,1\}^n \to \{-1,1\}$ has $Var[f] \leq p$, then $W_1(f) = \sum_{|S| = 1} \hat{f}(S)^2 \leq O(p \log (1/p))$. (This is a fact that I seem to use all the time in my work...)
If $\mathbb{I}\mathbb{I}(f)$ denotes $\sum_i Inf_i(f)^2$, then $W_2(f) = \sum_{|S| = 2} \hat{f}(S)^2 \leq O(\mathbb{I}\mathbb{I}(f) \log(1/\mathbb{I}\mathbb{I}(f)))$. (This is the main technical lemma/theorem in the paper.)
There is a monotone function $f : \{-1,1\}^n \to \{-1,1\}$ for which a constant fraction of all $x$ have the property that $f$ has $\Omega(\sqrt{n})$ sensitive coordinates on $x$. This immediately implies that the function has $\mathbb{I}(f) \geq \Omega(\sqrt{n})$, but in a much different way than $\mathbb{I}(Maj_n) \geq \Omega(\sqrt{n})$. With Majority, most $x$'s have $0$ sensitive coordinates, but a $\Omega(1/\sqrt{n})$ fraction of them have $(n+1)/2$ sensitive coordinates. With this function, the distribution on the number of sensitive coordinates is much simpler -- a constant fraction have $\Omega(\sqrt{n})$. The function is constructed randomly -- a random DNF of width $\sqrt{n}$ and size $(\ln 2) 2^{\sqrt{n}}$.

Actually, this last function, one can check, gives an example of a monotone function that has $NS_{O(1/\sqrt{n})}(f) \geq \Omega(1)$; and as we saw when studying hardness amplification, $\Theta(1/\sqrt{n})$ is the least noise rate to which a monotone function can be sensitive.

An important improvement to the second bullet point above came from a notable paper of Benjamini, Kalai, and Schramm: They showed that in fact $W_k(f) \leq O(\mathbb{I}\mathbb{I} (f) \log^{k-1}(1/\mathbb{I}\mathbb{I}(f))$. I'm trying to use this fact right now for a problem I'm working on...

Wednesday, March 21, 2007

Homework #4

No time like the present to get started on Homework #4. Post comments/questions here.

Tuesday, March 20, 2007

Homework #3 graded

I graded Homework #3. The mean and median were both 39/50, with a high score of 48/50.

Kudos to Runting Shi and Yi Wu for (implicitly, by giving correct answers) noticing that the weighting function in (6b) is a sum of Kravchuk polynomials, not a Kravchuk polynomials as I mistakenly wrote in the solutions.

Kudos also to Ryan Williams for improving the upper bound in problem 1 in some cases: He showed $\mathbb{I}(f) \leq (2 - 2\alpha/(1+\alpha)) w$ when $f$ is a DNF of width $w = \alpha n$. Note that this is tight in the case $\alpha = 1$, by the Parity function.

Open problem note: I'm guessing that indeed $\mathbb{I}(f) \leq w$ for any width-$w$ DNF. Might be a nice thing to try to prove, although I couldn't guarantee that it isn't already known.

Monday, March 19, 2007

Homework #4 is out

Hi all. Homework #4 is posted on the course homepage; it's due Tuesday April 3. I'll bring hard copies to class tomorrow. I should also have Homework #3 graded by tomorrow.

Friday, March 16, 2007

History of the Hypercontractivity Theorem

Hope you're having a good spring break. I put up the lecture notes for Lecture 16, on the Hypercontractivity Theorem. I find the history of the theorem quite fascinating, so I've put a slightly briefer version below. If you read French, you may also want to look at Djalil Chafai's version. Gross also has a nice survey about the history, but unfortunately I can't find it right now...

---

The history of the Hypercontractivity Theorem is quite involved, and
interesting. To understand the history, it's important to first note
that it has a "Gaussian'' version. In this setting it applies to functions $f : \R^n \to \R$, where $\R^n$ is thought of as having the Gaussian distribution and the $T_\rho$ operator is an appropriately generalized linear operator (specifically, the "Ornstein-Uhlenbeck operator" from mathematical physics, which we'll encounter later). Gross observed in 1975 that this Gaussian version follows from our boolean version, by a straightforward argument using the Central Limit Theorem.

An early version of the theorem in the Gaussian setting was proved by Edward Nelson of Princeton in the mid-60's [Nel:66]; I think it was the $q = 4$, $p = 2$ case, possibly with a constant bigger than $1$ on the right side. This was in an important paper on quantum field theory. Several works in subsequent years (e.g., Glimm '68, Federbush '69) improved on the result, and the culmination was the complete proof of the theorem in the Gaussian setting, due to Nelson again [Nel:73]. Nelson's two papers won him the Steele Prize. He is an interesting character, having gone on to work on foundations of mathematics, bounded arithmetic, and automatic proof verification; he is now well-known for having invented Internal Set Theory, a partial axiomatization of Nonstandard Analysis.

In 1973, Leonard Gross proved a limiting version of the theorem called a Logarithmic Sobolev Inequality, and deduced Nelson's Hypercontractive theorem from it [Gro:75]. His proof was in the boolean setting, getting the Gaussian setting via the Central Limit Theorem. However, it turned out that the proof of the Hypercontractivity Theorem (in the boolean setting) was not new; it was first proved by Aline Bonami [Bon:70] in 1979. In fact, Bonami had proved the $q = 4$, $p = 2$ case in the boolean setting even earlier [Bon:68].

The history of the theorem can be traced even further back; Bonami's work was informed by that of Walter Rudin, who proved [Rud:60] similar inequalities in the setting of $\Z_n$ rather than $\{-1,1\}^n$ (``one of my favorite papers'' -- Rudin). Further, a version of the log-sobolev inequality in the Euclidean (rather than Gaussian) setting was proved by A. J. Stam [Sta:59] in 1959, in work on Fisher/Shannon information theory -- much closer to the world of computer science!

Finally, the Hypercontractivity Theorem was introduced to the world of theoretical computer science in the groundbreaking work of Kahn, Kalai, and Linial [KKL:88]. Unfortunately, they attributed the theorem to William Beckner [Bec:75], which I'm not sure is exactly an accurate accreditation. Beckner's work was in fact important followup work on the work of Nelson and Gross, making extensions to Euclidean and complex settings.

In the computer science theory literature, the Hypercontractivity Theorem is often called "Beckner's Theorem". Lately there has been a move towards "Bonami-Beckner Theorem", although "Bonami Theorem" would respect the original discoverer and "Bonami-Gross Theorem" might more properly respect the independent discovery. To sidestep the issue, we will simply call it the Hypercontractivity Theorem.

Tuesday, March 13, 2007

Solutions for Homework 3

Are posted.

Thursday, March 08, 2007

Shouts-out

On the topic of hypercontractivity, I just wanted to give a couple of shouts-out. Back when I learned this stuff, I didn't have any intuition at all about the hypercontractivity theorem, and would ask all and sundry, "Do you have any idea what's really going on here?" "Do you have any idea what's really going on here?"

Krzysztof Oleszkiewicz showed me the slick, simple induction proof of $(2, 4, 1/\sqrt{3}^d)$-hypercontractivity for boolean functions with degree at most $d$. I asked him, "Who is this proof due to? You?" He said, "No no no, not me. Folklore, I guess."

Top Analysis-of-Boolean-Functions expert Guy Kindler showed me the nice level-sets, geometrical-picture proof of the Two-Point Inequality.

So -- thanks to both of them!

Wednesday, March 07, 2007

Thursday's class: a warning

Hi all. I know we're all in the School of Computer Science, but Thursday's class is going to be all math, all the time. We've already proved most (all?) of the hypercontractivity stuff we'll need. But you can't have a class on Fourier Analysis of Boolean Functions without actually doing the full-blown Bonami-Beckner theorem. I mean, just for culture's sake.

So I'm just warning you, it's going to be all hot Minkowski Inequality and Generalized Binomial Theorem action on Thursday. I think I'll scribe myself.

Tuesday, March 06, 2007

Homework 3 Corrections

Hi -- two more corrections for Homework 3. I guess this is what you get when running a course for the first time...

Question 1: Eliminate the part "and show that this is tight". In fact, I strongly believe it's not tight. For extra credit, improve on the constant factor 2.

Question 3a: This one is too hard. I thought I knew an easy way to do it, but I was mistaken. For the story on this problem, see this paper of Amano and Maruoka (but note that the theorem in their paper improving on KKL was already known).

A corrected version of the homework is on the course web page.

--------

Two small additional clarifications: In 4a, you may assume the algorithm knows $\|\hat{f}\|_1$. In 5a, as Aaron pointed out, it would probably be more sensible if I asked for you to show the estimate is within $\sqrt{\epsilon/n^d}$ rather than $\epsilon$; however, it doesn't really matter since $\epsilon$ is a parameter.

Monday, March 05, 2007

Homework correction + the Ajtai-Linial Function

First, Aaron correctly pointed out a mistake on the problem set. In 4a, you should also assume the algorithm has query access to $f$. The corrected version of the homework is on the course web page.

Now for the real topic of the post. Recall that KKL proved that if $f : \{-1,1\}^n \to \{-1,1\}$ is any roughly balanced function, there is coalition $J \subseteq [n]$ of size $O(n / \log n)$ with influence .999 on $f$.

A construction of Ajtai and Linial (The influence of large coalitions, Combinatorica 13, 1993; annoyingly, not available online) shows this is close to optimal.

The [AL] construction is randomized. Recall the Tribes function, which breaks up the input into disjoint blocks ("tribes") of size about $\log n - C \log \log n$ and takes the OR across blocks of the AND of the blocks. Here $C$ is a carefully chosen constant (around $1$) picked so that the overall function is roughly balanced.

Let's say a random nonmonotone Tribes function of width $w$ is given by randomly partitioning $[n]$ into blocks of size $w$, randomly choosing whether to negate each variable (50-50), and then again ANDing within blocks and ORing across blocks.

Choose $w$ so that these functions will be True with probability around $1 - (\ln 2)/n$. (This involves taking $C \sim 2$.)

Finally, the [AL] construction is this: Choose $n$ random nonmonotone Tribes functions with this width, and AND them together.

--------

Here is a sketch of the properties of this (randomly chosen) function:

1. The probability it is True is very close to $1/2$, with very high probability.

2. It's hard for small coalitions to force the function towards True. Because to do this, they have to force many of the random Tribes functions towards True, and for each such function ($\sim n$ of them), they have to force one of its Tribes (about $\log n$ variables) towards True. By the random construction, the different Tribes don't overlap much, so the coalition has to be nearly linear in size.

3. It's hard for small coalitions to force the function towards False. To do this, they have to force at least one of the Tribes functions towards False, and this requires fixing at least one variable in each tribe within the function ($\sim n/\log n$ variables).

--------

The actual theorem is that with high probability over the choice of $f$, $f$ is near-balanced and every coalition $J$ of size $o(n/\log^2 n)$ has influence $o(1)$ on $f$.

Oddly enough, the hardest part of the proof is showing that the functions are near-balanced with very high probability.

--------

Some open problems:

1. Close the gap between $n/\log^2 n$ and $n/\log n$. [AL] speculated that the latter is probably the truth, but I'm not so sure...

2. Give an explicit function where all $o(n/\log^2 n)$-size coalitions have $o(1)$ influence. In fact, I believe it's open even to do this for $c \sqrt{n}$-size coalitions for large enough $c$.

18-859S: Analysis of Boolean Functions