tag:blogger.com,1999:blog-367637312024-03-08T07:23:34.307-05:0018-859S: Analysis of Boolean FunctionsThe course blog for <a href="http://www.cs.cmu.edu/~odonnell/boolean-analysis/">Carnegie Mellon Computer Science course 18-859S</a>, offered in Spring 2007 with instructor Ryan O'Donnell.Ryan O'Donnellhttp://www.blogger.com/profile/01760886084136827344noreply@blogger.comBlogger50125tag:blogger.com,1999:blog-36763731.post-67488179230385887262007-05-04T11:08:00.000-04:002007-05-04T11:09:17.053-04:00ThanksWell, the course is over -- thanks to everyone for coming and making it a fun affair; I really appreciate it. Don't forget to fill in the <a href="http://www.cmu.edu/fce">course evaluation forms</a>.<br /><br />Have a good summer.Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-36763731.post-36801175414082725622007-05-03T10:53:00.000-04:002007-05-04T11:08:07.936-04:00Open ProblemsThe final set of lecture notes are posted -- they contain open problems. Some are definitely harder than others; some are definitely more interesting than others. If you have any more questions about them, feel free to comment. Also, if you are interested in working on any of them, let me know! If you are looking for an easyish-to-think-about problem, one that you can think about idly with a pen and paper while riding the bus, I might suggest the problem of extending the decision tree/influences inequality to real-valued functions.<br /><br />I probably missed half a dozen better open problems out there. I may add to this list as they occur to me. To paraphrase Mitch Hedberg, these are the open problems I could think of <i>today</i>.Unknownnoreply@blogger.com3tag:blogger.com,1999:blog-36763731.post-63115776472339549662007-05-03T10:52:00.001-04:002007-05-03T10:52:34.521-04:00Homework #5 gradedHomework 5 is graded. The average was 90%, with two people getting 39/40.<br /><br />Most problems were done pretty well; interestingly, only one person attempted #7, even though it has the shortest solution after #3. A number of people did #2 but no one really got part (b) exactly correctly. The issue there is that you have a <i>real</i>-valued (not integer-valued) random variable and you know that it has exponentially small tails; you want to show that the contribution to its $L_1$-norm from large numbers is exponentially small. There is a battle here, between, the contribution of $|u|$ and the probability $O(e^{-u^2/2})$, which of course the exponentially small quantity is going to win; still, one has to integrate, as shown in the solutions. <br /><br />Also, <i>nobody</i> got #6(c) quite correct. The Tribes function is <i>not</i> exactly balanced; its empty Fourier coefficient is just known to be $\pm O(\log n / n)$. Thus $\mathbb{N}\mathbb{S}_{1/2 - \gamma}(T)$ will <i>always</i> be at most $1/2 - \Omega(\log^2 n / n^2)$, which is why the condition $\gamma \geq 1/n$ is necessary.<br /><br /><br />I'm also pleased to report that the overall homework average was 83%. Good job, y'all.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-28714650102709282042007-05-01T15:43:00.001-04:002007-05-01T15:44:11.425-04:00Course evaluationPlease take a minute to fill in the <a href="http://www.cmu.edu/fce">course evaluation forms</a>!Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-36763731.post-61045125642547038402007-05-01T15:41:00.000-04:002007-05-01T15:42:40.521-04:00Szemeredi's Regularity Lemma, full versionThe lecture notes for today are up; they include a full proof of the Szemeredi Regularity Lemma for $\mathbb{F}_2^n$ and also the associated result about testing "triangle"-freeness of subsets of $\mathbb{F}_2^n$.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-39976635738355999512007-04-25T15:57:00.000-04:002007-04-25T16:33:32.667-04:00Homework #5 & TalagrandSolutions are posted.<br /><br />Regarding Question #2: I referred to this as "Talagrand's Lemma", and that's how I personally think of it, having learned it first as Proposition 2.2 in Michel Talagrand's ingenious paper <span style="font-style:italic;"><a href="http://www.math.ohio-state.edu/~talagran/preprints/positivecor.dvi">How much are increasing sets positively correlated?</a></span>. <br /><br />I think that, really, it should count as a folklore result. For example, in arithmetic combinatorics I think it's sometimes called <a href="http://math.ucr.edu/mathweb/People/IndividualPages/Chang.html">Chang's Theorem</a>. It is also the pith of the very technical proof of Bourgain's Theorem on the Fourier tails of boolean functions (not available online, but appearing in a <a href="http://www.cims.nyu.edu/~naor/homepage%20files/nonembed-final-new.pdf">paper of Khot and Naor</a>). Bourgain uses a different proof, but doesn't quite derive it explicitly. The different proof relies on hypercontractivity and is extremely short: <br /><br />Given $f : \{-1,1\}^n \to \mathbb{R}$, let $L$ be the degree-1 part of $f$. Then by Plancherel, $W_1(f) = \langle f, L \rangle \leq \|f\|_{q'} \|L\|_{q}$ for any $1/q + 1/q' = 1$, by Holder. Use hypercontractivity on $\|L\|_q$, noting that it has degree 1. One gets $\|L\|_q \leq \sqrt{q-1} \|L\|_2 = \sqrt{q-1} \sqrt{W_1(f)}$. On the other hand, we have $\|f\|_{q'} = E[|f|^{q'}]^{1/q'} \leq p^{1/q'} = p^{1 - 1/q}$ where $p = E[|f|]$, since $|f| \leq 1$ and $q' \geq 1$. Putting it together, we get $\sqrt{W_1(f)} \leq p^{1-1/q} \sqrt{q - 1}$, which implies $W_1(f) \leq p^2 \cdot q(1/p)^{2/q}$. Now we just need to choose $q$ to balance $q$ and $(1/p)^{2/q}$; the optimum choice is $q = \Theta(\log(1/p))$, and this completes the proof.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-59187217662982747852007-04-24T13:54:00.001-04:002007-04-24T13:59:24.283-04:00Randomized DT complexity of monotone graph propertiesI posted the lecture notes for today; they contain a bit more information about Yao's Conjecture ($f$ a monotone graph property $\Rightarrow R(f) \geq \Omega(v^2)$).Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-80433383350034790972007-04-24T08:55:00.000-04:002007-04-24T09:15:25.565-04:00EdinburghJust got back from a <a href="http://icms.org.uk/workshop.php?id=3">workshop on Geometry and Algorithms</a> in Edinburgh. Was there any Fourier analysis? But of course! What would geometry and algorithms be without it?<br /><br />Unfortunately, though, it's hard to get it into one's talk in 50 minutes, since one has to cover the geometry and algorithms parts. Guy Kindler mentioned some Fourier analysis in connection with UGC-hardness for Max-Cut. Assaf Naor talked a bit about his <a href="http://www.cims.nyu.edu/%7Enaor/homepage%20files/COTYPE.pdf">upcoming magnum opus</a> with Manor Mendel on "Metric Cotype" which will appear in the Annals of Math. That paper is pretty much all Fourier analysis for functions $f : {\mathbb{Z}_m}^n \to \mathbb{R}$, but with $m > 2$. And Robi Krauthgamer came closest to actually doing some Fourier analysis in his slides, while talking about his work on the communication complexity and embeddability of Edit Distance. Unfortunately, his paper (joint with Alexandr Andoni) doesn't seem to be available yet.<br /><br />I did my best to get there while talking about the parallel repetition/foams stuff I talked about at the <a href="http://magic.aladdin.cs.cmu.edu/2007/03/07/aco-seminar-2007-03-08/">ACO Seminar</a> in March, but no dice. I do think the Fourier analysis we used in that result was pretty cute though; basically: (1) Talagrand's Lemma from Homework 5 Problem 2; and (2) the <span style="font-style: italic;">reversed</span> form of the Hypercontractive Inequality, namely:<br /><br /><blockquote>If $f : \{-1,1\}^n \to \mathbb{R}^{\geq 0}$ and $-\infty < q \leq p \leq 1$, then $\|T_\rho f\|_q \geq \|f\|_p$ so long as $0 \leq \rho \leq \sqrt{(1-p)/(1-q)}$.</blockquote><br /><br />You know -- just like the usual inequality, but reversed.Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-36763731.post-7341135580319356122007-04-20T09:16:00.000-04:002007-04-20T09:18:35.559-04:00Homework #5Hope you enjoyed Avrim on Tuesday and a day off on Thursday. <br />Please post any questions or comments about Homework #5 here in the comments.Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-36763731.post-78639344845069300202007-04-08T23:47:00.000-04:002007-04-08T23:50:36.384-04:00Homework #4 gradedThe average was 35/40, with a high of 38. As usual, pretty much everybody knew what was going on.<br /><br />The most popular questions seemed to be #1 and #2, even though those were definitely the longest (especially #1!). #5 is definitely the fastest, if you see how to do it.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-45474819616521075352007-04-08T19:04:00.001-04:002007-04-08T19:04:47.023-04:00Homework #4Solutions are posted on the course web page.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-39905127375721701392007-04-05T13:58:00.000-04:002007-04-05T14:22:24.513-04:00Geometry of correlated GaussiansIn class we were discussing the geometry of $\rho$-correlated, $n$-dimensional Gaussians; call them $\vec{g}$, $\vec{h}$.<br /><br />We have that $\| \vec{g} \|^2 = \sum_{i=1}^n g_i^2$. This is the sum of $n$ very well-behaved independent random variables with mean $1$ and variance $2$, so its distribution value will be $n \pm O(\sqrt{n})$ with very high probability. (To be more accurate, say $n \pm O(\sqrt{n} \log n)$ with probability at least $1 - n^{-O(1)}$.) Hence $\|\vec{g}\|$ will be $\sqrt{n}(1 + O(1/\sqrt{n})$ with high probability. Since $\vec{h}$ is also distributed as a standard $n$-dimensional Gaussian, the same is true of $\vec{h}$.<br /><br />Now imagine $\vec{g}$ is fixed and we choose $\vec{h}$ to be $\rho$-correlated to $\vec{g}$. Then $\vec{g} \cdot \vec{h} = \sum_{i=1}^n g_i h_i = \sum_{i=1}^n (\rho g_i^2 + \sqrt{1-\rho^2} g_i g_i')$, where the $g_i'$ random variables are independent standard normals. I.e., the dot product is $\rho \|\vec{g}\|^2$ plus an independent one-dimensional normal, $\sqrt{1-\rho^2} \cdot N(0, \sum_i g_i^2)$. Since $\|\vec{g}\|^2 = \sum_i g_i^2 = n \pm O(\sqrt{n})$ with high probability, the dot product is $\rho n \pm O((\rho + \sqrt{1-\rho^2})\sqrt{n})$ with high probability.<br /><br />So the cosine of the angle between $\vec{g}$ and $\vec{h}$ will be $(\rho n \pm O(\sqrt{n}))/(n \pm O(\sqrt{n})) = \rho \pm O(1/\sqrt{n})$, and hence the angle will be $\arccos \rho \pm O(1/\sqrt{n})$ with high probability (assuming $\rho$ is treated as a constant in $(0,1)$).<br /><br />Now this doesn't really yet prove that $\vec{g}$ and $\vec{h}$ are distributed like a random pair on the surface of the $\sqrt{n}$-radius sphere with angle $\arccos \rho$. We just know that their angle will be $\arccos \rho$ and they will both be on the sphere. We should really look at the distribution of $\vec{h} - \rho \vec{g}$. But this is just a scaled $n$-dimensional Gaussian, so its distribution is spherically symmetric. I guess this more or less justifies the overall claim. <br />Actually, I believe that $\vec{h} - \rho \vec{g}$ will be close to orthogonal to $\rho \vec{g}$ with high probability. The dot product between the two will be distributed like $\sqrt{1-\rho^2} \cdot N(0, \sum_i g_i^2)$, which will probably be on the order of $\sqrt{1-\rho^2}\sqrt{n}$. However the product of the lengths of the two vectors will be like $\rho \sqrt{1-\rho^2} n$. So the cosine of the angle between them will be like $1/(\rho \sqrt{n})$, and the angle will be close to $90$ degrees.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-60500216408142970992007-04-04T19:24:00.000-04:002007-04-04T23:04:18.659-04:00Approximating not-nice functions by nice functionsIn the last class I asserted that it was possible to approximate certain not-nice functions like $1_{t < t_0}$ and $|t|$ by $B$-nice functions. For example, I asserted that for all $t_0 \in \mathbb{R}$ and $0 < \lambda < 1/2$ there exists a function $\Delta_{t_0,\lambda} : \mathbb{R} \to \mathbb{R}$ which is $O(1/\lambda^4)$-nice and approximates the $t_0$-step-function in the following sense: $\Delta_{t_0, \lambda}(t) = 1$ for $t < t_0 - \lambda$; $\Delta_{t_0, \lambda}(t) = 0$ for $t > t_0 + \lambda$; and, $0 \leq \Delta_{t_0, \lambda}(t) \leq 1$ for $|t - t_0| \leq \lambda$.<br /><br />There was a question in class as to whether this could really be possible. Specifically, if $\Delta_{t_0, \lambda}$ is $0$ for all $t > t_0 + \lambda$, and it's smooth, then it has all derivatives equal to $0$ for all $t > t_0 + \lambda$. Shouldn't this make it $0$ everywhere?<br /><br />Well, not so. It's true that for any $t > t_0 + \lambda$, all derivatives are $0$. But Taylor's theorem does not force such a function to be $0$ everywhere. Remember that Taylor (for smooth functions) just says that for any $r \in \mathbb{N}$, $f(x + \epsilon) = f(x) + \epsilon f'(x) + \frac{\epsilon^2}{2!} f''(x) + \cdots + \frac{\epsilon^{r-1}}{(r-1)!} f^{(r-1)}(x) \epsilon^{r-1} + \frac{\epsilon^r}{r!} f^{(r)}(y)$ for some $y \in [x, x + \epsilon]$.<br /><br />For a concrete example, consider the function $\phi(t)$ which is $\exp(-1/(1-t^2))$ for $|t| < 1$, and is $0$ for $|t| \geq 1$. It's easy to see that $\phi(t)$ is smooth on $(-1,1)$ and it's not hard to check that its derivatives, of all orders, at $\pm 1$ (when approached from inside) are $0$.<br /><br />In fact, functions like $\phi$ (known as <i>bump functions</i> or <a href="http://en.wikipedia.org/wiki/Mollifier"><i>mollifiers</i></a>) are what one uses to <i>create</i> functions like $\Delta_{t_0, \lambda}$ -- essentially, by taking $\Delta * \phi_\lambda$, where $\Delta_{t_0}$ denotes the actual discontinuous step function, $*$ denotes convolution, and $\phi_\lambda$ denotes a compressed version of $\phi$, viz., $\phi_(t/\lambda)/\lambda$.Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-36763731.post-34169302301495336592007-03-27T18:05:00.001-04:002007-03-27T18:14:16.280-04:00Ben Green and Terry TaoSeems like Ben Green and (Fields Medalist) Terry Tao are into the analysis of boolean functions more and more these days. Check out Terry's <a href="http://terrytao.wordpress.com/2007/03/22/freimans-theorem-in-finite-fields-via-extremal-set-theory/">recent blog post</a> on Freiman's theorem in $\mathbb{F}_2^n$, AKA $\{-1,1\}^n$, and Ben's <a href="http://terrytao.wordpress.com/2007/03/11/ben-green-the-polynomial-freiman-ruzsa-conjecture/">earlier guest post</a> on the same blog on the Polynomial Freiman-Ruzsa conjecture. Their latest two papers (mentioned in Terry's post) have all of our favorites -- the boolean cube, Fourier analysis, monotone functions, shifting, quasirandomness, linear threshold functions.<br /><br />If only I understood them...<br /><br />As usual, they look impeccably written, so it's just a matter of putting in the time to read them. If any of you wants to do this and then explain them to me, that would be great :)<br /><br />For a warmup, I always recommend Ben Green's terrific survey article, <a href="http://www.arxiv.org/pdf/math.NT/0409420">Finite Field Models In Additive Combinatorics</a>. The main difference between their world and ours is that we obsess over the <span style="font-style: italic;">low-degree </span>Fourier coefficients, treating them as most notable. On the other hand, they are perfectly egalitarian about $\hat{f}(S)$ across $S$; from their arithmetic point of view, all parity functions are created equal.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-41590196923150505982007-03-23T17:08:00.000-04:002007-03-23T17:24:25.343-04:00How much are increasing sets positively correlated?There's a paper of Talagrand, one of the masters of Fourier analysis of boolean functions, called "<a href="http://www.proba.jussieu.fr/users/talagran/preprints/positivecor.dvi">How much are increasing sets positively correlated?</a>" which I've always kind of known about but never really fully read. A few days ago I studied it more closely, and it turns out to be really nice, with proofs that are not too hard (although the notation took me a while to decipher). Maybe I'll show some of it in a future class.<br /><br />It proves some nice facts, including:<br /><ul><li>If $f : \{-1,1\}^n \to \{-1,1\}$ has $Var[f] \leq p$, then $W_1(f) = \sum_{|S| = 1} \hat{f}(S)^2 \leq O(p \log (1/p))$. (This is a fact that I seem to use all the time in my work...)</li><li>If $\mathbb{I}\mathbb{I}(f)$ denotes $\sum_i Inf_i(f)^2$, then $W_2(f) = \sum_{|S| = 2} \hat{f}(S)^2 \leq O(\mathbb{I}\mathbb{I}(f) \log(1/\mathbb{I}\mathbb{I}(f)))$. (This is the main technical lemma/theorem in the paper.)</li><li>There is a monotone function $f : \{-1,1\}^n \to \{-1,1\}$ for which a <span style="font-style: italic;">constant fraction</span> of all $x$ have the property that $f$ has $\Omega(\sqrt{n})$ sensitive coordinates on $x$. This immediately implies that the function has $\mathbb{I}(f) \geq \Omega(\sqrt{n})$, but in a much different way than $\mathbb{I}(Maj_n) \geq \Omega(\sqrt{n})$. With Majority, most $x$'s have $0$ sensitive coordinates, but a $\Omega(1/\sqrt{n})$ fraction of them have $(n+1)/2$ sensitive coordinates. With this function, the distribution on the number of sensitive coordinates is much simpler -- a constant fraction have $\Omega(\sqrt{n})$. The function is constructed randomly -- a random DNF of width $\sqrt{n}$ and size $(\ln 2) 2^{\sqrt{n}}$.<br /></li></ul>Actually, this last function, one can check, gives an example of a monotone function that has $NS_{O(1/\sqrt{n})}(f) \geq \Omega(1)$; and as we saw when studying hardness amplification, $\Theta(1/\sqrt{n})$ is the least noise rate to which a monotone function can be sensitive.<br /><br /><br />An important improvement to the second bullet point above came from a notable paper of B<a href="http://front.math.ucdavis.edu/math.PR/9811157">enjamini, Kalai, and Schramm</a>: They showed that in fact $W_k(f) \leq O(\mathbb{I}\mathbb{I} (f) \log^{k-1}(1/\mathbb{I}\mathbb{I}(f))$. I'm trying to use this fact right now for a problem I'm working on...Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-72906845509646917832007-03-21T22:12:00.000-04:002007-03-21T22:13:03.913-04:00Homework #4No time like the present to get started on Homework #4. Post comments/questions here.Unknownnoreply@blogger.com3tag:blogger.com,1999:blog-36763731.post-11826755361543590472007-03-20T11:14:00.000-04:002007-03-20T11:19:07.387-04:00Homework #3 gradedI graded Homework #3. The mean and median were both 39/50, with a high score of 48/50.<br /><br />Kudos to Runting Shi and Yi Wu for (implicitly, by giving correct answers) noticing that the weighting function in (6b) is a sum of Kravchuk polynomials, not a Kravchuk polynomials as I mistakenly wrote in the solutions.<br /><br />Kudos also to Ryan Williams for improving the upper bound in problem 1 in some cases: He showed $\mathbb{I}(f) \leq (2 - 2\alpha/(1+\alpha)) w$ when $f$ is a DNF of width $w = \alpha n$. Note that this is tight in the case $\alpha = 1$, by the Parity function.<br /><br />Open problem note: I'm guessing that indeed $\mathbb{I}(f) \leq w$ for any width-$w$ DNF. Might be a nice thing to try to prove, although I couldn't guarantee that it isn't already known.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-71422495348054759572007-03-19T17:07:00.000-04:002007-03-19T17:08:53.504-04:00Homework #4 is outHi all. Homework #4 is posted on the course homepage; it's due Tuesday April 3. I'll bring hard copies to class tomorrow. I should also have Homework #3 graded by tomorrow.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-14741163149749392272007-03-16T23:31:00.001-04:002007-03-16T23:47:37.479-04:00History of the Hypercontractivity TheoremHope you're having a good spring break. I put up the lecture notes for Lecture 16, on the Hypercontractivity Theorem. I find the history of the theorem quite fascinating, so I've put a slightly briefer version below. If you read French, you may also want to look at Djalil Chafai's <a href="http://hal.archives-ouvertes.fr/docs/00/04/18/00/PDF/sobolog-djalil.pdf">version</a>. Gross also has a nice survey about the history, but unfortunately I can't find it right now...<br /><br />---<br /><br />The history of the Hypercontractivity Theorem is quite involved, and<br />interesting. To understand the history, it's important to first note<br />that it has a "Gaussian'' version. In this setting it applies to functions $f : \R^n \to \R$, where $\R^n$ is thought of as having the Gaussian distribution and the $T_\rho$ operator is an appropriately generalized linear operator (specifically, the "Ornstein-Uhlenbeck operator" from mathematical physics, which we'll encounter later). Gross observed in 1975 that this Gaussian version <span style="font-style: italic;">follows</span> from our boolean version, by a straightforward argument using the Central Limit Theorem.<br /><br />An early version of the theorem in the Gaussian setting was proved by <a href="http://www.math.princeton.edu/%7Enelson/">Edward Nelson </a>of Princeton in the mid-60's [Nel:66]; I think it was the $q = 4$, $p = 2$ case, possibly with a constant bigger than $1$ on the right side. This was in an important paper on quantum field theory. Several works in subsequent years (e.g., Glimm '68, Federbush '69) improved on the result, and the culmination was the complete proof of the theorem in the Gaussian setting, due to Nelson again [Nel:73]. Nelson's two papers won him the <a href="http://www.ams.org/notices/199511/steele.pdf">Steele Prize</a>. He is an interesting character, having gone on to work on foundations of mathematics, bounded arithmetic, and automatic proof verification; he is now well-known for having invented <a href="http://en.wikipedia.org/wiki/Internal_set_theory">Internal Set Theory</a>, a partial axiomatization of Nonstandard Analysis.<br /><br />In 1973, <a href="http://www.math.cornell.edu/People/Faculty/grossl.html">Leonard Gross</a> proved a limiting version of the theorem called a Logarithmic Sobolev Inequality, and deduced Nelson's Hypercontractive theorem from it [Gro:75]. His proof was in the boolean setting, getting the Gaussian setting via the Central Limit Theorem. However, it turned out that the proof of the Hypercontractivity Theorem (in the boolean setting) was not new; it was first proved by <a href="http://www.univ-orleans.fr/mapmo//membres/bonami/welcome.html">Aline Bonami</a> [Bon:70] in 1979. In fact, Bonami had proved the $q = 4$, $p = 2$ case in the boolean setting even earlier [Bon:68].<br /><br />The history of the theorem can be traced even further back; Bonami's work was informed by that of <a href="http://en.wikipedia.org/wiki/Walter_Rudin">Walter Rudin</a>, who proved [Rud:60] similar inequalities in the setting of $\Z_n$ rather than $\{-1,1\}^n$ (``one of my favorite papers'' -- Rudin). Further, a version of the log-sobolev inequality in the Euclidean (rather than Gaussian) setting was proved by <a href="http://genealogy.math.ndsu.nodak.edu/html/id.phtml?id=25515">A. J. Stam</a> [Sta:59] in 1959, in work on Fisher/Shannon information theory -- much closer to the world of computer science!<br /><br />Finally, the Hypercontractivity Theorem was introduced to the world of theoretical computer science in the groundbreaking work of Kahn, Kalai, and Linial [KKL:88]. Unfortunately, they attributed the theorem to <a href="http://www.ma.utexas.edu/users/beckner/">William Beckner</a> [Bec:75], which I'm not sure is exactly an accurate accreditation. Beckner's work was in fact important followup work on the work of Nelson and Gross, making extensions to Euclidean and complex settings.<br /><br />In the computer science theory literature, the Hypercontractivity Theorem is often called "Beckner's Theorem". Lately there has been a move towards "Bonami-Beckner Theorem", although "Bonami Theorem" would respect the original discoverer and "Bonami-Gross Theorem" might more properly respect the independent discovery. To sidestep the issue, we will simply call it the Hypercontractivity Theorem.Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-36763731.post-91077542353986526962007-03-13T18:07:00.001-04:002007-03-13T18:07:52.202-04:00Solutions for Homework 3Are posted.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-40841192020287561192007-03-08T16:11:00.000-05:002007-03-08T16:17:47.595-05:00Shouts-outOn the topic of hypercontractivity, I just wanted to give a couple of shouts-out. Back when I learned this stuff, I didn't have any intuition at all about the hypercontractivity theorem, and would ask all and sundry, "Do you have any idea what's really going on here?" "Do you have any idea what's really going on here?"<br /><br /><a href="http://www.google.com/search?q=%22krzysztof+oleszkiewicz%22">Krzysztof Oleszkiewicz</a> showed me the slick, simple induction proof of $(2, 4, 1/\sqrt{3}^d)$-hypercontractivity for boolean functions with degree at most $d$. I asked him, "Who is this proof due to? You?" He said, "No no no, not me. Folklore, I guess."<br /><br />Top Analysis-of-Boolean-Functions expert <a href="http://dimacs.rutgers.edu/%7Egkindler/">Guy Kindler</a> showed me the nice level-sets, geometrical-picture proof of the Two-Point Inequality.<br /><br />So -- thanks to both of them!Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-59527057716721947082007-03-07T10:18:00.000-05:002007-03-07T10:25:09.674-05:00Thursday's class: a warningHi all. I know we're all in the School of Computer Science, but Thursday's class is going to be all math, all the time. We've already proved most (all?) of the hypercontractivity stuff we'll need. But you can't have a class on Fourier Analysis of Boolean Functions without actually doing the full-blown Bonami-Beckner theorem. I mean, just for culture's sake.<br /><br />So I'm just warning you, it's going to be all hot <a href="http://en.wikipedia.org/wiki/Minkowski%27s_inequality">Minkowski Inequality</a> and <a href="http://en.wikipedia.org/wiki/Binomial_theorem">Generalized Binomial Theorem</a> action on Thursday. I think I'll scribe myself.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-58431300202926268972007-03-06T08:58:00.000-05:002007-03-06T09:05:32.519-05:00Homework 3 CorrectionsHi -- two more corrections for Homework 3. I guess this is what you get when running a course for the first time...<br /><br />Question 1: Eliminate the part "and show that this is tight". In fact, I strongly believe it's not tight. For extra credit, improve on the constant factor 2.<br /><br />Question 3a: This one is too hard. I thought I knew an easy way to do it, but I was mistaken. For the story on this problem, see this paper of <a href="http://www.maruoka.ecei.tohoku.ac.jp/%7Eama/paper/ama_tcs_final.ps">Amano and Maruoka</a> (but note that the theorem in their paper improving on KKL was already known).<br /><br />A corrected version of the homework is on the course web page.<br /><br />--------<br /><br />Two small additional clarifications: In 4a, you may assume the algorithm knows $\|\hat{f}\|_1$. In 5a, as Aaron pointed out, it would probably be more sensible if I asked for you to show the estimate is within $\sqrt{\epsilon/n^d}$ rather than $\epsilon$; however, it doesn't really matter since $\epsilon$ is a parameter.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-36763731.post-20707156463086226882007-03-05T10:27:00.000-05:002007-03-05T10:44:08.690-05:00Homework correction + the Ajtai-Linial FunctionFirst, Aaron correctly pointed out a mistake on the problem set. In 4a, you should also assume the algorithm has <span style="font-weight: bold;">query</span> access to $f$. The corrected version of the homework is on the course web page.<br /><br />Now for the real topic of the post. Recall that KKL proved that if $f : \{-1,1\}^n \to \{-1,1\}$ is any roughly balanced function, there is coalition $J \subseteq [n]$ of size $O(n / \log n)$ with influence .999 on $f$.<br /><br />A construction of Ajtai and Linial (<span style="font-style: italic;">The influence of large coalitions</span><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b>, </b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b>Combinatorica 13, 1993; annoyingly, not available online) shows this is close to optimal.<br /><br />The [AL] construction is randomized. Recall the Tribes function, which breaks up the input into disjoint blocks ("tribes") of size about $\log n - C \log \log n$ and takes the OR across blocks of the AND of the blocks. Here $C$ is a carefully chosen constant (around $1$) picked so that the overall function is roughly balanced.<br /><br />Let's say a <span style="font-style: italic;">random nonmonotone Tribes function of width </span>$w$ is given by randomly partitioning $[n]$ into blocks of size $w$, randomly choosing whether to negate each variable (50-50), and then again ANDing within blocks and ORing across blocks. <br /><br />Choose $w$ so that these functions will be True with probability around $1 - (\ln 2)/n$. (This involves taking $C \sim 2$.) <br /><br />Finally, the [AL] construction is this: Choose $n$ random nonmonotone Tribes functions with this width, and AND them together.<br /><br />--------<br /><br />Here is a sketch of the properties of this (randomly chosen) function:<br /><br />1. The probability it is True is very close to $1/2$, with very high probability.<br /><br />2. It's hard for small coalitions to force the function towards True. Because to do this, they have to force many of the random Tribes functions towards True, and for each such function ($\sim n$ of them), they have to force one of its Tribes (about $\log n$ variables) towards True. By the random construction, the different Tribes don't overlap much, so the coalition has to be nearly linear in size.<br /><br />3. It's hard for small coalitions to force the function towards False. To do this, they have to force at least one of the Tribes functions towards False, and this requires fixing at least one variable in each tribe within the function ($\sim n/\log n$ variables).<br /><br />--------<br /><br />The actual theorem is that with high probability over the choice of $f$, $f$ is near-balanced and every coalition $J$ of size $o(n/\log^2 n)$ has influence $o(1)$ on $f$.<br /><br />Oddly enough, the hardest part of the proof is showing that the functions are near-balanced with very high probability.<br /><br />--------<br /><br />Some open problems:<br /><br />1. Close the gap between $n/\log^2 n$ and $n/\log n$. [AL] speculated that the latter is probably the truth, but I'm not so sure...<br /><br />2. Give an <span style="font-style: italic;">explicit</span> function where all $o(n/\log^2 n)$-size coalitions have $o(1)$ influence. In fact, I <span style="font-style: italic;">believe</span> it's open even to do this for $c \sqrt{n}$-size coalitions for large enough $c$.<br /><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b><b> </b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b></b>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-36763731.post-73158791266362384542007-02-28T13:07:00.000-05:002007-02-28T13:20:27.260-05:00Homework 3Hi all. Any comments/questions about Homework #3 can go in the comments here.Unknownnoreply@blogger.com6