At the end of last class I stated a proposition we'll need, which says that an -quasirandom function passes the Håst-Odd test with probability at most .
I meant to prove it at the beginning of tomorrow's class but, surprise surprise, it looks like we'll be pressed for time even with out it. So it goes, as Vonnegut would say.
Here then, is the analysis. Really, it's pretty easy. It basically proves itself.
Wednesday, January 31, 2007
Tuesday, January 30, 2007
Boosting the correctness probability in testing
In defining what it means to be a tester for property , I wrote: if satisfies then Pr[ accepts] and if is -far from satisfying then Pr[ accepts] . I then added that the and here is "arbitrary" and could be "boosted".
This phrasing is pretty imprecise, and an anonymous commenter asked what exactly I meant. What's meant is this: Suppose you have a tester making queries that satisfies the given definition. Then for any constant , there is another tester making queries that satisfies the definition with and in place of and . So if you don't care about constant factors in the number of queries, then the choice of and is arbitrary. In "testing" we don't usually care about constant factors on the number of queries, although in "local testing" we often do.
The way to get from is to just run it many times independently and then accept iff at least half of the runs accepted. Then the Chernoff bound implies the claim.
This phrasing is pretty imprecise, and an anonymous commenter asked what exactly I meant. What's meant is this: Suppose you have a tester making queries that satisfies the given definition. Then for any constant , there is another tester making queries that satisfies the definition with and in place of and . So if you don't care about constant factors in the number of queries, then the choice of and is arbitrary. In "testing" we don't usually care about constant factors on the number of queries, although in "local testing" we often do.
The way to get from is to just run it many times independently and then accept iff at least half of the runs accepted. Then the Chernoff bound implies the claim.
Details from the end of Lecture 5
Many thanks to Daniel Golovin for providing the Book proof of for each , the claim we needed to show that any function has at most variables with -attenuated influence at least . His proof:
Fix any . If then certainly . Sum this fact over to conclude . But .
---
Re one thing I mentioned at the end of class: The constant functions and pass the Håst-Odd test with probability (for every ). The reason is, the test picks some , , and , but then also picks random signs and tests "", which is equivalent to . Now if is constant then so is , and then is a random sign, which agrees with the constant with probability .
Note also that the "acceptance constraint" in the Håst-Odd test is either of the form "" or "". I.e., it's a 3-variable "linear" equation where the right-hand side is either "0" or "1" (if you go back and interpret bits as rather than ).
Fix any . If then certainly . Sum this fact over to conclude . But .
---
Re one thing I mentioned at the end of class: The constant functions and pass the Håst-Odd test with probability (for every ). The reason is, the test picks some , , and , but then also picks random signs and tests "", which is equivalent to . Now if is constant then so is , and then is a random sign, which agrees with the constant with probability .
Note also that the "acceptance constraint" in the Håst-Odd test is either of the form "" or "". I.e., it's a 3-variable "linear" equation where the right-hand side is either "0" or "1" (if you go back and interpret bits as rather than ).
Sunday, January 28, 2007
Homomorphism testing
Suresh asks, "Can we make use of the same BLR test to check whether a given mapping between two groups is a homomorphism or not?"
The answer is: Yes.
The question is: Let and be groups. A homomorphism between and (this is a definition from group theory) is a function that satisfies for all , where denotes the group operation in and denotes the group operation in .
A natural question is, does the BLR test "work" for functions ? I.e., if is -far from being a homomorphism (i.e., you need to change its values on inputs to make it a true homomorphism), will the BLR test reject with probability ?
We showed the answer is yes, with rejection probability at least , when and .
The answer is still yes for any finite abelian groups and . Possibly one could prove it with appropriately generalized Fourier analysis, but it's much easier to prove in this general case with pure combinatorics. This was originally proved by BLR for cyclic groups; later, Coppersmith gave a nice proof which also works for all abelian groups. See Section 4.1 of Rubinfeld's Testing Survey. Note that the constant factors necessarily get worse, because of examples like , .
In fact, the BLR test still works even for nonabelian finite groups; this was claimed in BLR and then proved by Ben-Or, Coppersmith, L & R.
The answer is: Yes.
The question is: Let and be groups. A homomorphism between and (this is a definition from group theory) is a function that satisfies for all , where denotes the group operation in and denotes the group operation in .
A natural question is, does the BLR test "work" for functions ? I.e., if is -far from being a homomorphism (i.e., you need to change its values on inputs to make it a true homomorphism), will the BLR test reject with probability ?
We showed the answer is yes, with rejection probability at least , when and .
The answer is still yes for any finite abelian groups and . Possibly one could prove it with appropriately generalized Fourier analysis, but it's much easier to prove in this general case with pure combinatorics. This was originally proved by BLR for cyclic groups; later, Coppersmith gave a nice proof which also works for all abelian groups. See Section 4.1 of Rubinfeld's Testing Survey. Note that the constant factors necessarily get worse, because of examples like , .
In fact, the BLR test still works even for nonabelian finite groups; this was claimed in BLR and then proved by Ben-Or, Coppersmith, L & R.
Wednesday, January 24, 2007
Tuesday, January 23, 2007
Also rejecting 1
At the end of class we considered running Håstad's test with many times - to be precise - and then accepting iff the empirical fraction of passes was at least . We saw that this gave a test for the class {}. I asked how we could also reject functions close to the function, but hurried over the explanation.
Aaron's suggestion was a very good one since we can do it making zero additional queries. Look at all the 's and 's we queried (not the 's) and reject also if the fraction of times the answer was 1 is at least . Note that all the 's and 's are independent random strings.
Now certainly dictators (which are half 1, half -1) will pass this extra test except with exponentially small probability in , so their acceptance probability, if it was at least before, is at least now (which is fine).
On the other hand, suppose is -far from the set of dictators. Then it's also -far from the set {dictators} (and so the Håstad tests will reject it with probability at least ) unless it's -close to . But in this case, the additional test will reject except with exponentially small probability in , so we're again fine.
The alternate fix I suggested involved using local decodability of parity functions. After all the Håstad tests, if we still haven't rejected then (with probability at least ) is -close to either or a dictator function - in particular, to a linear function. Now use 2 more queries to do local decoding on the string , and reject if the answer is 1. Dictators will pass this with probability at least , so overall they pass with probability at least which is large enough (assuming now, say). And any function -close to being will pass this with probability at most , so overall it will pass with probability at most , again fine.
Two good exercises for you:
1. Understand why it is without loss of generality to assume .
2. In the local decoding of we are really doing the following 2-query test: Pick at random, and accept iff . Express the probability that this test passes in terms of the Fourier coefficients of .
Aaron's suggestion was a very good one since we can do it making zero additional queries. Look at all the 's and 's we queried (not the 's) and reject also if the fraction of times the answer was 1 is at least . Note that all the 's and 's are independent random strings.
Now certainly dictators (which are half 1, half -1) will pass this extra test except with exponentially small probability in , so their acceptance probability, if it was at least before, is at least now (which is fine).
On the other hand, suppose is -far from the set of dictators. Then it's also -far from the set {dictators} (and so the Håstad tests will reject it with probability at least ) unless it's -close to . But in this case, the additional test will reject except with exponentially small probability in , so we're again fine.
The alternate fix I suggested involved using local decodability of parity functions. After all the Håstad tests, if we still haven't rejected then (with probability at least ) is -close to either or a dictator function - in particular, to a linear function. Now use 2 more queries to do local decoding on the string , and reject if the answer is 1. Dictators will pass this with probability at least , so overall they pass with probability at least which is large enough (assuming now, say). And any function -close to being will pass this with probability at most , so overall it will pass with probability at most , again fine.
Two good exercises for you:
1. Understand why it is without loss of generality to assume .
2. In the local decoding of we are really doing the following 2-query test: Pick at random, and accept iff . Express the probability that this test passes in terms of the Fourier coefficients of .
Monday, January 22, 2007
Getting ready for Tuesday's class
Just wanted to remind you of a few things for next class:
1. It might not have been clear in the last class, but we are using the convention that the "empty product" is defined to be 1. I.e., is taken to mean 1. In particular, the "empty parity" is the function that is 1 on every input string.
2. A trivial fact that one can sometimes forget: If is parity function itself, say , then its Fourier expansion is just . In other words, if and if .
3. We'll use the Chernoff given on the homework, as follows: If |Pr[Test passes in case A] - Pr[Test passes in case B]| , then you can tell with high confidence if you're in case A or case B after tests.
4. One of you will be scribing!
1. It might not have been clear in the last class, but we are using the convention that the "empty product" is defined to be 1. I.e., is taken to mean 1. In particular, the "empty parity" is the function that is 1 on every input string.
2. A trivial fact that one can sometimes forget: If is parity function itself, say , then its Fourier expansion is just . In other words, if and if .
3. We'll use the Chernoff given on the homework, as follows: If |Pr[Test passes in case A] - Pr[Test passes in case B]| , then you can tell with high confidence if you're in case A or case B after tests.
4. One of you will be scribing!
Sunday, January 21, 2007
Diffusion on the discrete cube
I wanted to say a few more words about the "diffusion process" on the discrete cube, mentioned in Lecture 1. Actually, on any graph you can set up an analogous process - basically, a "continuous-time" random walk on the graph.
Given a graph , its Lapalacian is usually defined to be the symmetric matrix that has a in entries and when this is an edge, deg() in the diagonal entry, and zeroes elsewhere.
In our case we considered the discrete cube, , with and iff and have Hamming distance 1.
You can see now that the diffusion equation I gave in Lecture 1 says that for functions , the rate of change of with respect to time is defined to be . (The factor is important here, but the factor is not; it just slows down time by a factor of .) The same diffusion process can be set up for any graph.
Further, the "basic solutions" to this differential equation - by which I mean those that just decay with time but don't change their "shape" - are the eigenvectors of the matrix ; i.e., the functions such that for some constant . On the discrete cube, it's easy to check that the parity functions are all eigenvectors, and since they are linearly independent in a space of dimension , linear algebra tells us this is all of them.
I'm not sure if there's a more principled way to derive the fact that the parity functions are the eigenvectors of , as opposed to just observing that they are and that they are sufficient in number. Maybe someone can figure this out.
For way more on this than you might want to know, see this paper by Bobkov and Tetali. It can be hard reading, but you might want to at least check that the quantity they introduce on line 4 is our "energy" , up to a constant. (Indeed, stands for "energy" in their paper.)
Given a graph , its Lapalacian is usually defined to be the symmetric matrix that has a in entries and when this is an edge, deg() in the diagonal entry, and zeroes elsewhere.
In our case we considered the discrete cube, , with and iff and have Hamming distance 1.
You can see now that the diffusion equation I gave in Lecture 1 says that for functions , the rate of change of with respect to time is defined to be . (The factor is important here, but the factor is not; it just slows down time by a factor of .) The same diffusion process can be set up for any graph.
Further, the "basic solutions" to this differential equation - by which I mean those that just decay with time but don't change their "shape" - are the eigenvectors of the matrix ; i.e., the functions such that for some constant . On the discrete cube, it's easy to check that the parity functions are all eigenvectors, and since they are linearly independent in a space of dimension , linear algebra tells us this is all of them.
I'm not sure if there's a more principled way to derive the fact that the parity functions are the eigenvectors of , as opposed to just observing that they are and that they are sufficient in number. Maybe someone can figure this out.
For way more on this than you might want to know, see this paper by Bobkov and Tetali. It can be hard reading, but you might want to at least check that the quantity they introduce on line 4 is our "energy" , up to a constant. (Indeed, stands for "energy" in their paper.)
Thursday, January 18, 2007
Voting with real numbers
In a comment, Yi suggested that voters use real numbers, say, in [-1,1]. This raises a number of interesting questions. Are our voting schemes now ? Presumably here a vote of -1 means you really really like candidate -1, a vote of .9 means you really like candidate 1, a vote of .01 means you are basically indifferent with a very slight preference towards candidate 1, etc. (It's a bit like rating the candidates on the Hot Or Not scale.)
In some cases (where can be expressed as a multilinear polynomial) this is basically the same as usual voting; one can just have the voting machine convert a vote of to a random vote from chosen to have expectation . But in other cases it's not the same.
The KKL theorem can be extended (with an appropriate new definition of "influence") to the case of functions , but certain corollaries cease to hold. We may see this result in the class; it's by "BKKKL" -- the B is Fields Medalist is Jean Bourgain and the extra K is Yitzhak Katznelson.
In some cases (where can be expressed as a multilinear polynomial) this is basically the same as usual voting; one can just have the voting machine convert a vote of to a random vote from chosen to have expectation . But in other cases it's not the same.
The KKL theorem can be extended (with an appropriate new definition of "influence") to the case of functions , but certain corollaries cease to hold. We may see this result in the class; it's by "BKKKL" -- the B is Fields Medalist is Jean Bourgain and the extra K is Yitzhak Katznelson.
Wednesday, January 17, 2007
Stealing from Avrim
I was checking out the materials for Avrim and Nina's Learning Theory course today. Two things:
1. I swear I didn't steal the "spam" example for learning from him!
2. I like his idea of dividing the homework into "exercises" (relatively straightforward problems you should really do yourself) and "problems" (harder ones, where collaboration would be fine). On our homework, #2 -- #6 would be exercises, #7 a problem, and #1 halfway in between (probably closer to a problem).
In particular, if you're looking at the homework and thinking, "Man, I can't even do #1!", try moving on to #2 through #6.
1. I swear I didn't steal the "spam" example for learning from him!
2. I like his idea of dividing the homework into "exercises" (relatively straightforward problems you should really do yourself) and "problems" (harder ones, where collaboration would be fine). On our homework, #2 -- #6 would be exercises, #7 a problem, and #1 halfway in between (probably closer to a problem).
In particular, if you're looking at the homework and thinking, "Man, I can't even do #1!", try moving on to #2 through #6.
Tuesday, January 16, 2007
Voting power
The slides from Lecture 1 are on the course web site now.
I mentioned John F. Banzhaf III in class; he's got quite a web site there, in which he proudly declaims about being the "Osama bin Laden of torts", among other things. I guess he's famous for suing McDonald's (over obesity) and tobacco companies. Apparently he's part of the reason for smoking bans in the UK and other countries. He also shows up in the news these days as a commenter on the Duke Lacrosse happening.
Anyway, he's frequently credited with describing the notion of influences of voters on elections we discussed in class. That came about when he sued the Nassau County Board for having an unfair voting scheme; they passed proposals according to the following boolean function:
= 1 iff .
You can check that for this , = = = 0.
More interesting info on measuring the power of voters is here:
Banzhaf Power Index
Voting Power
I mentioned John F. Banzhaf III in class; he's got quite a web site there, in which he proudly declaims about being the "Osama bin Laden of torts", among other things. I guess he's famous for suing McDonald's (over obesity) and tobacco companies. Apparently he's part of the reason for smoking bans in the UK and other countries. He also shows up in the news these days as a commenter on the Duke Lacrosse happening.
Anyway, he's frequently credited with describing the notion of influences of voters on elections we discussed in class. That came about when he sued the Nassau County Board for having an unfair voting scheme; they passed proposals according to the following boolean function:
= 1 iff .
You can check that for this , = = = 0.
More interesting info on measuring the power of voters is here:
Banzhaf Power Index
Voting Power
Subscribe to:
Posts (Atom)