## The good guys and bad guys algorithm

A good introduction to the mistery of the P vs. NP problem is to examine polynomial-time algorithms which ‘almost’ solve an NP-complete problem. Usually such algorithms solve the problem for most inputs, and ‘only’ fail at some quite rare ‘evil’ special inputs. Here I present such an algorithm of mine. I don’t even know for which kind of evil inputs it fails. (It probably fails for some inputs of course, it is just harder to provide counterexamples than to come up with the algorithm itself.) The reason that I present this algorithm is that I want to demonstrate a rarely considered possibility: Maybe P equals NP, maybe we will even find the algorithm which is not that incredibly complicated, we will use the algorithm in practice, but it will be crazily hard to prove that the algorithm is correct because the algorithm is based on pseudorandom behaviour (a.k.a Monte Carlo method) and some kind of ’emergent’ behaviour instead of a simple direct, easily analyzible behaviour.

So here is an algorithm which tries to find an N/2 clique in a graph. (A clique with half of the vertices.)

Please note that it is enough to find one vertex which is surely not in the clique. Then we can recursively use the algorithm for the remaining vertices, and it is easy to see that we will still be in polynomial time.

At the beginning let’s erase those vertices which has less then N/2-1 neighbours.

Let’s call the vertices ‘guys’. The ones in the N/2 sized clique are the good guys, the other N/2 guys are the bad guys. The good guys form a clique. The bad guys decide how they connect with the good guys and with each-other. They intentionally try to ‘game’ the system, they intentionally connect in tricky ways to make our algorithm fail.

Now we randomly distribute the vertices into two N/2 sized distinct sets:

– The good set.

We repeat the following for a reasonable time: (let’s say N^5 times)

We randomly select two guys in the good set which are not connected. We replace them with two randomly selected guys from the bad set.

Please note that when we do such a replacement, then we do one of the following 2 things (without knowing which case we are dealing with):

– We throw out two bad guys from the good set.

– We throw out one bad guy and one good guy from the good set.

So we never throw out two good guys!

During this  N^5 steps, we note for each vertex for how many time it stayed in the good set. Let’s call it T(n). It is easy to see from the above that the good guys together have a better summa T(n) than the bad guys.

But of course the bad guys can do evil trics. They can create some ‘bad superstars’. These are connected with almost every good guys and almost every bad guys. These can have very good T(n). This does not hurt us though, because we are only interested in the ‘worst’ guy. But they can do even more evil tricks. They can chose two good guys which they don’t connect with. And they connect with good guys and each other very much, on the borderline of creating another N/2 clique (but not creating, because they are not allowed to.) These bad guys, when get into the good set, they throw out these two good guys (while throwing themselves out, but they are in superior numbers.) The two poor good guys can end up having the worst T(n)-s.

But we don’t give up. The T(n)-s represent something about ‘goodness’. We just have to extract some minimal tiny bit of information from this.

We order all the guys according to their T(n)-s.

We mark the best N/2 the ‘goods’, the others the ‘bads’.

Now we give ‘points’ to vertices according to how many good neighbours they have. (Maybe we should somehow penalize bad neighbours, but I am not sure, because this way bad guys could just make harm to good guys by connecting to them.)

Now we order the guys according to points, we call the best N/2 the goods, the other N/2 the bads…

We repeat this let’s say N^5 times. The idea is that good guys are that guys who know very much good guys. Those two poor good guys in the previous example are coming up…

We assume, that the guy with the least points in the end is a bad guy.

This algorithm probably fails for some inputs. (I don’t know for what kind of inputs yet.) The real question is that: can it be fixed? Is this really fundamentally impossible, or just a bit more tricky than we first thought?

I am a Hungrarian programmer interested in programming, math and entrepreneurship.
This entry was posted in Uncategorized. Bookmark the permalink.

### 8 Responses to The good guys and bad guys algorithm

1. Istvan says:

If the average degree is around N/2, and you start with a random set, the number of ‘bad’ guys in the ‘good’ set will be approximately the number of ‘good’ guys in the same set. Furthermore, in most of the time you will replace one good guy and one bad guy with one good guy and one bad guy. What is worse, while you indeed cannot remove 2 good guys from the ‘good’ set, you might remove one good and one bad, and replace them with 2 bad guys.

So why do you think that spending time in the ‘good’ set is a good measure for ‘goodness’, when any point spends approximately the same time in the ‘good’ set, independently from being good or bad.

The conceptual problem is that you split the set of vertices into two parts, and you start believing that one of them is the good set without any reason.

2. Istvan says:

Did you give up?

No sorry, I did not give it up, I was just not aware that comments are waiting for my approval (I was not familiar with wordpress).
So let’s assume, that you are right: the probability of a random guy in the good set being a good guy is 0.5.

Let’s mark the good guys with letter ‘g’, the bad guys with letter ‘b’. A replacement can be written the following way:

xy -> vw

means that xy is changed to vw

Here are the possible changes:

bb->bb
bb->bg
bb->gb
bb->gg

bg->bb
bg->bg
bg->gb
bg->gg

gb->bb
gb->bg
gb->gb
gb->gg

All these possibilities have the same probability: 1/12. This means that

with 5/12 probability the ratio does not change
with 2/12 probability the ratio of the goods in the good set decreases
with 5/12 probability the ratio of the goods in the good set increases

This means the sooner or later the ratio of the goods must increase in the good set, so it cannot be 0.5

So I think the T(n) represents the information which I attributed to it.

On the other hand I am afraid that the second part of the algorithm can be attacked: probably even if T(n) correlates with goodness, it is incredibly hard (or impossible) to extract the relevant information from it: which vertex is surely not good? In this sense I’ve given up: I cannot come up with an algorithm for which I can correctly prove that it can find this surely bad vertex for sure.

So my summary: This algorithm can measure a metric which correlates with goodness of vertices, but the part which determines the surely bad guy is not a proven algorithm, just a heuristic, which is intuitively seems to be good, but I cannot prove that it is good, and I am afraid it is not good.

• Istvan says:

“All these possibilities have the same probability: 1/12. ” Oh, yesss….

Especially, if the ‘bad’ set contains 60% of bad guys and 40% of good guys, and the probability that a good guy is connected with a bad one is 0.5 approximately.

This is what I am saying: you believe that your algorithm is good, and you adjust mathematics to that, disregarding whether or not it is true…

The 1/12 probability did not make sense, but my original statement is still true.

Let’s discuss it more precisely:

We haveN/2 good guys and N/2 bad guys.
We have _any_ D distribution regarding how these are connected.You can choose this D distribution! (Provided that the good guys form the one and only N/2 clique in the graph.)

I state that in this case the probability of a good guy to be in the good set is greater than 0.5.

The proof is indirect: I assume that the probability of a good guy to be in the good equals to 0.5. P=0.5. We will see that this cannot be true.

If a gb combination is chosen in the good set (we mark the probability of this P(gb)) then you can see that the expected value of the ratio is not changing. Please think about this before replying: this does not depend on the deistribution of the edges. (Remember we are in an indirect proof where we consider P=0.5 (so the vertex distribution is fixed), but we don’t assume anything about the edge distribution!)

The other case is that a bb combination is chosen in the good set. (We mark the probability of this P(bb). Note that P(gb) + P(bb) = 1) In this case the expected value of the ratio of the goods must increase.

So the only case, when P colud remain 0.5 would be that if P(bb) = 0.
This could only happen if all bad guys would be connected, which is a contradiction.

Summary:
What I state about T(n) correlating with the goodnes is true.
That’s another question that this is probably just a trivial thing, and does not lead us closer to creating a polynomial algorithm for our task.

“you believe that your algorithm is good, and you adjust mathematics to that, disregarding whether or not it is true…”
I don’t adjust mathematics to anything. If my explanation don’t make sense to you then please write the simulation. You will experience that P will be greater than 0.5 for any kind of graph you choose (if you run the simulation for a reasonable time.)

Now that I think about it, I suggest you to intuitively think about it this way: Let’s forget my algorithm for a while. Probably simply the rank of vertices is even correlated with goodness for any edge distribution. You can make some bad guys well connected, but on average they must be less connected than good guys. My statement is very similar to that, just a bit more complicated because of my algorithm.

3. Istvan says:

“If a gb combination is chosen in the good set (we mark the probability of this P(gb)) then you can see that the expected value of the ratio is not changing.” <- This is not true as you can replace a gb with a bb (two bad guys might be connected).

I constructed you a counterexample, where the bad guys got a higher score in expectation. Consider that there are 5N vertices. The 2N good guys form a 2N click. The 3N bad guys are arranged around a polygon, and everybody is connected with everybody except the neighbors around the polygon. So it almost looks like a 3N click. However, the largest click you can make is a 1.5N click: indeed, any pair of vertices that are neighbors around the polygon cannot be in a click as they are not connected.

You will remove one bad and one good guy from the good set almost with probability 1, the probability of choosing two bad guys is o(1/N). However, you will replace them with two bad guys with a probability larger than 0.5 if there are more bad guys in the bad set than good guys.

So in limit, 5N/4 good guys will be in the bad set, and only 3N/4 in the good set. Due to symmetry, each good guy will spend less than half of its time in the good set. On the other hand, 5N/4 bad guys will be in the bad set, and 7N/4 in the good set.

This example highlights where your intuition is wrong. The click property is very much a global property that cannot be caught with local properties like neighborhood. It is very easy to destroy a click, as can you see, in a way to get a significantly smaller click. Probably this highlight why the maximum click problem is so hard.

4. Istvan says:

Concerning the case where there is an N/2 click (if there are N vertices): it might be easy to find that, and that might be a consequence of the Erdős-Gallai theorem, see http://mathworld.wolfram.com/GraphicSequence.html

I am not sure, I am just guessing. But if that is the case, then you can find it with a deterministic algorithm.

What will you do if the largest click is larger than N/2?

5. CsizmásKandúr says:

What a cool theory. I always proud of you! XD XD