Single Number II : How to come up with Bit Manipulation Formula

Problem Description

Given an array of integers, every element appears three times except for one. Find that single one.

Note:
Your algorithm should have a linear running time complexity. Could you implement it without using extra memory?

Solutions
By following up Single Number problem we can solve this problem by using Hash Table. So we will not discuss that solution here.
At first, in order to solve the problem, we try to make it simpler. We only consider the right most bit of each number, and count the number of bits that equals to 1. Call "mars" is the number of 1 bits. We also call "elon" is the number that appear only once, and "elonBit" is the rightmost bit of "elon" (The terms are just for fun because Elon Musk is my idol :)).

Our observation is that if "mars" is a multiple of 3 then "elonBit" must be 0, otherwise it is 1. We can repeat this procedure for all other bits. This is what we have in the following code.

I. Counting the number of bit 1.

public class Solution {
    public int singleNumber(int[] nums) {
        
        int result = 0;
        for(int i = 0; i<32; i++){
            int elonBit = 1 << i;
            int mars = 0;
            for(int j=0; j<nums.length; j++){
                if((nums[j] & elonBit) != 0) // if it is bit 1
                    mars++;
            }
            if (mars % 3 == 1) result |= elonBit;
        }
        return result;
    }
}

However, we see that the actual value of "mars" is not really important. Every time "mars" equal to 3, we can reset "mars" to 0. Since we are working with bits, we can represent "mars" by 2 bits p and q.

Since with 2 bits, we can represent 4 numbers, so we have different ways to represent the states from 0 to 2 (3 states). We will pick the natural way: 00 for state 0, 01 for state 1, and 10 for 2 (NOTE: if we choose another way to represent, we may come up with different solution later, so you should try it ^_^). With this way of representing, we see that when mars % 3 = 0, pq = 00, and when mars % 3 = 1, pq = 01. Therefore, the final value of q will be exactly the value of "elonBit".

Now we call r is the corresponding bit of a number in the array. If r = 1, we have the following transitions: (00 + 1 = 01), (01 + 1 = 10), (10 + 1 = 00). (The + means the transition with condition that we met bit 1).

We have the following transition table, where (old_p, old_q) represents the state before we examine a number, and (p, q) represents the state after that.
_____________________________
|old_p |old_q | r | p | q |
--------------------------------------------
| 0 | 0 | 0 | 0 | 0 |
--------------------------------------------
| 0 | 1 | 0 | 0 | 1 |
--------------------------------------------
| 1 | 0 | 0 | 1 | 0 |
--------------------------------------------
| 0 | 0 | 1 | 0 | 1 |
--------------------------------------------
| 0 | 1 | 1 | 1 | 0 |
--------------------------------------------
| 1 | 0 | 1 | 0 | 0 |
--------------------------------------------

Based on the above transition table, we can derive the formula of p and q based on old_p, old_q, and r.
p = (old_p & ~old_q & ~r) | (~old_p & old_q & r)
q = (~old_p & old_q& ~r) | (~old_p & ~old_q & r)

Based on this, we jump directly to coding and have the following code.
II. Method 2

public class Solution {
    public int singleNumber(int[] nums) {
        
       int p = 0;
       int q = 0;
       
       int old_p, old_q, r;
       for(int i = 0; i<nums.length; i++){
           old_p = p; old_q = q;
           r = nums[i]; //this to make it consistent with our analysis.
           p = (old_p & ~old_q & ~r) | (~old_p & old_q & r);
           q = (~old_p & old_q& ~r) | (~old_p & ~old_q & r);
       }
       return q;
    }
}

Now, we see that in this assignment:
p = (old_p & ~old_q & ~r) | (~old_p & old_q & r)
we have old_q = q, and old_p is p before that assignment.
So the formula become:
p = (p & ~q& ~r) | (~p& q& r)

As a consequence, we have the code:
III. Method 3: Improvement on Method 2

public class Solution {
    public int singleNumber(int[] nums) {
       int p = 0;
       int q = 0;
       int r, old_p;
       for(int i = 0; i<nums.length; i++){
          r = nums[i]; //this to make it consistent with our analysis. 
          old_p = p;
          p = (p & ~q& ~r) | (~p& q& r);
          q = (~old_p & q& ~r) | (~old_p & ~q & r);
       }
       return q;
    }
}

In the above code, it is still ugly because we need to use old_p. Why we cannot use p to replace old_p? Let's look at the triple (old_q, r, p) in the transition table. Its values include (0, 0, 0), (1, 0, 0), (0, 0, 1), (0, 1, 0), (1, 1, 1), (0, 1, 0). The triple (0, 1, 0) is duplicated, therefore information is lost for this case.
We can try different transition tables!
|old_p |old_q | r | p | q |
--------------------------------------------
| 0 | 0 | 0 | 0 | 0 |
--------------------------------------------
| 0 | 1 | 0 | 0 | 1     |
--------------------------------------------
| 1 | 1 | 0 |    1   | 1 |
--------------------------------------------
| 0 | 0 | 1 | 0 | 1     |
--------------------------------------------
| 0 | 1 | 1 |   1     | 1 |
--------------------------------------------
| 1 | 1 | 1 | 0 | 0 |
--------------------------------------------
It is pretty beautiful that in this table the triple (old_q, r, p) is unique, so we can use p to calculate q. And we are still able to use q to represents "elonBit" because for state 1, (p,q) = (0, 1)
p = (p & q & ~ r) | (~p & q & r)
q = (q & ~r & ~p) | (q & ~r & p) | (~q & r & ~p) | (q & r & p)
After reducing the above equations, we have:
p = q & (p ^ r)
q = p | (q ^ r)
IV. Method 4

public class Solution {
    public int singleNumber(int[] nums) {
       int p = 0;
       int q = 0;
       int r;
       for(int i = 0; i<nums.length; i++){
          r = nums[i]; //this to make it consistent with our analysis. 
          p = q & (p ^ r);
          q = p | (q ^ r);
       }
       return q;
    }
}

After understanding all of this process, we can solve different problems. For example, given an array in which all elements appear 3 times but one element is missing. For example [0,0,0, 1,1, 2, 2, 2, 3, 3, 3] where one "1" is missing.
Since the state 2 is represented as (1,1), hence instead of "return q", we can simply return "p&q" in the above code.

Training dragons the hard way - Programming Every Day!

Sunday, August 30, 2015

Single Number II : How to come up with Bit Manipulation Formula

3 comments: