Computing the binary expression for isDigit
In the previous article we built a program that would use genetic algorithms to determine the best expression to tell if a character is a digit using only bitwise operations. We got so close, only 2 inputs were providing invalid results. However, this is not something that can be deployed, so today we will look at how to do this exactly using bits of logic.
Matching a single binary pattern 🔗
To begin with, let’s first write a simple function that would return true if the input is equal to a specific value, and false otherwise:
bool Matches(char target, char value) {
return target == value;
}
Well, this is trivial, but we also cheated. Let’s replace equality with bitwise operations. We need some operation that given two bits \(b_1\) and \(b_2\), returns \(1\) if the bits are equal and \(0\) otherwise. This is XNOR, the negation of the exclusive or.
\(b_1\) | \(b_2\) | \(x = XNOR(b_1, b_2)\) |
---|---|---|
0 | 0 | 1 |
0 | 1 | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
For our function, we can combine the xor operator ^
with the negation
operator !
to obtain:
bool MatchesBits(char target, char value) {
return !(target ^ value);
}
Note that we have to use !
instead of ~
because we want to
return false
as soon as one bit does not match, but using ~
will
still return non-zero (that is, true
-ish) values.
Minimizing and matching all 10 digits 🔗
The naive way to match all 10 digits is to call MatchesBits
for all 10
values, returning true
if at least one of them returns true
(that is, using the ||
operator, to also benefit from short-circuiting
properties).
But this is going to be a very large expression, so let’s try to first minimize it. For problems with small number of values involved, we could use the Karnaugh map to identify adjacent blocks with the same value. However, here we have 8 bits in the input, so this method is not too practical.
Instead, we can use the Quine-McCluskey algorithm: we list all possible values together with the desired output and collapse states that differ in only a single input bit but have the same output. The differing bit will has a special “don’t care” value. In our case, we don’t even need to go over all 256 inputs, since the digits are clustered together in the ASCII table. So, we can start with a few collapsed states:
x |
isDigit(x) |
---|---|
0000---- |
0 |
0001---- |
0 |
0010---- |
0 |
00110000 |
1 |
00110001 |
1 |
00110010 |
1 |
00110011 |
1 |
00110100 |
1 |
00110101 |
1 |
00110110 |
1 |
00110111 |
1 |
00111000 |
1 |
00111001 |
1 |
00111010 |
0 |
00111011 |
0 |
00111100 |
0 |
00111101 |
0 |
00111110 |
0 |
00111111 |
0 |
01------ |
0 |
1------- |
0 |
If we group the patterns together, according to the algorithm, we reach the following result:
x |
isDigit(x) |
---|---|
0000---- |
0 |
0001---- |
0 |
0010---- |
0 |
00110--- |
1 |
0011100- |
1 |
0011101- |
0 |
001111-- |
0 |
01------ |
0 |
1------- |
0 |
We have two rows that produce the true
output, we just need to combine
them using ||
, just like we were discussing about combining
MatchesBits
. But first, we still have one question to solve: how do we
handle the “don’t care” bits?
We have (at least) two options, with equal results in terms of quality but
different resulting expressions: we can set them all to 0 using the bitwise
&
operator or we can set them all to 1 using the |
operator.
Then, we just require this result to match a specific mask.
Let’s do the case where we use |
, the other one is left as an exercise
to the reader. For the first pattern, 00110---
, we have to set the last 3
bits to 1, so we can do x | 0x07
. The corresponding mask is thus
00110111
, or 0x37
. Similarly, the other branch means doing a bitwise
or with 0x01
and using 0x39
as the mask. Hence:
bool isDigit(char x) {
return !((x | 0x07) ^ 0x37) || !((x | 0x01) ^ 0x39);
}
This is the expression we represented in the first tree in the last article, the expression we wanted the genetic algorithm to discover (or an equivalent form of it). In a future article, we will look into why this was not the case in our previous exploration.
Comments:
There are 0 comments (add more):