Bitmask and Bitwise Operations in Ruby

Here's a solution to a problem I was trying to solve in Ruby. I wanted to create a 23 bit data structure, that would hold three values. A 14 bit year, a 4 bit month and a 5 bit day. Each of these bit sizes are the minimum number of bits required to support a maximum value of 9999 for years, 12 for months and 31 for days (with months and days optional - hence the custom data structure). Using the bitwise & operator, you can either mask (protect) or zero out bit values. And then using the bitwise | operator - you can turn bits back on. In Ruby you can create and manipulate binary literals directly using the 0b prefix.

Below is the binary representation of the complete data structure that has been assigned to a bits variable with no values set, along with the required masks:

bits = 0b00000000000000000000000

YEAR_MASK  = 0b11111111111111000000000
MONTH_MASK = 0b00000000000000111100000
DAY_MASK   = 0b00000000000000000011111

ZERO_YEAR_MASK  = 0b00000000000000111111111
ZERO_MONTH_MASK = 0b11111111111111000011111
ZERO_DAY_MASK   = 0b11111111111111111000000

Let's set a year in our 23 bit data structure. We want to place the year value in the leftmost 14 bits (right aligned). Here's how we do it:

bits = (bits & ZERO_YEAR_MASK) | (2012 << 9)
# => 1030144
bits.to_s(2)
# => "11111011100000000000"

You can ignore the numerical result - 1030144. What we're interested in is whether 2012 has been placed in the 14 bit segment. The bits.to_s(2) command will print the number as a binary value - with leading 0s removed. If we pad this to our full 23 bits - it looks like our year has been placed in the correct segment.

00011111011100 0000 00000

To set these bits, we applied a mask that zeroed out the year bits first. We have to zero out values every time we set a new value, otherwise the bitwise | operator will create a bit value that is the combination of the old and new values. We then left shifted the numerical value 2012 by 9 bits - to move it into place. Let's try and get our year out again.

year = (bits & YEAR_MASK) >> 9
# => 2012

w00t! - we got our year back. In this case we applied the year mask  to retrieve just the year segment of 14 bits (although the mask wasn't strictly necessary in this case), and then we right shifted those bits back 9 places to get our original value. Doing the same for month and day values is exactly the same, with their respective masks and shift values:

# Month in
bits = (bits & ZERO_MONTH_MASK) | (12 << 5)
# => 1030528 # again ignore this number - it's meaningless.
bits.to_s(2)
#=> "11111011100110000000"
# which when padded is 00011111011100 1100 00000
# and sure enough 1100 in the month segment
# is the binary value for 12.

# Month out
month = (bits & MONTH_MASK) >> 5
# => 12 # w00t! - we got our month back

# Day in
bits = (bits & ZERO_DAY_MASK) | 3 # we don't need to shift
# => 1030531 # again - ignore this number

#Day out
day = (bits & DAY_MASK) # again no need to shift here
# => 3

So far so good. But what do we do if we want to store this value in a meaningful (and ideally human readable) format someplace else? If we create helper methods for each of our bit operations, and then apply a little math - we can create an integer value for our stored date that looks like this: 20121203.

date = (get_year(bits) * 10000) + (get_month(bits) * 100) + get_day(bits)
# => 20121203

And that's exactly what I did here - https://github.com/58bits/partial-date. I needed a solution for storing partial dates, where a complete date may be unknown, and I decided it was as good an opportunity as any for some bit twiddling in Ruby, allowing my date class to hold a single integer value as its backing store (and bit register), as well as only requiring a single column or attribute value in whatever persistence store is used if the complete date value needs to be saved.

Category
Tags

Comments