Sunday, February 8, 2015

HOWTO, Stamp Collecting Software

I just bought a new scanner and started to digitize my stamp collection. My goal is to create a database with each stamp individually. After one day of work, I can say that this will be very time consuming and tedious task. Unless I find a way to get myself 48 hours in a day, I need to do something to speed up this process.

Long time ago I worked on the development of image processing software, so I thought - maybe this will be a great opportunity to refresh my knowledge and have some fun too.

Let’s define our input as image with more or less randomly positioned stamps. 

Input image.

Step 1: Edge detection 

In theory, we know a number of edge detection filters, namely: homogeneity filter, difference filter, Sobel and Canny edge detection filter, etc. I decide to begin with the simplest one – the difference filter that finds edges by calculating the maximum difference between pixels in four directions around the processing pixel.

Edge detection - red line. 

Step 2: Quadrilaterals detection

Quadrilaterals (rectangles in our case) can be detected using Hough line transformation to find lines and then detect pairs of that lines that intersect with an angle of 90 degrees (approx.)

Quadrilaterals detection - yellow line.

Step 3: Extract individual item

Using quadrilaterals, we can calculate their bounding boxes and simply cut them out. All following steps are processed for all our cuts. 

Stamp extracted.

Step 4: Deskew (rotate) item

The rotation angle can be calculated from quadrilateral lines (as mean of all angles). Also Hough line transformation can be combined as a correction factor. In practice, getting angle is the most complicated operation in the whole process stack. But, when we have an angle, the rotation is trivial.

Stamp rotated.

Step 5: Fill background with transparent color

With last step we simply fill the background with transparent color. Flood fill algorithm is a way to go. Note that stamp image should be preserved in a format that supports transparency. My preferred format is PNG.

Stamp with transparent background.

I am now able to digitize my stamps for about fifty times faster! And, having images in such format brings a great deal of possibilities. Custom layouts are just one of them.

Row alignment layout.

Compact layout.