Monthly Computer Vision Meetup Roundup #6
End of April we hosted the 6ᵗʰ Computer Vision Meetup in our office “Die Manege”. We had the honour that one of our most seen member of the meetup Dr. Gayane Shalunts presented her PhD thesis about “Classification of the architectural style of a Romanesque, Gothic or Baroque building facade through a Computer Vision Tool”. So lets have a look!
Dr. Shalunts has always been very interested in architecture and photography. No wonder that the pictures she took of historical buildings all over Europe gave Gayane the idea for her thesis’s topic. A quick peek at Google Image Search showed exactly what she had in mind – famous buildings like the St. Stephen’s Basilica in Budapest were found very easily, however some rather ordinary neo-baroque buildings, as they occur in Vienna, could not be matched. Just have a look at the search results below.
The mission was to build a software tool based on computer vision, which is able to classify just by the image of the buildings facade the architectural style.
It wouldn’t be that challenging if there were no problems. You may know that every style has his own lifecycle of different phases, e.g., gothic. The “original” gothic period took place from about 1300-1500 and the Stephansdom is Austria’s famous example. Another church of the gothic style is the Votivkirche but it was build in the nineteenth century. To clarify — the architectural style of the Votivkirche is neo-gothic (historicism) and it’s really difficult to tell — have a look at the images.
Of course the style depends on the regional signature and the facades are often a mixture of different styles. Gayane decided to use the Baroque, Romanesque and Gothic style for her thesis and sub-styles are included in the main style. And last but not least there was no database labeled by architectural styles and a big intra-class variability.
Dr. Shalunts introduced her “Algorithm of the Voting of Architectural Elements”
The variety of the used images ranged from complete to parts of facades and from landmarks to non-landmarks. The algorithm consist of three steps: first one is to segment the images by architectural elements, in this case towers, domes, windows, pediments, traceries and balustrades. Step two is to classify the elements: Gayane built visual codebooks and assigned the possible styles. And finally step three is the voting of the architectural elements.
Step 1: The Segmentation of Architectural Elements
The Segmentation of a Dome:
The purpose of this step is to only segment the elements by type (tower or dome) not by style (Gothic, Romanesque). The main idea here is, that domes have a strong symmetry and the vertical axis, and the dome has a specific elliptic shape. The symmetry axis is found by first detecting feature points, and then preserving those points, which seem to be reflected at a line. If the axis is detected, the algorithm rotates the image, so that the axis lays vertically. Afterwards, the image is segmented into foreground (building) and background (usually sky), and cut off, so that the vertical axis is at one edge of the image, and the larger part of the cut off image is preserved. Finally, the top part of the building is found through checking the horizontal edges, and the roundness of the thresholded connected component is checked. If the top part is round enough, the algorithm knows that it’s a dome and let this information through to step two, where the architectural type will be classified.
The Segmentation of a Tower and Double Tower:
The procedure is essentially the same as finding the domes, but instead of checking the roundness of the connected component, the algorithm checks the solidity.
Just to imagine the dimension of database: 550 images of domes were used with an average detection and segmentation rate of 96% and 544 towers with an average rate of 94.77%. The results show that the segmentation is robust to high perspective distortions, complex scenes and different weather conditions as well as independent of resolution and color (even at night).
Step 2: The Classification of Architectural Elements
Up to this point, the algorithm collected information about the presence or absence of domes and towers. The next step to detect the special characteristics of the architectural style. A “bag of visual words” is built by computing feature points (e.g. SIFT), cutting out local image patches corresponding to these features, and clustering them (e.g. k-means). For each new unseen sample of a facade, a histogram describing the similarity to the individual words is computed, and the maximum value describes the best class. The styles used in Gayane’s thesis for the classification, are Romanesque, Gothic and Baroque. The histogram responses to different architectural styles can be seen below.
To sum up, the classification rate of windows reached 95.16% by a testing set of 310 images. The traceries, pediments and balustrades reached just 1.51% more than the windows of 420 images, 96,67%. And the domes finally got a rate of 90.24% after including additional golden color detection, which was an increase of nearly 3%.
Step 3: The Voting
The voting mechanism classified the style of the building by the given architectural elements. It’s working with style mixtures too because the important elements are rated higher and affect the result stronger. That’s it!
In summary, it can be stated that this is the first general algorithm of architectural style classification within facades. It’s build on three steps only – segmentation, classification and voting and was evaluated on Romanesque, Gothic and Baroque styles. The next possible steps would be new architectural styles and elements as well as the implementation of the three steps as independent modules.
Hold a talk yourself!
You have a project or topic you would like to talk about or you know someone, who would like to share his/her experiences and knowledge? Please contact us!
Oh – and don’t forget to join our meetup group ! ;)
QUESTIONS? LET US KNOW!