Semantic Segmentation

Short DescriptionSemantic Segmentation is a computer vision task that is utilised to categorise each pixel in an image into a class or object.
Data
Suggested tools
Variable

Overview

Semantic Segmentation is a fundamental task in computer vision that aims to assign a corresponding and unique class label to each pixel in an image, based on the content it represents. Convolutional Neural Networks (CNNs) are commonly used for this due to their ability to learn complex patterns from image data. These networks typically employ encoder-decoder architectures, where the encoder extracts features at different spatial resolutions, while the decoder reconstructs the segmented output map.

By leveraging computer vision models trained on streetscape imagery, a multitude of applications become feasible in urban analytics and urban health. These models, trained on diverse datasets encompassing various urban environments and streetscapes, can accurately delineate semantic information from street-level images. For instance, it can be used to quantify the amount of greenery along streets, assess the sky view index to understand urban canopy cover, and assess road features to evaluate infrastructure and pedestrian accessibility.

By integrating computer vision-based semantic segmentation into urban analytics workflows, stakeholders gain a comprehensive understanding of the urban built environment at a granular street level. This enables informed decision-making regarding factors that can contribute positively to urban health, including urban design and active transportation planning, thereby leading to the development of healthier cities.

Example of a Computer Vision Model:

SegFormer-B5, pretrained with the ImageNet-1K dataset and finetuned with the Cityscapes dataset: https://huggingface.co/nvidia/segformer-b5-finetuned-cityscapes-1024-1024