These trained deep learning are trained to extract building footprints from high-resolution aerial imagery and LiDAR-derived nDSM with a spatial resolution of 20cm or less. These models were used to extract footprints in formal residential zones, industrial zones, and informal settlement zones within the City of Cape Town. It must be noted that the trained Mask R-CNN models are scalable to extract building footprints across different South African’ Metropolitans, as formal and informal zones co-exist in these areas and they have similar environmental settings.