The labels on automatic detected images are based on a comparison with the manual detection, and are labelled according the the Pascal VOC overlap criteria (50% overlap). The results are shown in Dataset/Automated/ The hieroglyphs that I was unable to identify are labelled as "UNKNOWN".Īside from the manual annotation, we used a text-detection method to extract the hieroglyphs automatically. This is out of my knowledge as an Egyptian NOTE: The labelling may not be 100% correct. TotalClasses = 171 (excluding the UNKNOWN class)
TotalImages = 4210 (of which 179 are labelled as UNKNOWN)
The images are stored with their label and number in their name. Each hieroglyph is manually annotated and labelled according the Gardiner Sign List.The ten different pictures used throughout this dataset are: 3,5,7,9,20,21,22,23,39,41 (numbers represent the numbers used in the book "The pyramid of Unas".We therefore urge you to have access to this book before using the dataset. This dataset is build from the hieroglyphs found in 10 different pictures from the book "The Pyramid of Unas" (Alexandre Piankoff, 1955).Hieroglyphs image dataset along with Language Model !