PrintScanned Face Documents Dataset

Authors: Anselmo Ferreira, Mauro Barni,  Ehsan Nowroozi,

VIPPrint: A Large Scale Dataset for Colored Printed Documents Authentication and Source Linking


The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation and even the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we share a new dataset composed of a large number of synthetic and natural printed face images.  Such a dataset can be used with several computer vision and machine learning approaches for two tasks: pinpointing the printer source of a document and detecting printed pictures generated by deep fakes.

When using the dataset, don't forget to cite our paper:

