Facebook allows access to dataset for AI researchers15. April 2021
Facebook allows access to dataset for AI researchers
New York, 4/15/2021
Facebook has released an open-source dataset to help AI researchers test algorithms for age, gender and skin color. The “Casual Conversations” dataset is intended for computer vision and audio machine learning models.
Facebook emphasized that it is releasing the dataset as part of its “ongoing commitment to improving the fairness and accountability of AI systems.” The 10 terabytes of data includes videos recorded by 3,011 paid U.S. participants who were asked to provide their age and gender for the tags themselves.
Each person recorded 15-minute segments in which they answered pre-selected questions, for a total of 45,186 videos. Tags for skin tone and light conditions were determined by trained annotators who used the Fitzpatrick scale classification scheme for skin color. The same actors also participated in Faceook’s Deepfake Detection Challenge and the creation of the dataset.
Facebook encourages its own teams to use the dataset internally. It claims that it is the first dataset of its kind where participants provide their own age and gender, rather than having it estimated by third parties or models. This dataset is also different in that participants have consented to participate, unlike certain key facial recognition training datasets that use people’s images without consent.
You can access the dataset here