We are proud to offer the Sama-Coco dataset, a relabelling of the Coco-2017 dataset by our own in-house Sama associates (here’s more information about our people!). We invite the Machine Learning (ML) community to use it for anything you would like to do – all free of charge and ungated.
This is part of our ongoing effort to redefine data quality for the modern age, and to contribute to the wider research and development efforts of the ML community. Here are the ungated links to the two datasets (both covered by the Creative Commons license) so that you can get started right away.


In open-world action-adventure games, voice acting bridges the gap between gameplay and narrative. Ghost of Tsushima , set during the first Mongol invasion of Japan (1274), faced a unique challenge: should its characters speak modern English for mass market appeal, or period-authentic Japanese? The solution was a dual-language pack, released at launch, offering both English and Japanese voiceovers. This paper explores how that choice impacts player experience.
The language pack creates a central tension: ghost of tsushima language pack
The language pack offers player agency: purists can choose Japanese for immersion, while casual players can opt for English. This paper explores how that choice impacts player
Here are some key features of the Ghost of Tsushima language pack: Key features include:
The Japanese language pack in Ghost of Tsushima is not a simple subtitle overlay. Key features include: