TNTC Project
Translation Metalanguages (TML)
We release the following sets of metalanguages.
The development and use of the metalanguages are described in Miyata et al. (2022) and Yamada et al. (2020).
Translation-related Datasets
MultiEnJa
A set of 46 examples of English source documents with several types of translation-related derivatives, including professional translation and post-edited machine translation outputs.
ParaNatCom
Parallel English-Japanese abstract corpus made from Nature Communications articles.
Staged PE Dataset
Examples of translation issues and their revisions collected through 2-stage post-editing (PE) of machine translation
(MT) outputs.
Software
To be released.
References