السبت، 2 يونيو 2012

WIT3: a new collection of parallel texts!

We are very glad to announce WIT3 (to be read "wit cube"), acronym standing for Web Inventory of Transcribed and Translated Talks, a ready-to-use version of TED talks (wit3.fbk.eu).

TED is a non profit organization that invites experts to give the talk of their lives. TED records all talks and post them on its website (www.ted.com). Currently, more than 1100 talks are listed there, all subtitled in English. Translations of transcripts are also available into many languages (up to 90).

To make this collection of parallel texts more effectively usable by the MT research community, we have developed WIT3, a website hosting this multilingual corpus of talks, aligned at sentence level, alongside benchmarks, processing tools and reference MT results.

We hope WIT3 will offer an adequate service to the research community; for getting more info and downloading data, please visit: wit3.fbk.eu