Bilingual term pairs extracted from comparable Web resources using the TaaS Bilingual Term Extraction System

426 Last view: 2025-06-29

1 Last update: 2015-10-30

Bilingual term pairs extracted from comparable Web resources using the TaaS Bilingual Term Extraction System

ID:

TAAS-FMC-1

The resource contains bilingual term pairs automatically extracted from comparable resources found in the Web using the TaaS Bilingual Term Extraction System. The workflow for bilingual term extraction consisted of:
1) Focussed Monolingual Crawler for comparable corpora collection from the Web and for plaintext extraction.
2) DictMetric for cross-lingual document level alignment of the collected comparable corpora.
3) Tilde's Wrapper System for CollTerm for identification of terms in plaintext documents.
4) Term normalisation tools developed by Tilde and University of Sheffield for acquisition of term normalised (canonical) forms from terms in different surface forms.
5) MPAligner in order to extract bilingual term pairs (align terms) from term tagged Wikipedia document pairs.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Unrestricted Use

Licence

CC - BY

IPR Holder

[TAAS]

Contact Person

Mārcis Pinnis

text

Lexical Conceptual Resource General Information

Terminological Resource

Creation

Creation mode: Automatic

Multilingual text lexicalConceptualResourceLanguages

Slovak Slovenian Swedish English Greek, Modern (1453-) German Estonian Spanish Bulgarian Danish Czech French Finnish Hungarian Croatian Lithuanian Italian Dutch; Flemish Latvian Portuguese Polish Russian Romanian

Linguality

Linguality type: Multilingual

Multi-linguality type: Parallel

Size

632 121 Entries

Resource Creation

Resource Creator

University of Sheffield

Funding Project

Terminology as a service (TaaS project - TaaS)

URL: http://www.taas-proj...

Funding Type: Eu Funds

Funder: European Commission

Project duration: 06/01/2012 - 05/31/2014

Metadata

Created: 07/03/2014

Last Updated: 07/03/2014

Metadata Language: English (en)

Metadata Creator

Mārcis Pinnis

People who looked at this resource also viewed the following:

Resources from the same project

Resources from the same creators