Acquisition of Character Translation Rules for Supporting SNOMED CT Localizations.
Journal:
Studies in health technology and informatics
PMID:
25991218
Abstract
Translating huge medical terminologies like SNOMED CT is costly and time consuming. We present a methodology that acquires substring substitution rules for single words, based on the known similarity between medical words and their translations, due to their common Latin / Greek origin. Character translation rules are automatically acquired from pairs of English words and their automated translations to German. Using a training set with single words extracted from SNOMED CT as input we obtained a list of 268 translation rules. The evaluation of these rules improved the translation of 60% of words compared to Google Translate and 55% of translated words that exactly match the right translations. On a subset of words where machine translation had failed, our method improves translation in 56% of cases, with 27% exactly matching the gold standard.