Inside DGT, most translators work with Trados Studio, only a minority is using OmegaT. The situation where one translator using OmegaT needs interation with another using Trados is becoming more and more frequent. The following extensions are recent developments enabling to OmegaT users to work better with files provided by Trados Studio.
A more detailed, more technical description of all of this is in the dedicated document.
We already described this feature as it worked for DGT-2 here. In DGT-OmegaT 3.1, whose main features were integration with Trados Studio, the general idea is to consider that when the source files are in SDLXLIFF format, then "view source" or "view target" should not display the SDLXLIFF file (which would open in Trados Studio, if you have it) but the file in the native format. More concretely:
The XLIFF filter provided by OmegaT in the core does only support files where <target> is filled with the original version. SDLXLIFF files do not, by default, follow this definition.
The filter provided by Okapi is a true bilingual file filter, making the distinction between <source> and <target> markups from the file. But it has another inconvenience. When you have tags inside a segment, for example <x id="158" />, then OmegaT will see <x158/>. And SDLXLIFF, in particular, always contain a unique number for each tag.
Compare (in green the feature we like, in red the one we don't like):
OmegaT filter | Okapi filter |
<g0>Texte</g0> d'exemple <g0>Texte</g0> d'exemple |
Sample <g18>Text</g18> <segment 0010> <g18>Texte</g18> d'exemple <end segment> Sample <g24>Text</g24> <segment 0015> <g24>Texte</g24> d'exemple <end segment> |
Reads <target> as if it was the source Segment appears as untranslated Tag is reset at paragraph level Segments are recognized as identical Tag type is g (as in xliff) in sdlxliff (not in some XLIFF files) |
Reads <source> or <seg-source> as source segment Reads <target> as auto-populated translation Tag is unique at document-level Segments are not recognized as identical Tag type is g (as in xliff) |
As a consequence, it is virtually impossible to have a 100% match from an external TMX file, because the numbering inside the XLIFF will always be different from the numbering inside the TMX. Even the SDLXLIFF containing repetitions (or previous version of the same document) would have different numbers! Even worse, segments from tm/auto will be totally ignored, because tm/auto works only with 100% matches!
Ticket has already bee submitted to Okapi core team, but nothing happens for the moment.
To avoid this problem, two approaches have been tested. Both are available not only inside DGT-OmegaT since release 3.1, but also as separate packages you can install into standard OmegaT (compatible almost with OmegaT 2.6 or later), that's why we detail them in dedicated page:
Common features in both approaches:
Some more features, compared to other OmegaT filters, have been added only to the StaX filter, they have no equivalent for other filters and cannot be realized using renumerotation:
In Trados, segmentation rules are stored only in translation memory files (SDLTM), which are in reality SQLite databases. When you open a project, the rules stored in the first memory are applied, or if none is available, the default rules which are hardcoded and not available. In such conditions, getting Trados rules, even modified by a team inside our organisation, and produce something usable for another CAT tool, is very difficult.
As if things were not so complicated, when you create a project with Trados, the SDLXLIFF file is not segmented! You will have one translation unit per paragraph, that is usual, but the <seg-source> markup will not correctly be filled using their own rules, unless you open the file in Studio, and edit almost one segment.
All existing filters (OmegaT's original one, Okapi, and ours) are capable of using segmentation from <seg-source> if exists, in which case you should unactivate "sentence-level segmenting". But if it is not here, either you work by paragraph or you activate "sentence-level segmenting"... but then it works using OmegaT's rules, and you share one of the biggest advantages of using SDLXLIFF!
To solve this problem, the command-line executable has been modified to be able to call any of the "tasks" provided by Trados API. Then, the task "Analyze", which is not executed when you create a project, can be called in order to produce a correctly segmented SDLXLIFF file. This can be done if you use the Wizard (last version in Java only), or alternatively, we also provide a Groovy script which does nothing more than calling the command-line. In any case, if you want to use it, don't forget to install the command-line executable - which means, that you must have Trados installed - and update the script to tell where it is.
Add new comment