More often than not Office DOCX documents have (many) useless tags.
DGT developed in-house the Tagwipe application which removes all/most redundant tags from DOCX documents. It also improves segmentation by segment.
The cleaning level in Tagwipe can be chosen by the user from level 0 to 8. By default, in the installation folder, the cleaning level defined is the second lowest and most conservative level ("level.1").
In the example below, about 90% of the tags were eliminated by Tagwipe (from 121 tags to 14 tags) and segmentation was improved. The remaining tags are meaningful tags.
Display in OmegaT Editor |
||
Without Tagwipe |
With Tagwipe |
|
Tagwipe is being used in DGT since 2012, in a WINDOWS 7 environment, for DGT-OmegaT projects and therefore it is quite stable. In the DGT-OmegaT website it is also made available for Unix (with an installer which tries to work on any Linux distribution or any other Unix-like environment), but it has not been thoroughly tested in it.
Add new comment