CAT-Tools/DéjàVu X/Japanese

Bugs (Specific languages: Japanese as source language)

1. Scan function (Ctrl+S)

Added: --Loek van Kooten 18:19, 5 February 2007 (UTC)

Version: 7.0.284

System: Windows XP SP2

Description: Scan function (also known as Concordance in other CAT tools) does not work reliably. Words in Japanese source text are not always found. No pattern has been established yet.

Reproduction:

Workaround:


Status:

Not read by Atril yet.


Comments from other users:

2. Portions Found window

Added: --Loek van Kooten 18:19, 5 February 2007 (UTC)<

Version: 7.0.284

System: Windows XP SP2

Description: Japanese source words do not appear in Portions Found window if they are preceded or followed by punctuation marks (、。).

Reproduction:

Workaround:


Status:

Not read by Atril yet.


Comments from other users:

3. Auto-propagate

Added: --Loek van Kooten 18:19, 5 February 2007 (UTC)<

Version: 7.0.284

System: Windows XP SP2

Description: DVX gets confused by number conversion issues in translations from Japanese and therefore copies numbers of auto-propagated strings straight away without actually checking whether the numbers in the auto-propagated 'copy' are equal to the numbers in the auto-propagated 'original'.

Example:

平成19年2月2日 (The year Heisei 19, 2nd month, 2nd day) is translated as February 2, 2007

平成19年1月1日 (The year Heisei 19, 1st month, 1st day) is auto-propagated as February 2, 2007 (it should be January 1st 2007)

While it is understandable that DVX cannot come up with the right translation for the auto-propagated copy, the danger lies in the fact that DVX marks the copy as auto-propagated while in fact the source text is different. Turning auto-propagation off is the other extreme, as that also cancels auto-propagation of strings that would otherwise be propagated correctly.

Solution: DVX should not auto-propagate strings if a) the source text is Japanese AND b) numbers in the auto-propagated copy are not equal to numbers in the auto-propagated original. Reproduction:

Workaround:


Status:

Not read by Atril yet.


Comments from other users:

4. Import from Microsoft Word

Added: --Loek van Kooten 15:54, 6 February 2007 (UTC)

Version: 7.0.284

System: Windows XP SP2

Description: After importing Japanese Word documents, the text in DVX is infested with rogue codes. Japanese often features both Chinese ideograms (kanji) and the Roman alphabet/Arabic numerals. Since standard Roman alphabet/Arabic numerals in Japanese fonts are quite ugly, most Japanese texts in Word use two different fonts (one for kanji and one for the Roman alphabet/Arabic numerals), which are constantly alternated. Each alternation results in a rogue code.

The problem is that if the target text is a western language, these rogue codes make no sense at all, as the target text will not feature kanji and is therefore written in one font (the font used for the Roman alphabet/Arabic numerals). I.e. these rogue codes are completely superfluous in the target text, slow down the translation process and pollute your databases.

When translating from Japanese to a western language, DVX should therefore ignore all font changes and opt for a standard font instead (like Arial, or maybe even user-defined). Automatic recognition of the fonts used in the Japanese source text seems very complicated, as "font sets" (consisting of a Japanese and a Roman font) might be used and "paired" inconsistently.

It seems many rogue codes are actually not caused by alternating fonts, but by alternating code sets, as even Word documents that have been forced to use one font give inconsistent rogue codes in DVX:

{\loch\af25\hich\af25\dbch\af25 \loch\af25\hich\af25\dbch\f25 平成}{\loch\af25\hich\af25\dbch\af25 \hich\af25\dbch\af25\loch\f25 18年12月19日}

instead of merely 平成18年12月19日

The rogue codes are inconsistent: if a code appears behind 平成 (Japanese changes from kanji to Arabic numerals), you'd expect a similar code behind 年, 月 or 日, but this is not the case. Reproduction:

Workaround:


Status:

Not read by Atril yet.


Comments from other users: