Indic Unicode issues in Windows

The following issues with Windows XP's implementation of Indian languages were noticed during Baraha development. Some of these issues have been fixed in the later versions of Windows.


Issue 1: (fixed in Windows XP service pack 2)

In the following Kannada examples Windows XP's renderingis incorrect.

correct rendering is as shown in the example below.


Issue 2: (fixed in Windows 7)

In Kannada, the consonant clusters with 'ra' (0CB0) as the first consonant can be written in two methods as shown in the example below.

The method a) uses arkavottu (repha) symbol at the end of the combination. This method is inherited from Sanskrit and is commonly used in the Kannada script.

The method b) - which is the natural way of rendering consonat clusters in Kannada, uses 'ra' consonant followed by the half form(s) of the following consonant(s). In certain cases the method b) is more appropriate than the method a) as shown in example below.

Baraha supports both these methods. But, Windows XP supports only the method a). There is a proposal to render these two forms using ZWJ character as shown in the example below.


Issue 3: (fixed in Windows 7)

In the following Kannada examples Windows XP is incorrectly placing arkavottu after anuswara.

The correct rendering is as shown in the example below.


Issue 4:(fixed in Windows 7

There is inconsistency in the placement of arkavottu in the following Kannada examples.

The correct rendering is shown in the picture below.


Issue 5:

The correct sort order for Tamil script as shown in the example below. As of Baraha 7.0, Baraha uses this sort order for Tamil.

The UNICODE however, has a different sort order for Tamil as shown in the example below.


Issue 6: (need clarifications)

During implementation of Bengali script, Baraha used the book "Bengali self-taught" by Suniti Kumar Chatterji as a reference. Baraha uses the rule that YA-PHALA is the half form of the consonant 09DF(য়). But in Windows implementation, the YA-PHALA is the half form of the consonant 09AF(য). Due to this inconsistency when the Baraha document is converted to UNICODE, the output differs as shown in the pictures below. Baraha welcomes feedback from Bengali linguists on this issue.

Baraha rendering:


Windows XP rendering:

Execrpts from "Bengali self-taught" book.


Issue 7: (fixed in Windows 7)

The Bengali letters hU and hRu display the same output in Windows XP as shown in the picture below.


Issue 8: (fixed in Windows 7)

The Unicode documents specify that Marathi "eyelash ra" should be obtained as half form of character 0x0931. Baraha follows this convention. However, Windows XP uses half form of 0x0930 to render "eyelash ra".  Due to this inconsistency when the Baraha document is converted to UNICODE, the output differs as shown in the pictures below

Baraha rendering:

Windows XP rendering:


Issue 9: (fixed in Windows 7

The Kannada extended characters using Nukta(0x0CBC) are not implemented in Windows XP. Due to this deficiency when the Baraha document is converted to UNICODE, incorrect output is displayed as shown in the pictures below

Baraha rendering:

Windows XP rendering: