ANSI and Unicode

ANSI:

ANSI is an acronym for American National Standards Institute. One of the ANSI standards in computers is the character set comprising of letters, numbers, and symbols, that a computer uses, which is called ASCII (American Standard Code for Information Interchange). ANSI character set was designed originally for English script and is supported by almost all operating systems and applications. ANSI character set consists of 256 different characters and each character is represented by a single byte.

While ANSI is a standard for storing of the data, the visual presentation of the data is done by the fonts. Applications that use ANSI data use TrueType fonts for displaying the text. Most of the existing Indian language software including the Baraha editor are based on ANSI chararcter set and TrueType fonts. These software use the positions (codes) in the ANSI table that are meant for English script, for storing the characters (glyphs) of an Indian script. Therefore, if you view the ANSI text containing Indian language data in a text editor such as Notepad, you will see junk text. But the text is displayed correctly if you apply the corresponding Indian language font. Different Indian language software use different ANSI codes for representing the characters. Therefore, if you view the text from one software using a font from the other software you will see junk text.

All the Baraha fonts are TrueType fonts and can be used for ANSI text only. See: Baraha fonts

Unicode:

ANSI has only 256 different characters. However, many world languages have thousands of characters and many applications need to display text from multiple scripts at the same time. So, engineers designed a single character set that can hold all the characters from all the languages of the world, called Unicode. Today, Unicode supports around 100,000 characters! Unicode uses 2 or more bytes for representing a character.

Unicode required new functions for displaying the complex scripts such as Indian, which resulted in the development of new font standard called OpenType. Unicode compliant applications use OpenType fonts for displaying the Unicode text.

Not all operating systems may support Unicode as of today. Even if an operating supports Unicode, not all applications running under it may support Unicode and/or Indian languages. Microsoft introduced Unicode into its operating systems with Windows NT. But Indian languages started appearing only from Windows 2000 onwards.

BarahaPad and BarahaDirect (BarahaIME) programs can be used for creating documents in Unicode.

See: Unicode system requirements for Indian languages.