|
FRUSTRATION CHARACTERIZES the experience of
academic PC users of Middle Eastern languages who need to work with several
Middle Eastern and European languages in a single word-processing document and
who need to exchange documents with colleagues. Colleagues collaborating
long-distance on projects have had to contend with converting unintelligible
Arabic, Hebrew and Persian documents and e-mail. Equally frustrating is deciding
which transliteration font, word processor or operating system to use for Middle
Eastern languages while ensuring that colleagues can open, edit and print your
documents. Promising solutions and products of the 1990s have failed to provide
the level of multilingual text editing capabilities required by academic users.
Until recently, those who required extensive use of more than one RTL
(Right-to-Left) Middle Eastern language (Arabic, Hebrew, Persian, Syriac and
Ottoman Turkish) had to employ such self-contained solutions as Macintosh with
World Script add-ins and word-processing programs like Nisus Writer or
otherwise, bug-ridden archaic PC word processors and add-in programs sparse in
features and capabilities and requiring internal fonts incompatible with more
universal applications such as MS Word. Now academic users have a real
alternative with Microsoft Windows 2000 Professional operating system and
Microsoft Office 2000 Professional. In addition to reviewing these and other
software products, this essay explores a number of recent innovations and
solutions in software technology, which will benefit non-specialist and
specialist users.
The Unicode Standard
While the issue of international standards in multilingual computing has no
direct bearing on the average academic user who desires practical solutions that
work, such a user may be left behind as colleagues, publishers, and academic
institutions adopt new and more efficient standards in word-processing and the
exchange of data. The internationally recognized Unicode encoding standard (www.unicode.org),
a means of representing the world's modern and ancient languages in a single
character set (currently supporting approximately 6,000 languages, including the
International Phonetic Alphabet and special characters for transliterating
Middle Eastern languages) has rendered obsolete the 256 character ASCII
standard.1 Unicode includes support for Arabic and Hebrew scripts
with additional support for Persian, Ottoman Turkish, Urdu and Yiddish
characters. Support for ancient Near Eastern scripts like Egyptian Hieroglyphics
and Ancient Aramaic is forthcoming. Users of Syriac and Windows 2000 will
benefit from the eleven free Unicode-enabled fonts (beta release) available
through the Syriac Computing Institute. However, users need to register on-line:
(www.egroups.com/group/syrcom).
Further information on the Syriac project is available in the July 2000
electronic journal Hugoye: Journal of Syriac Studies (syrcom.cua.edu/Hugoye).
Formal adoption of the Unicode standard by software manufacturers is beginning
to produce significant results. No longer are the multilingual needs of academic
users being ignored. Windows 2000 Professional (reviewed below), Microsoft's
flagship operating system intended for organizational users, is the first
operating system to effectively integrate Unicode multilingual support. This OS
contains a built-in Unicode script processor called Uniscribe, which supports
complex tasks relating to how particular characters are displayed. PC users can
receive and convert Macintosh Arabic and Hebrew documents on a Windows 2000
platform. Word 2000, part of MS Office 2000 (reviewed below) is designed to work
with Unicode-enabled as well as traditional TrueType and Bitmap fonts. The
reader should be warned, however, that not all software packages provide Unicode
support. For example, it is still not possible to enter transliterated
characters for Middle Eastern languages and Arabic and Hebrew scripts in
bibliographic software like EndNote. Although web-based library and
bibliographic databases are common, Unicode support for transliterated Middle
Eastern languages along with Arabic and Hebrew scripts has yet to be
implemented. Nevertheless, with many library database formats available, the
most effective way to support the display of non-Roman and transliterated
scripts is through adopting the Unicode standard.
In the field of academic publishing, full
implementation of the Unicode standard is a year or two away as desktop
publishing applications do not provide multilingual Unicode support. Publishers
are receptive to this standard and realize the potential for making the
publishing process from the manuscript stage to the printing process more
efficient and cost-effective.
Windows 2000 Professional (NT 5)
Gone are the days of the invidious blue screen system crash to which Windows
95 and 98 were susceptible. While Microsoft Windows 98/ME (Arabic, Hebrew and
enabled versions) was adequate for users who required only one RTL language,
Windows 2000 has broken the language barrier with its implementation of the
Unicode standard and support for a wide range of Windows, PC and Macintosh
encoding formats. Currently with support for 120 language groups (this includes
variations on Arabic, English, French and German, and so forth) from Afrikaans,
Arabic and Basque to Sanskrit and Thai, Windows 2000 currently offers the best
solution for those who need to use more than one Middle Eastern language. Syriac
also will officially be supported in MS Office 10 and in the Windows NT
Operating System slated for release next year. No other operating system offers
the functionality, multilingual capabilities and the versatility to exchange
documents in various encoding formats, which make texts readable.
Microsoft Windows 2000 is essentially an
operating system for organizations. Indeed, most users will not have reason to
take advantage of the many administrative and utility features packed into the
OS. However, it has much to offer academic users and institutions. A
multilingual version intended for academic institutions is slated for release in
the second quarter of this year. The only difference between this and the
Professional release is the ability to change the user interface and menus.
However, for most individual users this feature is not needed. Windows 2000 can
be deployed in language and learning labs, particularly for advanced students
who need to produce essays in Middle Eastern languages as well as for visiting
scholars who may prefer working in their native language. Windows 2000 is also
an ideal platform for the development of language learning applications, which
take advantage of Unicode support and the advanced features of Windows 2000.
Academic institutions in the Middle East will
find the multilingual or localized versions of Windows 2000 (Arabic, Hebrew,
Turkish) to be superior to Windows 98/ME in every respect. Most users who only
use either Arabic or Hebrew along with European languages will continue to
benefit from Windows 98/ME Arabic or Hebrew. Unfortunately, users running Word
2000 under Windows 98/ME cannot easily use Persian and Hebrew. In Windows 98/ME,
languages are OS dependent. Additional language groups can be installed as
needed through Regional Options in the Control Panel folder (e.g., Arabic
[includes Persian and Urdu], Armenian, Greek, Hebrew, Turkic, and so forth).
Additional input languages and keyboard layouts can be added by clicking the
“Input Locales” tab.
Before upgrading to Windows 2000, users should
consider the hardware requirements. A minimum configuration of 64 MB RAM and at
least a Pentium 233 and 2-3 Gigabyte hard drive are required even though
Microsoft recommends only a 2 GB hard drive and a Pentium 133. While it is
possible to install Windows 2000 over Windows 95 or 98, a clean install is
highly recommended.
Transliteration
Many of us have experienced the difficulty of using extended ASCII
transliteration fonts and converting them for simultaneous use on Macintosh and
the PC. Interchangeability and cross-platform exchanging of documents containing
complex scripts was virtually impossible. Currently, one of the best commercial
transliteration fonts available for Arabic, Hebrew, and Persian is Linguist’s
Software’s Semitic Transliterator which is available in six different
typefaces. Linguist’s has been dedicated to providing specialized fonts and
other products along with reliable technical support to linguists, Semiticists
and scholars of Biblical languages. Although not Unicode-compliant, Semitic
Transliterator, which was updated for compatibility with the latest application
software in 1999, is useful for scholars of Biblical languages and Ancient Near
Eastern languages such as Akkadian. The installation diskettes come with
additional transliteration fonts and keyboards drivers for Akkadian. Although
the set-up process can be discouraging for the computer novice, Linguist's
simplifies this process by providing two detailed step-by-step manuals covering
installation and troubleshooting issues. In addition to the font, Linguist's
keyboard drivers need to be installed and Linguist’s provides a keyboard chart
showing the key combinations. Unfortunately, the updated TrueType font displays
poorly in Word 2000 in 12 point at 100% magnification. Working at 125% or higher
magnification is recommended. As with older versions, the updated version of
Semitic Transliterator requires installing Linguist's keyboard drivers and
disabling AutoCorrect features in Word 97 and Word 2000. Despite this drawback,
laser printing produces high quality output. Experienced users of Windows 2000
need not install the accompanying keyboard drivers, but instead may opt to
access the special characters by creating a simple macro using the Symbol menu
option in Word 2000 and assigning keyboard shortcuts. Although a Macintosh
version of this font is available, the character mappings are not identical and
finding and replacing characters in converted documents is not a simple process.
While Semitic Transliterator is not Unicode-compliant, another font is.
Monotype Corporation’s Arial Unicode MS
(Helvetica style), which is included with Microsoft Office 2000, accommodates
the scripts of many of the world's languages. Arial Unicode MS produces high
quality output to screen and printer. Unfortunately, Microsoft and Monotype
currently do not have plans to add the necessary Unicode ranges to Courier New,
Times New Roman and other standard fonts. Arial Unicode MS does include IPA and
transliteration characters with macron, breve, circumflex, caron, dieresis, dot
and other diacritic marks. It also includes all the characters for
transliterating Arabic, Hebrew, Persian and Ottoman Turkish. Nevertheless,
accessing these special characters even in Windows 2000 requires a little
effort. The easiest way is to create keyboard shortcuts (assigning Alt, Ctrl,
Alt Gr and Shift key combinations to a character) in a new document using the
Symbol sub-menu item on the Insert menu and saving the modified configuration as
a Word template. Fortunately, users need not disable AutoCorrect features. Arial
Unicode MS also works well in MS Access 2000 with a keyboard macro. One
foreseeable problem with adopting a new transliteration font is converting
word-processing files that contain another font. There is no easy solution. One
might create a find-and-replace macro in MS Word 2000 or perform a global search
and replace. In order to avoid unintentionally replacing characters from
European languages, it is best to approve each change. It is to be hoped that
font manufacturers in conjunction with academic institutions will continue to
produce Unicode-enabled fonts to meet the multilingual needs of the academic
community.
Microsoft Office 2000 Professional
Office 2000 Professional includes a word processor, database, presentation
manager, spreadsheet and e-mail applications in addition to the multilingual R-L
capable Internet Explorer 5 web browser (also downloadable for free from the
Microsoft web-site). All Office applications allow the input of Arabic and
Hebrew. This review focuses on two of these applications—MS Word 2000
(word-processing) and Outlook 2000 (e-mail). In the past, PC users have used
programs containing built-in Arabic and Hebrew fonts (incompatible with
Microsoft Word or add-in programs) which allowed the importation of a limited
amount of text in a Middle Eastern language. Word 97 and Word 2000 as stand
alone applications with Windows 98/ME make it possible to use only one Middle
Eastern language apart from Turkish. For Arabic or Hebrew this meant using
enabled or localized versions. RTL languages are operating system dependent. The
powerful combination of Office 2000 and Windows 2000 allows users to customize
the user interface via the Microsoft Office Language Settings panel (accessible
through the Windows Start menu) in one of sixty-three languages and variations,
including Arabic and Hebrew. Institutional users can acquire the MultiLanguage
Pack (a set of seven CDs available through Microsoft institutional licensing
programs) with which they can install proofing tools (spelling checker, grammar
checker, hyphenation, thesaurus, translation dictionary for Arabic and Hebrew,
and localized templates) and change the interface language (display, menus and
help features) of all Office 2000 applications except for non-Arabic and Hebrew
versions of Outlook (e-mail). Individual users can purchase Microsoft Office
2000 Proofing Tools from Microsoft’s on-line shop or an academic retailer
for around USD 70. Unlike the Multilanguage Pack, the Proofing Tools do not
include the ability to customize the interface language. Among the most useful
Office language features is the translation dictionaries (Arabic-English,
English-Arabic, Hebrew-English, English-Hebrew). While the Hebrew and Arabic
spelling checker is fairly impressive, the grammar checker needs improvement.
For instance, it cannot detect errors in subject-verb agreement. The English
(US) version of Office 2000 comes with proofing tools for English, French, and
Spanish. In switching languages, Word 2000 automatically detects the text
language of a document. With Windows 2000 and Office 2000, users can choose from
over fifty Arabic/Persian and fifty Hebrew fonts including OpenType face (OTF)
fonts, the latest in font technology, which contain a number of languages.
OpenType fonts include Arial, Courier New, Lucida Sans Unicode, Tahoma and Times
New Roman. MS Office 2000 users can also install a visual keyboard, which can be
displayed using any font in all MS applications. Clicking on a key will copy the
character to the application window. Users who only require Arabic or Hebrew can
purchase an Arabic or Hebrew localized version of Office 2000 instead of the
additional proofing tools.
In Word 2000, the user can open and save
documents in various PC and Macintosh formats with conversion to and from
Macintosh Arabic and Hebrew formats (though no converters are currently
available for NisusWriter documents). Select “Save As” from the file menu
and select “encoded text” under “Save as type.” Files can be opened the
same way by first ensuring that conversion confirmation is checked in Word
Options and then selecting the appropriate encoding. For Arabic and Hebrew
Macintosh, document formatting is lost in the process. The user can assign names
to Documents with Arabic, Hebrew, Persian, Turkish and the other languages
installed on Windows 2000. Office also comes with a stand-alone code page and
text layout conversion utility (Conv Text), which includes a variety of PC
formats for converting Arabic and Hebrew text files.
Outlook
Sending Arabic and Hebrew e-mail has never been easier with Outlook 2000.
Using Outlook 2000 will facilitate closer cooperation among colleagues in the
Middle East and elsewhere. This e-mail program comes with an impressive
repertoire of productivity features, including an appointment/meeting calendar
with international holidays, task manager and address book. Users who wish to
send multilingual text can do so using a Unicode-enabled font such as Arial
Unicode MS, Times New Roman or by attaching a Word 2000 document to the e-mail
message. In addition to Unicode (UTF-7, UTF-8), Outlook supports sending and
receiving e-mail in Arabic (ISO, Windows) and Hebrew (ISO-logical, Windows)
encodings. Colleagues who do not own Outlook 2000 can use the free scaled-down
version—Outlook Express (downloadable from the Microsoft site) on an Arabic or
Hebrew-enabled platform or Windows 2000. The principal fault with Outlook 2000
is that unlike other MS Office applications, its visual display is not
Unicode-based and cannot be updated with the institutional Multilanguage Pack.
The address book in non-Arabic or Hebrew editions also cannot store Arabic and
Hebrew names. Although MS Office 2000 Professional with the Proofing Tools is a
bit pricey, for someone working in multiple Middle Eastern languages, they are
worth it.
Conclusions
Adoption of the Unicode standard in institutional computing and in the
publishing market will lead to further globalization of informational
technology. Users desire the ability to access databases and download
multilingual text in universal and readily exchangeable formats. Companies like
Microsoft have made this possible. A pressing need exists for Unicode support in
bibliographic software as well as library databases and for users to be able to
access and exchange data in as few formats as possible. With the adoption of
these innovative solutions, colleagues in the Middle East and the West will be
able to work more closely on collaborative projects.
1 The author would
like to thank Dr. Kenneth Whistler of the Unicode consortium and Sybase for this
figure.
|