The FarsiWeb Project
www.farsiweb.info
These are the informational pages of the FarsiWeb Project, a research project in Sharif University of Technology, Computing Center. The project has started in January 1999, but the team members have been in the field since long before that.
FarsiWeb has two main sponsors, Science and Arts Foundation (SAF) and High Council of Informatics of Iran (HCI). SAF has been the main financial sponsor, providing us with computers, books, and wages, always answering our rising requests, for both guidance or new accessories. HCI has been the main power behind the official acceptance of our work, both in Iran and internationally, trying very hard to help us pass the bureocratic borders. There has also been other sponsors, including but not limited to UNDP, and Sharif University of Technology.
One of the project's main goals is pushing the Iranian and Persian computing community to use "The Unicode Standard", the universal character set. Unicode is synchronised the international standard "ISO/IEC 10646, Universal Multiple-Octet Coded Character Set (UCS)".
The FarsiWeb Project is a member of the Unicode Consortium as a liaison from HCI, and has representatives in World Wide Web Consortium and ISO.
1 Lists and Contacts
For inquiries or comments about the project, or any help we may be able to provide, you can contact The FarsiWeb Project Group at FWPG@sharif.edu. Please note that we are not always able to answer.
If you are able to participate in mailing list discussions, there are two available:
- FarsiWeb: for discussions about Unicode, Persian standards, using Persian in World Wide Web (like HTML, XML, ...).
- PersianComputing: for general discussions about Persian computing.
The official language of the lists is English, but you should also be able to understand Persian, since some comments may be in Persian. To subscribe, or access the archives, click on the mailing list name above, and continue the instructions.
2 HOWTOs
We are maintaining a list of HOWTO documents on specific subjects. This will grow with time.
- How to setup a standard Persian keyboard for Windows 2000
- How to setup a Persian terminal (XTerm) for Linux
3 Products
FarsiWeb people have developed many products since the project's birth. These were patches for enabling Persian support in various programs, or sometimes original programs by themselves. Here is an incomplete list:
- FriBidi: A Free Implementation of Unicode Bidirectional Algorithm.
- Nesf: A Unicode font based on Nimrooz, suitable for Persian web pages (by Mehran Mehr): zipped version; uncompressed version.
- Jalali: A program to convert between Jalali (Hejri-e Shamsi) and Gregorian date systems: C source; PHP source; Windows executable and sources (ported by Mehrdad Sabetzadeh); Palm OS 4.0 executable and sources; PocketPC executable and sources and its screenshot.
- Standard Persian keyboard for Windows 2000, to solve some of the problems created by Microsoft's Farsi keyboard.
- KDE Persian support: We are helping KDE and Qt projects to have proper Persian handling, and Iranian localization support. Latest versions (Qt 3.0.2 and later, and KDE 3.0.0 and later) include our patches, but the effort is not finished yet.
- Pango Persian support: We are helping the development of Pango (used in GNOME) to complete its Persian support. Latest versions (1.0 and later) include our patches, but the effort is not finished yet.
- Mozilla Persian support: We are helping in developement and quality assurance of Mozilla, for proper Persian support. Latest versions (0.9.6 and later) include our patches, but the effort is not finished yet. Please always consider using the latest version if possible.
- XFree86 Persian support: We are improving XFree86 to have Persian data entry and automatic font selection support. Latest versions (4.2.0 and later) include our patches, but the effort is not finished yet.
- XTerm with Persian support, was developed by Robert Brady, using specifications and feedbacks from the project.
4 Local Info
Following is the list of data files and specifications available from the project. Please note that the files marked with an asterisk (*) are referring to deprecated standards, and you should normally not use them. You will only need them if you want your software to support these old formats.
Specifications
The final version of the national Iranian standard ISIRI 6219:2002 titled "Information Technology — Persian Information Interchange, using Unicode" is available. A first status report (in Persian) for the FarsiWeb project is also available.
The text of ISIRI 6219:200
Tables
These are the tables and data files produced in the project. Some of these have been submitted to global projects, including the GNU C library, and the CSets archive.
2901-unicode.txt ISIRI 2901 keyboard layout, using Unicode iran-holidays.txt List of official holidays of Iran. iri.eps Official symbol of Islamic Republic of Iran in EPS format, based on the exact definition from ISIRI 1 iri.svg Official symbol of Islamic Republic of Iran in SVG format, based on the exact definition from ISIRI 1 * 2901.txt ISIRI 2901 keyboard layout, using ISIRI 3342 * 3342.txt a mapping from ISIRI 3342 to Unicode * iransystem.txt a mapping from Iran System to Unicode * farsitex.txt a mapping from FarsiTeX file format to Unicode * 2900.txt a mapping from ISIRI 2900 to Unicode
5 Informational Links
We are maintaining a list of links here, on Internationalization (i18n for short) and bi-directional scripts (Bidi for short). We will highlight any part related to the Perso-Arabic script.
-
The Unicode Standard, which is the best globally accepted character set and supports all common scripts of the world, including but not limited to Latin, Greek, Cyrillic, Hebrew, Arabic, Armenian, various Indic scripts, Thai, Lao, Tibetan, Georgian, Chinese, Japenese, and Korean, and also Mathematical and Technical symbols. The standard is published by The Unicode Consortium which is founded by major computer companies like IBM, Microsoft, Apple, HP, and Oracle. The FarsiWeb project also participates in the development of the Unicode Standard, as a representative from HCI.
There is another standard, namely ISO/IEC 10646 which is the same as Unicode, but with less information.
These are the important parts of the Unicode site you should visit:
- The Unicode Standard, Version 3.0, which is really a good book to buy (a non-printable online version is also available). It addresses many problems, ranging from right-to-left behavior to guidelines for implementing sorting and searching.
- Unicode Technical Reports, of which UAX #9, The Bidirectional Algorithm, and UTS #10, Unicode Collation Algorithm (which specifies sorting) are of special importance.
- Code charts, where you can find character codes, names and shapes. Perso-Arabic script is encoded at the 0600..06FF range (hexadecimal).
Note: Unicode has made provisions for all Persian characters. Apart from "Pe", "Che", "Zhe", and "Gaf", there are codes for Persian-specific "Kaf" and "Ye". There was a single character in the Iranian Information Interchange Standard, ISIRI 3342, "Rial Sign", which was not in Unicode. The FarsiWeb Project has submitted a proposal for adding it to the Unicode Standard, and Rial Sign was encoded in Unicode 3.2 as U+FDFC.
- The World Wide Web Consortium
(W3C), is another gathering of companies for World Wide Web standards.
Using Unicode (or ISO/IEC 10646) is recommended in all of the W3C
standards, including HTML 4 and XML. Mr Roozbeh Pournader,
one of the members of the FarsiWeb Project, is participating in W3C
activities as an invited expert to its Internationalization Interest
Group.
- The HTML 4.01 specifications. The Persian-related parts are: Chapter 5, HTML Document Represention which addresses character set issues, and Chapter 8, Language information and text direction which tells how to specify languages in HTML, and specifies HTML bidirectional behaviour.
- The WWW Internationalization page, which links to i18n info on the web site.
- These are the browsers that support the Perso-Arabic script.
The support is still buggy, but is becoming better with time:
- Internet Explorer 5 or later, from Microsoft.
- Netscape 6 and its Open Source engine Mozilla are in the course of improving Bidi support, and the FarsiWeb project is working with mozilla-i18n group to ensure correct Persian behaviour. You can take a look at Mozilla i18n and l10n guidelines if you are interested in Mozilla's engineering details.
- Konqueror, from the KDE team.
- PMosaic, AraMosaic, and AraZilla, from Langbox.
- Tango, from Alis Technologies (it seems that it is no longer available).
- We recommend trying SC UniPad as one of the best Unicode-aware editors available on Windows.
- To get some ideas about Unicode and Linux, take a look at Markus Kuhn's UTF-8 and Unicode FAQ for Unix/Linux or Roman Czyborra's homepage that contain useful information and many links. They may not always be up to date in this moving world of i18n and Linux, but are a good resource for all, from the beginner to the expert.
6 Other Persian Free Software
The following are other projects who have released Persian Free Software. Please send a notice to FWPG@sharif.edu if you know of any other.