|
|
Advanced topics on Computing in Greek
This page is intended to cover advanced topics in using Greek fonts. It
is not intended for novice users, AOL users, WebTV users, or people with no
programming background. It is heavily UniX-oriented, primarily because the
topics addressed have emerged during the past 3 years of developing HR-Net
and the HR-Net custom news archiving and distribution system - on a UniX
platform.
This page is not "supported". We invite contributions, corrections and
discussions with sysadmins of Greek systems or systems that use Greek
fonts, but cannot promise to help solve a problem, develop new code, or
modify our existing code to suit third-party needs. Please direct all
correspondence regarding these topics to the feedback form.
Warning: Anything on this page may be in error.
It may not work. It may be incompatible with the RFC's. (Don't know what
an RFC is? You're looking at the wrong page, try http://www.hri.org/fonts/!) We'll be
happy to correct any mistakes you identify, but we offer no guarantees that
anything here actually works.
Character Tables
- ISO-8859-7 character set
- Codepage 437 character set
- Unicode 2.0: Greek Block U+0370-U+03FF
- Unicode 2.0: Greek Extended U+1F00-U+1FFF
Code Grab-bag
- 2ascii.pl
- A CGI script in Perl, which uses webcopy to get a given WWW page,
and applies a filter to convert from ISO-8859-7 to ASCII Greek
before returning the page to the user. Useful for occasional accessing of
Greek pages from a machine that does not have Greek fonts installed.
In public use on HR-Net: www.hri.org/cgi-bin/2ascii
- decode.pl
- A Perl script which converts Base64 and Quoted-Printable encoded
messages, sections of messages or attachments to 8-bit-clean.
Used as a pre-filter to a conversion from ISO-8859-7 to ASCII
which facilitates "twin-lists".
qpdec.pl: Quoted-Printable Decoding. Standalone section of code in decode.pl
- elot2ascii.pl
- Perl filter for converting ISO-8859-7 to ASCII Greek according to an
ad-hoc conversion table The
ad-hoc definition of the ASCII Greek created by elot2ascii.pl is indicated by "Latin:".
- e2a.pl
- A similar conversion as the previous one, only the output looks as
if you typed in Greek text without using a Greek keyboard driver.
e.g., the Greek word for "good morning" converts to "kalhm;era".
We have used it from within an editor such as vi as in:
:.!e2a.pl
to convert a single line.
- a2e.pl
- The exact opposite of the previous filter. If you lack a Greek
keyboard driver, you can type in the text pretending you are using one.
e.g., you type:
O ;hliow anat;elei ap;o thn anatol;h
Then, you pass the text through this filter and generate an ELOT
928 text.
- a2e2.pl
- An extended version of a2e.pl, this script tries to cover multiple versions
of ASCII Greek, as they might appear in email distribution lists where
discussion is principally in ASCII Greek.
Note: This program does not currently handle accents.
- 4372elot.pl
- Perl filter for converting antiquated Codepage 437 Greek to ISO-8859-7
according to an ad-hoc definition of Codepage 437.
- web2elot.pl
- Ever create an HTML file using an HTML editor (we don't actually use
these contraptions, but a lot of stuff that comes our way has been
produced using them) only to find it full of cryptic, illegible and
sometimes flat out wrong & encoding, such as ê,
ġ or even Ώ? The first and second examples occur
when your editor does not know that you want the output to be
8-bit ISO-8859-7. These encoding errors are particularly heinous,
since they multiply the size of a regular HTML file by a factor of 4 or 5.
The second case is actually valid (it's really Unicode
encoding) but since all our other tools are ISO-8859-7, we'll hold
off on the switch until this becomes more prevalent [*]. web2elot.pl clears up all the
junk, reverting your file to nice, clean, 8-bit ISO-8859-7.
Warning: This code has not really been
tested. If you see anything missing, please write us.
- mactrans or better yet: trans
- On a Macintosh, you might be using what looks like an ELOT 928 font,
yet, when it comes to converting to HTML code, nothing looks
right. The Mac has its own standards. To convert to PC or Unix
standards, after the text has reached its final platform and is
illegible, use this variant of tr to fix it.
- convert-greek-rtf-21.hqx
- Use this package to translate Greek Word documents written in Mac format
to rtf documents in PC format. Follow the instructions that come with
the package.
- elta
- Some mailers cannot handle 8-bit messages and don't do anything
about it. The result is apparent loss of information and the
recipient receives something like:
To je_lemo aut| dem ha ckit~sei. Wqei\fetai bo^heia ap| to HR-Net cia
ma diabaste_.
Use this variant of tr to recover parts of the text that
were originally 8-bit characters of ELOT 928. There is no way for
the program to know which parts were originally plain 7-bit characters,
i.e., had the most significant bit low. This means that you have to
separate the message headers as well as any parts of the damaged
message that you can already read before trying to recover the original.
Thus, the above example becomes:
Το κείμενο αυτό δεν θα γλιτώσει. Χρειάζεται βοήθεια από το ΘR-Ξετ για
να διαβαστεί.
- grconv.bas
- Visual Basic routines for converting various types of files to different Greek encodings, thanks to Panagiotis Louridas. Available also in zip format.
"Twin-lists"
Twin lists is an approach to Greek mailing lists, proposed by HR-Net and
used for some supported lists, as well as our internal communications. The
principle is simple: Create two lists, one with users that can read Greek,
and one with users that cannot. The second list is a member of the first
one, but receives its messages after any Greek content is converted from
ISO-8859-7 to ASCII Greek.
If you are using Majordomo software, the member of the main list which
represents the Greek-impaired list should be defined in /etc/aliases as
list-asc "|/path/decode.pl|/path/elot2ascii.pl|/path/wrapper resend..."
Both decode.pl and elot2ascii.pl are available for
download. decode.pl is used to convert all text and text attachments to
8-bit-clean, in order to allow elot2ascii.pl to remain as simple and fast
as possible.
Unicode: We are currently evaluating how
it might be possible to switch the Greek internet community to using
Unicode -- the inevitable, it seems, next step in internationalization.
First attempts at UTF-8 scripts
- iso2uni.pl
- Converts ISO-8859-7 compliant text to UTF-8 Unicode. STRICT adherence to ISO-8859-7.
- win2uni.pl
- Converts Windows-1253 compliant text to UTF-8 Unicode. STRICT adherence to Windows-1253.
- uni2iso.pl
- Converts UTF-8 Unicode to ISO-8859-7. Bad code, does not handle non-ISO-8859-7 Unicode Characters.
- utf82iso.pl
- Converts UTF-8 Unicode to ISO-8859-7 using different character mapping. Very simple code, does not handle non-ISO-8859-7 Unicode Characters, maybe more.
- uni2win.pl
- Converts UTF-8 Unicode to Windows-1253. Bad code, does not handle non-Windows-1253 Unicode Characters.
|