szigetvári's
affiliation * availability * courses * papers * vita * etcetera * newpages
GNU * TeX * EPD * Hungarian * links
diacritics * sorting * corpuses


perl scripts converting Hungarian characters with diacritics

Since ASCII does not encode characters with diacritics there evolved several standards of doing so. Each is more or less useful for some purpose, but usually one wants to use the same text for several purposes, like posting on the web, printing a hardcopy using some text formatting device, etc. Therefore conversion is unavoidable. The perl scripts below do just this for all the characters with diacritics used in Hungarian.

The characters involved
á, Á, ä, Ä, é, É, ë, Ë, í, Í, ó, Ó, ö, Ö, ô, Ô, ú, Ú, ü, Ü, û, Û
The encoding systems
Prószéky, TeX, Cork (8-bit TeX), ISO-8859, DOS style, html, abc (q.v.)

There is a README file containing more details. The names of the scripts are mostly self-explanatory, select those you need below, or take them all (you need gzip and tar for DOS to unarchive this under DOS, the README file is included in the all-package).

to/from Proszéky
abc2p / p2abc
cork2p / p2cork
dos2p / p2dos
html2p / p2html
iso2p / p2iso
sgml2p / p2sgml
tex2p / p2tex
utf2extp / extp2utf
utf2p
more TeX related
cork2iso / iso2cork
dos2tex / tex2dos
html2tex / tex2html
to2cork
more html related
dos2html / html2dos
others
dnew2dold / dold2dnew
dos2iso / iso2dos

Comments welcome

[back to top]


© Péter Szigetvári, page last touched Sun Jan 27 00:01:25 CET 2002  best viewed with any browser