RFC: Improving Xft font selection [was: Xft comments]

Wednesday, 23 July 2008

        Raymond Toy (RT/EUS) writes:
...
 >>>>> "Stephen" == Stephen J Turnbull
<stephen(a)xemacs.org&gt; writes: 
...
     Stephen> I suspect that the problem is here in the array
`charset_table' in
     Stephen> src/objects-xlike-inc.c: 
...
     Stephen>     { &Vcharset_thai_tis620, "Thai",
"th" },
     Stephen>     { &Vcharset_arabic_iso8859_6, "Arabic", "ar"
},
     Stephen>     { &Vcharset_hebrew_iso8859_8, "Hebrew", "he"
},

 Hmm.  I tried that.  Thai and Arabic still don't work.  I also see
 that the entry for cyrillic_iso8859_5 appears to work, but the
 charset_table entry is NULL there. 
Yeah, I'm not sure what's going on with Cyrillic.  There should be a
fallback to try to find a font that fits the character, but I'm not
sure how well it's implemented.  I've updated Thai and Arabic as
above, and Russian as you suggest, in the Hg repo.  That last *does*
make a difference for me.

Please test and see if it regresses for you.  In particular, I wonder
if you'll get a different font.

...
 It could very well be that I didn't set up Xft correctly.
[...]
...
 I'm pretty sure the fonts are correct.  At the very least those
are
 the fonts in /usr/openwin/lib/locale/{ar,th}/X11/fonts/TrueType. 
I don't see any problems.

...
 I also don't see any messages about checking if font foo handles
bar
 like:

 checking if Courier New-12 handles Greek
 Xft font Courier New-12 supports el
 checking if Courier New Hebrew-12 handles Hebrew
 Xft font Courier New Hebrew-12 supports he

 There are no messages about Thai or Arabic. 
Yeah, the problem is that these are Mule-specific charsets (xtis-0 and
Mule-Arabic-1, respectively), not the ISO ones in my list.  I'll see
what I can do about a quick fix (probably not that quick actually,
because these charsets are defined in Lisp which isn't available when
this table gets built), but clearly the fixed charset_table approach
is insufficient even on its own terms.

Here's the rationale, for anybody who wants to think about how we
could do it better.

In Mule, the charsets used by the coding-system-for-read used are the
best indicator we have of the language.  Users of different languages
do have different font preferences, and multilingual users get
acculturated to each language's font customs.  So we do want to use
language information, and not just character repertoires.

Now in current Mule, we *always* have such charset information, so
that's the approach I started with.  As a hack, I just put in a C
array of structs (ie, a table) keyed by charset.  The charsets I've
used are the standard ones, plus Big5.  To fix Arabic, Thai, and
Tigrigna, I just need to introduce the appropriate private charsets in
the table.

More generally, I think this table approach is an appropriate
heuristic in many circumstances, since legacy charsets are often
available, but it needs to be accessible from Lisp.

On the other hand we will be getting lots of Unicode texts with no
such information, so we will need to deduce language information from
the document's character repertoire.  Suggestions for algorithms to
accomplish this, and for detecting language switches within documents
:-), would be appreciated.

_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta(a)xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

RFC: Improving Xft font selection [was: Xft comments]