鼓捣一下Linux下的locale

关于Linux下的locale,网上讲这个资料不少,本没必要多说什么。无奈手贱,权当做个笔记吧。以下操作环境为CentOS 6.5

1. 查看当前locale


  1. [chenhj@node1 ~]$ locale
  2. LANG=en_US.UTF-8
  3. LC_CTYPE="zh_CN.UTF8"
  4. LC_NUMERIC="zh_CN.UTF8"
  5. LC_TIME="zh_CN.UTF8"
  6. LC_COLLATE="zh_CN.UTF8"
  7. LC_MONETARY="zh_CN.UTF8"
  8. LC_MESSAGES="zh_CN.UTF8"
  9. LC_PAPER="zh_CN.UTF8"
  10. LC_NAME="zh_CN.UTF8"
  11. LC_ADDRESS="zh_CN.UTF8"
  12. LC_TELEPHONE="zh_CN.UTF8"
  13. LC_MEASUREMENT="zh_CN.UTF8"
  14. LC_IDENTIFICATION="zh_CN.UTF8"
  15. LC_ALL=zh_CN.UTF8
上面列出了系统支持各种区域相关的属性,比如日期,货币。


2. 设置locale

通过设置环境变量可以随时改变locale
LC_ALL > LC_* > LANG
[root@node1 ~]# export LC_ALL=zh_CN.utf8

或者修改/etc/sysconfig/i18n

  1. [root@node1 ~]# cat /etc/sysconfig/i18n
  2. LANG="en_US.UTF-8"
  3. SYSFONT="latarcyrheb-sun16"

3.查看系统支持的loacle一览

比如列出所有zh_CN的

  1. [chenhj@node1 ~]$ locale -a|grep zh_CN
  2. zh_CN
  3. zh_CN.gb18030
  4. zh_CN.gb2312
  5. zh_CN.gbk
  6. zh_CN.utf8

4. locale的定义在哪里?

/usr/share/i18n/locales目录比如zh_CN
/usr/share/i18n/locales/zh_CN:

  1. ...
  2. LC_CTYPE
  3. % This is a copy of the "i18n" LC_CTYPE with the following modifications:
  4. % - Additional classes: hanzi

  5. copy "i18n"

  6. translit_start
  7. include "translit_combining";""
  8. translit_end

  9. class "hanzi"; /
  10. % <U3400>..<U4DBF>;/
  11.         <U4E00>..<U9FA5>;/
  12.         <UF92C>;<UF979>;<UF995>;<UF9E7>;<UF9F1>;<UFA0C>;<UFA0D>;<UFA0E>;/
  13.         <UFA0F>;<UFA11>;<UFA13>;<UFA14>;<UFA18>;<UFA1F>;<UFA20>;<UFA21>;/
  14.         <UFA23>;<UFA24>;<UFA27>;<UFA28>;<UFA29>
  15. END LC_CTYPE

  16. % ISO 14651 collation sequence
  17. LC_COLLATE
  18. copy "iso14651_t1_pinyin"
  19. END LC_COLLATE
  20. ...

上面的LC_CTYPE定义了简体中文的汉字分类"hanzi",LC_COLLATE定义了汉字的拼音排序。


还可以再打开拼音排序的定义文件看看
/usr/share/i18n/locales/iso14651_t1_pinyin

  1. LC_COLLATE

  2. copy "iso14651_t1_common"

  3. script <HAN>

  4. order_start <HAN>;forward;forward;forward;forward,position
  5. <U5416> <U5416>;IGNORE;IGNORE;IGNORE #吖104
  6. <U814C> <U814C>;IGNORE;IGNORE;IGNORE #腌185
  7. <U9312> <U9312>;IGNORE;IGNORE;IGNORE #錒0
  8. <U9515> <U9515>;IGNORE;IGNORE;IGNORE #锕7
  9. <U963F> <U963F>;IGNORE;IGNORE;IGNORE #阿23237
  10. <U55C4> <U55C4>;IGNORE;IGNORE;IGNORE #嗄60
  11. <U554A> <U554A>;IGNORE;IGNORE;IGNORE #啊16566
  12. <U54C0> <U54C0>;IGNORE;IGNORE;IGNORE #哀4070
  13. <U54CE> <U54CE>;IGNORE;IGNORE;IGNORE #哎2473
  14. ...

一看注释就明白了,确实是按拼音排序的。

5 字符集在哪定义的?

字符集都定义在/usr/share/i18n/charmaps目录下,比如GB2312。
/usr/share/i18n/charmaps/GB2312.gz

  1. <code_set_name> GB2312
  2. <mb_cur_max> 2
  3. <mb_cur_min> 1
  4. <comment_char> %
  5. <escape_char> /
  6. % Chinese charmap for EUC-CN = GB2312 = union of ASCII and GB_2312-80
  7. % version: 1.0
  8. % Contact: ha_shao
  9. % Email: hashao@china.com
  10. % Distribution and use is free, even for comercial purpose.
  11. %
  12. CHARMAP
  13. <U0000> /x00 NULL (NUL)
  14. <U0001> /x01 START OF HEADING (SOH)
  15. <U0002> /x02 START OF TEXT (STX)
  16. <U0003> /x03 END OF TEXT (ETX)
  17. <U0004> /x04 END OF TRANSMISSION (EOT)
  18. <U0005> /x05 ENQUIRY (ENQ)
  19. <U0006> /x06 ACKNOWLEDGE (ACK)
  20. <U0007> /x07 BELL (BEL)
  21. <U0008> /x08 BACKSPACE (BS)
  22. <U0009> /x09 CHARACTER TABULATION (HT)
  23. ...



6. 创建loacle

前面提到的loacle定义和字符集定义相当于源代码,我们真正使用是基于loacle定义+字符集定义的得到的编译好的locale。创建loacle使用localedef
man localedef

  1. ...
  2. The localedef program reads the indicated charmap and input files, compiles them to a form usable by the locale(7) functions inthe C library, and places the six output files in the outputpath directory.
  3. ...

定义一个试试!

  1. [root@node1 ~]# localedef -f UTF-8 -i zh_CN myzh
  2. [root@node1 ~]# locale -a|grep myzh
    myzh
    myzh.utf8
创建的locale被添加进了/usr/lib/locale/locale-archive

  1. [root@node1 ~]# grep myzh /usr/lib/locale/locale-archive
  2. Binary file /usr/lib/locale/locale-archive matches

7. 获取本地化消息


  1. [root@node1 ~]# export LC_ALL=myzh.utf8
  2. [root@node1 ~]# ls xx
  3. ls: cannot access xx: No such file or directory

怎么还是英文消息?
看看它在干嘛!

  1. [root@node1 ~]# strace -eopen ls xx
  2. open("/etc/ld.so.cache", O_RDONLY) = 3
  3. open("/lib64/libselinux.so.1", O_RDONLY) = 3
  4. open("/lib64/librt.so.1", O_RDONLY) = 3
  5. open("/lib64/libcap.so.2", O_RDONLY) = 3
  6. open("/lib64/libacl.so.1", O_RDONLY) = 3
  7. open("/lib64/libc.so.6", O_RDONLY) = 3
  8. open("/lib64/libdl.so.2", O_RDONLY) = 3
  9. open("/lib64/libpthread.so.0", O_RDONLY) = 3
  10. open("/lib64/libattr.so.1", O_RDONLY) = 3
  11. open("/proc/filesystems", O_RDONLY) = 3
  12. open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
  13. open("/usr/share/locale/locale.alias", O_RDONLY) = 3
  14. open("/usr/share/locale/myzh.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  15. open("/usr/share/locale/myzh/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  16. ls: cannot access xxopen("/usr/share/locale/myzh.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  17. open("/usr/share/locale/myzh/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  18. : No such file or directory
原来找不到libc的本地消息资源文件。localedef只是定义了locale的基本内容,每个应用要使用的本地资源还得另外加。
现在从别的地方借个过来应急!

  1. [root@node1 ~]# ls /usr/share/locale/myzh
    ls: cannot access /usr/share/locale/myzh: No such file or directory
  2. [root@node1 ~]# ln -sf /usr/share/locale/zh_CN /usr/share/locale/myzh

再试一下,OK了。

  1. [root@node1 ~]# ls xx
  2. ls: 无法访问xx: 没有那个文件或目录

最后把这个临时的locale删掉

  1. [root@node1 ~]# localedef --delete-from-archive myzh
  2. [root@node1 ~]# rm -f /usr/share/locale/myzh

8 参考

http://wiki.ubuntu.org.cn/Locale
http://www.linuxidc.com/Linux/2009-12/23620.htm
http://sysadmin.blog.51cto.com/83876/223870




郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。