Archive for September, 2006

Improving the memorizability of EDICT entries

2006 September 16

The data for my word list comes from EDICT. The major hurdle in literally memorizing the English meanings is the lack of convention for related words. One example:

英語 [えいご]: the English language
日本語 [にほんご]: Japanese language

Why does the word “the” precedes “English language” while not for “Japanese language”?

Another example:

一日 [いちにち]: one day/first of month
二日 [ふつか]: two days/second day of the month
三日 [みっか]: three days/the third day (of the month)
四日 [よっか]: four days/4th day of month

“One day”, “two days”, “three days”, “four days”, no problem with that. However, the format is inconsistent for the other sense.

To improve the memorizability, I often modify entries in my word list. For the first example, I remove the “the” from “the English language”, making it use the same convention as “Japanese language”. For the second example, I consistently applied the format “x day of the month”. So, they become “first day of the month”, “second day of the month”, “third day of the month”, “fourth day of the month”, etc.

The last round of change concerns the use of “e.g.” vs. “etc.”. Example:

士気 [しき]: morale (of troops, team, etc.)
図 [ず]: figure (e.g., Fig 1)
開く [ひらく]: to open (e.g., a bank-account, festival, etc.)

Some use “e.g.”, some use “etc.”, while some use both. The one using both is clearly wrong. “e.g” means “for example” which implies non-exhaustiveness. Therefore “etc” (“and others”) which also implies non-exhaustiveness is not needed.

I decided to use “e.g.”, and here are the entries that are affected:

Old New
morale (of troops, team, etc.) morale (e.g., troops, team)
to take (e.g., time, money, etc) to take (e.g., time, money)
times (three times, each time, etc.) times (e.g., three times, each time)
to flow (liquid, time, etc.), to be washed away to flow (e.g., liquid, time), to be washed away
to open (e.g., a bank-account, festival, etc.) to open (e.g., bank account, festival)
(Why the need for “a” and “-” anyway?)

Everytime I make a change into my word list, I also have to synchronize the entries in Mnemosyne. What a disintegration mess.

PS: “e.g.” is from the Latin “exempli gratia”.

私のfirst 1500 words (another edition)

2006 September 16

The first printout of my first 1500 (or so) words is optimized to memorize the English meaning of a Japanese word. This is because the format is:


So, to drill the words, I cover the English meaning (and kana, if kanji exists) with a paper and then try to answer it.

For drilling the reverse, the ideal format is:

ENGLISH     KANJI_1 (KANA_1), KANJI_2 (KANA_2), ...

This time, it’s the kanji and kana that’s covered and I must translate from English to Japanese.

Of course, we can use the first format to drill English->Japanese by covering the kanji and kana. However there are two shortcomings:

  • The entries are sorted by kanji/kana. Therefore the sound of the answers will be similar if we drill from the top to bottom. This provides an unwanted clue.
  • Entries with the same English (for example 本日 and 今日 for “today”) is separated. Therefore there is ambiguity when trying to answer some items.

Thus the need for another format.

Preparing the first one isn’t hard because that’s exactly the format I use in my ods file. To prepare the second format, I write a filter for my text transformation program, LineFilter. The result is here.

Setting print area in Calc

2006 September 16
Format -> Print Ranges -> Define

After doing the above step, printing-related operations like “Page Preview” and “Print” (with the “selection” option selected for “Print range”) will act upon the defined print area.

Authentic Japanese

2006 September 16

This morning (September 15th), I went to SIC. When I was heading to the entrance, I saw three Asian foreigners. They were tall and wore black suits.

I didn’t really care, but when I passed them, one of them spoke something that was recognizably Japanese:


The sound was heavy and most important of all, authentic! Hearing it spoken by a native speaker just some inches away was a really gratifying experience.

Keeping a memorized pro game memorized

2006 September 16

(Written on… Uhm… No more “written on” stuffs. Too much of a hassle.)

I’ve memorized a lot of Takemiya games. However, as time passed, I gradually forgot the moves. Only general impressions like “Takemiya likes san-ren-sei opening” and “Takemiya sometimes reply a keima-gakari with kosumi” remains.

That is undesirable. Ideally, I want to remember the games forever. Therefore, periodic review is a must. Mnemosyne, anyone?

Mnemosyne is a program that helps you memorize items. It will smartly schedule reviews for you. Items that you already know well will be asked rarely, while items that isn’t memorized well will be asked more frequently. I use Mnemosyne to memorize Japanese words, among others.

We can use simple HTML tags like <b> and <i> in Mnemosyne, so I thought applets could be supported. My first idea was to create a **cough** Java applet myself and embed it inside Mnemosyne. However, following the principle of “don’t reinvent the wheel” and considering that I had to dig a lot of documentation just to get a “Hello World” applet running (almost no past experience making applets), I tried finding other people’s applet first.

So, I dug through Hiroki Mori’s “Interactive Way To Go” because I remembered there was a game replayer there. This is what I found:

<applet codebase="./../java" code="mori/go/FreeBoardApplet.class"
   width=300 height=350 align=left border=10>
	<param name=demo value=true>
	<param name=size value=13>
	<param name=init value="B[cc]W[kk]B[dj]W[kd]">
	<param name=moves value="B[fk]W[ki]B[id]W[ic]B[hc]W[jc]B[gd]W[cf]B[ch]W[dc]">

The parameters were easy enough to decode. Give it a board size, initial moves, and navigable moves. Coordinates are in SGF style, which is <column><row> and the top left is “aa” (the letter ‘i’ is used).

I usually memorize the first 50 moves of a game. My idea is to divide the moves into 10 moves chunks. So, a Mnemosyne item will test moves 1-10, another for moves 11-20, and so on.

However, if I want to test moves 11-20, it will help if move 10 was marked. Since the applet only marks the last move played (from the “moves” parameter), move 10 shouldn’t be in “init” but in “moves”. It will be a dummy move, just to provide the mark.

Sadly, the applet wouldn’t appear inside Mnemosyne. Therefore I changed the usage scenario into the following:

  • A Menosyne item will contain a kifu review code, for example: “Kifu review: ab” which means game ‘a’ part ‘b’ (part ‘a’ is moves 1-10, part ‘b’ moves 11-20, etc)
  • Upon seeing that code, I will go to Firefox and type “k ab” (‘k’ stands for “Kifu review”).
  • Firefox will be configured such that ‘k’ corresponds to a bookmark, such as “http://localhost/kifureviewer/?code=cc.
  • A page will appear with the applet mentioned before. I will replay the kifu and report the result back to Mnemosyne.

The process is quite disintegrated, going back and forth between Mnemosyne and Firefox.

For the web application, I decided to use ASP.NET. This is because I planned to load the game data from the SGF file, and I’m already comfortable with .NET’s file I/O. I used Mono 1.1.7’s XSP as the web server.

My ASP.NET’s experience was almost none, so I need to peek the documentation even to get a Hello World running. Here’s a sample:

<%@ Page Language="C#" %>
		<title><% Response.Write(DateTime.Now); %></title>
		for(int i = 0; i < 10; i++)
			Response.Write("<p>Hello world!</p>");

Other than that, I searched on how to import namespaces (needed e.g. System.IO and I didn’t want to type it all over). The answer is to put something like…

<%@ Import Namespace="System.IO" %>

…below the “Page Language” thing.

Next is searching where to make classes and static functions. The answer is to put it inside…

<script runat="server">

…which is put before the <html> tag.

The last info I needed was how to fetch the HTTP GET variables. This example…


…will get the value of the HTTP GET variable named “code”.

From there it wasn’t that hard, just parsing the SGF file and giving the correct parameters to the applet. Normal C# coding in which I’m already comfortable with.

I didn’t read the SGF format specification, but guessing the tag meanings wasn’t that hard. I assumed the SGF to be nonbranching and probably a lot of other simplifying assumptions. Below is shown some first characters of an SGF file:

(;DT[2006-09-01]EV[3rd Toyota Cup]RO[semi-final]
PB[Lee Sedol]BR[9p]PW[Lee Changho]WR[9p]
KM[6.5]RE[B+R]SO[Moyo Go Studio]

The relation from game code to SGF file names is in the file “database.txt” (location and file name cofigurable in the aspx file) which contains for example:

a	20060901_Lee-Sedol_Lee-Changho.sgf
b	20060830_Lee-Sedol_Hane-Naoki.sgf

The path of the SGF file is also configurable in the aspx file.

However, testing quickly reveals a critical problem:

Illegal Go position

It is clear that in putting stones from the “init” parameter, the applet didn’t check for captures.

There were two alternatives if I still wanted to use that applet. First is to ask for the source code of the applet and modify it. Second to do the capture checking myself before giving the parameters to the applet.

I chose the latter, utilizing the class inside my (with Fuad, Awang) Go playing program, Sai. A little hack here, a little hack there, and the board was displayed properly:

Legal Go position

Of course the prisoner count was wrong :). I don’t think there is a way to tell the applet about the initial prisoner count.

Anyway, it supports handicap:

Handicap support in KifuReviewer

However, being a Frankenstein solution, there are many things to improve:

  • The applet should be inside Mnemosyne. (at least I informed the maintainer)
  • Dummy move should be eliminated. The board should start with the last move marked.
  • User should guess by clicking on the desired coordinate, not by clicking the next button.
  • Prisoner count should be correct.

At least now I have a non-completely-manual means to memorize a game infinitely long. 1 game is already in Mnemosyne, and a lot more will certainly come…

Unicode support in desktop blog clients

2006 September 12

What blogging tool can work with the Japanese hiragana character あ (‘a’) properly? Let’s test all programs listed on post is meant to raise awareness of the dismay Unicode support in today’s blogging programs.

First of all, the test system is Windows 98. This is the OS used by computers in Student Internet Center (SIC), and SIC is my gateway into the internet. Unicode works completely fine in Firefox, IE, and Notepad++ here. It’s OK if the program requires Java because it is installed on this computer. It’s fine if the program requires .NET 1.1 because installing .NET 1.1 don’t take much time. Program that requires .NET 2.0 is unacceptable because installing it takes ages on this slow computer (it compiles many files to native code using ngen.exe). Mono 1.1.17 can’t be used as the replacement of .NET 2.0 because Mono can only be installed on NT machines.

These are the programs that aren’t reviewed due to the aforementioned restrictions:

  • BlogWriter 1.0.29 (Zoundry): Couldn’t start on Windows 98 (ZBLOGWRITER caused an exception 10H in module PYTHON23.DLL).
  • ecto 2.1: Needs .NET 2.0.
  • Elicit 1.1.7: Needs .NET 2.0
  • Windows Live Writer Beta: Won’t run on Windows 98 (I’ve already installed .NET 1.1 (does it need 2.0?). It pinvokes a nonexisting function in Windows 98’s kernel32.dll)
  • JBlogEditor: Crashes before even posting.

Now on to those who could run. Some programs mercilessly converts あ to ‘?’. This happens everywhere: in the main and/or code editor and the title text box. Those programs are:

  • BlogDesk 2.6
  • w.bloggar 4.00 (as a bonus, it will crash when exiting)
  • Semagic (btw the version I downloaded (for Windows 98) can only be used for livejournal)

Some programs work better:

  • BlogJet あ is displayed fine in the main editor. It is displayed as a square in the code editor. It turns into a question mark (?) when the post is sent to WordPress. あ is converted into ‘?’ in the title text box.
  • Post2Blog 1.23.3: あ is displayed fine in the main editor. It will be converted into squares when you enter the code editor. However, no character corruption occurs when the post is uploaded (just remember to not enter the code editor). あ is converted into ‘?’ in the title text box.
  • WB Editor 2.5.1: Same imperfection like Post2Blog. Weird for a .NET program. Doesn’t .NET use UTF-8 internally?
  • Qumana 3.0.0: あ is displayed as a square in this program. However posting it to WordPress works.

Two programs are perfect, which means that the program displays あ correctly in the title text box, main editor, and code editor, and uploads the post with no character corruption. Here are the winners:

  • Flock (a Firefox-based browser with integrated blogging support, among others)
  • Performancing 1.3 (a Firefox extension)

There you go… Most of the blogging clients I reviewed failed. We really live in a primitive age of computing where software developers don’t care about supporting Unicode…

PS: I won’t use Flock nor Performancing. Performancing regards newlines in the code editor as real lines so my post will contain extra lines unless I do some unnatural deleting. Flock messes the <pre> (or <PRE>) tag by converting the starting tag into <pRE> and the ending tag into </PRE>, destroying all newlines in the process (already reported those bugs btw). I’m currently using Qumana to post this blog entry. The Japanese characters that show up as squares is tolerable because I edit the HTML file from my home.

Time of day

2006 September 12

(Written on 9:42 PM 9/11/2006 GMT+7)

While memorizing Japanese words, I encounter lots of English words related to the time of the day. Some examples are morning, evening, daytime, and night.

I’ve previously taken those words for granted. However, revisiting them in a studious moment made me question my understanding. What is the definition of "morning"? When does "evening" start and end?

Using the definitions in Oxford Advanced Genie (this was a long time ago), I compiled a visual diagram using Inkscape. The PNG image is here. The SVG file is here but Firefox 1.5 can’t display it properly (because the SVG implementation is incomplete).

Those using a text-based browser can still read the relevant definitions:


  1. the time of darkness between one day and the next
  2. the evening until you go to bed

daytime: the period during the day between the time when it gets light and the time when it gets dark

dawn/sunrise/daybreak: the time of day when light first appears

noon/midday: 12 o’clock in the middle of the day

dusk: the time of day when the light has almost gone, but it is not yet dark

midnight: 12 o’clock at night

morning: (…)

  1. the early part of the day from the time when people wake up until midday or before lunch
  2. the part of the day from midnight to midday

afternoon: the part of the day from 12 midday until about 6 o’clock

evening: the part of the day between the afternoon and the time you go to bed

Sorting strings: Calc, Explorer vs. Nautilus, dir vs. ls

2006 September 12

(Written on 9:26 PM 9/11/2006 GMT+7)

Sorting Japanese words in Calc

While sorting my Japanese words in OOo Calc, I noticed that the katakana ア is between the hiragana あ. After a curious investigation, I concluded that OOo Calc doesn’t distinguish between hiragana and its corresponding katakana for sorting purposes. Uppercase and lowercase latin letters are also regarded as the same.

Therefore, the starting condition will determine the "sorted" condition. For example, the following column won’t change if sorted:


But the same is true for this column:


Explorer works the same way as OOo Calc, treating capitals the same as its small counterparts and hiragana the same as katakana:

Sorting in Explorer

However, the dir program treats katakana after hiragana which is inconsistent with Explorer:

E:\Temp\sorting test>dir
 Volume in drive E is Archive
 Volume Serial Number is A809-0E48

 Directory of E:\Temp\sorting test

09/08/2006  08:13 PM    <DIR>          .
09/08/2006  08:13 PM    <DIR>          ..
09/08/2006  07:47 PM                 0 Aa
09/08/2006  07:47 PM                 0 ab
09/08/2006  07:47 PM                 0 ba
09/08/2006  07:47 PM                 0 Bb
09/08/2006  07:47 PM                 0 あa
09/08/2006  07:47 PM                 0 いb
09/08/2006  07:47 PM                 0 アb
09/08/2006  07:47 PM                 0 イa

But the behavior will change if we use /o:n (sort by name):

E:\Temp\sorting test>dir /o:n
 Volume in drive E is Archive
 Volume Serial Number is A809-0E48

 Directory of E:\Temp\sorting test

09/08/2006  08:13 PM    <DIR>          .
09/08/2006  08:13 PM    <DIR>          ..
09/08/2006  07:47 PM                 0 Aa
09/08/2006  07:47 PM                 0 ab
09/08/2006  07:47 PM                 0 ba
09/08/2006  07:47 PM                 0 Bb
09/08/2006  07:47 PM                 0 あa
09/08/2006  07:47 PM                 0 アb
09/08/2006  07:47 PM                 0 イa
09/08/2006  07:47 PM                 0 いb

This is weird because by default dir already sorts latin alphabets by name (in other words, the default behavior should match /o:n).

So how does Ubuntu 6.06 fare? I booted the Live CD and here’s Nautilus in action:

Broken sorting in Nautilus

Total mess! Why are kana interspersed between latin alphabets? I couldn’t figure out how that program sorts…

ls (the console command "el-es") is no better:

ubuntu@ubuntu:/media/ntfs/Temp/sorting test$ ls -l
total 0
-r-xr-xr-x 1 root root 0 2006-09-08 12:47 あa
-r-xr-xr-x 1 root root 0 2006-09-08 12:47 イa
-r-xr-xr-x 1 root root 0 2006-09-08 12:47 Aa
-r-xr-xr-x 1 root root 0 2006-09-08 12:47 ab
-r-xr-xr-x 1 root root 0 2006-09-08 12:47 いb
-r-xr-xr-x 1 root root 0 2006-09-08 12:47 アb
-r-xr-xr-x 1 root root 0 2006-09-08 12:47 ba
-r-xr-xr-x 1 root root 0 2006-09-08 12:47 Bb

I’ve reported those bugs to Ubuntu’s Launchpad.

Terjemahan IGN “Goama” 21

2006 September 11

Silakan dibaca terjemahan dari majalah Igo online IGN "Goama" edisi ke-21. Artikel aslinya bisa didapat dari

Samsung Cup adalah salah satu turnamen Igo internasional. Formatnya adalah turnamen knockout 32 pemain, dengan hadiah 200.000.000 won (sekitar $200.000). Saat ini babak perempat final telah tercapai. 4 pemain Jepang semuanya telah tumbang dan yang tersisa tinggal pemain dari Korea dan China:

Paek Hong-seok, dan-4 (KR) vs Yu Bin, dan-9 (CN)
Ch’oe Ch’eol-han, dan-9 (KR) vs Chang Hao, dan-9 (CN)
Yi Ch’ang-ho (Lee Changho), dan-9 (KR) vs Piao Wenyao, dan-4 (CN)
Seo Pong-su, dan-9 (KR) vs Wang Yao, dan-6 (CN)

Tabel turnamen (beserta kifu) bisa dilihat di:

Dalam edisi ini kami menyajikan wawancara dengan Seo Pong-su, dan-9 dan Paek Hong-seok, dan-4.

"Seo Pong-su, dan-9 – Pembantai pemain muda?"

Seo Pong-su, dan-9 dari Korea yang berumur 53 tahun, membuat sensasi utama dengan mengalahkan dua pemain muda China, Zhang Wei, dan-5 (pemain termuda di turnamen) dan Chen Yaoye, dan-5 (kelahiran 1989). Wawancara ini diambil langsung setelah kemenangan keduanya:

Q: Apakah anda lelah?
A: Hanya sedikit. Mereka mengubah waktu utamanya dari 3 jam menjadi 2 jam, dan itu bagus bagi saya.

Q: Bagaimana pendapat anda tentang lawan anda, Chen Yaoye, dan-5?
A: Pikiran saya jernih. Saya sedang menikmati Igo, bukannya mencoba menang, jadi siapapun lawan saya tidak masalah.

Q: Apa perbedaan antara pemain muda Korea dengan China?
A: Saya jarang mendapat kesempatan bermain dengan pemain muda Korea. Akhir-akhir ini saya selalu kalah di babak awal semua turnamen.

Q: Saya tahu bahwa anda bermain lebih dari 50 permainan blitz (10 detik per langkah) dengan Park Seunghyeon, dan-5 di sekolah Kwon Kapyong’ (sekolah Igo terbesar di Korea). Apakah anda banyak bermain Igo akhir-akhir ini?
A: Ya, dia sangat membantu saya. Kami bermain banyak blitz, untuk persiapan Liga Baduk Korea. Biasanya saya hanya mempelajari kifu (catatan pertandingan). Itu adalah hobi saya: kadang-kadang saya melakukannya seharian. Sangat menarik!

Q: Apa rencana ke depan anda?
A: Saya tidak pernah memikirkannya. Saya sedang sangat menikmati Igo dan merupakan kesenangan yang sangat besar bagi saya bisa bermain dengan para master muda.

Peserta perempat final lain, Paek Hong-seok, dan-4 (Korea) adalah kuda hitam turnamen ini.

Q: Selamat! Anda mengalahkan Luo Xihe, dan-9, pemenang Samsung Cup sebelumnya. Bagaimana permainan tersebut?
A: Saya bahagia. Ini adalah perempat final pertama saya! Saya melakukan opening yang buruk, tapi Luo, dan-9 melakukan blunder di tengah permainan. Setelahnya dia melakukan banyak overplay (langkah yang berlebihan) dan saya menghukumnya dengan baik.

Q: Apa pendapat anda tentang pemasangan setelah 1/16 final? (Setiap ronde dibuat pemasangan baru, memisahkan pemain dari negara yang sama)
A: Saya berusaha menghindari pemain-pemain kuat di ronde pertama, tapi di 1/16 final kekuatan mereka kurang lebih sama. Saya paham bahwa saya harus bisa mengelola waktu dengan baik melawan Lou Xihe, dan-9.

Q: Apakah benar Luo Xihe, dan-9 bermain sangat cepat?
A: Ya, dia menghabiskan kurang dari 10 detik untuk beberapa langkah, melakukan 50 langkah opening dalam 2 menit. Setelahnya dia berpikir cukup lama untuk saat-saat penting. Biasanya saya juga bermain cepat, jadi permainannya tidak memberi tekanan pada saya.

Q: Apa rencana ke depan anda?
A: Saya memenuhi target memasuki perempat final. Sekarang saya ingin memenangkan satu pertandingan lagi.

Diterjemahkan oleh Agro Rachmatullah.

"You are welcome to republish any text material from the IGN "Goama" without commercial purposes: please note the source and put the link to"

My first 1500 words

2006 September 9

(Written on 11:34 AM 9/8/2006 GMT+7)

After tidying my word database, my word count drops from 1542 to 1513. I’ve printed the words into a book using FinePrint. The book is titled "私のfirst 1500 words".

Initial check reveals that many words’ meaning didn’t directly "click" into my mind. That’s why I’m going to drill those words.

All those words can be accessed here (sorted by kanji, then by kana).