Archive for September, 2006

Starting a new process on the .NET Framework

2006 September 30

To run another program from a .NET program, call the System.Diagnostics.Process.Start method. An example C# program is:

using System.Diagnostics;

class RunNewProcessTest
	static void Main()

In the example, Notepad will be invoked.

A document can be given to the start method (for example “c:\test.txt”). If there is an associated application for the file, the application will launch. Else an exception will be thrown.

If any arguments must be given to the invoked process, put it in the second argument of the Start method. An example is:

Process.Start("notepad", "c:\\test.txt");

(Note: backslash (‘\’) must be escaped)

In this regard, the Process.Start method is different from C or C++’s system function (defined on stdlib.h (C) or cstdlib (C++)), which would be:

system("notepad c:\\test.txt");

Embedding a resource into a .NET assembly and loading it

2006 September 30

This post has been moved to “Embedding a resource into a .NET assembly and loading it”. Please visit the new server.

McCune-Reischauer Converter

2006 September 27

I like to use the McCune-Reischauer form of Korean Go players’ name because that’s what Sensei’s Library use for page titles. Therefore, it is “Yi Se-tol” not “Lee Sedol” and “Yi Ch’ang-ho” not “Lee Changho”. Unlike other romanizations, we can reconstruct the hangul correctly from its McC-R form.

It was a pain in the arm to write articles containing Korean Go player names because I had to repeatedly look at the reference and do lots of “find & replace”. Therefore I made a small tool to do that job:

$ mcrconverter --help
This program will change Korean Go player names in a file into
their McCune-Reischauer form.

mcrconverter [options] file:
    change the contents of a file
mcrconverter [options] folder:
    change the contents of SGF files in a folder

Options can be:
    also changes the file/folder name
    search for files inside subfolders recursively

Other than useful for writing articles, this tool can also be used to mass-rename the contents of an sgf file, and even the filename itself.

The source and executable are here (requires .NET Framework 2.0).

I’m a computer scientist, not your tech support

2006 September 27

Problems worthy of attack
Prove their worth by fighting back
    – Piet Hein

A Princeton computer science major writes about his/her annoyance of being asked to fix computer problems.

What are problems in the field called “Computer Science” anyway? It’s NOT about how to make a web site. It’s NOT about fixing a machine that won’t boot up. It’s NOT about getting rid of worms investing your computer. It’s NOT about making the best hardware purchase.

Wikipedia has a list of unsolved computer science problems. It includes the famous P=NP. There is also a similar, significantly longer, list for mathematics and physics. This probably reflects the relatively young age of computer science. (those thinking mathematics is finished is dead wrong; the amount of mathematical research activity is in fact getting much bigger from time to time)

Anyway, does the reverse happen? (which means, people enrolling CS expecting to be taught about fixing computer problems)

Kanji as a form of data compression

2006 September 24

Using kanji, many ideas can be expressed using just a few characters. For example, here’s how we write the 12 months in various ways:

Kanji Hiragana Roomaji English Indonesian
一月 いちがつ ichigatsu January Januari
二月 にがつ nigatsu February Februari
三月 さんがつ sangatsu March Maret
四月 しがつ shigatsu April April
五月 ごがつ gogatsu May Mei
六月 ろくがつ rokugatsu June Juni
七月 しちがつ shichigatsu July Juli
八月 はちがつ hachigatsu August Agustus
九月 くがつ kugatsu September September
十月 じゅうがつ juugatsu October Oktober
十一月 じゅういちがつ juuichigatsu November November
十二月 じゅうにがつ juunigatsu December Desember
2.17 4.17 8.83 6.17 6.25

Note that the average character count drops from roomaji to hiragana. That is expected, since each hiragana symbol expresses the idea of mora which for this discussion can be regarded as a syllable. If we use roomaji, most syllables must be written using two or more characters. Therefore hiragana can be thought to compress roomaji. As a character, hiragana is more high level than roomaji.

The average character count drops again when we go from hiragana to kanji. Kanji is even more high level than hiragana. Each kanji expresses a certain idea. Because most kanji expands to more than one character when written using hiragana, kanji can be thought to compress hiragana.

I’ve heard people say, “kanji is sooo ancient. They should abolish it and replace it with something simpler and modern like the latin alphabet.” It eventually boils down to the unwillingness to memorize lots of high level symbols.

However, kanji is a form of pictogram. What they don’t realize is they also use some pictograms. Ever saw 1, 2, 3, 4, 5, 6, 7, 8, 9, and 0? Great, let’s abolish them. Then we can all have fun writing “sixty five thousand five hundred thirty six” or “enam puluh lima ribu lima ratus tiga puluh enam”.

Anyway, it is natural to ask, “can we define even more higher level elements?”. I don’t see that happening in natural language, but there is one language in which simpler concepts (encoded in symbols) are used to consecutively build more complex ones: mathematics.

In modern mathematics, everything starts with the set theory. There we see symbols like “{“, “}”, “,”, and “⊆”. From sets, we can define things such as the natural number, and naturally (no pun intended) new symbols like “1” and “0” appear.

Going even higher level, there is calculus in which symbols like “∫” appears. Calculus is very high level so that using vector calculus, all electromagnetic phenomena can be written in only four equations (the so-called “Maxwell’s Equations“).

I think it is astonishing that using the more high-level symbols in Clifford Algebra, the Maxwell’s Equations can be written in only one equation.

Digitizing kanji mnemonics

2006 September 24

When studying grade 2 kanji, I made mnemonics which relates their shape to their meaning. Tony Buzan in his book “Make the Most of Your Mind” advocates using mnemonics to increase memorizability. Here are some examples:

For the kanji 半 which means “half”, the mnemonic is “a border (|) is seen dividing the island into HALF; we can see a river (二) and signs (\ /) on each area”. The illustration is below.

mnemonic for 半

For the kanji 弱 which means “weak”, the mnemonic is “the twin snakes (弓) are WEAK, therefore they avoid the sharp thorns”. It is illustrated below.

mnemonic for 弱

For the kanji 当 which means “hit”, the mnemonic is “a fierce HIT (ヨ) that destroys the wall (\|/)”. Illustration is below:

mnemonic for 当

Inventing mnemonic is a creative right-brain activity, which dispels the dullness traditionally associated with memorizing. It also exploits the property of the human brain which remembers things better if there are outstanding (e.g. funny, weird) associations to other things.

All the mnemonics (around 160) were written on a book and I had just finished transcribing them to computer. A tedious task, and one which I cannot automate.

Conditional execution in bash

2006 September 24

For some reason, I needed to run a program, but not if another program is running (can be the same program).

After googling for a bash tutorial, I was able to create the script:

ps aux > psinfo
cat psinfo | grep FORBIDDEN_PROGRAM > programinfo

if [ `wc -l < programinfo` = "0" ]

ps is used to list all processes running on the system. grep is used to display all occurences of the string FORBIDDEN_PROGRAM from the output of ps. The output of ps isn’t directly piped to grep, e.g “ps aux | grep FORBIDDEN_PROGRAM” because of the unpredictable behaviour of piping. Sometimes “grep FORBIDDEN_PROGRAM” is started before ps, making 1 entry of the string FORBIDDEN_PROGRAM in the output of ps if FORBIDDEN_PROGRAM is not running. However, sometimes grep is started after ps, which gives a count of 0 if FORBIDDEN_PROGRAM is not running.

wc is used to count the number of lines in a file (and is able to count other things too).

Terjemahan artikel utama IGN “Goama” 22

2006 September 19

Sekolah Baduk (Igo) Yu Ch’ang-hyeok

(Diterjemahkan ke bahasa Inggris dari majalah Korea “Baduk World” Mei 2004 oleh Alexandre Dinerchtein)

Yu Ch’ang-hyeok, dan-9 (salah satu pemain terkuat Korea di pertengahan 1990-an) membuka sekolah Igo anak-anak pada April 2004 bersama sahabat terbaiknya, Ch’oe Kyu-pyeong, dan-9. Ini bukanlah sekolah pertama mereka. Sebelumnya mereka pernah mendirikan klub untuk pemain muda profesional, yang berjalan sangat sukses selama 16 tahun. Kami mengunjungi sekolah mereka dan mewawancarai pendirinya.

Sekolah ini dibagi menjadi beberapa ruang kecil. Saat memasuki ruang pertama, kami melihat 4 pemain pro: Park Yeong-hun, dan-5 (sekarang dan-9), Yi Cheong-u, dan-4 (dan-5 pada 2005), Kim Eunseon, dan Kim Sesil (keduanya wanita, sama-sama dan-1). Ch’oe Kyu-pyeong, dan-9 adalah guru utama Park waktu dia masih amatir. Sepertinya, Park sekarang lebih kuat.

Saat ini hanya ada 7 murid. Mereka muda dan berbakat. “Kita punya beberapa anak jenius, tapi klub Igo lain juga punya. Susah untuk memastikan apakah mereka bisa mencapai level pro, semua akan tergantung usaha mereka,” kata Ch’oe Kyu-pyeong, dan-9.

Sekolah ini dibuka dari pukul 8 pagi sampai 10 pagi. Yu Ch’ang-hyeok, dan-9 selalu mendampingi muridnya kecuali kalau ada jadwal turnamen.

Kami bertanya, “Bagaimana cara anda mengajar mereka?”. Dia menjawab, “Mereka bisa dengan mudah belajar opening, joseki, dan persoalan hidup mati dari buku. Jadi saya berusaha menciptakan atmosfir “self-learning” di sekolah ini. Hal utama yang saya lakukan adalah mengulang dan mengomentari permainan mereka. Banyak murid saya yang kemampuan membacanya sangat bagus, dan bahkan pemain pro tingkat tinggi tidak bisa membuat mereka lebih kuat dalam persoalan hidup mati. Tapi kita bisa mengajarkan mereka banyak hal lain.”

Beberapa berpendapat bahwa terlalu dini bagi Yu Ch’ang-hyeok, dan-9 untuk mengajar anak-anak, karena mereka percaya bahwa dia masih bisa berprestasi di panggung profesional. Yu Ch’ang-hyeok, dan-9 menjelaskan keputusannya: “Memang benar bahwa saya masih bermain serius, tapi saya tidak akan berada di puncak selamanya. Saya berpikir banyak tentang masa depan saya dan memutuskan bahwa saya akan menikmati mengajar Igo dibanding pekerjaan lain. Mimpi saya adalah menemukan murid yang bisa menjadi elit di dunia Igo Korea.”.

Yu, dan-9 juga mengatakan beberapa hal tentang murid-muridnya: “Seseorang mengatakan bahwa kami hanya mengambil insei dari Asosiasi Baduk Korea, tapi itu tidak benar. Kami lebih suka mengajar anak yang masih sangat muda, walaupun dia tidak sekuat insei. Menurut saya, tidaklah terlalu baik mengundang murid kuat dari klub dan guru lain, dan dalam kasus tersebut kami selalu meminta izin dari tempat belajar lamanya. Kami tidak mendirikan sekolah ini untuk menghasilkan uang, jadi moralitas selalu kami jaga. Kalau kami melihat bahwa seorang anak tidak mungkin menjadi pro, kami akan menghubungi orang tuanya dan meminta mereka mencari masa depan lain. Saya percaya, dalam beberapa kasus lebih baik mencari jalan lain, melanjutkan Igo hanya sebagai hobi.”

Terakhir, kami menanyakan Yu, dan-9 tentang situasi dunia Igo modern.

“Master Jepang jauh tertinggal di belakang. China berusaha mencapai level kami tapi itu tidaklah mudah. Negara mereka sangat besar dan susah untuk membuat sistem pembelajaran tingkat tinggi seperti di Korea. Beberapa kali mereka mengontak kami dan menyarankan kami untuk menerima murid dari China, tapi kami menolaknya terutama karena alasan bahasa,” kata Yu Ch’ang-hyeok, dan-9.

Yu Ch’ang-hyeok, dan-9 dan Ch’oe Kyu-pyeong, dan-9 adalah guru hebat dan kita akan melihat bagaimana mereka mengubah situasi dunia modern Igo.

Diterjemahkan ke bahasa Indonesia oleh Agro Rachmatullah.

“You are welcome to republish any text material from the IGN “Goama” without commercial purposes: please note the source and put the link to”

Character variants in Unicode

2006 September 19

In the Unicode, there are several code points for fullwidth characters. Here’s a comparison between the normal ASCII characters and their fullwidth counterparts (the normal is written first):


The superscript characters like ² is also a display variant of normal characters like 2.

Another amusing thing is the existence of language-specific characters. An example is the Greek capital letter eta (Η, U+0397) and the Cyrillic capital letter en (Н, U+041D). In my machine, they look exactly like the Latin capital letter H (which is ASCII 72 or U+0048).

I actually have a mixed feeling about including display variants in a character set. In light of HTML and various text-formatting utilities (TeX, office suites), display variants can be regarded as a waste of code points. For example, in HTML subscripts can be achieved using the tag <sup> and specific fonts (for example fullwidth) can be chosen using CSS (or the old-style <font> tag). About language variants, again HTML renders this unnecessary because there is the “lang” (or “xml:lang”) attribute.

However, variants have some merits. One use of those variants is of course for plain text files. For example, with the character “²” I can write “a² + b² = c²” nicely in a plain text file. The other benefit is space efficiency. For example, “²” is one character, while “<sup>2</sup>” consists of a lot.

What I hate about language variants is that it conflicts with one major theme in the Unicode work: CJK (Chinese Japanese Korean) character unification. In the Unicode, there is no such thing as the Japanese 人, Chinese 人, and Korean 人. There is only one character for all three languages: 人. This is in spite of drawing differences between some of the characters! Thus, it is not possible to convey the difference in a plain text file.

For example, here is the CJK character for “now” but displayed differently (if your computer is set up correctly) because of the “lang” attribute: (Japanese) vs. (Chinese). Both are U+4ECA. In my computer it looks like this:

Japanese vs. Chinese 今

See the HTML source code for more info.

Counting from 1 to 59 in Japanese

2006 September 19

Because some number kanji can be read in many ways, counting can be confusing to a beginner. The kanji for 44 is simply 四十四. However, how do we read it? よんじゅうし, よんじゅうよん, しじゅうよん, or しじゅうし? Only one correct reading, or more than 1 correct way to do it?

Tanaka Reina in Futarigoto

I found a real-life example that should shed a little light on this matter. In the TV show Futarigoto, Tanaka Reina of Morning Musume shows her ability to stand using her hands and head. While doing so, she times herself, counting from 1 onwards.

This example won’t invalidate other readings, however it does give us one way to correctly count. Here’s how she counts:

1: いち
2: に
3: さん
4: し
5: ご
6: ろく
7: しち
8: はち
9: く
10: じゅう
11: じゅういち
12: じゅうに
13: じゅうさん
14: じゅうし
15: じゅうご
16: じゅうろく
17: じゅうしち
18: じゅうはち
19: じゅうく
20: にじゅう
21: にじゅういち
… (same pattern)
30: さんじゅう
31: さんじゅういち
… (same pattern)
40: よんじゅう
41: よんじゅういち
… (same pattern)
50: ごじゅう
51: ごじゅういち
… (same pattern)
59: ごじゅうく

There are two things to note, probably specific to this kind of situation. First, the “じゅう” is spoken as only 1 mora, i.e. “じゅ”. Second, if the number consists of 2 or more mora and ends with a vowel, then the final vowel is oftenly omitted or almost unheard. Therefore it sounds like “ich”, “ni”, “san”, “shi”, “go”, “rok”, “shich”, “hach”, “ku”, “ju”, “juich”, “juni”, “jusan”, “jush”, “jugo”, “jurok”, “jushich”, “juhach”, “juk”, …

(This is like “tu, wa, ga, pat, …, dua satu, dua dua, …” instead of “satu, dua, tiga, empat, …, dua puluh satu, dua puluh dua, …” in Indonesian.)

I’ve prepared the audio file (396 KB) of Reina counting. The file is in the free Ogg/Vorbis format, and Windows user might need to download the codec. It is made using Audacity, a free digital audio editor.

PS: There are some interjections between the counting. First is between 19 and 20 (“It is getting tough”) and second is after 59 (“I passed 1 minute!”). After 59, she counts from 1 again. In the end she says “Let’s stop now.”.