2005-12-23

The uglyness of C++ locale

I'm very happy how C++ handles most of the things, both in the language and in the standard library. One monstrosity is the handling of locale. Look at the following piece of code, that I came across on the boost users mailing list:

time_facet* of = new time_facet();
of->set_iso_extended_format();
std::cout.imbue(std::locale(std::locale::classic(), of));

In my opinion, this is horrible. I believe that it's very flexible, as everything else in the standard library, but still... I won't even try to imagine how would you set locale to some particular language's collation sequence and character encoding.

I think that there must be a better way of handling locales in programs. Today, it's simply a mess throughout the computer world (e.g. I shiver when I remember the time I had to convince Oracle to chew up data in some particular encoding). As for the above code, I could write date/time formatting code much faster than to figure out the above piece of code.

Localization, as it is today, simply sucks big time. It should be transparent both to programmers and users. It should work out of the box. The way it is today, it is just creating headaches for everyone. Incresed effort on programmers (look at the above code!), a plethora of confusing settings to users (character sets, keyboard layouts, date/time formats, decimal points, etc). Yes, it's flexible, and all those settings should stay under some advanced options, but the system should work seamlessly with mixed locales and character sets. The user should not be forced to use a single locale system-wide (as it is in Windows) or within a signle application (as it is in Unix).

I see many open problems in mixed locales, but I don't think that they are unsolvable. They are just very hard. Some "locale grand unification framework" is missing. And until it appears, I'm afraid that the state of affairs in I18N will remain a mess it is now. Maybe (or should I say, probably!), with a bit of bad luck, it'll just get worse.

No comments: