Was contemplating migrating an old D5 application into the 21st century this weekend. This was the side effect...

Was contemplating migrating an old D5 application into the 21st century this weekend. This was the side effect...
http://softwareonastring.com/2014/10/06/20-resources-on-migrating-to-unicode-with-delphi

Comments

  1. I see people on the Embarcadero forum say they dont have the time to rewrite the code for new string base for mobile etc but they have the time to post after post on that forum about how they dont have the time...?

    ReplyDelete
  2. I've done my share of portings, and it has been far easier than feared. There is a lot of FUD on this.  If you absolutely need 8 bit chars, you already are working with a defined charset - so migrating to a fixed code page AnsiChar should be fairly trivial.

    ReplyDelete
  3. Great collection. I guess we'll advertise your blog post a bit!

    ReplyDelete
  4. Lars Fosdal But if you need UTF-8 or RawByteString, for any of a variety of reasons, you're pretty much screwed.

    ReplyDelete
  5. I can still remember the pain migrating D5 Code to D2009 on my previous job.

    Currently the other department on my new office is migrating their D2007 to XE4. Your blog might be a big help for them. Thanks

    ReplyDelete
  6. Marco Cantù Thanks. Please do :)

    ReplyDelete
  7. Mason Wheeler - Are you talking about mobile?

    ReplyDelete
  8. Lars Fosdal  Yes, as it got brought up in the first reply to this topic.

    ReplyDelete
  9. Mason Wheeler - Why would you want to use those on a phone where the native string is UTF-16?

    ReplyDelete
  10. Lars Fosdal  For several reasons.  First, because every smartphone today (I'm pretending WP doesn't exist, which really isn't much of a stretch) is a *NIX box and so the native string type is UTF-8.

    Second, UTF-8 is a far better format for both connectivity (an important component of almost every mobile app) and storage.

    Third, because it preserves compatibility with existing Delphi code that uses those string types.  For example, I'm trying to port DWS to Delphi for Android, and it uses RawByteString and UTF8String quite a bit in various places.

    Fourth, because UTF-8 is less buggy of a format to use.  Have a look at how many unresolved issues remain in Delphi regarding its handling of non-BMP characters.  This is because it tries to pretend that 1 WideChar = 1 character, whereas in UTF-8, no one is under that particular illusion.

    Really, the question is not "why not use UTF-16;" the appropriate question to ask is "why not use UTF-8?"

    ReplyDelete
  11. http://developer.android.com/reference/java/nio/charset/Charset.html
    This does say that "The platform's default charset is UTF-8" - and that UTF-16 (with variants) also is supported - among others.

    Most Android apps run in a Java VM, and are written in Java, and AFAIK, Java also uses UTF-16 for internal representation? It used to be UCS-2.  

    I do agree that UTF-8 in theory is a better format (less space used) for transmission and storage, but with compression and encryption - does it matter?

    As for the 1 widechar  always being treated as two bytes - I'll take your word for that.

    ReplyDelete
  12. Lars Fosdal  I didn't say one widechar is always treated as two bytes; I said one widechar is treated as one character, even though this is not correct for anything outside of the BMP, leading to all sorts of problems in many, many different software products that implement UTF-16, including Delphi.

    ReplyDelete

Post a Comment