Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

Understanding Encoding methods?

  • 15-08-2012 08:20AM
    #1
    Registered Users, Registered Users 2 Posts: 7,546 ✭✭✭


    Ive read the documentation but cant figure out why the Encoding method is not working as i expected it to.
    Im receiving a byte array which can contain the UTF-8 character set.

    If i decode this string using the below all unicode characters are replaced by a black diamond with a question mark inside it if i open it with notepad:

    Encoding.UTF8.GetString(byteArray)

    However if i decode it using the below it works fine:

    Encoding.Default.GetString(byteArray)

    Can anyone explain this to me because im just not getting why the first one does not work?


Comments

  • Registered Users, Registered Users 2 Posts: 7,157 ✭✭✭srsly78


    Coz unicode chars are 16 bits in size. UTF8 is therefore mangling them.


  • Registered Users, Registered Users 2 Posts: 7,546 ✭✭✭BrokenArrows


    srsly78 wrote: »
    Coz unicode chars are 16 bits in size. UTF8 is therefore mangling them.

    Close.

    I ended up just encoding the original byte array to every possible encoding to see which one worked.

    Turns out i was being sent the byte array in UTF-7 not 8.

    Muppets!!

    So im now just doing a convert and all is good in the world.:
    Encoding.Convert(Encoding.UTF7, Encoding.UTF8, byteArray);


  • Moderators, Sports Moderators, Regional Abroad Moderators, Paid Member Posts: 2,692 Mod ✭✭✭✭TrueDub


    An excellent article on character-encoding:

    http://www.joelonsoftware.com/articles/Unicode.html


  • Registered Users, Registered Users 2 Posts: 7,157 ✭✭✭srsly78


    Close.

    I ended up just encoding the original byte array to every possible encoding to see which one worked.

    Turns out i was being sent the byte array in UTF-7 not 8.

    Muppets!!

    So im now just doing a convert and all is good in the world.:
    Encoding.Convert(Encoding.UTF7, Encoding.UTF8, byteArray);

    This isn't unicode then. edit: derp ok it is encoded unicode >.<


Advertisement