Issue
I have uploaded a CSV file with some german words. I can access the file via this link . When I try to get the file via postman, the response is not fully utf converted.
After downloading the file, I open the file with excel or notepad++ I get the expected result.
Again I try to get the file via angular HttpClient. I get this response in Chrome's network preview section.
But when I console.log the response, Now the response is incorrect.
I try to add Accept: application/csv
and Accept-Charset: UTF-8
header with postman call, But get same result.
I make a stackblitz sample with angular.
Solution
you can change the "responseType" to "arraybuffer" and then decode the array with whatever encoding you need. In this case the iso-8859-2
look at the example.
Since a byte can only have 256 distinct values, and the set of all the world's languages is more than 256 characters, UTF-8 needs to use more than one byte to represent some characters. In general, accented characters like the word “bênção” are represented as 2 bytes in UTF-8 (as opposed to just 1 in ISO 8859-1):
b | ê | n | ç | ã | o
ISO 8859-1: 62 | EA | 6E | E7 | E3 | 6F
UTF-8: 62 | C3 AA | 6E | C3 A7 | C3 A3 | 6F
This is a problem when you try to read text written in one encoding as if it were another encoding: if the text was written as UTF-8 but read as ISO 8859-1, it appears as “bênção”; the opposite appears as “b�n��o”
example in JavaScript:
var bytesUtf8 = new Uint8Array([0x62, 0xc3, 0xaa, 0x6e, 0xc3, 0xa7, 0xc3, 0xa3, 0x6f]);
var bytesIso = new Uint8Array([0x62, 0xea, 0x6e, 0xe7, 0xe3, 0x6f]);
var decIso = new TextDecoder("iso-8859-1")
var decUtf8 = new TextDecoder("utf-8")
var decBytesIsoCorrect = decIso.decode(bytesIso)
var decBytesUtf8Correct = decUtf8.decode(bytesUtf8)
var decBytesIsoInUtf8 = decUtf8.decode(bytesIso)
var decBytesUtf8InIso = decIso.decode(bytesUtf8)
console.log('iso > iso', decBytesIsoCorrect)
console.log('utf-8 > utf-8', decBytesUtf8Correct)
console.log('iso > utf-8', decBytesIsoInUtf8)
console.log('utf-8 > iso', decBytesUtf8InIso)
note: can be used iso-8859-n
Answered By - Lucas Rissi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.