System.Text.Encoding.Convert
(1)
XMLNorm.TargetCharset
(1)
System.IO.Stream
(1)
IPipelineContext
(1)
IBaseMessage
(1)
DestBuffer.Length
(1)
TargetCharsetfrom
(1)
TargetCharset
(1)

Flat File Assembler Encoding Problem

Asked By F.Mondelo - http://felixmondelo.blogspot.com/
17-Jan-08 02:52 AM
Hi, I'm trying to assemble a flat file with target charset iso-8859-1.
I use XMLNorm.TargetCharset message context property for that.

The target file encoding is right, but spanish characters (N tilde, U
umlaut, ...) is represented with question marks (3F in hexadecimal).

Anyone has a solution for this issue?

Hi Felix,Are you sure the problem is in the output file, and not that the

Asked By Tomas Restrepo [MVP]
14-Jan-08 02:05 PM
Hi Felix,

Are you sure the problem is in the output file, and not that the characters
got corrupted during input from another source?

Also, what happens if you try other encodings (like one of the unicode
ones)? Do they appear correctly?


--
Tomas Restrepo
http://www.devdeo.com/
http://www.winterdom.com/weblog/

I have a solution that works for now, but is not using TargetCharsetfrom flat

Asked By F.Mondelo - http://felixmondelo.blogspot.com/
17-Jan-08 02:52 AM
I have a solution that works for now, but is not using TargetCharset
from flat file assembler. I have done a custom pipeline component that
calls flat file assembler and then converts the result into
iso-8859-1:

protected override IBaseMessage Assemble(IPipelineContext pc)
{
IBaseMessage retMsg = _assembler.Assemble(pc);
System.Text.Encoding encoder =
System.Text.Encoding.GetEncoding("iso-8859-1");

byte[] buffer = new byte[1024];
byte[] destBuffer = new byte [1024];
System.IO.Stream strm = retMsg.BodyPart.Data;
VirtualStream vStrm = new VirtualStream();

while (strm.Read(buffer, 0, 1024) > 0)
{
destBuffer = System.Text.Encoding.Convert
(Encoding.UTF8, encoder, buffer);
vStrm.Write(destBuffer, 0, destBuffer.Length);
}

retMsg.BodyPart.Data = vStrm;
retMsg.BodyPart.Data.Position = 0;

return retMsg;
}

This works, I mean, after flat file assembler the output file is
correct in utf-8 (for example, N tilde is represented by bytes C3 91)
and then I can convert it to iso-8859-1 (N tilde is converted from C3
91 to D1), but if I use directly XMLNORM.TargetCharset = "iso-8859-1",
after assembler I get spanish characters as question marks (byte 3F).

I think the problem is with assembler encoding, because without set
any charset (utf-8 by default) it works correctly.
Post Question To EggHeadCafe