Wednesday, May 8, 2013

Remember to set the child delimeter even for single element lines in a flat file schema if using wrap characters

I briefly ran into this when creating a larger flat file schema for a message. The input message had line tags that I could use to identify the different rows. Most of the rows had several distinct fields to read, but there were rows with only the tag and a string of text encapsulated in quotation marks.

As an example, we'll use an input file like so:

#FIRST 12345 "This is the first string"
#SECOND "This is the second string"


This can then be used to create a schema which identifies the two distincs rows based on their tag (#FIRST and #SECOND respectively) and then split the fields on a delimiter of 0x20 (the space character).

The schema can be like this:

 <?xml version="1.0" encoding="utf-16"?>
<xs:schema xmlns="http://TestProject.Schemas" xmlns:b="http://schemas.microsoft.com/BizTalk/2003" elementFormDefault="qualified" targetNamespace="http://TestProject.Schemas" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:annotation>
    <xs:appinfo>
      <schemaEditorExtension:schemaInfo namespaceAlias="b" extensionClass="Microsoft.BizTalk.FlatFileExtension.FlatFileExtension" standardName="Flat File" xmlns:schemaEditorExtension="http://schemas.microsoft.com/BizTalk/2003/SchemaEditorExtensions" />
      <b:schemaInfo standard="Flat File" codepage="65001" default_pad_char=" " pad_char_type="char" count_positions_by_byte="false" parser_optimization="speed" lookahead_depth="3" suppress_empty_nodes="false" generate_empty_nodes="true" allow_early_termination="false" early_terminate_optional_fields="false" allow_message_breakup_of_infix_root="false" compile_parse_tables="false" root_reference="Root" />
    </xs:appinfo>
  </xs:annotation>
  <xs:element name="Root">
    <xs:annotation>
      <xs:appinfo>
        <b:recordInfo structure="delimited" child_delimiter_type="hex" child_delimiter="0xD 0xA" child_order="postfix" sequence_number="1" preserve_delimiter_for_empty_data="true" suppress_trailing_delimiters="false" />
      </xs:appinfo>
    </xs:annotation>
    <xs:complexType>
      <xs:sequence>
        <xs:annotation>
          <xs:appinfo>
            <groupInfo sequence_number="0" xmlns="http://schemas.microsoft.com/BizTalk/2003" />
          </xs:appinfo>
        </xs:annotation>
        <xs:element name="First">
          <xs:annotation>
            <xs:appinfo>
              <b:recordInfo tag_name="#FIRST" structure="delimited" child_delimiter_type="hex" child_delimiter="0x20" child_order="prefix" sequence_number="1" preserve_delimiter_for_empty_data="true" suppress_trailing_delimiters="false" />
            </xs:appinfo>
          </xs:annotation>
          <xs:complexType>
            <xs:sequence>
              <xs:annotation>
                <xs:appinfo>
                  <groupInfo sequence_number="0" xmlns="http://schemas.microsoft.com/BizTalk/2003" />
                </xs:appinfo>
              </xs:annotation>
              <xs:element name="Id" type="xs:string">
                <xs:annotation>
                  <xs:appinfo>
                    <b:fieldInfo justification="left" sequence_number="1" />
                  </xs:appinfo>
                </xs:annotation>
              </xs:element>
              <xs:element name="Text" type="xs:string">
                <xs:annotation>
                  <xs:appinfo>
                    <b:fieldInfo justification="left" sequence_number="2" wrap_char_type="char" wrap_char="&quot;" />
                  </xs:appinfo>
                </xs:annotation>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
        <xs:element name="Second">
          <xs:annotation>
            <xs:appinfo>
              <b:recordInfo tag_name="#SECOND" structure="delimited" child_order="prefix" sequence_number="2" preserve_delimiter_for_empty_data="true" suppress_trailing_delimiters="false" />
            </xs:appinfo>
          </xs:annotation>
          <xs:complexType>
            <xs:sequence>
              <xs:annotation>
                <xs:appinfo>
                  <groupInfo sequence_number="0" xmlns="http://schemas.microsoft.com/BizTalk/2003" />
                </xs:appinfo>
              </xs:annotation>
              <xs:element name="Text" type="xs:string">
                <xs:annotation>
                  <xs:appinfo>
                    <b:fieldInfo justification="left" sequence_number="1" wrap_char_type="char" wrap_char="&quot;" />
                  </xs:appinfo>
                </xs:annotation>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>


This will however generate an output like so:

<Root xmlns="http://TestProject.Schemas">
    <First>
        <Id>12345</Id>
        <Text>This is the first string</Text>
    </First>
    <Second>
        <Text>"This is the second string"</Text>
    </Second>
</Root>


Notice the quotation marks that are still left in the string even if we have defined them as a wrap character in the schema for that field.

In order to make them disappear from the field properly in the same manner as in the first line of text, we have to set the Child Delimiter parameter on the "Second" child record:

 <xs:element name="Second">
    <xs:annotation>
        <xs:appinfo>
            <b:recordInfo tag_name="#SECOND" structure="delimited" child_order="prefix" sequence_number="2" preserve_delimiter_for_empty_data="true" suppress_trailing_delimiters="false" child_delimiter_type="hex" child_delimiter="0x20" />
        </xs:appinfo>
    ...


Which in turn will make the quotation marks disappear from our element data.