Jitterbit: replace a non-breaking space
07 Oct 2022
Some visitor typed a non-breaking space into their address when filling out a form. 123 Sun St, New York, NY 10001
is just 30 characters long, but as I tried to data-load an address into an Oracle VARCHAR2(30)
-typed column (THE_ADDRESS
) with Jitterbit, Oracle complained: ORA-12899: value too large for column ... (actual: 31, maximum: 30)
. Jitterbit logs showed the actual address submitted in my flat-file source column “address
” as 123 Sun St, New York, NY 10001
.
“Allow truncation of character fields to avoid field length errors” was already checked on the database target side of the Jitterbit transformation … so what was this all about?
I looked up “ 
” and found out it’s the non-breaking space character. Somehow, Jitterbit was reading the non-breaking space character as only 1 character but the database was complaining, “No, that’s 2 characters.”
No worries! It’s just a non-breaking space in a single-line address … it can’t be that big of a deal … I’ll just replace it with an ordinary space.
Here’s my original transformation of data on its way into THE_ADDRESS
– that is to say, there wasn’t anything special about it.
<trans>
address
</trans>
Problem is, “ 
” is merely the way Jitterbit decided to represent the non-breaking space when e-mailing me. It wasn’t clear how to grab it when configuring my transformation of address
into THE_ADDRESS
so that I could replace it with an ordinary space.
In the end, I came up with this: convert the whole address into Hex.
Using Jitterbit’s Script Pad, I was able to validate that the output of this:
<trans>
StringToHex("123 Sun St, New York, NY 10001");
</trans>
Is this:
3132332053756e2053742c204e657720596f726b2c204e59203130303031
But the output of this, when I loaded the file that wasn’t working in as sample data for Jitterbit:
<trans>
StringToHex(address);
</trans>
Was this:
3132332053756e2053742c204e657720596f726b2c204e59c382c2a03130303031
Comparing the two in Notepad++, I was able to see that the difference was “20
” for an ordinary space vs. “c382c2a0
” for the non-breaking one. _(Hint: most of the time, 1 character in “plain” text will be 2 digits of “hexified” text. So I had a pretty good guess counting backwards from the end by 10 digits, chopping off “3130303031”)
Since StringToHex(...)
technically just returns a string made up of numbers, but it’s still a string, I could do Replace()
on it.
And then I could HexToString
it back into 123 Sun St, New York, NY 10001
, which Jitterbit and Oracle agree is only 30 characters.
So the final transformation of data on its way into Oracle’s THE_ADDRESS
field is now:
<trans>
// So there is this one really nuisance address with non-breaking spaces -- we'll just tidy it up here.
// Replace non-breaking spaces with ordinary spaces
spaceclean = HexToString(Replace(StringToHex(address), 'c382c2a0', '20'));
// Return cleaned data
spaceclean;
</trans>