Skip to main content

Special Character Handling in Inbound Data Files

How Toolio handles Unicode special characters across inbound data fields, including what's supported, where exceptions apply, and how to avoid matching failures.

Updated today

Toolio supports Unicode special characters — such as trademark (™), registered trademark (®), copyright (©), and similar symbols — across most inbound data fields. However, a handful of specific contexts treat characters differently. This article explains what's fully supported, where exceptions apply, and how to avoid common mismatches.

Use Cases

  • Your source system exports product names or attributes containing Unicode symbols and you want to confirm they'll import cleanly.

  • You're troubleshooting ID matching failures between two feeds that use different versions of the same value (e.g. a name with a symbol in one feed and without it in another).

  • You're preparing a new data feed and want to understand any field-level restrictions before go-live.

What Is Fully Supported

The following field types accept Unicode special characters with no restrictions. Values are stored exactly as provided.

  • Product titles and variant names: Symbols such as ™, ®, ©, °, and – are allowed and stored as-is.

  • Custom attribute fields (TL_col fields): No character filtering is applied.

  • External IDs for products, variants, locations, and choices: Special characters are preserved; lookups only apply trim and lowercasing.

Toolio's database uses utf8mb4 encoding with utf8mb4_unicode_ci collation, which supports the full Unicode character set. There is no transformation or loss of fidelity between what arrives in your data and what gets stored.

Exceptions to Be Aware Of

ID Matching Is Character-Sensitive

Toolio matches external IDs (product, variant, location, choice) using a sanitize function that trims whitespace and lowercases values — but does not normalize or remove special characters.

This means a product ID with a symbol and the same ID without it are treated as two different records — they will not match. If your feeds don't use the exact same characters in their IDs, records will not link correctly. Ensure that all source systems use identical ID values — including any symbols — or that none of them include the symbols.

Single-Select Option Values Must Match Exactly

For single-select attribute values used in filters or exclusion rules (e.g. promo causal values, location exclusion rules), matching follows the same sanitize logic: case-insensitive and whitespace-trimmed, but character-sensitive.

A value with a symbol in your feed must match the corresponding value in your Options configuration exactly — including the symbol. If they differ, the values will not match and the record may be skipped or rejected.

Carriage Returns Are Stripped in Some Importers

For Option Attribute Values, Column Meta, and Import Item importers, carriage returns (\r) are removed from field values before processing. Standard newlines (\n) are kept in product and variant data. If your source system uses Mac-style or Windows-style line endings, values may be altered slightly — though this rarely affects visible data.

Newlines and Single Quotes Are Modified in Mapper Formulas

When import mappers use formula-based column substitution, two transformations are applied:

  • Single quotes (') are escaped as \'

  • Newlines (\n) and carriage returns (\r) are removed

If values containing single quotes or multi-line strings are passed through a mapper formula, the quotes will be escaped and line breaks will be stripped. The stored value will reflect these changes.

Numeric Fields Reject Non-Numeric Characters

Fields that expect numeric input — such as promo causal lift or discount values — will reject values containing non-numeric characters. Sending "10%" instead of 10 or 0.10 will cause a validation error.

Use plain numbers for all numeric fields. Percentages should be expressed as decimals (e.g. 0.10 for 10%).

Downloaded Filenames Are ASCII-Only

When Toolio generates a filename for a downloadable file, non-ASCII characters are stripped from the filename. Unicode symbols will not appear in the downloaded file's name. The file content itself is unaffected.

Quick Reference

Scenario

Recommendation

Product titles and names with Unicode symbols

Fully supported — no action needed

External IDs with special characters

Supported, but must be identical across all feeds

Single-select / filter option values

Ensure Options config and feed values use exactly the same characters

Numeric fields (lift, discount, price)

Use numbers only — no % or other non-numeric symbols

CSV encoding

Save as UTF-8; avoid legacy encodings like Windows-1252 or Latin-1

Excel exports

Use "Save as UTF-8 CSV" when exporting from Excel

Newlines in mapper formula values

Avoid — they will be stripped during formula substitution

FAQs

What's the Most Common Cause of Matching Failures When Using Special Characters?

Inconsistency across feeds. If your product master includes a symbol in an ID or name but your sales or inventory feed does not, Toolio will treat them as different records. The fix is to standardize the value — with or without the symbol — across every source system feeding Toolio.

What If Our Source System Exports Garbled Characters Instead of the Expected Symbols?

This typically happens when the exported file is not in UTF-8 encoding. Many source systems default to a legacy encoding (such as Windows-1252) that can't represent Unicode characters correctly. The fix should happen on the source side — ensure your export is configured for UTF-8. Toolio stores whatever it receives and cannot repair encoding issues upstream.

Do Special Characters Affect How Values Are Stored vs. How They're Displayed?

No. Values are stored exactly as received (after whitespace trimming). What you send in the file is what will appear in Toolio.

Did this answer your question?