XML Extractor
A Flatfile plugin for parsing XML files, flattening nested structures and attributes, and converting them into tabular format for easy import into Flatfile Sheets.
The XML Extractor plugin processes .xml files uploaded to Flatfile by parsing XML data, flattening its nested structure and attributes, and converting it into a tabular format. The plugin automatically detects the main collection of records within the XML file and transforms each record into a row, making it ideal for importing structured XML data from legacy systems, APIs, or data exports.
Installation
Install the XML Extractor plugin using npm:
Configuration & Parameters
The XML Extractor accepts the following configuration options:
Parameter | Type | Default | Description |
---|---|---|---|
separator | string | "/" | Character used to separate keys when flattening nested XML elements |
attributePrefix | string | "#" | Prefix added to keys derived from XML element attributes |
transform | function | undefined | Function to transform each parsed data row before passing to Flatfile |
chunkSize | number | 10000 | Number of records to process in each batch |
parallel | number | 1 | Number of chunks to process concurrently |
debug | boolean | false | Enable verbose logging for debugging |
Default Behavior
By default, the plugin:
- Flattens nested elements using
"/"
as a separator (e.g.,<country><name>USA</name></country>
becomes"country/name"
) - Prefixes attributes with
"#"
(e.g.,<country code="US">
becomes"country#code"
) - Processes files in chunks of 10,000 records with sequential processing
- Automatically infers headers from all unique keys found across records
Usage Examples
Basic Usage
Custom Configuration
Using the Standalone Parser
Error Handling
Troubleshooting
Fields Not Appearing as Expected
Check the separator
and attributePrefix
configuration. The default flattening behavior might not match your desired output format.
Handling Repeated Tags
If tags within a record are repeated (e.g., multiple <email>
tags), the extractor creates indexed headers. The first email tag becomes the email
field, the second becomes email/0
, the third email/1
, and so on.
Debugging Row Data
Use the transform
function to inspect intermediate parsed output:
Notes
- The plugin is designed for server-side environments and should be used in Flatfile listeners deployed on the Platform
- XML files must have a single root element containing an array of similar child elements (records)
- Not suitable for arbitrary or document-style XML with highly irregular structures
- Headers are automatically inferred from all unique keys found across all records
- The primary error condition is invalid XML structure - if no root element containing records is found, the parser throws “No root xml object found”