What Are Custom Extractors?
Custom extractors are specialized plugins that enable you to handle file formats that aren’t natively supported by Flatfile’s existing plugins. They process uploaded files, extract structured data, and provide that data for mapping into Sheets as Records. This guide covers everything you need to know to build custom extractors. Common use cases include:- Legacy system data exports (custom delimited files, fixed-width formats)
- Industry-specific formats (healthcare, finance, manufacturing)
- Multi-format processors (handling various formats in one extractor)
- Binary file handlers (images with metadata, proprietary formats)
Architecture Overview
Core Components
Custom extractors are built using the@flatfile/util-extractor
utility, which provides a standardized framework for file processing:
file:created
event and processes your files.
Handling Multiple File Extensions
To support multiple file extensions, use a RegExp pattern:Key Architecture Elements
Component | Purpose | Required |
---|---|---|
File Extension | String or RegExp of supported file extension(s) | ✓ |
Extractor Type | String identifier for the extractor type | ✓ |
Parser Function | Core logic that converts file buffer to structured data | ✓ |
Options | Configuration for chunking, parallelization, and customization | - |
Data Flow
- File Upload → Flatfile receives file with matching extension
- Event Trigger →
file:created
event fires - Parser Execution → Your parser function processes the file buffer
- Data Structuring → Raw data is converted to WorkbookCapture format and provided to Flatfile for mapping into Sheets as Records
- Job Completion → Processing status is reported to user
Getting Started
Remember that custom extractors are powerful tools for handling unique data formats. Start with simple implementations and gradually add complexity as needed.Prerequisites
Install the required packages. You may also want to review our Coding Tutorial if you haven’t created a Listener yet.Basic Implementation
Let’s create a simple custom extractor for a pipe-delimited format. This will be used to process files with the.pipe
or .psv
extension that look like this:
.pipe
or .psv
extension.
Advanced Examples
Multi-Sheet Parser
Let’s construct an Extractor to handle files that contain multiple data sections. This will be used to process files with the.multi
or .sections
extension that look like this:
Binary Format Handler
This example will be used to process binary files with structured data. This will be used to process binary files with the.bin
or .dat
extension. Due to the nature of binary format, we can’t easily present a sample import here.
Configuration-Driven Extractor
Create a flexible extractor that can be configured for different formats. This will be used to process files in a manner that handles different delimiters, line endings, and other formatting options..custom
extension that look like this, while transforming dates and amount values:
.pipe
or .special
extension that look like this:
Reference
API
Parameter | Type | Description |
---|---|---|
fileExt | string or RegExp | File extension to process (e.g., ".custom" or /\.(custom|special)$/i ) |
extractorType | string | Identifier for the extractor type (e.g., “custom”, “binary”) |
parseBuffer | ParserFunction | Function that converts Buffer to WorkbookCapture |
options | Record<string, any> | Optional configuration object |
Options
Option | Type | Default | Description |
---|---|---|---|
chunkSize | number | 5000 | Records to process per batch |
parallel | number | 1 | Number of concurrent processing chunks |
debug | boolean | false | Enable debug logging |
Parser Function Options
YourparseBuffer
function receives additional options beyond what you pass to Extractor
:
Option | Type | Description |
---|---|---|
fileId | string | The ID of the file being processed |
fileExt | string | The file extension (e.g., “.csv”) |
headerSelectionEnabled | boolean | Whether header selection is enabled for the space |
Data Structures
WorkbookCapture Structure
The parser function must return aWorkbookCapture
object:
Cell Value Objects
Each cell value should use theFlatfile.RecordData
format:
Message Types
Type | Description | UI Effect |
---|---|---|
error | Validation error | Red highlighting, blocks Actions with the hasAllValid constraint |
warning | Warning message | Yellow highlighting, allows submission |
info | Informational message | Mouseover tooltip, allows submission |
TypeScript Interfaces
Troubleshooting Common Issues
Files Not Processing
Symptoms: Files upload but no extraction occurs Solutions:- Verify file extension matches
fileExt
configuration - Check Listener is properly deployed and running
- Enable debug logging to see processing details
Parser Errors
Symptoms: Jobs fail with parsing errors Solutions:- Add try-catch blocks in parser function
- Validate input data before processing
- Return helpful error messages
Memory Issues
Symptoms: Large files cause timeouts or memory errors Solutions:- Reduce chunk size for large files
- Implement streaming for very large files
- Use parallel processing carefully
Performance Problems
Symptoms: Slow processing, timeouts Solutions:- Optimize parser algorithm
- Use appropriate chunk sizes
- Consider parallel processing for I/O-bound operations