Delimiter Extractor Plugin

The Delimiter Extractor plugin is designed to parse text files that use non-standard delimiters to separate values. Its primary purpose is to automatically detect and extract structured data from these files when they are uploaded to Flatfile. It supports a variety of single-character delimiters such as ;, :, ~, ^, and #. This plugin is useful in scenarios where data is provided in custom formats that are not natively handled by the Flatfile platform’s standard CSV, TSV, or PSV parsers. It operates within a server-side listener, triggering on the file:created event to process the file, identify headers, and structure the data into records for import.

Installation

Install the plugin using npm:

npm install @flatfile/plugin-delimiter-extractor

Configuration & Parameters

Required Parameters

fileExt

string

required

The file extension (e.g., “.txt”, “.dat”) that the plugin should listen for. This is the first argument to the DelimiterExtractor function.

options.delimiter

string

required

The character used to separate values in the file. Supported delimiters are ;, :, ~, ^, #.

Optional Parameters

options.dynamicTyping

boolean

default:"false"

If set to true, the plugin will attempt to convert numeric and boolean strings into their corresponding types. For example, “123” becomes 123 and “true” becomes true.

options.skipEmptyLines

boolean | 'greedy'

default:"false"

Controls how empty lines in the file are handled:

true: Skips lines that are completely empty
'greedy': Skips lines that contain only whitespace characters
false: Includes all lines, even empty ones

options.transform

function

default:"identity function"

A function that is applied to each individual cell value during parsing. The return value of the function will replace the original value. This is applied before dynamicTyping.

options.chunkSize

number

default:"10000"

The number of records to process in each batch or chunk when inserting data into Flatfile.

options.parallel

number

default:"1"

The number of chunks to process concurrently.

options.headerDetectionOptions

object

An advanced configuration object to control the header detection strategy. Allows for specifying explicit headers, looking for headers in specific rows, or using different detection algorithms. Default uses the ‘default’ algorithm, which selects the row with the most non-empty cells within the first 10 rows as the header.

options.guessDelimiters

string[]

default:"[',', '|', '\\t', ';', ':', '~', '^', '#']"

An array of delimiter characters to try if a specific delimiter is not provided. The parser will use the first one that successfully parses the data.

options.debug

boolean

default:"false"

Enables debug logging.

Usage Examples

Basic Usage

Configure the listener to use the plugin for any .txt file, specifying that the data is separated by a colon:

import { listener } from "@flatfile/platform";
import { DelimiterExtractor } from "@flatfile/plugin-delimiter-extractor";

listener.use(DelimiterExtractor(".txt", { delimiter: ":" }));

Advanced Configuration

This example shows a more detailed configuration for .data files with type conversion, empty line handling, and value transformation:

import { listener } from "@flatfile/platform";
import { DelimiterExtractor } from "@flatfile/plugin-delimiter-extractor";

const options = {
  delimiter: "#",
  dynamicTyping: true,
  skipEmptyLines: 'greedy',
  transform: (value) => {
    if (typeof value === 'string') {
      return value.toUpperCase();
    }
    return value;
  },
};

listener.use(DelimiterExtractor(".data", options));

Custom Header Detection

This example demonstrates how to use advanced header detection options to explicitly define headers:

import { listener } from "@flatfile/platform";
import { DelimiterExtractor } from "@flatfile/plugin-delimiter-extractor";

const advancedOptions = {
  delimiter: "~",
  headerDetectionOptions: {
    algorithm: 'explicitHeaders',
    headers: ['product_id', 'product_name', 'quantity', 'price'],
    skip: 1 // Skip the first row in the file
  }
};

listener.use(DelimiterExtractor(".inv", advancedOptions));

Direct Buffer Parsing

For advanced use cases where you need to parse a buffer directly:

import * as fs from 'fs';
import { delimiterParser } from "@flatfile/plugin-delimiter-extractor";

async function parseLocalFile() {
  const fileBuffer = fs.readFileSync('my-data.txt');
  const options = { delimiter: '|', dynamicTyping: true };

  const workbookData = await delimiterParser(fileBuffer, options);
  console.log(workbookData.Sheet1.headers);
  console.log(workbookData.Sheet1.data[0]);
}

parseLocalFile();

Troubleshooting

No Data Appears After Upload

If a file is uploaded but no data appears, check the following:

File Extension: Ensure the file extension matches the one configured in DelimiterExtractor(fileExt, ...)
Delimiter: Verify that the delimiter option matches the actual delimiter used in the file
Empty Files: If the file is empty or contains no parsable data, the plugin will log “No data found in the file” to the console and produce no records

Unsupported File Types Error

try {
  // This will throw an error
  const csvExtractor = DelimiterExtractor(".csv", { delimiter: "," });
} catch (e) {
  console.error(e.message); 
  // -> ".csv is a native file type and not supported by the delimiter extractor."
}

Notes

Limitations

The plugin explicitly does not support file types that are natively handled by Flatfile: .csv (comma-separated), .tsv (tab-separated), and .psv (pipe-separated)
The list of supported delimiters is fixed to: ;, :, ~, ^, #
This plugin is intended to be run in a server-side listener environment within the Flatfile Platform

Error Handling

The main DelimiterExtractor function includes a guard clause that throws an Error if an unsupported native file type is provided
The internal parsing function uses try-catch blocks to handle parsing errors, which are logged to the console and re-thrown, causing the associated Flatfile job to fail

Default Behavior

By default, the plugin does not perform type conversion (dynamicTyping: false)
Empty lines are included in the output unless explicitly configured otherwise (skipEmptyLines: false)
The plugin processes 10,000 records per chunk with no parallel processing (chunkSize: 10000, parallel: 1)
Header detection uses the ‘default’ algorithm, selecting the row with the most non-empty cells within the first 10 rows

Getting Started

Core Concepts

Plugins

Advanced Guides

Embedding Flatfile

Reference

Installation

Configuration & Parameters

Required Parameters

Optional Parameters

Usage Examples

Basic Usage

Advanced Configuration

Custom Header Detection

Direct Buffer Parsing

Troubleshooting

No Data Appears After Upload

Unsupported File Types Error

Notes

Limitations

Error Handling

Default Behavior

Getting Started

Core Concepts

Plugins

Advanced Guides

Embedding Flatfile

Reference

​Installation

​Configuration & Parameters

​Required Parameters

​Optional Parameters

​Usage Examples

​Basic Usage

​Advanced Configuration

​Custom Header Detection

​Direct Buffer Parsing

​Troubleshooting

​No Data Appears After Upload

​Unsupported File Types Error

​Notes

​Limitations

​Error Handling

​Default Behavior

Installation

Configuration & Parameters

Required Parameters

Optional Parameters

Usage Examples

Basic Usage

Advanced Configuration

Custom Header Detection

Direct Buffer Parsing

Troubleshooting

No Data Appears After Upload

Unsupported File Types Error

Notes

Limitations

Error Handling

Default Behavior