Handling CSV validation errors before they hit your database | Tapix                  [ ![Tapix](/img/tapix-logo-light.svg) ![Tapix](/img/tapix-logo-dark.svg) ](https://tapix.dev)   Try Demo  [Pre-order for $59 $99](https://tapix.dev#pricing)

    Try Demo  [Pre-order for $59 $99](https://tapix.dev#pricing)

   ![Tapix](https://tapix.dev/img/tapix-logo-light.svg)

 Best PracticesHandling CSV validation errors before they hit your database
============================================================

 tapix.dev/blog

  [    Back to blog ](https://tapix.dev/blog) [ Best Practices ](https://tapix.dev/blog/category/best-practices)

Handling CSV validation errors before they hit your database
============================================================

 Manch Minasyan ·  April 21, 2026  · 11 min read

 CSV import validation in Laravel is nothing like form validation, and treating it the same way is the root cause of most import failures in production. A form has one user submitting one payload. A CSV file has 10,000 rows, each wrong in its own way, authored by someone who has never seen your database schema.

Row 47 has an empty string where you expected a number. Row 1,203 has "15/04/2026" -- is that April 15th or March... wait, there is no 15th month. European day-first or American month-first? Row 8,412 has "$1.234,56" because the person who exported the file uses German locale conventions. Row 6,000 has "yes" in a boolean column, row 6,001 has "Y", and row 6,002 has "1". All three mean the same thing, but a naive validator rejects two of them.

These are not hypothetical. They are the default state of real-world CSV data. The question is not whether your import will encounter these problems, but what happens when it does.

[\#](#three-common-approaches-and-why-they-all-fail "Permalink")Three common approaches and why they all fail
-------------------------------------------------------------------------------------------------------------

Most Laravel CSV import implementations fall into one of three patterns for handling invalid data. Each one solves a small piece of the problem while creating a worse one.

### [\#](#validate-and-reject "Permalink")Validate-and-reject

The simplest approach: run the file through validation, and if any row fails, reject the entire upload.

```
foreach ($rows as $index => $row) {
    $validator = Validator::make($row, [
        'email' => ['required', 'email'],
        'salary' => ['required', 'numeric'],
        'hired_at' => ['required', 'date'],
    ]);

    if ($validator->fails()) {
        return back()->withErrors([
            'csv' => "Row {$index}: {$validator->errors()->first()}",
        ]);
    }
}

```

This fails on first error with no context. The user sees "Row 47: The salary field must be a number" and has no idea that rows 48 through 312 also have problems. They fix row 47, re-upload, hit row 48. For a file with 200 bad rows, this becomes a multi-hour loop of re-uploading and guessing. Every round trip forces them out of your application and into a spreadsheet editor.

### [\#](#validate-and-log "Permalink")Validate-and-log

The more sophisticated version: process the entire file, skip rows that fail validation, and give the user a report at the end. This is how Filament's built-in Import Action handles it -- failed rows are collected and offered as a downloadable CSV.

The user uploads 10,000 contacts. The import finishes. They get a notification: "8,247 rows imported. 1,753 failed. Download failed rows." They download the CSV, fix errors in Excel, and re-upload just the failed rows.

Better than reject-on-first-error, but still painful. The user context-switches to a spreadsheet editor and loses all mapping configuration from the first import. If failed rows include relationship data (a "Company" column that needs to match existing records), the re-upload has no memory of the original match resolution. And between the first import and the re-upload, database state has changed -- what was a valid email on the first pass might now be a duplicate.

### [\#](#validate-and-skip "Permalink")Validate-and-skip

The worst pattern, and unfortunately common: silently skip bad rows and report a summary.

"Imported 8,247 of 10,000 rows." Where did the other 1,753 go? The user has no idea which rows were skipped, why they were skipped, or whether the skipped rows contained critical data. This is silent data loss dressed up as a success message.

In production, users do not notice missing rows until weeks later when someone asks "where is the record for Acme Corp?" and nobody can explain why it vanished during the import.

[\#](#the-better-pattern-validate-and-correct "Permalink")The better pattern: validate-and-correct
--------------------------------------------------------------------------------------------------

The fundamental insight is that validation and execution should be separate steps with a user decision point between them. Instead of validate-then-import or validate-then-reject, the flow becomes: validate, show errors, let the user correct, then execute.

This works in four stages:

1. **Parse and store**: Read the CSV, store every row as raw data in the database. Nothing is imported yet.
2. **Validate per column**: Run validation across all rows for each mapped column. Store errors as structured data alongside each row.
3. **Review and correct**: Show the user a table of rows with inline error indicators. They click on a red cell, correct the value, and the correction is stored separately from the raw data.
4. **Execute**: Process only valid rows (original values plus corrections). Failed rows that the user chose not to fix are skipped with full visibility.

The critical difference from the three failing patterns: the user fixes data inside your application, not in a spreadsheet. They see every error at once, not one at a time. And corrections are tracked separately from the original data, so you always know what changed and why.

### [\#](#why-column-level-validation-matters "Permalink")Why column-level validation matters

A naive implementation validates row by row: iterate through 10,000 rows, run each through a validator. For a file with 15 mapped columns, that is 150,000 individual validation calls.

The column-level approach flips this. Instead of validating row 1 across all columns, then row 2 across all columns, you validate all rows for the "email" column, then all rows for the "salary" column. This has two significant advantages.

First, it is parallelizable. Each column's validation is independent -- the email column does not need to know about the salary column. You can dispatch one job per column and run them concurrently as a batch.

Second, and more important: you can validate unique values instead of every row. If 10,000 rows contain only 847 unique email addresses, you validate 847 values instead of 10,000. For columns with low cardinality (a "status" field with 5 possible values), the savings are dramatic.

```
// Fetch only distinct, uncorrected values for this column
// Note: json_extract() is SQLite-specific syntax. Tapix uses a JsonQuery
// helper that generates the correct SQL for SQLite, MySQL, and PostgreSQL.
$uniqueValues = ImportRow::where('import_id', $importId)
    ->selectRaw("DISTINCT json_extract(raw_data, ?) as value", [$columnKey])
    ->pluck('value')
    ->filter()
    ->all();

// Validate each unique value once
$validator = new ColumnValidator;

foreach ($uniqueValues as $value) {
    $error = $validator->validate($column, $value);
    $results[] = [
        'raw_value' => $value,
        'validation_error' => $error?->toStorageFormat(),
    ];
}

// Apply results to all rows sharing that value
foreach ($results as $result) {
    // Same note: raw json_extract() shown for clarity; Tapix abstracts
    // this through JsonQuery for cross-database compatibility.
    ImportRow::where('import_id', $importId)
        ->whereRaw("json_extract(raw_data, ?) = ?", [$columnKey, $result['raw_value']])
        ->get()
        ->each(function (ImportRow $row) use ($columnKey, $result) {
            $validation = $row->validation?->toArray() ?? [];

            if ($result['validation_error'] === null) {
                unset($validation[$columnKey]);
            } else {
                $validation[$columnKey] = $result['validation_error'];
            }

            $row->update(['validation' => $validation === [] ? null : $validation]);
        });
}

```

The validation state is stored as a JSON column on each `ImportRow`. Keys are column identifiers, values are error messages. A row with no validation errors has `null` for its validation column. This makes querying trivial: `ImportRow::whereNotNull('validation')` gives you every row with at least one error.

[\#](#implementation-patterns-in-laravel "Permalink")Implementation patterns in Laravel
---------------------------------------------------------------------------------------

### [\#](#type-aware-validation "Permalink")Type-aware validation

Generic Laravel validation rules (`required`, `email`, `numeric`) are a starting point, but CSV data needs type-specific parsing that understands format ambiguity. A `ColumnValidator` should dispatch to specialized validators based on field type:

```
final class ColumnValidator
{
    public function validate(ColumnData $column, string $value): ?ValidationError
    {
        $type = $column->getType();

        return match (true) {
            $type->isDate() => $this->validateDate($column, $value),
            $type->isNumeric() => $this->validateNumber($column, $value),
            $type === FieldType::Boolean => $this->validateBoolean($value),
            default => $this->validateText($column, $value),
        };
    }

    private function validateBoolean(string $value): ?ValidationError
    {
        $normalized = strtolower(trim($value));

        if (in_array($normalized, ['1', '0', 'true', 'false', 'yes', 'no', 'on', 'off'], true)) {
            return null;
        }

        return ValidationError::message(
            'The value must be true or false (accepted: true, false, 1, 0, yes, no).'
        );
    }
}

```

Boolean validation is a good example of the gap between database types and human-authored CSV data. A boolean column in PostgreSQL accepts `true` or `false`. A human typing into a spreadsheet might write "yes", "Y", "1", "on", or "TRUE". Your validator needs to accept all of these and normalize them before they reach the database.

Numbers are even more complex. The string "$1,234.56" is a valid number in US locale, but "1.234,56" is the same number in German locale. The validation layer needs to know which format the user selected during column mapping, then parse accordingly -- stripping currency symbols, thousands separators, and normalizing the decimal separator to a format PHP and the database understand.

Dates carry the most ambiguity. The string "01/02/2024" means January 2nd in American format and February 1st in European format. There is no way to determine the correct interpretation from the value alone. The user must specify the date format during mapping, and the validator uses that context: ISO (`2024-01-02`), European (`02/01/2024`, day first), or American (`01/02/2024`, month first). Each format has multiple parse patterns to handle variations like `d-m-Y`, `d/m/Y`, and `d.m.Y`.

### [\#](#the-correction-layer "Permalink")The correction layer

When a user fixes a value in the review step, the correction is stored separately from the raw data. The `ImportRow` model has both a `raw_data` JSON column (the original CSV values, never modified) and a `corrections` JSON column (user-provided fixes, keyed by column).

During execution, the system merges the two: if a correction exists for a column, use it; otherwise use the raw value. This separation matters for auditability -- you can always trace what the original file contained versus what the user changed.

It also matters for re-validation. When the user corrects a date field, the system clears the validation error for that column and re-validates the corrected value. If the correction is also invalid, the error reappears. The user knows immediately whether their fix worked.

[\#](#cross-database-gotchas-that-break-production-imports "Permalink")Cross-database gotchas that break production imports
---------------------------------------------------------------------------------------------------------------------------

This section exists because of a pattern that catches nearly every team building CSV imports for the first time: it works on their local machine and breaks in production.

### [\#](#postgresql-is-strict-sqlite-is-not "Permalink")PostgreSQL is strict, SQLite is not

Most Laravel developers run SQLite locally and PostgreSQL in production. SQLite is forgiving to a fault. Insert an empty string into an integer column? SQLite stores it as an empty string. Insert "yes" into a boolean column? SQLite stores the string "yes".

PostgreSQL rejects all of this. An empty string is not a valid integer. An empty string is not a valid boolean. An empty string is not a valid date. Your import passes every test locally and throws `SQLSTATE[22P02]: Invalid text representation` the moment it hits production.

The fix is normalization before database insertion. Every value from a CSV is a string. Before it reaches an INSERT or UPDATE query, typed fields must be cast to the appropriate PHP type, and empty strings must become `null`:

```
public function castValue(mixed $value): mixed
{
    if ($value === null) {
        return null;
    }

    // Numeric fields: parse through format-aware parser
    if ($this->getType()->isNumeric() && is_string($value)) {
        $format = $this->numberFormat ?? NumberFormat::POINT;

        return $format->parse($value); // Returns float or null
    }

    return $value;
}

```

The `NumberFormat::parse()` method strips currency symbols (`$`, `EUR`, `GBP`), removes thousands separators, normalizes the decimal point, and returns a `float` or `null`. The `null` return is important -- it means the value was empty or unparseable, and `null` is valid for nullable numeric columns in every database engine.

### [\#](#mysqls-silent-truncation "Permalink")MySQL's silent truncation

MySQL has its own trap. With `STRICT_TRANS_TABLES` disabled (still common in older configurations), MySQL silently truncates values that do not fit. A 300-character string in a `varchar(255)` column becomes 255 characters without error. "not-a-date" in a date column becomes `0000-00-00`. Validation errors that PostgreSQL catches loudly get swallowed silently. Your validation layer must catch these before the query, regardless of which database engine is running.

### [\#](#the-practical-impact "Permalink")The practical impact

In a cross-database compatible import system, every value must pass through a normalization pipeline:

1. **Empty string to null**: `"" -> null` for all typed fields (numeric, boolean, date, datetime)
2. **Format parsing**: Currency symbols stripped, decimal separators normalized, date formats resolved
3. **Type casting**: String values cast to PHP `float`, `bool`, `Carbon`, or left as `string`
4. **Null filtering**: On the create path, null values for non-required fields are stripped from the attribute array entirely, letting the database default take effect

This pipeline runs after validation and before execution. Validation catches values that cannot be normalized. The pipeline normalizes everything that can be.

[\#](#building-this-yourself-vs-using-it-off-the-shelf "Permalink")Building this yourself vs. using it off the shelf
--------------------------------------------------------------------------------------------------------------------

The validate-and-correct pattern is not conceptually difficult. The difficulty is in the surface area: JSON-backed row storage, parallel column validation jobs, format-aware parsers for numbers and dates, a review UI with inline editing, a correction flow with re-validation, cross-database normalization, and a JSON query abstraction layer (because `json_extract()`, `->>'key'`, and `jsonb_extract_path_text()` are all different syntax). Each piece is 50-200 lines. Together they take weeks to build and months to harden.

For more on why Tapix takes this approach, see [Why we're building Tapix](/blog/why-we-are-building-tapix). For the full import pipeline, see the [complete guide to CSV imports in Laravel](/blog/complete-guide-csv-imports-laravel).

Row 8,412 with German locale numbers doesn't have to be anyone's problem. That is what the pipeline handles.

Tapix handles per-column validation, inline correction, and cross-database normalization out of the box. The wizard's review step shows every validation error grouped by column, lets users edit values in place, and re-validates corrections before execution begins. [Check it out](/).

 ### Enjoyed this post?

Get notified when we publish new articles about Laravel imports and data handling.

  Email address   Subscribe

Almost there — confirm your subscription via email.

  [ ![Tapix](/img/tapix-logo-light.svg) ![Tapix](/img/tapix-logo-dark.svg) ](https://tapix.dev)CSV and Excel import wizard for Laravel.

  Product [Pricing](https://tapix.dev#pricing) [Docs](https://docs.tapix.dev) [Blog](https://tapix.dev/blog) [Contact](mailto:hello@tapix.dev)

 Compare [vs Laravel Excel](https://tapix.dev/vs/laravel-excel) [vs Filament Import](https://tapix.dev/vs/filament-import)

 Legal [Privacy](https://tapix.dev/privacy-policy) [Terms](https://tapix.dev/terms-of-service)

© 2026 Tapix. All rights reserved.
