Importing relational data from CSV files in Laravel | Tapix                  [ ![Tapix](/img/tapix-logo-light.svg) ![Tapix](/img/tapix-logo-dark.svg) ](https://tapix.dev)   Try Demo  [Pre-order for $59 $99](https://tapix.dev#pricing)

    Try Demo  [Pre-order for $59 $99](https://tapix.dev#pricing)

   ![Tapix](https://tapix.dev/img/tapix-logo-light.svg)

 TutorialsImporting relational data from CSV files in Laravel
===================================================

 tapix.dev/blog

  [    Back to blog ](https://tapix.dev/blog) [ Tutorials ](https://tapix.dev/blog/category/tutorials)

Importing relational data from CSV files in Laravel
===================================================

 Manch Minasyan ·  April 24, 2026  · 13 min read

 The easy part of CSV importing is flat fields. First name goes into `first_name`, email goes into `email`, done. The hard part starts the moment a column references something that lives in another table. A "Company" column that needs to become a `company_id` foreign key. A "Tags" column with comma-separated values that map to a polymorphic many-to-many relationship. A "Category" field pointing to a record that might not exist yet.

This post walks through the real patterns you need: BelongsTo lookups, MorphToMany tag resolution, the match-or-create decision, match priority strategies, and multi-tenant scoping. Each section includes code you can adapt. If you are new to CSV imports in Laravel, [The complete guide to CSV imports in Laravel](/blog/complete-guide-csv-imports-laravel) covers the full landscape of approaches first.

[\#](#the-problem-csv-is-flat-databases-are-not "Permalink")The problem: CSV is flat, databases are not
-------------------------------------------------------------------------------------------------------

A CSV file is a grid. Every value is a string in a cell. There are no foreign keys, no join tables, no polymorphic relationships. But the database you are importing into has all of those things.

Consider a contacts CSV:

```
First Name,Last Name,Email,Company,Tags
Jane,Doe,jane@example.com,Acme Corp,"laravel, php, backend"
John,Smith,john@example.com,Acme Corp,"laravel, devops"
Sarah,Chen,sarah@example.com,Globex Inc,"php, frontend"

```

Three columns here reference other tables:

- **Company** is a BelongsTo relationship. "Acme Corp" needs to resolve to a row in the `companies` table and its `id` stored as `company_id` on the contact.
- **Tags** is a MorphToMany relationship. "laravel, php, backend" needs to be split into individual tags, each resolved or created in the `tags` table, then attached via a pivot table.
- Both Jane and John reference "Acme Corp". Both Jane and Sarah reference "php". The import needs to create each related record once, not once per row.

Every one of these is a lookup, a potential creation, and a deduplication problem. And they all interact: you cannot insert the contact row until you have resolved the company ID, and you cannot attach tags until the contact row exists.

[\#](#simple-case-belongsto-lookup "Permalink")Simple case: BelongsTo lookup
----------------------------------------------------------------------------

The most common relationship in CSV imports is BelongsTo. A column contains a human-readable name, and you need the corresponding foreign key.

Here is how most developers first write this, typically inside a Laravel Excel `ToModel` import:

```
use App\Models\Contact;
use App\Models\Company;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithHeadingRow;

class ContactImport implements ToModel, WithHeadingRow
{
    public function model(array $row): Contact
    {
        $company = Company::where('name', $row['company'])->first();

        return new Contact([
            'first_name'  => $row['first_name'],
            'last_name'   => $row['last_name'],
            'email'       => $row['email'],
            'company_id'  => $company?->id,
        ]);
    }
}

```

This works when every company in the CSV already exists in the database. But what happens when it does not?

You have three options, and the first two are bad:

**Option A: Skip the row.** If "Globex Inc" does not exist, set `company_id` to null or skip the entire contact. This causes silent data loss. The user uploaded 500 contacts and only 430 made it. The 70 that referenced unknown companies vanished without explanation.

**Option B: Always create.** Use `firstOrCreate` instead of `first`. Now every unknown company name becomes a new company record. This sounds reasonable until the CSV has typos. "Acme Corp", "Acme Corp.", "ACME Corp", and "acme corp" all become separate companies. Your companies table fills with garbage that someone has to clean up manually.

**Option C: Let the user decide.** Show the user which company names matched and which did not. Let them choose per-relationship whether unmatched values should be created, skipped, or corrected before the import runs. This is the right answer, and it is significantly harder to build.

Option C requires that relationship resolution happens as a discrete step, not inline during row insertion. The import system needs to collect all unique relationship values from the file, attempt to match them against existing records, present the results to the user, and then execute based on their decision. This is why a multi-step import wizard exists -- a single-pass `model()` method cannot support this workflow.

[\#](#complex-case-morphtomany-and-tags "Permalink")Complex case: MorphToMany and tags
--------------------------------------------------------------------------------------

Tags are the second most common relationship pattern in CSV imports, and they compound every problem from the BelongsTo case.

A typical tags column looks like: `"laravel, php, backend"`. To import this, you need to:

1. Split the string on commas
2. Trim whitespace from each value
3. Normalize for matching (lowercase, strip extra spaces)
4. Look up each tag in the database
5. Create any that do not exist (if the import is configured to allow that)
6. Attach all of them to the contact via the pivot table
7. Do this after the contact record exists, since you need its ID for the pivot

Here is a manual implementation:

```
public function afterModel(Contact $contact, array $row): void
{
    if (empty($row['tags'])) {
        return;
    }

    $tagNames = array_map('trim', explode(',', $row['tags']));
    $tagIds = [];

    foreach ($tagNames as $name) {
        if ($name === '') {
            continue;
        }

        $tag = Tag::firstOrCreate(
            ['name' => strtolower($name)],
        );

        $tagIds[] = $tag->id;
    }

    $contact->tags()->syncWithoutDetaching($tagIds);
}

```

This works for small files. It falls apart at scale for one reason: deduplication.

If your CSV has 500 rows and 400 of them include "laravel" in the tags column, the code above runs `Tag::firstOrCreate(['name' => 'laravel'])` four hundred times. The first call creates the tag. The next 399 each run a SELECT query, find the existing row, and return it. That is 399 unnecessary database queries.

The fix is an in-memory cache that tracks what you have already resolved within the current import:

```
private array $tagCache = [];

private function resolveTag(string $name): int
{
    $cacheKey = "tag:" . strtolower(trim($name));

    if (isset($this->tagCache[$cacheKey])) {
        return $this->tagCache[$cacheKey];
    }

    $tag = Tag::firstOrCreate(
        ['name' => strtolower(trim($name))],
    );

    $this->tagCache[$cacheKey] = $tag->id;

    return $tag->id;
}

```

The cache key uses a `{linkKey}:{normalized_name}` pattern. The `linkKey` prefix (here `tag:`) prevents collisions when you resolve multiple relationship types in the same import. The normalized name strips whitespace and lowercases so that "Laravel", " laravel ", and "LARAVEL" all hit the same cache entry.

This pattern is essential for any import that touches relationship tables. Without it, an import of 10,000 rows with a tags column generates tens of thousands of redundant queries. With it, you issue one query per unique tag value in the entire file.

[\#](#the-match-or-create-decision "Permalink")The match-or-create decision
---------------------------------------------------------------------------

The examples above hardcode the behavior: `firstOrCreate` always creates missing records, `first` never does. But in production, the right behavior depends on the data and the user's intent.

Consider three scenarios for a "Company" column during a contact import:

1. **Importing into a mature CRM.** Most companies already exist. You want to link contacts to existing companies and flag any that do not match. Creating new companies from CSV data would introduce unverified records.
2. **Seeding a new application.** Nothing exists yet. Every company name in the CSV should become a new company record.
3. **Ongoing data sync.** Some companies exist, some are new. You want to match existing ones and create the rest.

These map directly to three behaviors:

- **MatchOnly** -- Only look up existing records. If no match is found, leave the relationship empty or skip the row entirely. The imported data never creates side effects in related tables.
- **MatchOrCreate** -- Look up first. If a match exists, link to it. If not, create a new record and link to that. This is the most common behavior for user-facing imports.
- **Create** -- Skip the lookup entirely. Always create a new related record. Useful for one-time bulk seeding where you know nothing exists yet.

The key insight is that this decision belongs to the user, not the developer. A well-designed import system exposes it as a configuration choice per relationship, not a hardcoded behavior buried in an import class.

[\#](#match-resolution-strategies "Permalink")Match resolution strategies
-------------------------------------------------------------------------

When you do perform a lookup, the next question is: what field do you match on?

Matching a company by name seems obvious until you realize that names are the least reliable identifier. "Acme Corp" and "Acme Corporation" are the same company. "ABC Inc." might appear twice in your database with different addresses. Names are ambiguous by nature.

Better identifiers exist, and they have a natural priority order:

Match fieldPriorityWhyRecord ID100Deterministic. If the CSV includes the database ID, there is no ambiguity.Email90Nearly unique in practice. Best general-purpose identifier for people and organizations.Domain80Good for companies. Extract from email or use a dedicated column.Phone70Useful but format-sensitive. "+1 (555) 123-4567" and "5551234567" must normalize to the same value.Name10Last resort. Ambiguous, typo-prone, and format-inconsistent.In code, this priority system looks like:

```
public function matchableFields(): array
{
    return [
        MatchableField::id(),                                           // priority: 100
        MatchableField::email('email', MatchBehavior::MatchOrCreate),   // priority: 90
        MatchableField::domain('domain', MatchBehavior::MatchOrCreate), // priority: 80
        MatchableField::phone('phone', MatchBehavior::MatchOrCreate),   // priority: 70
        MatchableField::name(),                                         // priority: 10
    ];
}

```

The system tries each matcher in priority order. If the CSV row contains an ID column and the ID matches an existing record, that match wins regardless of whether the name or email also appears in the data. If there is no ID, it falls back to email, then domain, and so on.

Each `MatchableField` carries its own `MatchBehavior`, which means you can have different create-or-skip policies per field. Matching by ID is always `MatchOnly` -- if you provide an ID and it does not exist, something is wrong and you should not create a record with a potentially conflicting primary key. Matching by email might be `MatchOrCreate` because a new email legitimately represents a new contact. Matching by name is typically `Create` because names alone are too ambiguous for reliable lookups.

This priority-based approach lets the import system make the best decision it can with whatever data the CSV provides, without requiring the user to specify which column to match on every time.

[\#](#multi-tenant-complications "Permalink")Multi-tenant complications
-----------------------------------------------------------------------

Everything above assumes a single-tenant application. In a multi-tenant system, every lookup must be scoped.

When you search for a company named "Acme Corp", you are not searching the entire `companies` table. You are searching for "Acme Corp" within the current tenant's companies. A different tenant might have their own "Acme Corp" that is a completely different organization.

This scoping must be airtight across the entire import pipeline:

- **During mapping**, when the system previews match results for relationship columns
- **During validation**, when it checks whether referenced records exist
- **During execution**, when it performs `firstOrCreate` or links to existing records
- **In queue jobs**, where the original HTTP request's tenant context no longer exists

The queue job point is where most implementations break. The user uploads a file in a web request scoped to Tenant A. The import job runs minutes later in a queue worker that has no tenant context. If the job does not explicitly restore the tenant scope before running relationship lookups, it queries across all tenants. At best, you get wrong matches. At worst, you leak data between tenants.

Solving this requires the tenant context to be serialized when the import is created and restored when each job runs. The import record itself carries the tenant identifier, and every job reads it before executing any queries.

[\#](#why-this-is-the-hardest-part-of-csv-imports "Permalink")Why this is the hardest part of CSV imports
---------------------------------------------------------------------------------------------------------

Flat field imports are stateless. Each row is independent. You validate it, insert it, move on. The row before and the row after do not matter.

Relationship imports are stateful. Row 1 creates "Acme Corp" as a new company. Rows 2 through 50 need to reference that same company, not create 49 duplicates. The deduplication cache from earlier in this post is one piece of that state. Match resolution results are another. The user's create-or-skip decisions per relationship type are yet another.

This state has to survive across:

- **Rows within a chunk** -- the deduplication cache must persist within a processing batch
- **Chunks within a job** -- if you process 500 rows per chunk, the cache from chunk 1 must be available in chunk 2
- **Queue job restarts** -- if a job fails and retries, previously created related records must not be duplicated
- **Database transactions** -- creating a company and linking a contact to it should be atomic; if the contact insert fails, the orphaned company is a problem

Then add validation. Before the import runs, you want to show the user which company names matched and which did not. That requires a separate resolution pass that queries the database for every unique relationship value in the file, batched and cached, and presents the results in a review step. This is fundamentally different from validating flat fields, which is just running Laravel's `Validator` on individual values.

And then add multi-tenancy. And MorphToMany. And the fact that a single row might have both a BelongsTo company and MorphToMany tags, each with their own match behavior and deduplication cache. The combinatorial complexity is why relationship handling accounts for more code than every other part of a CSV import system combined.

This is also the problem that motivated building Tapix -- see [Why we're building Tapix](/blog/why-we-are-building-tapix) for the full context.

[\#](#the-declarative-alternative "Permalink")The declarative alternative
-------------------------------------------------------------------------

The manual approaches above work. They also require hundreds of lines of carefully coordinated code per importer, and every new relationship type means writing more of it.

The alternative is to declare the relationship configuration on the field definition and let the import system handle resolution, deduplication, and user-facing match behavior automatically:

```
ImportField::make('company')
    ->label('Company')
    ->guess(['company', 'company name', 'organization', 'org'])
    ->relationship(
        name: 'company',
        model: Company::class,
        matchBy: ['name'],
        behavior: MatchBehavior::MatchOrCreate,
    ),

```

Four lines. The field declares that it maps to a `company` relationship on the `Company` model, matches by the `name` column, and uses `MatchOrCreate` behavior. The import system takes it from there: it collects unique company names from the file, queries the database for matches, presents unmatched values to the user, creates new records if configured to, caches resolved IDs for deduplication across rows, and links everything during execution.

For BelongsTo relationships, the resolved ID is stored as a foreign key on the imported record. For MorphToMany relationships, records are attached via the pivot table after the main record is saved. The storage strategy is determined by the relationship type -- you declare it once and do not think about it again.

Combined with `matchableFields()` for controlling match priority:

```
public function matchableFields(): array
{
    return [
        MatchableField::id(),
        MatchableField::email('email', MatchBehavior::MatchOrCreate),
    ];
}

```

This gives the import system everything it needs to resolve relationships correctly, across rows, across chunks, across queue jobs, and across tenants.

---

Relationship resolution is the part of CSV importing that separates toy implementations from production-ready systems. If your imports involve relationships, multi-tenancy, and the match-or-create decision, and you would rather not build the infrastructure yourself, take a look at [tapix.dev](/) to see how it handles these problems. For inline validation before any data touches the database, see [Handling CSV validation errors before they hit your database](/blog/handling-csv-validation-errors).

 ### Enjoyed this post?

Get notified when we publish new articles about Laravel imports and data handling.

  Email address   Subscribe

Almost there — confirm your subscription via email.

 Related posts
-------------

 [  Tutorials   Apr 12, 2026

 The complete guide to CSV imports in Laravel
----------------------------------------------

Every approach to CSV importing in Laravel -- from raw PHP to dedicated packages -- and when to use each one.

 ](https://tapix.dev/blog/complete-guide-csv-imports-laravel)

   [ ![Tapix](/img/tapix-logo-light.svg) ![Tapix](/img/tapix-logo-dark.svg) ](https://tapix.dev)CSV and Excel import wizard for Laravel.

  Product [Pricing](https://tapix.dev#pricing) [Docs](https://docs.tapix.dev) [Blog](https://tapix.dev/blog) [Contact](mailto:hello@tapix.dev)

 Compare [vs Laravel Excel](https://tapix.dev/vs/laravel-excel) [vs Filament Import](https://tapix.dev/vs/filament-import)

 Legal [Privacy](https://tapix.dev/privacy-policy) [Terms](https://tapix.dev/terms-of-service)

© 2026 Tapix. All rights reserved.
