Cleaning up Laravel app database duplicates - part 1

June 3, 2024

  • Backend

  • Laravel

  • Database

  • Cleanup

In this article I'm going to handle the case when a database has overgrown because of duplicates and dump entries. In this article I'm going to create a handy migration script to find unwanted rows, remove them and replace foreign keys to point to correct places.

đź’ˇ Remember to do a database backup first

The case

Let's imagine that in our application users can create configurations. Unfortunately, it turned out that they keep repeating the job over and over… whoops, our bad!

Search and hash

Let's start with creating proper query in a migration script. The relations are Users → Configurations → Themes and Users → Themes, so users can create themes and assign them to active configurations.

Now we will search for duplicate theme values and organize them in easy-to-manage collections.

  1. Fetch and iterate users, because we want to compare Themes in scope of given User.
return new class extends Migration
{
    public function up(): void
    {
        User::with('themes')->chunk(100, $this->processUsers($users));
    }
}
  1. Generate hash strings for theme fields to easily compare them. By using Laravel model's only method, we want to have only specified fields, because there may be fields like images which cannot be treated as a text value and we don't want to process them. We'll learn about images in next article - 🔗 Cleaning up Laravel application database - part 2 - images.
private function processUsers(Collection $users): void
{
    foreach ($users as $user) {
        $toCompareByText = collect();
        foreach ($user->themes as $theme) {
            $theme->textHash = md5(serialize(
                $theme->only(['heading', 'paragraph', 'primary_color', 'secondary_color', 'font_family', 'base_font_size'])
            ));
            $toCompareByText->push($theme);
        }
    }
}

Grouping is the key

Grouping by generated hash

Let's put themes into groups - collections of themes having all values in common. One check if a group contains duplicates and we can do further processing in cleanUpCollection function which will do the changes in the database.

$toCompareByText->groupBy('textHash')
    ->each(function ($sameTextGroup) {
        if ($sameTextGroup->count() < 2) {
            return; // No duplicates
        }
        $this->cleanUpCollection($sameTextGroup);
    });

Clean up the mess

Finally we will reassign themes which are going to stay in the database and remove unwanted rows. The shift method called on the collection will simply extract a survivor from the set. Survivor is going to stay, others are free to get deleted, because they're duplicates of survivor.

private function cleanUpCollection(Collection $sameTextGroup): void
{
    $survivor = $sameTextGroup->shift();
    $sameTextGroup->each(function ($theme) use ($survivor) {
        DB::transaction(function () use ($theme, $survivor) {
            Configuration::where('theme_id', $theme->id)->update(['theme_id', $survivor->id]);
            $theme->delete();
        });
    });
}

đź’ˇ Always take care of data consistency and use transactions for actions which can't live without each other

Prevent future littering

I hope this handy script did the job for you, but if there were duplicates in your application's database, it's good to know the reason of this state. It's time to rethink the application's logic, structure and user interface to make it more intuitive and promote clean patterns.

See also 🔗 Cleaning up Laravel application database - part 2 - images

Adam Kaczmar

About the author

Adam Kaczmar, Web Developer

I've been a professional full-stack web developer since 2015. My experience comes mainly from e-commerce and consists of:

  • developing highly customized e-commerce software,
  • automating catalog and order integrations with external warehouse services,
  • creating tailor-made user-friendly administration tools for client teams,
  • creating front-end React / Next.js applications along with headless Magento, Laravel and Sanity back-ends.

Besides my programming job, I'm a husband, a father of two lovely daughters and I train boxing every Monday afternoon. Movie genere of my choice is western.

Want to talk? 🙂 Reach me on LinkedIn

...or explore all blog posts ➡️