Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

This article is based on Steve Moore's experience with making a copy of the TCIA production database to use for testing. My goal was to have a duplicate of the production data on a test mySQL server MySQL server to protect the production data during development (standard stuff). The database dump was 6 GB. My first attempt at a restore failed, so I performed some additional research. This article is based on google + trial/error; your mileage may vary.

Standard Backup Restore

The standard backup and restore procedure for mySQL MySQL would use these mySQL MySQL commands from the command line (Linux):

...

    mysql -u root -p[root_password] < dumpfile.sql

Restore Issue on Large Database

The issue I ran into was that the restore operation was taking a long time (seemed to be greater than 24 hours) and then just stopped. There were no error messages that I could find in the output from the mysql MySQL interpreter; it was as if the import just stopped working.

Two Part Solution

Split the Dump File

I was concerned that the mySQL MySQL interpreter was not able to handle the entire dump file in one operation. I have no direct evidence of this, but I thought I would break up the dump into separate commands and restore the database in pieces. The split command is a standard linux command that is not related to mySQLMySQL:

    split --suffix-length=3 --numeric-suffixes --verbose --lines=250 dumpfile.mysql prefix.

...

  • The file names begin with "prefix."; you can choose any prefix. I used "DB." to remind me this is a database dump. The period is not required; it helped me distinguish between the prefix and generated suffix.
  • The suffix length is three characters; you can choose a different length.
  • I chose numeric suffixes. When you combine this with the suffix length of three, the files are named:
    • DB.000
    • DB.001
    • DB.002
    • ...
  • Each output file contains 250 lines of text. The last file will likely contain less than 250 lines.
    • This is not a magic number. I came to it by trial and error to get the individual files to be of reasonable length.
    • The output files are 250 lines, and many of the lines are very long. The mySQL MySQL dump produces long insert statements; the split command reproduces those lines and does not add extra formatting.

The split command does not know anything about SQL statements. This procedure can split statements across files. Because we are going to process each file individually with the mySQL MySQL interpreter, this is a minor issue. The last SQL statement in one of the DB.xxx files may be incomplete and need text from the next file. I was lucky in that this only happened a few times in the files that contained commands to create tables. The mySQL MySQL dump created a single CREATE TABLE command that extended across several lines. The long insert statements were always complete on a single line and were not disturbed by the split function,.

...

I now have a collection of individual SQL files that can be interpreted by the mySQL MySQL interpreter. I could have fed them into mySQL MySQL one by one, but found this other trick through google. I wrote this SQL statement and ran this through the mySQL MySQL interpreter:

    drop database nciaa;

...

When I ran this through the mySQL MySQL interpreter, it took approximately 100 minutes to process the entire script.

...

  • How long it takes to run the full dump through the mySQL MySQL interpreter using the standard procedure (I gave up after a failed attempt).
  • How long it takes to run the individual DB.xxx files through the mySQL MySQL interpreter using the standard "mysql .... < DB.000" procedure

It would also be interesting to do some more reading in the mySQL MySQL documentation on the "source" statement to understand why this seemed to perform better. As mentioned above, this is all trial and error.