Data migration—the process of transferring data from their original source to a new destination, whether it be a new storage location or different file format—may sound easy in concept, but may actually be difficult in practice. The more data, file types, and production applications involved, the more complex the processes become.
The “simplest” database migration scenarios, so to speak, are (1) between different versions of the same database with the same schema and (2) between different versions of the same database (or from one database to another) with a different schema. “Simple” though these may be, certain steps must be still taken into account so that the migration will be more beneficial rather than inconvenient.
Four basic steps are usually followed when migrating a single database, namely schema build, initial data load, change data capture, and data authentication and reporting. These are apart from the necessary preparations and activities before, during, and right after the procedure, such as data quality assessment, shut down plans, final testing, and post-migration reports.
Step 1: Schema Build
The first step of any migration project requires the careful review of the source systems in order to identify the data that will populate the target model, remove any duplicates and inapplicable files, and ensure that there will be no data gaps. Apart from the final data, you also need to identify all the programming codes necessary for the transfer.
If you’re using a local or proprietary application, you may have to manually migrate the schema. Meanwhile, all programming codes required for the correct operation of applications must be reviewed, especially when changing database products (e.g. from SQL to Oracle).
On the other hand, if you’re using a third-party application, then the supplier should be able to provide all the necessary tools for the migration, including the schema definition and optimized codes.
Step2: Initial Data Load
Initial data load simply means bringing your data into the (new) data warehouse for the first time. This is a time-consuming and memory-intensive task, so you should take necessary steps to properly manage the amount of data being loaded. You can do this by setting properties such as filters and load intervals to improve the performance of the ETL process during load management.
Note that a staging database—where you modify and “cleanse” the data—is recommended for multischema environments before pushing into production.
Change Data Capture
Change data capture is necessary when migrating a vast amount of data, and the system cannot afford to be offline for long periods (as initial data load may take days to complete). With the data being transferred by increments, the most important changes and updated information are made available sooner and the business’ usual operations are not disrupted or critically altered or affected.
Change data capture is also an important tool in reducing instabilities during application migration, because initial tests can immediately be performed on the updated data uploaded to the target model to check if they are working as intended. After the migration is completed, the source database can also be kept running and updated for an acclimatization period for the new database.
Data Authentication and Reporting
Authenticating the migrated data may be as important, if not more important, than the migration process itself. These authentication tests may be conducted internally through predetermined checklists, or even through end-user-driven trials to ensure that the ecosystem works properly and is easy to use. The key is to ensure that both data owners and data users will be able to evaluate the output to ensure accuracy.
Once done with the entire process, creating a report as a reference for future data migrations will help in keeping the process consistent. Reports are also valuable in spotting errors in the process as well as determining opportunities for improvements for the next migration session.
It may be a challenging task, but data migration can be made simple and easily comprehensible by breaking it down into basic steps. Through these, businesses can also shift the way they view data migration—that is, it is no longer solely the IT department’s responsibility, but rather a way to better and more holistically understand the business through data delivery.