Home

Partitioning MySQL and migrating data

Important

Applicable only to Unravel v4.3.x or before.

MySQL isn't bundled with Unravel and you must manually migrate Unravel's tables for its use. The time for the data migration varies by machine and MySQL configuration.

Unravel uses the following partitioned tables:

  • BLACKBOARDS

  • EVENT_INSTANCES

  • HIVE_QUERIES

  • IMPALA_QUERIES

  • JOBS

  • OOZIE_WORKFLOW_JOBS

  1. Upgrade Unravel Server.

  2. Stop all daemons.

    /etc/init.d/unravel_all.sh stop
  3. Make sure MySQL is running.

  4. Perform the data migration.

    Note

    For external MySQL servers you might need to change the chunk size to 100 or 1000.

    /usr/local/unravel/dbin/db_schema_upgrade.sh migration:migration_partitioning -verbose

    During migration process the status of the tables is written to stdout. The output is continually updated until the migration for the table is complete.

    Start migration: migration_partitioning
    Start table rows calculation...
    Table blackboards_old has 0 rows total
    Table impala_queries_old has 0 rows total
    Table jobs_old has 205437 rows total
    Table event_instances_old has 0 rows total
    Table oozie_workflow_jobs_old has 0 rows total
    Table hive_queries_old has 0 rows total
    Table rows calculation finished.
    Migrating records for table: blackboards, chunk size: 10000, chunk count: 10...
    Migrated records for table: blackboards, rows total: 0, chunk size: 10000, chunk count: 10, finished: true, time: 0 seconds, progress 100,00 %
    Migrating records for table: impala_queries, chunk size: 1500, chunk count: 10...
    Migrated records for table: impala_queries, rows total: 0, chunk size: 1500, chunk count: 10, finished: true, time: 0 seconds, progress 100,00 %
    Migrating records for table: jobs, chunk size: 500, chunk count: 10...
    Migrated records for table: jobs, rows total: 5000, chunk size: 500, chunk count: 10, finished: false, time: 32 seconds, progress 2,43 %
    Migrating records for table: event_instances, chunk size: 50000, chunk count: 10...
    Migrated records for table: event_instances, rows total: 0, chunk size: 50000, chunk count: 10, finished: true, time: 0 seconds, progress 100,00 %
    Migrating records for table: oozie_workflow_jobs, chunk size: 1500, chunk count: 10...
    Migrated records for table: oozie_workflow_jobs, rows total: 0, chunk size: 1500, chunk count: 10, finished: true, time: 0 seconds, progress 100,00 %
    Migrating records for table: hive_queries, chunk size: 1500, chunk count: 10...
    Migrated records for table: hive_queries, rows total: 0, chunk size: 1500, chunk count: 10, finished: true, time: 0 seconds, progress 100,00 %
    Migrating records for table: jobs, chunk size: 500, chunk count: 10...
    Migrated records for table: jobs, rows total: 10000, chunk size: 500, chunk count: 10, finished: false, time: 33 seconds, progress 4,87 %
    Migrating records for table: jobs, chunk size: 500, chunk count: 10...

    When all the tables have been successfully migrated you see:

    Migration: migration_partitioning is finished

    During the process a temporary migration_partitioning DB table keeps track of the tables' status. You can view this table using the following MySQL query.

    Note

    The chunks field increases until the migration for the table is complete. Upon completion, the table's finished column is set to 1.

    /usr/local/unravel/install_bin/db_access.sh
    
    mysql>  select * from migration_partitioning;
    +---------------------+--------------------+--------+----------+
    | migr_table_name     | lowest_id_migrated | chunks | finished |
    +---------------------+--------------------+--------+----------+
    | blackboards         |           11018258 |    119 |        0 |
    | event_instances     |                  0 |      1 |        1 |
    | hive_queries        |                  0 |      1 |        1 |
    | impala_queries      |                  0 |      1 |        1 |
    | jobs                |             240884 |    100 |        0 |
    | oozie_workflow_jobs |                  0 |      1 |        1 |
    +---------------------+--------------------+--------+----------+
    
    

    Note

    To see the oldest migrated records per each table you can use the following MySQL query.

    /usr/local/unravel/install_bin/db_access.sh
    
    mysql>
    select "blackboards" as "table_name", min(created_at) as "oldest_migrated_record" from blackboards
    union select "event_instances", min(created_at) from event_instances
    union select "hive_queries", min(created_at) from hive_queries
    union select "impala_queries", min(created_at) from impala_queries
    union select "jobs", min(created_at) from jobs
    union select "oozie_workflow_jobs", min(created_at) from oozie_workflow_jobs;
    +---------------------+------------------------+
    | table_name          | oldest_migrated_record |
    +---------------------+------------------------+
    | blackboards         | 2018-08-30 00:57:45    |
    | event_instances     | 2018-08-30 00:57:55    |
    | hive_queries        | 2018-08-30 00:59:10    |
    | impala_queries      | NULL                   |
    | jobs                | 2018-08-30 00:57:50    |
    | oozie_workflow_jobs | 2018-09-07 10:25:22    |
    +---------------------+------------------------+
  5. If the migration process is interrupted or killed, you can run shell script again. However, if the process has failed you must truncate and reset ID's before running the script again.

  6. During migration the original files were renamed to TableName_old. Upon completion, delete these and the migration_partitioning table.

    /usr/local/unravel/install_bin/db_access.sh
    
    DROP TABLE blackboards_old;
    DROP TABLE event_instances_old;
    DROP TABLE hive_queries_old;
    DROP TABLE impala_queries_old;
    DROP TABLE jobs_old;
    DROP TABLE oozie_workflow_jobs_old;
    DROP TABLE migration_partitioning;
    quit
  7. Restart daemons.

    /etc/init.d/unravel_all.sh start
Restarting a failed migration
  1. Truncate partitioned tables, reset autoincrement ID values and truncate migration_partitioning table.

    /etc/init.d/unravel_all.sh stop
    
    /usr/local/unravel/install_bin/db_access.sh
    
    truncate blackboards;
    truncate event_instances;
    truncate hive_queries;
    truncate impala_queries;
    truncate jobs;
    truncate oozie_workflow_jobs;
    
    truncate migration_partitioning;
    
    call resetAutoIncrementId('blackboards_old', 'blackboards');
    call resetAutoIncrementId('event_instances_old', 'event_instances');
    call resetAutoIncrementId('hive_queries_old', 'hive_queries');
    call resetAutoIncrementId('impala_queries_old', 'impala_queries');
    call resetAutoIncrementId('jobs_old', 'jobs');
    call resetAutoIncrementId('oozie_workflow_jobs_old', 'oozie_workflow_jobs');
  2. Return to Step 3 and complete the remaining steps.