Migration Oracle → PostgreSQL
Create a new default schema in PostgreSQL, customize it and pool the data directly from Oracle without intermediate export files by using oracle_fdw (foreign data wrapper).
Current state of this solution
Works as expected, but very slow because one of the biggest tables asset_feature have a CLOB column and OCI library doesn't allow PostgreSQL to prefetch the rows. For example, a pull of a 40 Million rows table takes 3 hours. There are different workarounds that are described in our internal PostgreSQL mailing list.
Install PostgreSQL
For more information, see PostgreSQL installation.
Setup and test Oracle Instant Client
oracle_fdw is a PostgreSQL extention that need Oracle libraries. There are two optoins for you to take these libraries, download them form Oracle Website or get them from our FileServer. There is a third option to use the libraries from the Oracle DB itself ($ORACLE_HOME), but we shall not describe it here.
Option 1: Download directly from Oracle
Select your OS version here. Then download and unzip the following files in ~postgres/ (the files shall create it's own subdirectory). Don't unzip them on Mac, because you shall get different directories. All files from every zip should be in the same directory.
instantclient-basic-linux.x64-12.1.0.2.0.zip - Instant Client Package - Basic
instantclient-sqlplus-linux.x64-12.1.0.2.0.zip - Instant Client Package - SQL*Plus
instantclient-sdk-linux.x64-12.1.0.2.0.zip - Instant Client Package - SDK
Option 2: Copy the package from our internal FileServer
Execute the following command:
rsync -avx /Volumes/Files/censhare/censhare-3rdParty/sqlplus/instantclient_12_1_linux/ postgres@PG-host:instantclient_12_1/
Setup and test connectivity to Oracle
Modify and add the following variables in ~postgres/.pgsql_profile and ~postgres/.bash_profile:
export ORACLE_SID=corpus
export ORACLE_HOME=/var/lib/pgsql/instantclient_12_1
export PATH=/usr/pgsql-9.6/bin:$ORACLE_HOME:$PATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME
Test connectivity to Oracle:
postgres@pg-host $ sqlplus system/password@oracle-host:1521/corpus
# if get an error mesage for a missing library, install it like this:
# yum install libaio.x86_64
Modify environment inside the PostgreSQL startup script:
# as root, search for PGDATA in that file and add the Oracle variables under the [Service] section of the file.
vi /usr/lib/systemd/system/postgresql-9.6.service
Environment=ORACLE_SID=corpus
Environment=ORACLE_HOME=/var/lib/pgsql/instantclient_12_1
Environment=LD_LIBRARY_PATH=/var/lib/pgsql/instantclient_12_1
systemctl daemon-reload
systemctl restart postgresql-9.6
Stop censhare process and prepare the Oracle database
sqlplus corpus/passwrod
-- truncate table messages;
-- consider this optional step
create table seqtmp(seqname varchar(100), lastnum integer, incr integer);
insert into seqtmp select sequence_name,last_number,increment_by from user_sequences; commit;
-- optional: truncate table corpus.message;
Setup oracle_fdw and connect PostgreSQL to Oracle
Follow the basic installation instructions.
# as root, install needed libraries
yum install postgresql96-devel.x86_64
chown -R postgres:postgres /usr/pgsql*
# as postgres make soft link first
cd /var/lib/pgsql/instantclient_12_1
ln -s libclntsh.so.12.1 libclntsh.so
wget https://github.com/laurenz/oracle_fdw/archive/ORACLE_FDW_2_1_0.tar.gz
tar -xvzf ORACLE_FDW_2_1_0.tar.gz && rm ORACLE_FDW_2_0_0.tar.gz
cd oracle_fdw-ORACLE_FDW_2_1_0
make
make install
Sometimes, the follwing error occurs during the make install command:
-bash-4.2$ make install
/usr/bin/mkdir -p '/usr/pgsql-9.6/lib'
/usr/bin/mkdir -p '/usr/pgsql-9.6/share/extension'
/usr/bin/mkdir -p '/usr/pgsql-9.6/share/extension'
/usr/bin/mkdir -p '/usr/pgsql-9.6/doc/extension'
/usr/bin/install -c -m 755 oracle_fdw.so '/usr/pgsql9.6/lib/oracle_fdw.so'
/usr/bin/install: cannot create regular file ‘/usr/pgsql-9.6/lib/oracle_fdw.so’: Permission deniedmake: *** [install-lib-shared] Error 1
To resolve the error, do the following:
-bash-4.2$ pg_config --pgxs /usr/pgsql-9.6/lib/pgxs/src/makefiles/pgxs.mk
Make sure that your pg_config path is in $PATH. To do this, use $PATH with the make install command, or change the permission of the pgsql-9.6 folder from "root" user to "postgres:postgres":
[root@censhare-db usr]# chown -R postgres:postgres /usr/pgsql*
Now the make install command should be run smoothly and show the following result:
Switch to "postgres user"
-bash-4.2$ cd oracle_fdw-ORACLE_FDW_2_0_0/
-bash-4.2$ make install /usr/bin/mkdir -p '/usr/pgsql-10/lib'
/usr/bin/mkdir -p '/usr/pgsql-10/share/extension'
/usr/bin/mkdir -p '/usr/pgsql-10/share/extension'
/usr/bin/mkdir -p '/usr/pgsql-10/doc/extension'
/usr/bin/install -c -m 755 oracle_fdw.so '/usr/pgsql-10/lib/oracle_fdw.so'
/usr/bin/install -c -m 644 .//oracle_fdw.control '/usr/pgsql-10/share/extension/'
/usr/bin/install -c -m 644 .//oracle_fdw--1.1.sql .//oracle_fdw--1.0--1.1.sql '/usr/pgsql-10/share/extension/'
/usr/bin/install -c -m 644 .//README.oracle_fdw '/usr/pgsql-10/doc/extension/'
-bash-4.2$
The following command connects the PostgreSQL daatbase to the Oracle database, and imports the foreign schema definitions:
psql -d postgres
\i postgresql-create-user.sql
-- this script can be found under css/database/ on the app server and usualy have these two lines + few type cast definitinos
-- CREATE USER corpus PASSWORD 'corpus';
-- CREATE DATABASE corpus OWNER corpus TEMPLATE template0 ENCODING 'UTF8';
\c corpus postgres
CREATE EXTENSION oracle_fdw;
CREATE SERVER oradb FOREIGN DATA WRAPPER oracle_fdw OPTIONS (dbserver '//IP.OF.ORACLE.SERVER:1521/corpus.censhare');
GRANT USAGE ON FOREIGN SERVER oradb TO corpus;
\c corpus corpus
CREATE USER MAPPING FOR corpus SERVER oradb OPTIONS (user 'corpus', password 'password');
CREATE SCHEMA corpus;
IMPORT FOREIGN SCHEMA "CORPUS" FROM SERVER oradb INTO CORPUS;
-- in case that you would like to DROP and recreate it:
-- psql -d postgres
-- drop database corpus;
-- drop user corpus;
Create the new default PostgreSQL schema
In our software release, PostgreSQL and Oracle have different schemas (data types, constraints, index options, stored procedures etc.). After the data migration, PostgreSQL should have the same schema as our release plus customer specific columns (language, db-schema.patch etc.). For that reason, we create here an empty default PostgreSQL schema taken from the release for that specific censhare server version.
# First copy the schema definition file from this specific censhare server version.
you@yourMac$ scp corpus@cs-srver:css/database/postgresql-create-db.sql /tmp/
you@yourMac$ scp /tmp/postgresql-create-db.sql postgres@pg-host:
## Then separate the TABLES structure from everything else
perl -ne 'print if /^CREATE TABLE/../^\);/' postgresql-create-db.sql > postgresql-create-db-TABLES.sql
perl -ne 'print if not /^CREATE TABLE/../^\);/' postgresql-create-db.sql > postgresql-create-db-REST.sql
## Finally import only the TABLES
postgres@pg-host $ psql -d corpus -U corpus
set search_path='public';
\i postgresql-create-db-TABLES.sql
Check Oracle for custom TABLES and create them in PostgreSQL
SQL> select view_name from user_views;
Now check for tables in Oracle that are not defined in db-schema.patch. You can create them like this:
psql -d corpus -U corpus
set search_path='public';
select 'create table '|| table_name ||' (like corpus.'||table_name||');' from (
SELECT table_name FROM information_schema.tables WHERE table_schema='corpus' and table_name not in ('plan_table', 'plan_view','timetrackerinterface')
EXCEPT
SELECT table_name FROM information_schema.tables WHERE table_schema='public'
) as t1;
... copy and paste the output with 10-20 lines at a time.
Create custom language COLUMNS in PostgreSQL
This step checks for columns in Oracle that don't exist in PostgreSQL. We usually see here the custom language columns, but it could be everything, so executing postgresql-add-missing-languages.sql is not a solution:
psql -d corpus -U corpus
set search_path='public';
create temporary table columnstmp as (
SELECT table_name, column_name FROM information_schema.columns WHERE table_schema='corpus' and table_name not in ('plan_table', 'plan_view', 'timetrackerinterface')
EXCEPT
SELECT table_name, column_name FROM information_schema.columns WHERE table_schema='public' order by 1,2
);
DO $$
DECLARE
c record;
crow record;
ctype varchar(50);
BEGIN
FOR c IN SELECT * from columnstmp
LOOP
SELECT *
INTO crow
FROM information_schema.columns WHERE table_schema='corpus' and table_name=c.table_name and column_name=c.column_name;
IF crow.data_type = 'character varying' THEN ctype := crow.data_type || '(' || crow.character_maximum_length || ') ';
ELSE ctype := crow.data_type;
END IF;
-- RAISE NOTICE '% % %', c.table_name, c.column_name, ctype ;
EXECUTE 'ALTER TABLE ' || c.table_name || ' ADD COLUMN "' || c.column_name || '" ' || ctype;
END LOOP;
END$$;
Here is a small exception for versions <5.7 because current_timestamp is a reserved word in postgresql:
# only for <= v5.7
set search_path='public';
alter table stored_transformation drop column current_timestamp;
Final check to see if there is any difference in the schemas:
SELECT table_name, column_name FROM information_schema.columns WHERE table_schema='public' and column_name not in ('rowid')
EXCEPT
SELECT table_name, column_name FROM information_schema.columns WHERE table_schema='corpus' order by 1,2;
SELECT table_name, column_name FROM information_schema.columns WHERE table_schema='corpus' and table_name not in ('plan_table','plan_view', 'timetrackerinterface')
EXCEPT
SELECT table_name, column_name FROM information_schema.columns WHERE table_schema='public' order by 1,2;
Let PostgreSQL pool the data from Oracle
For big databasese, you should extend the undo retention in Oracle, otherwise your import shall fail with error message ORA-01555:
oracle@oracle-host $ sqlplus / as sysdba
SQL> alter system set undo_retention=30000;
Changing the prefetch size from default 200 to the maximum of 10240 will reduce the import time. Please note that some of the biggest tables, like asset_feature, have CLOB columns for which OCI doesn't allow prefetching at all.
DO $$
DECLARE
c record;
BEGIN
FOR c IN SELECT table_name FROM information_schema.tables WHERE table_schema='public'
LOOP
EXECUTE 'alter foreign table corpus.' || c.table_name || ' OPTIONS (prefetch ''10240'');';
END LOOP;
END$$;
Shutdown censhare before executing that step and start the process in a screen session, because it's very very slow:
psql -d corpus -U corpus
CREATE OR REPLACE FUNCTION get_columns(varchar) RETURNS varchar AS $$
select string_agg('"'||column_name||'"',',') FROM information_schema.columns
WHERE table_name = $1 and table_schema='public' AND column_name NOT IN('rowid');
$$ LANGUAGE SQL;
set search_path='public','corpus';
create temp table columns_list as
select table_name, get_columns(table_name) FROM information_schema.tables WHERE table_schema='public';
DO $$
DECLARE
c record;
BEGIN
FOR c IN SELECT * FROM columns_list order by table_name desc
LOOP
RAISE NOTICE 'BEGIN processing table %', c.table_name;
EXECUTE 'insert into '|| c.table_name ||' (' || c.get_columns || ') select '|| c.get_columns || ' FROM corpus.'|| c.table_name;
EXECUTE 'analyze '|| c.table_name; RAISE NOTICE 'END processing table %', c.table_name;
END LOOP;
END$$;
Here is a workaround for the case when Oracle have null characters (U+0000) in some of it's text data rows. You execute it in a separate terminal session and then repeat the DO block above.
# If PostgreSQL returns this error for some column ERROR:
# invalid byte sequence for encoding "UTF8": 0x00 try this with sqlplus:
SQL> SELECT string2 FROM asset_info WHERE instr(string2, unistr('\0000')) > 0;
SQL> UPDATE asset_info SET string2 = replace(string2, unistr('\0000')) WHERE instr(string2, unistr('\0000')) > 0;
SQL> SELECT title FROM message WHERE instr(title, unistr('\0000')) > 0;
SQL> UPDATE message SET title = replace(title, unistr('\0000')) WHERE instr(title, unistr('\0000')) > 0;
SQL> COMMIT;
Create default constraints, indexes etc. in PostgreSQL
Few steps above, we created only the default tables needed to import the data. Now we have to create all other objects in PostgreSQL:
postgres@pg-host $ psql -d corpus -U corpus
set search_path='public';
\i postgresql-create-db-REST.sql
Set the sequences values as they are in Oracle:
set search_path='public';
DO $$
DECLARE
c record;
BEGIN
-- create custom sequences
FOR c IN SELECT * FROM corpus.seqtmp where seqname not in (SELECT upper(relname) FROM pg_class
WHERE relkind = 'S')
LOOP
EXECUTE 'CREATE SEQUENCE ' || c.seqname || ' START ' || c.lastnum || ' INCREMENT ' || c.incr;
END LOOP;
-- set the values
FOR c IN SELECT * from corpus.seqtmp
LOOP
EXECUTE 'SELECT setval('''|| c.seqname ||''',' || c.lastnum || ')';
END LOOP;
END$$;
Revert back important changes that you made
Revert back the undo_retention in Oracle
oracle@oracle-host $ sqlplus / as sysdba
SQL> alter system set undo_retention=3600;
Drop temporary table "seqtmp" which we have created in Oracle. Connect to sqlplus & drop it.
drop table corpus.seqtmp;
Rename back this column in PostgreSQL and drop the oracle mapping.
# only for versions <=5.7
postgres@postgresql-host $ psql -d corpus -U corpus
alter table stored_transformation add column curr_timestamp TIMESTAMP(3);
drop function get_columns(varchar);
# Additional command, if we have to rename the column "current_timestamp"
alter table stored_transformation rename column "current_timestamp" to curr_timestamp;
psql -d corpus -U corpus
DROP USER MAPPING FOR corpus SERVER oradb;
\c corpus postgres
DROP SERVER oradb cascade;
DROP schema corpus;