GoldenGate

GoldenGate Technology Introduction

Oracle GoldenGate It is a comprehensive software package for real-time data integration and replication between heterogeneous IT environments. The product set supports high availability solutions, real-time data integration, transactional change data capture, data replication, transformation, and validation between operational and analytical enterprise systems. Oracle GoldenGate 12c achieves extreme performance through simplified configuration and management, tighter integration with Oracle Database, support for cloud environments, extended heterogeneity, and enhanced security.

GoldenGate software provides a single platform that can achieve disaster backup in seconds for any enterprise environment. GoldenGate is a log-based structured data replication method that obtains data additions, deletions, and changes by parsing the source database online log or archive log (the amount of data is only about one-fourth of the log), and then applies these changes to the target database to achieve synchronization and dual-active between the source and target databases.

Product Positioning

Zero downtime database upgrades and migrations.

Meet users' needs for sub-second real-time data.

Sustainable data high availability and real-time business intelligence.

Real-time data synchronization across heterogeneous platforms and operating systems.

Non-intrusive to source and target systems.

Technical Services

Oracle GoldenGate is a comprehensive software package for real-time data integration and replication of data across heterogeneous IT environments. The product set supports high availability solutions, real-time data integration, transactional change data capture, data replication between operational and analytical enterprise systems, transformation, and validation. Oracle GoldenGate 12c delivers extreme performance through simplified configuration and management, tighter integration with Oracle Database, support for cloud environments, extended heterogeneity, and enhanced security.

The Extract Process is used to read the Online Redo Log or Archive Log on the source system side, and then parse it to extract only the changes in the data, such as addition, deletion, and modification operations, and convert the relevant information into a customized intermediate format of GoldenGate TDM and store it in a trail file. The transmission process is then used to transmit the trail file to the target system via TCP/IP. After each reading of the data changes in the log and after the data is transmitted to the target system, the Extract Process will write a checkpoint to record the current position of the captured log. The existence of the checkpoint allows the Extract Process to continue copying from the checkpoint position after aborting and resuming.

The target system accepts data changes and caches them in the GoldenGate TDM queue, which is a series of files that temporarily store data changes and wait for the delivery process to read the data.

The GoldenGate TDM delivery process (replicat process) reads data changes from the queue and creates corresponding SQL statements, which are executed through the local interface of the database. After being successfully submitted to the database, it updates its own checkpoint and records the location where the replication has been completed. The data replication process is finally completed.

It can be seen that GoldenGate TDM is a software-based data replication method that analyzes data changes from the database log (the amount of data is only about a quarter of the log). GoldenGate TDM converts data changes into its own format and transmits them directly through the TCP/IP network without relying on the database's own transmission method. It can also compress data with a compression ratio of up to 10:1, which can greatly reduce bandwidth requirements. On the target side, GoldenGate TDM can greatly speed up the speed and efficiency of data delivery through technical means such as transaction reorganization and batch loading, reduce the resource usage of the target system, and can achieve the replication of large amounts of data at the sub-second level, and the target database is active.

Maintenance experience

Daily maintenance

1. Configure scheduled deletion of expired queues

Used to automatically delete expired queues to save hard disk space. It is recommended to configure it in the Mgr process so that all queues can be managed centrally. Add the following line to the mgr parameters: purgeoldextracts //dirdat/*, usecheckpoint, minkeepdays 7. Among them, the first parameter is the queue location, and * can match all queue files in the backup center; the second parameter means that the checkpoint requirements must be met first, and unprocessed queues cannot be deleted; the third parameter indicates the minimum number of days to be retained, and the following number is the number of days. For example, if you want to retain the queue /ggs/dirdat/xm file for only 3 days, you can configure it as follows: purgeoldextracts /ggs/dirdat/xm, usecheckpoint, minkeepdays 3. Note: The Mgr process parameters need to be restarted before the Mgr process takes effect. Temporarily stopping the mgr process does not affect data replication.

2. Configure automatic scheduled restart process

It is used to automatically recover process terminations caused by temporary network interruptions, database or system maintenance, etc., to reduce manual workload. It is recommended to add the following line to the mgr parameter file in the Mgr process configuration: AUTORESTART ER *, RETRIES 3, WAITMINUTES 5, RESETMINUTES 60. The above parameters mean that all processes will be tried to be restarted every 5 minutes, for a total of three attempts. After that, it will be reset every 60 minutes, and then tried once every 5 minutes for a total of 3 attempts. Note: The Mgr process needs to be restarted for the parameters to take effect. You can query the ggserr.log file to view the restart attempt information.

3. Configure long transaction alarm

You can configure long transaction alarms in the extract process. The parameters are as follows: warnlongtrans 12h, checkintervals 10m. The above means that GoldenGate will check long transactions every 10 minutes. If there is a long transaction of more than 12 hours, GoldenGate will add an alarm message to ggserr.log in the root directory. You can view these alarm messages by checking ggserr.log or executing the view ggsevt command in ggsci. You can configure Director or custom scripts to send alarm emails.

4. Add steps to copy table

Stop the Extract/Data Pump/Replicat process. Note that when stopping Extract, check the long transaction and archive logs. Create a replication table on the source and target. Add additional logs to the table on the source side. Modify the replication range in the Extract/Data Pump/Replicat parameters to include the table. Restart the Extract/Data Pump/Replicat process and start operating the newly added table. Note that the above operations are limited to DML replication. If DDL replication is configured, additional logs can be automatically generated and the table structure can be created on the target side.

Common troubleshooting

General steps for troubleshooting: First, determine which type of GoldenGate process is faulty (is it the extraction, delivery, or replication process that has a problem). The general ideas for troubleshooting are as follows. (1) Use the GGSCI>view report command to find the ERROR word, determine the cause of the error, and eliminate it based on its information. (2) Use GGSCI>view ggsevt to view the alarm log information. (3) Check whether the databases at both ends are running normally and whether the network is connected. (4) Use the logdump tool to analyze the queue file.

1. The user does not exist

Problem description:2010-05-02 10:45:20 GGS ERROR 2001 Oracle GoldenGate Delivery for Oracle, rcrmheal.prm: Fatal error executing DDL replication: error [Error code [1918], ORA-01918: user 'KINGSTAR' does not exist, SQL /* GOLDENGATE_DDL_REPLICATION */ alter user kingstar account unlock ], no error handler present.

Problem analysis: According to the analysis log, it can be determined that the failure is caused by the non-existence of the user on the target end.

Problem solving: Method 1: If you do not need to synchronize the user, you can remove the mapping of the user on the target side and restart the process. For example, remove: MAP KINGSTAR.*, TARGET CRMKINGSTAR.*; Method 2: Manually create the user on the target side and restart the process.

2. The table does not exist

Problem description:2010-05-10 15:02:12 GGS ERROR 101 Oracle GoldenGate Delivery for Oracle, rcrmheal.prm: Table CRMOLAP.TB_FT_OFSTK_CLIENT_BY_DAY does not exist in target database.

Problem analysis: According to the analysis log, it can be determined that the failure is caused by the table not existing on the target end.

Problem handling: Method 1: If you do not need to synchronize the table, you can exclude the table on the target side and restart the process. For example, add: MAPEXCLUDE OLAP.TB_FT_OFSTK_CLIENT_BY_DAY Method 2: Manually create the table on the target side. For heterogeneous databases, you also need to regenerate the table structure definition file and restart the process.

3. Insufficient table space

Problem description:2010-02-01 17:19:18 GGS ERROR 103 Discard file (./dirrpt/rep1.dsc) exceeded max bytes (10000000).

Problem Analysis: According to the error, we can see that the direct reason for the GoldenGate process to stop is that the discard file is full. What caused the discard file to be full? From the discard file, we can see that the ORA-01653: unable to extend error occurred. Seeing this, I believe everyone knows how to deal with it. We just need to expand the size of the tablespace where the aaa.TB_LVY_TEMPINVOIC object is located.

Problem solving: 1. Find the tablespace where the relevant objects are stored; for example:select owner,table_name,tablespace_name from dba_tables 2. Execute tablespace expansion. For example:ALTER TABLESPACE tbs_03 ADD DATAFILE 'tbs_f04.dbf' SIZE 100K AUTOEXTEND ON NEXT 10K MAXSIZE 100K。

4. Insufficient disk space

Problem description:2010-05-07 04:05:31 GGS ERROR 103 Oracle GoldenGate Collector: Unable to write to file "./dirdat/crm/fl003629" (error 28, No space left on device). 2010-05-07 04:05:31 GGS ERROR 190 PROCESS ABENDING.

Problem analysis: Based on the analysis log, it can be determined that the failure is caused by insufficient disk space.

Troubleshooting: Allocate enough disk space and restart the process.