Building Change Data Capture Solutions with PL/SQL

 Introduction

Modern data-driven systems rarely operate on static datasets. Applications need to react to changes—whether for analytics, integrations, auditing, caching, or real-time processing. This requirement is addressed through Change Data Capture (CDC), a pattern that identifies and propagates changes in a database in an efficient and reliable way.

Oracle Database provides several built-in CDC capabilities such as materialized view logs, Oracle GoldenGate, and Flashback technologies. However, in many real-world scenarios, developers still build custom CDC solutions using PL/SQL to achieve fine-grained control, simplicity, or cost-effective implementation.

This article explores how to design and implement a robust CDC framework using PL/SQL, including architecture patterns, implementation techniques, performance considerations, and best practices.


What is Change Data Capture?

Change Data Capture is the process of identifying and capturing:

  • INSERT operations (new rows)

  • UPDATE operations (modified rows)

  • DELETE operations (removed rows)

These changes are then made available to downstream systems such as:

  • Data warehouses

  • ETL pipelines

  • Reporting systems

  • Microservices

  • Audit systems

Instead of repeatedly scanning entire tables, CDC ensures only incremental changes are processed.


Why Build CDC with PL/SQL?

Although Oracle provides advanced CDC tools, PL/SQL-based CDC is still widely used because:

  • It is simple and fully database-native

  • No additional licensing may be required

  • It provides full control over capture logic

  • It works in any Oracle version with triggers

  • It can be customized per business requirement

However, it also requires careful design to avoid performance overhead.


CDC Architecture in PL/SQL

A typical PL/SQL CDC solution includes:

Source Table
     │
     ▼
DML Triggers (INSERT/UPDATE/DELETE)
     │
     ▼
Change Log Table
     │
     ▼
CDC Processing Layer (PL/SQL / ETL / APIs)
     │
     ▼
Downstream Systems

The key idea is to capture row-level changes at the time they occur and store them in a centralized change log.


Designing a Change Log Table

The change log is the backbone of a CDC system.

CREATE TABLE cdc_change_log (
   change_id      NUMBER GENERATED ALWAYS AS IDENTITY,
   table_name     VARCHAR2(128),
   operation      VARCHAR2(10),
   primary_key    VARCHAR2(4000),
   column_name    VARCHAR2(128),
   old_value      VARCHAR2(4000),
   new_value      VARCHAR2(4000),
   changed_by     VARCHAR2(100),
   change_time    DATE,
   tx_id          VARCHAR2(50)
);

Key Design Considerations

  • Store both old and new values for full traceability

  • Capture transaction identifiers for grouping changes

  • Keep data generic for reuse across tables

  • Index frequently queried columns like change_time


Building the CDC Package

A reusable PL/SQL package centralizes CDC logic.

CREATE OR REPLACE PACKAGE cdc_pkg AS

   PROCEDURE log_change (
      p_table_name  VARCHAR2,
      p_operation   VARCHAR2,
      p_pk          VARCHAR2,
      p_column      VARCHAR2,
      p_old_value   VARCHAR2,
      p_new_value   VARCHAR2
   );

END cdc_pkg;
/

Package Body

CREATE OR REPLACE PACKAGE BODY cdc_pkg AS

   PROCEDURE log_change (
      p_table_name  VARCHAR2,
      p_operation   VARCHAR2,
      p_pk          VARCHAR2,
      p_column      VARCHAR2,
      p_old_value   VARCHAR2,
      p_new_value   VARCHAR2
   ) IS
      PRAGMA AUTONOMOUS_TRANSACTION;
   BEGIN

      INSERT INTO cdc_change_log (
         table_name,
         operation,
         primary_key,
         column_name,
         old_value,
         new_value,
         changed_by,
         change_time,
         tx_id
      )
      VALUES (
         p_table_name,
         p_operation,
         p_pk,
         p_column,
         p_old_value,
         p_new_value,
         SYS_CONTEXT('USERENV','SESSION_USER'),
         SYSDATE,
         DBMS_TRANSACTION.LOCAL_TRANSACTION_ID
      );

      COMMIT;

   END;

END cdc_pkg;
/

Capturing INSERT Operations

CREATE OR REPLACE TRIGGER trg_emp_cdc_ins
AFTER INSERT ON employees
FOR EACH ROW
BEGIN

   cdc_pkg.log_change(
      p_table_name => 'EMPLOYEES',
      p_operation  => 'INSERT',
      p_pk         => :NEW.employee_id,
      p_column     => 'ROW',
      p_old_value  => NULL,
      p_new_value  => 'NEW RECORD'
   );

END;
/

This records newly created rows.


Capturing UPDATE Operations

A more detailed CDC implementation tracks column-level changes.

IF NVL(:OLD.salary,0) <> NVL(:NEW.salary,0) THEN
   cdc_pkg.log_change(
      'EMPLOYEES',
      'UPDATE',
      :NEW.employee_id,
      'SALARY',
      :OLD.salary,
      :NEW.salary
   );
END IF;

This approach ensures only actual changes are recorded.


Capturing DELETE Operations

CREATE OR REPLACE TRIGGER trg_emp_cdc_del
AFTER DELETE ON employees
FOR EACH ROW
BEGIN

   cdc_pkg.log_change(
      'EMPLOYEES',
      'DELETE',
      :OLD.employee_id,
      'ROW',
      'EXISTS',
      'DELETED'
   );

END;
/

Deletes are critical for downstream synchronization systems.


Transaction-Aware CDC

One challenge in CDC design is grouping changes by transaction.

Using:

DBMS_TRANSACTION.LOCAL_TRANSACTION_ID

allows multiple row changes to be associated with a single business transaction.

This is useful for:

  • Rebuilding consistent state

  • Event-driven pipelines

  • Data replication


CDC Processing Layer

Once changes are captured, downstream processing can occur via:

Batch Processing

SELECT *
FROM cdc_change_log
WHERE change_time > SYSDATE - 1;

Incremental ETL

Extract only new changes since last checkpoint.

Real-Time Consumers

APIs or jobs can poll or subscribe to changes.


Performance Considerations

CDC introduces overhead because every DML operation triggers additional logic.

Minimize Trigger Workload

Avoid heavy computation inside triggers.

Use Bulk Inserts Carefully

High-volume systems should batch CDC writes.

Index Strategically

CREATE INDEX idx_cdc_time ON cdc_change_log(change_time);

Partition Change Tables

For large systems, partition by date or table name.

Consider Asynchronous Processing

Use queues instead of direct inserts for high-throughput systems.


Common Pitfalls

Over-Auditing

Capturing every column change can generate excessive data.

Poor Indexing Strategy

CDC tables can grow quickly without optimization.

Transaction Contention

Autonomous transactions may introduce overhead if misused.

Missing Deletes

Many implementations forget to capture DELETE operations.

Trigger Complexity

Complex triggers can degrade DML performance significantly.


Enhancements for Enterprise CDC

1. JSON-Based Change Payloads

Instead of row-per-column logs:

{
  "employee_id": 1001,
  "salary": {
    "old": 5000,
    "new": 6000
  }
}

2. Unified Row Snapshot Model

Store full row images instead of column deltas.


3. Event Streaming Integration

Push CDC events to:

  • Kafka

  • REST APIs

  • Message queues


4. Soft Delete Strategy

Instead of DELETE tracking only:

is_deleted = 'Y'

5. CDC Filtering Rules

Capture only relevant tables or columns.


When PL/SQL CDC Is a Good Fit

PL/SQL-based CDC works well when:

  • Volume is moderate

  • Real-time streaming is not required

  • Licensing constraints exist

  • Simple integration is needed

  • Full control is required

It may not be ideal for:

  • Very high-throughput systems

  • Multi-database replication

  • Cloud-native event streaming architectures


Best Practices

  • Keep triggers lightweight and deterministic

  • Centralize CDC logic in packages

  • Capture transaction context whenever possible

  • Partition and archive CDC data regularly

  • Monitor performance impact continuously

  • Avoid business logic inside CDC layer

  • Test under realistic production load


Conclusion

Building Change Data Capture solutions with PL/SQL provides a flexible and powerful way to track database changes without external dependencies. While not as feature-rich as enterprise replication tools, a well-designed PL/SQL CDC framework can effectively support auditing, integration, and incremental processing requirements.

By combining triggers, reusable packages, and a well-structured change log model, organizations can build a scalable CDC system that provides visibility into data changes while remaining fully within the Oracle Database ecosystem.

The key to success is balancing completeness of change tracking with performance and maintainability—ensuring that the CDC layer remains lightweight, reliable, and production-ready.

Comments

Popular posts from this blog

Oracle APEX Development Tips I Wish I Knew Earlier

The Evolution of Software Engineering: From Writing Code to Solving Business Problems

Advanced PL/SQL Performance Tuning: Processing Millions of Records Efficiently