Change Data Capture (CDC)
Change data capture utilizes the SQL Server Agent to log insertions, updates, and deletions occurring in a table. So, it makes these data changes accessible to be easily consumed using a relational format. The column data and essential metadata need to apply these change data to a target environment are captured for the modified rows and stored in change tables that mirror the column structure of the tracked source tables. Furthermore, table-valued functions are available for systematic access to this change data by consumers.
Change data capture is used by application to track changes which happen in database. Some ETL apps like Kafka need this feature to be enabled when we want to stream data from one database to another destination. CDC is a feature at table level, so you need to enable this per each table if there are many tables in a database that need to be tracked. Here is the steps to setup CDC for tables in database.
1. Change the Database Owner to 'sa' or 'dbo' (optional)
The first thing to do is to change the database owner to 'sa' or 'dbo' user which have sysadmin role. This is optional as to implement CDC it's not really necessary but sometimes it prevents error of permission when doing alter db or configuration process.
you can use the query:
EXEC sp_changedbowner 'sa'
2. Enable CDC in database
Now you need to enable CDC in the database by running the following query:
EXEC sys.sp_cdc_enable_db
Notes:
CDC cannot be enabled when Transactional Replication is on, must turn off, enable CDC then reapply Transactional Replication
CDC source is the SQL Server Transaction Log
3. Enable Change Table and Create Capture Instance
So in my TestDB, I have this table which host some values for people, their names, ages, and email.
Now to enable CDC for this table, we can just create a capture instance for this table which will record all changes made to the table. Each table can only have up to 2 capture instance. To Create capture instance run the query to the targeted database:
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo'
, @source_name = N'SimpleTable'
, @role_name = NULL
, @capture_instance = NULL
--, @supports_net_changes = 1
GO
Notes: source_schema is the schema of the table (default is dbo), source_name is the name of the table, role_name can be null if not needed, capture_instance name will be concatenated name of the schema name and table name if defined Null plus "_CT" at the end of the name
Example: here the capture instance will be "dbo_SimpleTable_CT"
enable supports_net_changes if the table have primary key or unique key
4. Check the Change Table which has been created and also CDC Jobs for the Change Table
Now after you created the Capture Instance for the table, you can check for all the capture instances which has been created in a database by using this query:
EXEC sys.sp_cdc_help_change_data_capture
Notes: Each table can only have 2 capture instances at most
You will also have to check that there are 2 jobs generated for CDC, first is Capture job (yes, CDC is not instaneously capture, but using a job which is scheduled to run) and the second is Cleanup job which will clean old records of CDC after some times. Note that CDC uses SQL Agent Jobs to move log entries to the CDC tables, so there is a latency, and also there is a moving window of data kept, I believe the default is 3 days before it's cleaned up.
5. Tracking Changes Recorded in Change Table
So in here I'm inserting some data to the table to be recorded by CDC.
Now to see all the recorded change you can query like this:
SELECT * FROM dbo.<Capture_Instance_Name>_CT
so in this case, I have Change Table in system table which captured all my changes named cdc.dbo_SimpleTable_CT and I can query this table to see all the changes in data records.
The thing that you need to know is The column __$operation records the operation that is associated with the change: 1 = delete, 2 = insert, 3 = update (before), and 4 = update (after)
The column _$seqval is the order changes within a transaction
Now that you have done all the setup for CDC and making sure it's working, your application can now connect to database and use CDC.
To Disable Change Table and Capture Instance:
Use this query to disable CDC for a table:
EXEC sys.sp_cdc_disable_table
@source_schema = N'dbo'
, @source_name = N'SimpleTable'
, @capture_instance = N'dbo_SimpleTable'
To disable CDC in the database:
Use this query to disable CDC in the database:
EXEC sys.sp_cdc_disable_db
Comments
Post a Comment