Deploy Kafka Instance Data Replication

Application Scenario

Huawei Cloud Distributed Message Service Kafka supports data replication between Kafka instances through Smart Connect, suitable for cross-region data synchronization, disaster recovery, data migration and other scenarios. By configuring Smart Connect tasks, you can establish data replication channels between source and target instances to achieve automatic topic data synchronization. This best practice will introduce how to use Terraform to automatically deploy Kafka instance data replication, including creating multiple Kafka instances, Smart Connect, and Smart Connect tasks.

This best practice involves the following main resources and data sources:

Data Sources

Resources

Resource/Data Source Dependencies

Operation Steps

1. Script Preparation

Prepare the TF file (e.g., main.tf) in the specified workspace for writing the current best practice script, ensuring that it (or other TF files in the same directory) contains the provider version declaration and Huawei Cloud authentication information required for deploying resources. Refer to the "Preparation Before Deploying Huawei Cloud Resources" document for configuration introduction.

2. Query Data Sources

Add the following script to the TF file (e.g., main.tf) to query availability zone and Kafka flavor information:

Parameter Description:

  • type: Flavor type, assigned by referencing the local variable instance_configurations_without_flavor_id, default value is "cluster" (cluster mode)

  • availability_zones: Availability zone list, assigned by referencing input variables or availability zones data source

  • storage_spec_code: Storage specification code, assigned by referencing the local variable, default value is "dms.physical.storage.ultra.v2"

3. Create Basic Network Resources

Add the following script to the TF file (e.g., main.tf) to create VPC, subnet and security group:

4. Create Kafka Instance Resources

Add the following script to the TF file (e.g., main.tf) to instruct Terraform to create multiple Kafka instance resources (at least 2 instances are required):

Parameter Description:

  • count: Creation count, assigned by referencing the length of input variable instance_configurations, at least 2 instances are required

  • name: Kafka instance name, assigned by referencing input variable instance_configurations

  • availability_zones: Availability zone list, assigned by referencing input variables or availability zones data source

  • engine_version: Engine version, assigned by referencing input variables, default value is "3.x"

  • flavor_id: Flavor ID, assigned by referencing input variables or Kafka flavors data source

  • storage_spec_code: Storage specification code, assigned by referencing input variables, default value is "dms.physical.storage.ultra.v2"

  • storage_space: Storage space, assigned by referencing input variables, default value is 600 (GB)

  • broker_num: Number of brokers, assigned by referencing input variables, default value is 3

  • vpc_id: VPC ID, assigned by referencing the VPC resource

  • network_id: Network subnet ID, assigned by referencing the subnet resource

  • security_group_id: Security group ID, assigned by referencing the security group resource

  • access_user: Access user name, assigned by referencing input variables, optional parameter

  • password: Access password, assigned by referencing input variables, optional parameter

  • enabled_mechanisms: Enabled authentication mechanisms, assigned by referencing input variables, optional parameter, supports "SCRAM-SHA-512", etc.

  • port_protocol: Port protocol configuration, configured through dynamic blocks, optional parameter

5. Create Kafka Topic Resource (Optional)

Add the following script to the TF file (e.g., main.tf) to create a Kafka topic (if task_topics is not specified, create a topic):

Parameter Description:

  • count: Creation count, create 1 topic when task_topics is empty, otherwise do not create

  • instance_id: Kafka instance ID, assigned by referencing the first Kafka instance resource

  • name: Topic name, assigned by referencing input variable topic_name

  • partitions: Number of partitions, assigned by referencing input variable topic_partitions, default value is 10

  • replicas: Number of replicas, assigned by referencing input variable topic_replicas, default value is 3

  • aging_time: Aging time, assigned by referencing input variable topic_aging_time, default value is 72 (hours)

  • sync_replication: Sync replication, assigned by referencing input variable topic_sync_replication, default value is false

  • sync_flushing: Sync flushing, assigned by referencing input variable topic_sync_flushing, default value is false

  • description: Topic description, assigned by referencing input variable topic_description, optional parameter

  • configs: Topic configurations, configured through dynamic blocks, optional parameter

6. Create Smart Connect Resource

Add the following script to the TF file (e.g., main.tf) to create Smart Connect:

Parameter Description:

  • instance_id: Kafka instance ID, assigned by referencing the first Kafka instance resource

  • storage_spec_code: Storage specification code, assigned by referencing input variable smart_connect_storage_spec_code, optional parameter

  • bandwidth: Bandwidth, assigned by referencing input variable smart_connect_bandwidth, optional parameter

  • node_count: Number of nodes, assigned by referencing input variable smart_connect_node_count, default value is 2

7. Create Smart Connect Task Resource

Add the following script to the TF file (e.g., main.tf) to create a Smart Connect task to achieve data replication between instances:

Parameter Description:

  • instance_id: Kafka instance ID, assigned by referencing the first Kafka instance resource

  • task_name: Task name, assigned by referencing input variable task_name

  • source_type: Source type, set to "KAFKA_REPLICATOR_SOURCE" (Kafka replicator source)

  • start_later: Whether to start later, assigned by referencing input variable task_start_later, default value is false

  • topics: Topic list, assigned by referencing input variable task_topics or topic resources

  • source_task.peer_instance_id: Peer instance ID, assigned by referencing the second Kafka instance resource

  • source_task.direction: Replication direction, assigned by referencing input variable task_direction, default value is "two-way"

  • source_task.replication_factor: Replication factor, assigned by referencing input variable task_replication_factor, default value is 3

  • source_task.task_num: Number of tasks, assigned by referencing input variable task_task_num, default value is 2

  • source_task.provenance_header_enabled: Whether to enable provenance header, assigned by referencing input variable task_provenance_header_enabled, default value is false

  • source_task.sync_consumer_offsets_enabled: Whether to sync consumer offsets, assigned by referencing input variable task_sync_consumer_offsets_enabled, default value is false

  • source_task.rename_topic_enabled: Whether to rename topic, assigned by referencing input variable task_rename_topic_enabled, default value is true

  • source_task.consumer_strategy: Consumer strategy, assigned by referencing input variable task_consumer_strategy, default value is "latest"

  • source_task.compression_type: Compression type, assigned by referencing input variable task_compression_type, default value is "none"

  • source_task.topics_mapping: Topic mapping, assigned by referencing input variable task_topics_mapping, optional parameter

  • source_task.security_protocol: Security protocol, automatically determined based on peer instance port protocol configuration

  • source_task.sasl_mechanism: SASL mechanism, automatically determined based on peer instance authentication mechanism

  • source_task.user_name: User name, obtained from peer instance

  • source_task.password: Password, obtained from peer instance

8. Preset Input Parameters Required for Resource Deployment (Optional)

In this practice, some resources use input variables to assign configuration content. These input parameters need to be manually entered during subsequent deployment. At the same time, Terraform provides a method to preset these configurations through tfvars files, which can avoid repeated input during each execution.

Create a terraform.tfvars file in the working directory with the following example content:

Usage:

  1. Save the above content as a terraform.tfvars file in the working directory (this filename allows users to automatically import the content of this tfvars file when executing terraform commands. For other naming, you need to add .auto before tfvars, such as variables.auto.tfvars)

  2. Modify parameter values according to actual needs, especially:

    • instance_configurations needs to configure at least 2 instances, the first instance as the target instance, the second instance as the source instance

    • If the source instance enables SASL authentication, you need to configure access_user, password and enabled_mechanisms

    • password needs to be set to a password that meets password complexity requirements

  3. When executing terraform plan or terraform apply, Terraform will automatically read the variable values in this file

In addition to using the terraform.tfvars file, you can also set variable values in the following ways:

  1. Command line parameters: terraform apply -var="task_name=my_task" -var="vpc_name=my_vpc"

  2. Environment variables: export TF_VAR_task_name=my_task and export TF_VAR_vpc_name=my_vpc

  3. Custom named variable file: terraform apply -var-file="custom.tfvars"

Note: If the same variable is set through multiple methods, Terraform will use variable values according to the following priority: command line parameters > variable file > environment variables > default values. Since password contains sensitive information, it is recommended to use environment variables or encrypted variable files for setting. In addition, ensure network connectivity between source and target instances, and the authentication configuration of the source instance is correct.

9. Initialize and Apply Terraform Configuration

After completing the above script configuration, execute the following steps to create Kafka instance data replication:

  1. Run terraform init to initialize the environment

  2. Run terraform plan to view the resource creation plan

  3. After confirming that the resource plan is correct, run terraform apply to start creating Kafka instances, Smart Connect, and Smart Connect tasks

  4. Run terraform show to view the details of the created Smart Connect task

Note: After the Smart Connect task is created, data replication will automatically start according to the configuration. If start_later=true is set, the task will not start immediately after creation and needs to be started manually. The instance's availability zones and flavor ID cannot be modified after creation, so they need to be configured correctly during creation. Through lifecycle.ignore_changes, Terraform can be prevented from modifying these immutable parameters in subsequent updates.

Reference Information

Last updated