Deploy Kafka Instance Data Replication
Application Scenario
Huawei Cloud Distributed Message Service Kafka supports data replication between Kafka instances through Smart Connect, suitable for cross-region data synchronization, disaster recovery, data migration and other scenarios. By configuring Smart Connect tasks, you can establish data replication channels between source and target instances to achieve automatic topic data synchronization. This best practice will introduce how to use Terraform to automatically deploy Kafka instance data replication, including creating multiple Kafka instances, Smart Connect, and Smart Connect tasks.
Related Resources/Data Sources
This best practice involves the following main resources and data sources:
Data Sources
Resources
Resource/Data Source Dependencies
Operation Steps
1. Script Preparation
Prepare the TF file (e.g., main.tf) in the specified workspace for writing the current best practice script, ensuring that it (or other TF files in the same directory) contains the provider version declaration and Huawei Cloud authentication information required for deploying resources. Refer to the "Preparation Before Deploying Huawei Cloud Resources" document for configuration introduction.
2. Query Data Sources
Add the following script to the TF file (e.g., main.tf) to query availability zone and Kafka flavor information:
Parameter Description:
type: Flavor type, assigned by referencing the local variable instance_configurations_without_flavor_id, default value is "cluster" (cluster mode)
availability_zones: Availability zone list, assigned by referencing input variables or availability zones data source
storage_spec_code: Storage specification code, assigned by referencing the local variable, default value is "dms.physical.storage.ultra.v2"
3. Create Basic Network Resources
Add the following script to the TF file (e.g., main.tf) to create VPC, subnet and security group:
4. Create Kafka Instance Resources
Add the following script to the TF file (e.g., main.tf) to instruct Terraform to create multiple Kafka instance resources (at least 2 instances are required):
Parameter Description:
count: Creation count, assigned by referencing the length of input variable instance_configurations, at least 2 instances are required
name: Kafka instance name, assigned by referencing input variable instance_configurations
availability_zones: Availability zone list, assigned by referencing input variables or availability zones data source
engine_version: Engine version, assigned by referencing input variables, default value is "3.x"
flavor_id: Flavor ID, assigned by referencing input variables or Kafka flavors data source
storage_spec_code: Storage specification code, assigned by referencing input variables, default value is "dms.physical.storage.ultra.v2"
storage_space: Storage space, assigned by referencing input variables, default value is 600 (GB)
broker_num: Number of brokers, assigned by referencing input variables, default value is 3
vpc_id: VPC ID, assigned by referencing the VPC resource
network_id: Network subnet ID, assigned by referencing the subnet resource
security_group_id: Security group ID, assigned by referencing the security group resource
access_user: Access user name, assigned by referencing input variables, optional parameter
password: Access password, assigned by referencing input variables, optional parameter
enabled_mechanisms: Enabled authentication mechanisms, assigned by referencing input variables, optional parameter, supports "SCRAM-SHA-512", etc.
port_protocol: Port protocol configuration, configured through dynamic blocks, optional parameter
5. Create Kafka Topic Resource (Optional)
Add the following script to the TF file (e.g., main.tf) to create a Kafka topic (if task_topics is not specified, create a topic):
Parameter Description:
count: Creation count, create 1 topic when task_topics is empty, otherwise do not create
instance_id: Kafka instance ID, assigned by referencing the first Kafka instance resource
name: Topic name, assigned by referencing input variable topic_name
partitions: Number of partitions, assigned by referencing input variable topic_partitions, default value is 10
replicas: Number of replicas, assigned by referencing input variable topic_replicas, default value is 3
aging_time: Aging time, assigned by referencing input variable topic_aging_time, default value is 72 (hours)
sync_replication: Sync replication, assigned by referencing input variable topic_sync_replication, default value is false
sync_flushing: Sync flushing, assigned by referencing input variable topic_sync_flushing, default value is false
description: Topic description, assigned by referencing input variable topic_description, optional parameter
configs: Topic configurations, configured through dynamic blocks, optional parameter
6. Create Smart Connect Resource
Add the following script to the TF file (e.g., main.tf) to create Smart Connect:
Parameter Description:
instance_id: Kafka instance ID, assigned by referencing the first Kafka instance resource
storage_spec_code: Storage specification code, assigned by referencing input variable smart_connect_storage_spec_code, optional parameter
bandwidth: Bandwidth, assigned by referencing input variable smart_connect_bandwidth, optional parameter
node_count: Number of nodes, assigned by referencing input variable smart_connect_node_count, default value is 2
7. Create Smart Connect Task Resource
Add the following script to the TF file (e.g., main.tf) to create a Smart Connect task to achieve data replication between instances:
Parameter Description:
instance_id: Kafka instance ID, assigned by referencing the first Kafka instance resource
task_name: Task name, assigned by referencing input variable task_name
source_type: Source type, set to "KAFKA_REPLICATOR_SOURCE" (Kafka replicator source)
start_later: Whether to start later, assigned by referencing input variable task_start_later, default value is false
topics: Topic list, assigned by referencing input variable task_topics or topic resources
source_task.peer_instance_id: Peer instance ID, assigned by referencing the second Kafka instance resource
source_task.direction: Replication direction, assigned by referencing input variable task_direction, default value is "two-way"
source_task.replication_factor: Replication factor, assigned by referencing input variable task_replication_factor, default value is 3
source_task.task_num: Number of tasks, assigned by referencing input variable task_task_num, default value is 2
source_task.provenance_header_enabled: Whether to enable provenance header, assigned by referencing input variable task_provenance_header_enabled, default value is false
source_task.sync_consumer_offsets_enabled: Whether to sync consumer offsets, assigned by referencing input variable task_sync_consumer_offsets_enabled, default value is false
source_task.rename_topic_enabled: Whether to rename topic, assigned by referencing input variable task_rename_topic_enabled, default value is true
source_task.consumer_strategy: Consumer strategy, assigned by referencing input variable task_consumer_strategy, default value is "latest"
source_task.compression_type: Compression type, assigned by referencing input variable task_compression_type, default value is "none"
source_task.topics_mapping: Topic mapping, assigned by referencing input variable task_topics_mapping, optional parameter
source_task.security_protocol: Security protocol, automatically determined based on peer instance port protocol configuration
source_task.sasl_mechanism: SASL mechanism, automatically determined based on peer instance authentication mechanism
source_task.user_name: User name, obtained from peer instance
source_task.password: Password, obtained from peer instance
8. Preset Input Parameters Required for Resource Deployment (Optional)
In this practice, some resources use input variables to assign configuration content. These input parameters need to be manually entered during subsequent deployment. At the same time, Terraform provides a method to preset these configurations through tfvars files, which can avoid repeated input during each execution.
Create a terraform.tfvars file in the working directory with the following example content:
Usage:
Save the above content as a
terraform.tfvarsfile in the working directory (this filename allows users to automatically import the content of thistfvarsfile when executing terraform commands. For other naming, you need to add.autobefore tfvars, such asvariables.auto.tfvars)Modify parameter values according to actual needs, especially:
instance_configurationsneeds to configure at least 2 instances, the first instance as the target instance, the second instance as the source instanceIf the source instance enables SASL authentication, you need to configure
access_user,passwordandenabled_mechanismspasswordneeds to be set to a password that meets password complexity requirements
When executing
terraform planorterraform apply, Terraform will automatically read the variable values in this file
In addition to using the terraform.tfvars file, you can also set variable values in the following ways:
Command line parameters:
terraform apply -var="task_name=my_task" -var="vpc_name=my_vpc"Environment variables:
export TF_VAR_task_name=my_taskandexport TF_VAR_vpc_name=my_vpcCustom named variable file:
terraform apply -var-file="custom.tfvars"
Note: If the same variable is set through multiple methods, Terraform will use variable values according to the following priority: command line parameters > variable file > environment variables > default values. Since password contains sensitive information, it is recommended to use environment variables or encrypted variable files for setting. In addition, ensure network connectivity between source and target instances, and the authentication configuration of the source instance is correct.
9. Initialize and Apply Terraform Configuration
After completing the above script configuration, execute the following steps to create Kafka instance data replication:
Run
terraform initto initialize the environmentRun
terraform planto view the resource creation planAfter confirming that the resource plan is correct, run
terraform applyto start creating Kafka instances, Smart Connect, and Smart Connect tasksRun
terraform showto view the details of the created Smart Connect task
Note: After the Smart Connect task is created, data replication will automatically start according to the configuration. If
start_later=trueis set, the task will not start immediately after creation and needs to be started manually. The instance's availability zones and flavor ID cannot be modified after creation, so they need to be configured correctly during creation. Through lifecycle.ignore_changes, Terraform can be prevented from modifying these immutable parameters in subsequent updates.
Reference Information
Last updated