Home >Java >javaTutorial >Java distributed Kafka message queue instance analysis

Java distributed Kafka message queue instance analysis

WBOY
WBOYforward
2023-04-19 16:10:151242browse

Introduction

Apache Kafka is a distributed publish-subscribe messaging system. The definition of kafka on the kafka official website is: a distributed publish-subscribe messaging system. It was originally developed by LinkedIn, which was contributed to the Apache Foundation in 2010 and became a top open source project. Kafka is a fast, scalable, and inherently distributed, partitioned, and replicable commit log service.

Note: Kafka does not follow the JMS specification (), it only provides publish and subscribe communication methods.

Kafka core related names

  1. Broker: Kafka node, a Kafka node is a broker, multiple brokers can form a Kafka cluster

  2. Topic: A type of message. The directory where the message is stored is the topic. For example, page view logs, click logs, etc. can exist in the form of topics. The Kafka cluster can be responsible for the distribution of multiple topics at the same time.

  3. massage: The most basic delivery object in Kafka.

  4. Partition: The physical grouping of topics. A topic can be divided into multiple partitions, and each partition is an ordered queue. Partitioning is implemented in Kafka, and a broker represents a region.

  5. Segment: Partition is physically composed of multiple segments. Each segment stores message information.

  6. Producer: Producer, produces messages and sends them. Go to topic

  7. ##Consumer: consumer, subscribe to topic and consume messages, consumer consumes as a thread

  8. Consumer Group: consumer group, A Consumer Group contains multiple consumers

  9. Offset: offset, understood as the index position of the message in the message partition

Topic and queue Difference:

The queue is a data structure that follows the first-in-first-out principle

kafka cluster installation

  • Install jdk1.8 environment on each server

  • Install Zookeeper cluster environment

  • Install kafka cluster environment

  • Run environment test

Java distributed Kafka message queue instance analysis

#The installation of jdk environment and zookeeper will not be described in detail here.

Why does kafka rely on zookeeper: Kafka will store mq information on zookeeper. In order to make the entire cluster easy to expand, zookeeper's event notification is used to sense each other.

kafka cluster installation steps:

1. Download the kafka compressed package

2. Unzip the installation package

tar -zxvf kafka_2.11 -1.0.0.tgz

3. Modify kafka’s configuration file config/server.properties

Configuration file modification content:

  • zookeeper connection address:

    zookeeper.connect=192.168.1.19:2181

  • The listening ip is changed to the local ip

    listeners=PLAINTEXT:// 192.168.1.19:9092

  • #kafka’s brokerid, each broker’s id is different

    broker.id=0

4. Start kafka in sequence

./kafka-server-start.sh -daemon config/server.properties

kafka usage

kafka file storage

topic is a logical concept, and partition is a physical concept. Each partition corresponds to a log file, and the log file stores the data generated by the Producer. The data generated by the Producer will be continuously appended to the end of the log file. In order to prevent the log file from being too large and causing inefficiency in data positioning, Kafka adopts a sharding and indexing mechanism to divide each partition into multiple segments. Each segment includes: ".index" files, ".log" files, and .timeindex files. These files are located in a folder, and the naming rules of the folder are: topic name partition serial number.

For example: execute the command to create a new topic, which is divided into three areas and stored in three brokers:

./kafka-topics.sh --create --zookeeper localhost: 2181 --replication-factor 1 --partitions 3 --topic kaico

Java distributed Kafka message queue instance analysis

Java distributed Kafka message queue instance analysis

    ##a partition Divided into multiple segments
  • .log log file
  • .index offset index file
  • .timeindex timestamp index file
  • Other files (partition.metadata, leader-epoch-checkpoint)
  • Springboot integration kafka

maven dependency

 <dependencies>
        <!-- springBoot集成kafka -->
        <dependency>
            <groupId>org.springframework.kafka</groupId>
            <artifactId>spring-kafka</artifactId>
        </dependency>
        <!-- SpringBoot整合Web组件 -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
    </dependencies>

yml configuration

# kafka
spring:
  kafka:
    # kafka服务器地址(可以多个)
#    bootstrap-servers: 192.168.212.164:9092,192.168.212.167:9092,192.168.212.168:9092
    bootstrap-servers: www.kaicostudy.com:9092,www.kaicostudy.com:9093,www.kaicostudy.com:9094
    consumer:
      # 指定一个默认的组名
      group-id: kafkaGroup1
      # earliest:当各分区下有已提交的offset时,从提交的offset开始消费;无提交的offset时,从头开始消费
      # latest:当各分区下有已提交的offset时,从提交的offset开始消费;无提交的offset时,消费新产生的该分区下的数据
      # none:topic各分区都存在已提交的offset时,从offset后开始消费;只要有一个分区不存在已提交的offset,则抛出异常
      auto-offset-reset: earliest
      # key/value的反序列化
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
    producer:
      # key/value的序列化
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.StringSerializer
      # 批量抓取
      batch-size: 65536
      # 缓存容量
      buffer-memory: 524288
      # 服务器地址
      bootstrap-servers: www.kaicostudy.com:9092,www.kaicostudy.com:9093,www.kaicostudy.com:9094

producer

@RestController
public class KafkaController {
	/**
	 * 注入kafkaTemplate
	 */
	@Autowired
	private KafkaTemplate<String, String> kafkaTemplate;
	/**
	 * 发送消息的方法
	 *
	 * @param key
	 *            推送数据的key
	 * @param data
	 *            推送数据的data
	 */
	private void send(String key, String data) {
		// topic 名称 key   data 消息数据
		kafkaTemplate.send("kaico", key, data);
	}
	// test 主题 1 my_test 3
	@RequestMapping("/kafka")
	public String testKafka() {
		int iMax = 6;
		for (int i = 1; i < iMax; i++) {
			send("key" + i, "data" + i);
		}
		return "success";
	}
}

consumer

@Component
public class TopicKaicoConsumer {
    /**
     * 消费者使用日志打印消息
     */
    @KafkaListener(topics = "kaico") //监听的主题
    public void receive(ConsumerRecord<?, ?> consumer) {
        System.out.println("topic名称:" + consumer.topic() + ",key:" +
                consumer.key() + "," +
                "分区位置:" + consumer.partition()
                + ", 下标" + consumer.offset());
        //输出key对应的value的值
        System.out.println(consumer.value());
    }
}

The above is the detailed content of Java distributed Kafka message queue instance analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete