Posts

listener Vs advertised.listener | SSL | plainText

listeners=PLAINTEXT://myhost:9092,OUTSIDE://:9094 PLAINTEST : is just name of lister myhost: is the host and port where broker bind itself, when it starts, that mean, it will listen to this IP/ethe0 interface and listen on this this 09092, if somebidy is sending any traffic, then it will process it, else it will not. listener:  For internal Broker to Broker or zookeeper communication advertised.listeners:  For external point of contact, like in cloud env, this will be external exposed IP and port, like NodePort IP:Port

records | logs| offset | consumer groups

Zookeeper and kafka relationship

  Kafka uses Zookeeper for the following:   Source : Quora Electing a controller. The controller is one of the brokers and is responsible for maintaining the leader/follower relationship for all the partitions. When a node shuts down, it is the controller that tells other replicas to become partition leaders to replace the partition leaders on the node that is going away. Zookeeper is used to elect a controller, make sure there is only one and elect a new one it if it crashes. Cluster membership - which brokers are alive and part of the cluster? this is also managed through ZooKeeper. Topic configuration - which topics exist, how many partitions each has, where are the replicas, who is the preferred leader, what configuration overrides are set for each topic (0.9.0) - Quotas - how much data is each client allowed to read and write (0.9.0) - ACLs - who is allowed to read and write to which topic (old high level consumer) - Which consumer groups exist, who are their membe

HPSA (HP Service activator) framework

Kafka components one by one

Image
Courtesy : Stéphane Maarek  and  Learning Journal Installation mkdir ~/Downloads curl " https://www.apache.org/dist/kafka/2.1.1/kafka_2.11-2.1.1.tgz " -o ~/Downloads/kafka.tgz mkdir ~/kafka && cd ~/kafka tar -xvzf  ~/Downloads/kafka.tgz --strip 1 Partition: A Topic can be broken in to multiple partition, so that each partition can reside in a separate Broker instance, thus achieving parallelism for that specific topic (KPI). New message written to a topic: Kafka:  Decides which partition will the message go, kind of load balancing, Que:  How does it knows, which partition is free ? or it's offset is not full(12) ? Even of you parallel produce message in 2 partition, the bottleneck can still be in consumer side? 4. Topic Partition replication and leader 5. Load distribution and acknowledgment Producer just send data to " BrokerHost:TopicName" Kafka:  Will decide, which broker instance and which partition of that T