KAFKA

install and execute kafka from AWS

Naranjito 2020. 12. 18. 16:26

In order to practice in kafka, I created AWS instance and issued pem file.

 

 

  • chmod 400 kafka.pem 
Joses-MacBook-Pro:Downloads joohyunyoon$ chmod 400 kafka.pem 

chmod 400 : Change Mode, Gives the user read permission, and removes all other permission. These permissions are specified in octal, the first char is for the user, second for the group and the third is for other. The high bit (4) is for read access, the middle bit (2) os for write access, and the low bit (1) is for execute access.

 

 

  • ssh -i kafka.pem ec2-user@my AWS Public IP

 

Hey, let me connect to my AWS Public IP. 

 

Joses-MacBook-Pro:Downloads joohyunyoon$ ssh -i kafka.pem ec2-user@my AWS Public IP
The authenticity of host 'my AWS Public IP (my AWS Public IP)' can't be established.
ECDSA key fingerprint is SHA256:U991ew4uc8UMRNJn41fF93sAErb5QgeNDkVUOyyVdd0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'my AWS Public IP' (ECDSA) to the list of known hosts.

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
7 package(s) needed for security, out of 19 available
Run "sudo yum update" to apply all updates.

...

[ec2-user@ip-172-31-44-61 ~]$

ssh : Secure Shell which is a network communication protocol that enables two computers to communicate and share data

 

-i : identity_file [-J [user@]host[:port]] [-L address]

In my case, identity_file is kafka.pem, user is ec2-user which is user name used in AMI OS, address is my AWS Public IPv4 address my AWS Public IP

 

 

  • sudo yum install -y java-1.8.0-openjdk-devel.x86_64

Kafka requires to use jdk.

[ec2-user@ip-172-31-44-61 ~]$ sudo yum install -y java-1.8.0-openjdk-devel.x86_64
Failed to set locale, defaulting to C
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
amzn2-core                                                    | 3.7 kB  00:00:00 

...

  xorg-x11-fonts-Type1.noarch 0:7.5-9.amzn2                                          

Complete!

yum : Yellodog Update Modified, open source command line package management utility for computers running the Linux using RPM.

 

This guy requires another one, kafka binary file in order to run this guy.

[ec2-user@ip-172-31-44-61 ~]$ wget http://mirror.navercorp.com/apache/kafka/2.5.0/kafka_2.12-2.5.0.tgz
--2020-12-18 05:24:20--  http://mirror.navercorp.com/apache/kafka/2.5.0/kafka_2.12-2.5.0.tgz
Resolving mirror.navercorp.com (mirror.navercorp.com)... 125.209.216.167
Connecting to mirror.navercorp.com (mirror.navercorp.com)|125.209.216.167|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 61604633 (59M) [application/octet-stream]
Saving to: 'kafka_2.12-2.5.0.tgz'

100%[===========================================>] 61,604,633   102MB/s   in 0.6s   

2020-12-18 05:24:20 (102 MB/s) - 'kafka_2.12-2.5.0.tgz' saved [61604633/61604633]

 

 

  • tar xvf kafka_2.12-2.5.0.tgz

I am going to deconstruct you!

[ec2-user@ip-172-31-44-61 ~]$ tar xvf kafka_2.12-2.5.0.tgz 
kafka_2.12-2.5.0/
kafka_2.12-2.5.0/LICENSE
kafka_2.12-2.5.0/NOTICE
...

tar : Tape Archive, used to rip a collection of files and directories into highly compressed archive file

 

x : Extract

v : Verbose

f : List

 

 

  • export KAFKA_HEAP_OPTS="-Xmx400m -Xms400m"

Adjust this guy's heap size. Because this guy has only 1MB heap memory as default setup, but my instance t2.micro is bigger. Therefore it needs to setup as environment parameter.

Let's change 1MB to 400MB.

[ec2-user@ip-172-31-44-61 kafka_2.12-2.5.0]$ export KAFKA_HEAP_OPTS="-Xmx400m -Xms400m"

[ec2-user@ip-172-31-44-61 kafka_2.12-2.5.0]$ echo export KAFKA_HEAP_OPTS="-Xmx400m -Xms400m"
export KAFKA_HEAP_OPTS=-Xmx400m -Xms400m

echo : used to display line of text/string that are passed as an argument

 

 

  • cd config

In order to connect, this guy still requires adjustment. 

[ec2-user@ip-172-31-44-61 kafka_2.12-2.5.0]$ cd config
[ec2-user@ip-172-31-44-61 config]$ ls
connect-console-sink.properties    consumer.properties
connect-console-source.properties  log4j.properties
connect-distributed.properties     producer.properties
connect-file-sink.properties       server.properties
connect-file-source.properties     tools-log4j.properties
connect-log4j.properties           trogdor.conf
connect-mirror-maker.properties    zookeeper.properties
connect-standalone.properties
[ec2-user@ip-172-31-44-61 config]$ vi server.properties 

config : Configuration, config as a template for the current kernel configuration. Any new options for your current kernel sources will be queried

 

vi : Visual editor

 

Open the 9092port and change the host name as my AWS Public IP through vi in order to communicate. 

#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://:9092

# Hostname and port the broker will advertise to producers and consumers. If not set, 
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
advertised.listeners=PLAINTEXT://my AWS Public IP:9092

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details

a : Append

ESC + ':' + wq : Write, Quit

 

 

  • bin/zookeeper-server-start.sh -daemon config/zookeeper.properties

Then, let's run zookeeter which is chum of kafka and then show me whether you are there or no.

Ther are 850 QuorumPeerMain, 875 Jps. Yes, you are out there.

[ec2-user@ip-172-31-44-61 kafka_2.12-2.5.0]$ bin/zookeeper-server-start.sh 
-daemon config/zookeeper.properties
[ec2-user@ip-172-31-44-61 kafka_2.12-2.5.0]$ jps
850 QuorumPeerMain
875 Jps

jps : Java Process

 

 

  • bin/kafka-server-start.sh -daemon config/server.properties

Then, always kafka run after zookeeper.

Yes, this guy is there with other name.

[ec2-user@ip-172-31-44-61 kafka_2.12-2.5.0]$ bin/kafka-server-start.sh 
-daemon config/server.properties
[ec2-user@ip-172-31-44-61 kafka_2.12-2.5.0]$ jps
850 QuorumPeerMain
1235 Kafka
1302 Jps

 

 

  • tail -f logs/*

 

[ec2-user@ip-172-31-44-61 kafka_2.12-2.5.0]$ tail -f logs/*
==> logs/controller.log <==
[2020-12-18 06:02:22,799] DEBUG [PartitionStateMachine controllerId=0] Started partition state machine with initial state -> Map() (kafka.controller.ZkPartitionStateMachine)
...

head, tail : head and tail commands, which are very useful when you want to view a certain part at the beginning or at the end of a file