Elastic Stack/Elasticsearch

Structure of Elasticsearch, Node, Cluster, Document, Shard, Sharding, Primary shard, Replica shard

Naranjito 2021. 4. 14. 13:23

Customer                                                                →  Server(1)

 ⇢ A customer serches for a product on my app  The request is sent to the server

My shopping mall website(app) 

↓↑  It connected to a database which contains all my product data along with other data we're collecting from the app

Database  →  Server(2)

                ⇡ looks up the product within the database

  product info is sent back to the server

Server(3)

 ⇢ product info can be rendered on the userr's browser

Customer

 

Customer 

↓             

My shopping mall website(app) 

  The request(search query) is sent to the server when a customer searches for a product on my website

Server

 ⇢ It sends the search query to elasticsearch

Elasticsearch

  It searchs relevant results fast to the server

Server

 ⇢ The results back to the brower

My shopping mall website(app) 

 

Node : An instance of elasticsearch once elasticsearch is up and running, Each node has a unique ID and a name and it belongs to a single cluster. A member of a cluster that share a common goal. It assigned to one or multiple roles and one of the roles that could be assigned to hold data.

 

Cluster : It is formed automatically when you start up a node, and you could have one to many nodes in a cluster. And these nodes are distributed across separate machines. All nodes and machines belong to the same cluster(above pic) in other words, they're still part of the same cluster, and work together to accomplish a task.

 

Document : Data is stored as document that is a json object contains whatever data you want to store in els. It share similar traits and is grouped into an index. It would be grouped under produce index. Indices are used to group documents that are related to each other so we know where to find certain information.

 

Index : It is a virtual thing that keeps track of where documents are stored. You cannot find the index on disk.

 

Shard : It is where data(document) is stored and where you run a search queries. what actually exists on disk. When you create an index, one shard comes with it by default(it assigned to a node), you can create an index with multiple shards that are distributed across nodes(sharding). Number of documents a shard could hold depends on the capacity of the node. 

 

Sharding : Create an index with multiple shards that are distributed across nodes. When our produce data is only going to grow, you could add more shard and nodes so horizontally scale and adapt to increasing data. It could run a search on all shards at the same time in parallel. 

 

Primary shard : Original shard p0, p1. 

 

Replica shard : Identical copies of each primary shard and store these in r0, r1. If p0 were to go down, everything's okay because we have a replica r0.

 

GET _cluster/health # from cluster api/health information

>>>
{
  "cluster_name" : "joohyun",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 2,
  "active_shards" : 2,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 2,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 50.0
}

GET _API/parameter : we want to get information from/retrieves the info that we're looking for

 

GET _nodes/stats

>>>
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "joohyun",
  "nodes" : {
    "embCwbMlQZeL_G-FyR1M6Q" : {
      "timestamp" : 1618372104734,
      "name" : "joohyun_node",
      "transport_address" : "172.19.0.2:9300",
      "host" : "172.19.0.2",
      "ip" : "172.19.0.2:9300",
      
...

GET _nodes/stats : get the information about nodes. It is helpful when you debugging a node.

 

PUT favorite_candy

PUT name_of_the_index : create index.

 

PUT favorite_candy/_doc/123
{
  "first_name":"joooohyun",
  "candy":"Starburst"
}

>>>
{
  "_index" : "favorite_candy",
  "_type" : "_doc",
  "_id" : "123",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 11,
  "_primary_term" : 1
}

PUT name_of_the_index/following document/assign id : create index, store documents in elasticsearch with a specific id you assign.

 

POST favorite_candy/_doc
{
  "first_name":"joohyun",
  "candy":"sour skittles"
}

>>>
{
  "_index" : "favorite_candy",
  "_type" : "_doc",
  "_id" : "JPSHzngBNlJgER1wbozU",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 10,
  "_primary_term" : 1
}

POST name_of_the_index/_doc: create index, store documents in elasticsearch with auto generate an id for your document/the following document

 

201-Created : The index has a document with an auto-generated id and the document has been created

 

GET favorite_candy/_doc/123

>>>
{
  "_index" : "favorite_candy",
  "_type" : "_doc",
  "_id" : "123",
  "_version" : 2,
  "_seq_no" : 1,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "first_name" : "joooohyun",
    "candy" : "Starburst"
  }
}

GET name_of_the_index/following document/assign id : Send a request to see the content of the document that has been stored. 

 

200-OK : A document with an id has been updated.

 

 

"_version" : 2 : How many times your document has been created, updated, deleted

 

POST favorite_candy/_update/123
{
  "doc": {
    "candy":"M&M"
  }
}

>>>
{
  "_index" : "favorite_candy",
  "_type" : "_doc",
  "_id" : "123",
  "_version" : 3,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 12,
  "_primary_term" : 1
}

POST name_of_the_index/_update/assign id : I want to update with this assigned id

 

DELETE favorite_candy/_doc/123

>>>
{
  "_index" : "favorite_candy",
  "_type" : "_doc",
  "_id" : "123",
  "_version" : 4,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 13,
  "_primary_term" : 1
}

DELETE name_of_the_index/following document/assign id

 

Reference : www.youtube.com/watch?v=gS_nHTWZEJ8&t=2954s