Discussion:
How to estimate hardware needed for a Hadoop CLuster
Amine Tengilimoglu
2018-10-21 08:25:51 UTC
Permalink
Hi all;

I want to learn how can i estimate the hardware nedeed for hadoop
cluster. is there any standart or other things?

for example I have 10TB data, and i will analiyze it... My replication
factor will be 2.

How much ram do i need for one node? how can I estimate it?
How much disk do i need for one node ? how can I estimate it?
How many core - CPU do i need for one node?


thanks in advance..
Antonio Rendina
2018-10-24 10:06:22 UTC
Permalink
Post by Amine Tengilimoglu
Hi all;
I want to learn how can i estimate the hardware nedeed for hadoop
cluster. is there any standart or other things?
for example I have 10TB data, and i will analiyze it... My replication
factor will be 2.
How much ram do i need for one node? how can I estimate it?
How much disk do i need for one node ? how can I estimate it?
How many core - CPU do i need for one node?
thanks in advance..
Hi,
there are some docs on HDP docs website, I think that also Cloudera and
other companies should have something similar, the HDP document is
called "Cluster Planning".
There are some rules of thumb, but if you want to be precise, you need
to know the services that you will run, how much resources they will
need, and the expected performances.
To do a good estimation you need a person that have a good understanding
of the services that are running into the cluster, and the right input
about what you are going to do with the cluster.B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB��[��X��ܚX�KK[XZ[��[�\�[ ][��X��ܚX�PY�� �\X�K�ܙ�B��܈Y][ۘ[��[X[��
Amine Tengilimoglu
2018-10-24 11:33:23 UTC
Permalink
Thank u Antonio, I read cloudera's docs I have got an idea about it a
little. you have a point. I have to look at these services... this will
take some time...

and thank u again Or Raz and lqjacklee ...
Post by Antonio Rendina
Post by Amine Tengilimoglu
Hi all;
I want to learn how can i estimate the hardware nedeed for hadoop
cluster. is there any standart or other things?
for example I have 10TB data, and i will analiyze it... My replication
factor will be 2.
How much ram do i need for one node? how can I estimate it?
How much disk do i need for one node ? how can I estimate it?
How many core - CPU do i need for one node?
thanks in advance..
Hi,
there are some docs on HDP docs website, I think that also Cloudera and
other companies should have something similar, the HDP document is
called "Cluster Planning".
There are some rules of thumb, but if you want to be precise, you need
to know the services that you will run, how much resources they will
need, and the expected performances.
To do a good estimation you need a person that have a good understanding
of the services that are running into the cluster, and the right input
about what you are going to do with the cluster.
Loading...