Questions tagged [hadoop]
Hadoop is an open-source solution for providing a distributed/replicated file system, a produciton grade map-reduce system, and has a series of complementary additions like Hive, Pig, and HBase to get more out of a Hadoop-powered cluster.
265 questions
-1
votes
0
answers
35
views
Kerberos and Hadoop UI
I have a small number of servers among 100s which will not open the local hadoop (datanode) UI (port 1006). I use the NAMENODE UI to access datanodes and can see data on most but, for these few, I ...
0
votes
0
answers
44
views
How to get audit result from Apache Ranger
I have declared elasticsearch in the ranger-admin-site.xml config and, from the UNIX commandline on the system which has ranger installed, I can query the elastic cluster with curl.
When I am in the ...
0
votes
0
answers
57
views
deploye hadoop cluster in Kubernetes
hellow every one , I have question can any want tell me if this architect that I want to do is possible and how can I start :
create a hadoop cluster compose of 3 nodes ( 2 data Node + 1 NameNode) ...
2
votes
1
answer
206
views
HDFS + Using very large disks with HDFS
from my understanding Using 20-30TB disks with HDFS can present some challenges, but it can also be managed effectively with proper configuration
using 20-30TB disks with HDFS is possible, it requires ...
2
votes
0
answers
155
views
Hive metastore + Hiveserver2 + what is the prefered Garbage Collectors – Serial vs. Parallel vs. CMS vs. G1
From time to time we see messages as jvm pause defected from HiveMetastore & Hiveserver2 logs
in spite we increased the heap size to the right size ( according to number of connection to HIVE ...
1
vote
0
answers
174
views
Hadoop + warnings as slow block-receive from data-node machines
We have Hadoop cluster with 487 data-nodes machines ( each data-node machine include also the Service node-manager ) , all machines are physical machines ( DELL ) , and OS is RHEL 7.9 version.
Each ...
0
votes
1
answer
150
views
Does VM machine can replace physical machine,
We have 254 Physical servers when all machines are DELL servers R740.
servers are part of Hadoop cluster. most of them are holding HDFS filesystem and data node & node manager services, part of ...
0
votes
1
answer
886
views
Clear RAM Memory Cache and buffer on production Hadoop cluster with HDFS filesystem
we have Hadoop cluster with 265 Linux RHEL machines.
from total 265 machines, we have 230 data nodes machines with HDFS filesystem.
total memory on each data-node is 128G and we run many spark ...
1
vote
0
answers
282
views
HDP cluster + journal nodes get out of Sync
we have HDP cluster version 2.6.5
when we look on name-node logs we can see the following warning
2023-02-20 15:56:37,731 INFO namenode.FileJournalManager (FileJournalManager.java:finalizeLogSegment(...
1
vote
0
answers
51
views
YARN + how to debug wget
we are testing with wget VIA port 8088 the connection from ResourceManager02 to ResourceManager01
both Resource Managers are part of YARN service , and each resource manager service installed on RHEL ...
0
votes
1
answer
2k
views
Hadoop datanodes Using "{Hostname}/{IP address}:9000" to try to connect to nameNode
I have a cluster of Pis that I'm using to experiment with Hadoop. masternode is set to .190, p1 to 191 ... p4 to 194. All nodes are up and running. start-dfs.sh, stop-all.sh, etc from the master ...
-2
votes
1
answer
226
views
How does placing data in various racks help to exploit the fact that intra-rack aggregated bandwidth>=inter-rack bandwidth?
GFS research paper snapshot
it says that(my interpretation after reading research paper and its reviews) "inter rack bandwidth is lower than aggregated intra rack bandwidth(not sure what it means ...
-1
votes
2
answers
559
views
RHEL + can we improve disks performance by tuning kernel parameters?
we have Hadoop cluster and we are collection metrics collection data in order to investigate slowness behavior on spark applications
after long investigation on our Hadoop cluster
we noticed from ...
0
votes
1
answer
278
views
How to add multiple hostnames in private DNS zone in Azure to resolve hostnames for VNET?
I have an AKS (Azure Kubernetes cluster) that is on a VNET (Azure Virtual Network) that needs to connect to multiple On-prem hadoop machines to read/write data. I have a private DNS zone connected to ...
1
vote
0
answers
21
views
How to debug policy enforcement on YARN queues?
I have a HDP 3.1 cluster and it seems that the fair policy isn't behaving as expected or YARN is misconfigured, since some users/applications/jobs are consuming more resources than we supposed it to ...