Cloudera CCA-470 Dumps
Exam: Cloudera Certified Administrator for Apache Hadoop CDH4 Upgrade (CCAH)
Cloudera CCA-470 Exam Tutorial
Question No : 1
For each job, the Hadoop framework generates task log files. Where are Hadoop's task log
A. Cached on the local disk of the slave node running the task, then purged immediately upon task completion.
B. Cached on the local disk of the slave node running the task, then copied into HDFS.
C. In HDFS, in the directory of the user who generates the job.
D. On the local disk of the slave node running the task.
Question No : 2
How must you format the underlying filesystem of your Hadoop clusters slave nodes
running on Linux?
A. They may be formatted in nay Linux filesystem
B. They must be formatted as HDFS
C. They must be formatted as either ext3 or ext4
D. They must not be formatted - - HDFS will format the filesystem automatically
Question No : 3
Hadoop provider web interface can be used for all of the following EXCEPT: (choose 1)
A. Keeping track of the number of files and directories stored in HDFS.
B. Keeping track of jobs running on the cluster.
C. Browsing files in HDFS.
D. Keeping track of tasks running on each individual slave node.
E. Keeping track of processor and memory utilization on each individual slave node.
Question No : 4
Your existing Hadoop cluster has 30 slave nodes, each of which has 4 x 2T hard drives.
You plan to add another 10 nodes. How much disk space can your new nodes contain?
A. The new nodes must all contain 8TB of disk space, but it does not matter how the disks are configured
B. The new nodes cannot contain more than 8TB of disk space
C. The new nodes can contain any amount of disk space
D. The new nodes must all contain 4 x 2TB hard drives
Question No : 5
Your cluster implements HDFS High Availability (HA). You two NameNodes are named
nn01 and nn02. What occurs when you execute the command:
Hdfs haadmin -failover nn01 nn02
A. nn02 becomes the standby NameNode and nn02 becomes the active NameNode
B. Nn01 is fenced, and nn01 becomes the active NameNode
C. Nn01 is fenced, and nn02 becomes the active NameNode
D. Nn01 becomes the standby NameNode and nn02 becomes the active NameNode
Question No : 6
You have a cluster running with the Fair in Scheduler enabled. There are currently no jobs
running on the cluster, and you submit a job A, so that only job A is running on the cluster.
A while later, you submit job B, Now job A and job B are running on the cluster at the same
Which of the following describes how the Fair Scheduler operates? (Choose 2)
A. When job B gets submitted, it will get assigned tasks, while job A continues to run with fewer tasks.
B. When job A gets submitted, it doesn't consume all the task slots.
C. When job A gets submitted, it consumes all the task slots.
D. When job B gets submitted, job A has to finish first, before job B can get scheduled.
Question No : 7
You has a cluster running with the Fail Scheduler enabled. There are currently no jobs
running on the cluster you submit a job A, so that only job A is running on the cluster. A
while later, you submit job B. Now job A and Job B are running on the cluster al the same
time. How will the Fair' Scheduler handle these two Jobs?
A. When job A gets submitted, it consumes all the task slot
B. When job A gets submitted, it doesn't consume all the task slot
C. When job B gets submitted, Job A has to finish first, before job it can get scheduled.
D. When job B gets submitted, it will get assigned tasks, while job A continues to run with fewer tasks.
Question No : 8
You have a cluster running with the FIFO scheduler enabled. You submit a large job A to
the cluster which you expect to run for one hour. Then, you submit job B to the cluster,
which you expect to run a couple of minutes only. Lets assume both jobs are running at
the same priority.
How does the FIFO scheduler execute the jobs? (Choose 3)
A. The order of execution of tasks within a job may vary.
B. When a job is submitted, all tasks belonging to that job are scheduled.
C. Given jobs A and B submitted in that order, all tasks from job A will be scheduled before all tasks from job B.
D. Since job B needs only a few tasks, if might finish before job A completes.
Question No : 9
Assuming a large properly configured multi-rack Hadoop cluster, which scenario should not
result in loss of HDFS data assuming the default replication factor settings?
A. Ten percent of DataNodes simultaneously fail.
B. All DataNodes simultaneously fail.
C. An entire rack fails.
D. Multiple racks simultaneously fail.
E. Seventy percent of DataNodes simultaneously fail.
Question No : 10
Your developers request that you enable them to use Hive on your Hadoop cluster. What
do install and/or configure?
A. Install the Hive interpreter on the client machines only, and configure a shared remote Hive Metastore.
B. Install the Hive Interpreter on the client machines and all the slave nodes, and configure a shared remote Hive Metastore.
C. Install the Hive interpreter on the master node running the JobTracker, and configure a shared remote Hive Metastore.
D. Install the Hive interpreter on the client machines and all nodes on the cluster
Question No : 11
On a cluster running MapReduce v1 (MRv1), the value of the
mapred.tasktracker.map.tasks.maximum configuration parameter in the mapred-site.xml
file should be set to:
A. Half the number of the maximum number of Reduce tasks which can run simultaneously on an individual node.
B. The maximum number of Map tasks can run simultaneously on an individual node.
C. The same value on each slave node.
D. The maximum number of Map tasks which can run on the cluster as a whole.
E. Half the number of the maximum number of Reduce tasks which can run on the cluster as a whole.
Question No : 12
You are running a Hadoop cluster with NameNode on host mynamenode, a secondary
NameNode on host mysecondary and DataNodes.
Which best describes how you determine when the last checkpoint happened?
A. Execute hdfs dfsadmin report on the command line in and look at the Last Checkpoint information.
B. Execute hdfs dfsadmin saveNameSpace on the command line which returns to you the last checkpoint value in fstime file.
C. Connect to the web UI of the Secondary NameNode (http://mysecondarynamenode:50090) and look at the Last Checkpoint information
D. Connect to the web UI of the NameNode (http://mynamenode:50070/) and look at the Last Checkpoint information
Question No : 13
How does the NameNode know DataNodes are available on a cluster running MapReduce
A. DataNodes listed in the dfs.hosts file. The NameNode uses as the definitive list of available DataNodes.
B. DataNodes heartbeat in the master on a regular basis.
C. The NameNode broadcasts a heartbeat on the network on a regular basis, and DataNodes respond.
D. The NameNode send a broadcast across the network when it first starts, and DataNodes respond.
Question No : 14
Your cluster is running Map v1 (MRv1), with default replication set to 3, and a cluster
blocks 64MB. Identify which best describes the file read process when a Client application
connects into the cluster and requests a 50MB file?
A. The client queries the NameNode for the locations of the block, and reads all three copies. The first copy to complete transfer to the client is the one the client reads as part of Hadoops execution framework.
B. The client queries the NameNode for the locations of the block, and reads from the first location in the list of receives.
C. The client queries the NameNode for the locations of the block, and reads from a random location in the list it receives to eliminate network I/O loads by balancing which nodes it retrieves data from at any given time.
D. The client queries the NameNode and then retrieves the block from the nearest DataNode to the client and then passes that block back to the client.
Question No : 15
You configure you cluster with HDFS High Availability (HA) using Quorum-Based storage.
You do not implement HDFS Federation.
What is the maximum number of NameNodes daemon you should run on you cluster in
order to avoid a split-brain scenario with your NameNodes?
A. Unlimited. HDFS High Availability (HA) is designed to overcome limitations on the number of NameNodes you can deploy.
B. Two active NameNodes and one Standby NameNode
C. One active NameNode and one Standby NameNode
D. Two active NameNodes and two Standby NameNodes
Question No : 16
You install Cloudera Manager on a cluster where each host has 1 GB of RAM. All of the
services show their status as concerning. However, all jobs submitted complete without an
Why is Cloudera Manager showing the concerning status KM the services?
A. A slave node's disk ran out of space
B. The slave nodes, haven't sent a heartbeat in 60 minutes
C. The slave nodes are swapping.
D. DataNode service instance has crashed.
Question No : 17
Identify four characteristics of a 300MB file that has been written to HDFS with block size of
128MB and all other Hadoop defaults unchanged?
A. The file will consume 1152MB of space in the cluster
B. The third block will be 64MB
C. The third Initial block will be 44 MB
D. Two of the initial blocks will be 128MB
E. Each block will be replicated three times
F. The file will be split into three blocks when initially written into the cluster
G. Each block will be replicated nine times
H. All three blocks will be 128MB
Question No : 18
What determines the number of Reduces that run a given MapReduce job on a cluster
running MapReduce v1 (MRv1)?
A. It is set by the Hadoop framework and is based on the number of InputSplits of the job.
B. It is set by the developer.
C. It is set by the JobTracker based on the amount of intermediate data.
D. It is set and fixed by the cluster administrator in mapred-site.xml. The number set always run for any submitted job.
Question No : 19
Your Hadoop cluster contains nodes in three racks. Choose which scenario results if you
leave the dfs.hosts property in the NameNodes configuration file empty (blank)?
A. The NameNode will update dfs.hosts property to include machines running the DataNode daemon on the next NameNode reboot or with a dfsadmin refreshNodes.
B. Any machine running the DataNode daemon can immediately join the cluster.
C. Presented with a blank dfs.hosts property, the NameNode will permit DataNodes specified in mapred.hosts to join the cluster.
D. No new can be added to the cluster until you specify them in the dfs.hosts file.
Question No : 20
Your cluster Mode size is set to 128MB. A client application (client application A) is writing
a 500MB file to HDFS. After client application A has written 300MB of data, another client
(client application B) attempts to read the file. What is the effect of a second client
requesting a file during a write?
A. Application B can read 256MB of the file
B. Client application B returns an error
C. Client application on B can read the 300MB that has been written so far.
D. Client application B must wait until the entire file has been written, and will then read its entire contents.