Cloudera Certified Administrator for Apache Hadoop (CCAH) exam topics and syllabus

Update : 6 April 2013

Cloudera Certified Administrator for Apache Hadoop (CCAH)

To earn a CCAH certification, candidates must pass an exam designed to test a candidate’s fluency with the concepts and skills required in the following areas:

If you are interested in Developer exam then you should read other post

http://jugnu-life.blogspot.in/2012/03/cloudera-certified-developer-for-apache.html


Details for Admin exam are here along with from where to prepare

 

Test Name: Cloudera Certified Administrator for Apache Hadoop CDH4 (CCA-410)
Number of Questions: 60
Time Limit: 90 minutes
Passing Score: 70%
Languages: English, Japanese
English Release Date: November 1, 2012
Japanese Release Date: December 1, 2012
Price: USD $295, AUD285, EUR225, GBP185, JPY25,500

 

 

1. HDFS (38%)

Objectives
  • Describe the function of all Hadoop Daemons
  • Describe the normal operation of an Apache Hadoop cluster, both in data storage and in data processing.
  • Identify current features of computing systems that motivate a system like Apache Hadoop.
  • Classify major goals of HDFS Design
  • Given a scenario, identify appropriate use case for HDFS Federation
  • Identify components and daemon of an HDFS HA-Quorum cluster
  • Analyze the role of HDFS security (Kerberos)
  • Describe file read and write paths
Section Study Resources

2. MapReduce (10%)

Objectives
  • Understand how to deploy MapReduce MapReduce v1 (MRv1)
  • Understand how to deploy MapReduce v2 (MRv2 / YARN)
  • Understand basic design strategy for MapReduce v2 (MRv2)
Section Study Resources
  • Apache YARN docs (note: we don't control apache.org links and as of 11 February 2013, they have been experiencing downtime. You may get a 404 error.)
  • CDH4 YARN deployment docs

    3. Hadoop Cluster Planning (12%)

    Objectives
    • Principal points to consider in choosing the hardware and operating systems to host an Apache Hadoop cluster.
    • Analyze the choices in selecting an OS
    • Understand kernel tuning and disk swapping
    • Given a scenario and workload pattern, identify a hardware configuration appropriate to the scenario
    • Cluster sizing: given a scenario and frequency of execution, identify the specifics for the workload, including CPU, memory, storage, disk I/O
    • Disk Sizing and Configuration, including JBOD versus RAID, SANs, virtualization, and disk sizing requirements in a cluster
    • Network Topologies: understand network usage in Hadoop (for both HDFS and MapReduce) and propose or identify key network design components for a given scenario
    Section Study Resources
    • Hadoop Operations: Chapter 4

    4. Hadoop Cluster Installation and Administration (17%)

    Objectives
    • Given a scenario, identify how the cluster will handle disk and machine failures.
    • Analyze a logging configuration and logging configuration file format.
    • Understand the basics of Hadoop metrics and cluster health monitoring.
    • Identify the function and purpose of available tools for cluster monitoring.
    • Identify the function and purpose of available tools for managing the Apache Hadoop file system.
    Section Study Resources
    • Hadoop Operations, Chapter 5

    5. Resource Management (6%)

    Objectives
    • Understand the overall design goals of each of Hadoop schedulers.
    • Understand the role of HDFS quotas.
    • Given a scenario, determine how the FIFO Scheduler allocates cluster resources.
    • Given a scenario, determine how the Fair Scheduler allocates cluster resources.
    • Given a scenario, determine how the Capacity Scheduler allocates cluster resources.
    Section Study Resources

    6. Monitoring and Logging (12%)

    Objectives
    • Understand the functions and features of Hadoop’s metric collection abilities
    • Analyze the NameNode and JobTracker Web UIs
    • Interpret a log4j configuration
    • Understand how to monitor the Hadoop Daemons
    • Identify and monitor CPU usage on master nodes
    • Describe how to monitor swap and memory allocation on all nodes
    • Identify how to view and manage Hadoop’s log files
    • Interpret a log file
    Section Study Resources

      7. The Hadoop Ecosystem (5%)

      Objectives
      • Understand Ecosystem projects and what you need to do to deploy them on a cluster.
      Section Study Resources
    • 5 comments:

      1. Thank you! this was very useful, Planning to give this exam by August 3rd week.

        ReplyDelete
      2. This comment has been removed by a blog administrator.

        ReplyDelete
      3. Hi,

        can someone tell me having this certification would help in our career ??

        I have no experience in hadoop, but i am attracted towards BIG DATA and i really wanna pursue my career with BIG DATA.

        Can someone help me please ?

        ReplyDelete
        Replies
        1. I would suggest you to start exploring about hadoop , what problems it solves and if it interests you then only think about certification.

          Right now ahead to apache.org and start seeing its mailing list , documentation. Grab a book on Hadoop and start turning its pages.

          You will enjoy for sure and when you enjoy what you do your career will automatically take care of itself.

          Delete
      4. This comment has been removed by the author.

        ReplyDelete

      Please share your views and comments below.

      Thank You.