Talend Big Data Real Time v6 Certified Developer Exam

TALEND certification exams measure candidates’ knowledge of product usage and underlying methods required to successfully implement quality projects. Preparation is critical to passing.

Certification Exam Details

Exam content is updated over time. The number and difficulty of questions may change. The passing score is adjusted to maintain a consistent standard – for example, a new exam version with more difficult questions may have a lower passing score.

These details are provided as a guideline only:

  • Approximate number of questions: 60
  • Estimated exam duration: 1 hour

Types of questions:

  • Multiple choice
  • Multiple response
  • True/false formats
  • Matching concepts with definitions
  • Sortable Drag & Drop questions

Recommended Experience

General knowledge of Hadoop: HDFS; Map Reduce v1 and v2; Hive; Pig; HBase; Hue; Zookeeper; and Sqoop.

General knowledge of Spark and Kafka.

Experience with Talend Big Data real time 6.x solutions and Talend Studio including metadata creation, configuration, and troubleshooting.

Preparation

To prepare for the certification exam, we recommend the following:

  • Attend the Big Data Basics and Advanced training courses
  • Study in detail the training material
  • Acquire experience using the product for at least 6 months
  • Read the product documentation

Certification Exam Topics

Big Data – General concepts

  • The different YARN daemons
  • How HDFS works
  • The Hadoop ecosystem: Pig, Hive, Hue, Hbase and Sqoop
  • The process to create cluster metadata in Talend Studio
  • Different ways to test connectivity to a Hadoop cluster

HDFS

  • What is HDFS?
  • Talend components dedicated to HDFS: Names, how they work, how to configure them
  • Mandatory configuration to connect to HDFS
  • Troubleshoot common issues

Hive

  • What is Hive?
  • Talend components dedicated to Hive: Names, how they work, how to configure them
  • How to create, profile and preview Hive tables
  • Troubleshoot common issues

Pig

  • What is Pig?
  • Talend components dedicated to Pig: Names, how they work, how to configure them
  • Troubleshoot common issue

HBase

  • What is HBase?
  • Talend components dedicated to HBase: Names, how they work, how to configure them
  • Troubleshoot common issues

Sqoop

  • What is Sqoop?
  • Talend components dedicated to Sqoop: Names, how they work, how to configure them
  • Troubleshoot common issues

Spark

  • What is Spark?
  • Configuration to use Spark and Spark streaming frameworks: execution modes, mandatory parameters, resource limitations, tuning memory
  • Troubleshoot common issues

Kafka

  • What is Kafka?
  • Talend components dedicated to Kafka: Names, how they work, how to configure them
  • Troubleshoot common issues

Sample Questions

  1. You designed a Big Data Batch using the Map Reduce framework. You plan to execute it on a cluster using Map Reduce v1. What are the mandatory configuration that must be specified in the Hadoop Configuration tab of the Run view:
  2. Choose all that apply.

    a. Name Node

    b. Data Node

    c. Resource Manager

    d. Job Tracker

  3. What is HDFS?
  4. a. A data warehouse infrastructure tool to process structured data in Hadoop.

    b. A tool to import/export tables from/to the Hadoop File System

    c. A column-oriented key/value data store built to run on top of the Hadoop File System

    d. It is the primary storage system used by Hadoop applications

  5. In which perspective of the Studio can you run analysis on Hive Tables content:
  6. a. Profiling

    b. Integration

    c. Big Data

    d. Mediation

  7. HDFS components can only be used in Big Data Batch or Big Data Streaming Jobs:
  8. a. True

    b. False

  9. ZooKeeper service is mandatory to coordinate the transaction between the Talend Studio and HBase
  10. a. True

    b. False

  11. Select the outgoing links allowed for a tSqoopImport component
  12. a. Main

    b. Iterate

    c. OnSubjobOk

    d. SqoopCombine

  13. What should be the type of incoming data of a tKafkaOutput component?
  14. a. Serialized byte arrays

    b. Integers

    c. bytes

    d. String

  15. In the Studio, the components in the Palette are the same for Spark Jobs and for Map Reduce Jobs
  16. a. True

    b. False

Answers:

  1. a&d
  2. d
  3. a
  4. b
  5. a
  6. b&c
  7. a
  8. b