Kettle



Introduction

Kettle is a very popular ETL - Extract, Transformation and Load - tool which is open sourced and considered one of the best ETL in BI marketplace.

Kettle (Pentaho Data Integration) is a very popular open source ETL tool initiated by Matt Casters (http://www.ibridge.be).

Designed with a very careful, good architecture and with vast Java JDBC support, Kettle is our favorite choice for any data integration solution.


Kettle itself is part of Pentaho BI applications suite. It is a independent project initiated Matt Casters until acquired by Pentaho in 2006. Ever since, Kettle is also known as Pentaho Data Integration (PDI). Matt himself still leads theĀ  PDI project development in Pentaho.

Kettle comprise of 4 applications :
  • Spoon, graphical designer for designing job and transformation schemes. It is based on swing.
  • Pan, script that is used to execute transformation scheme in .ktr xml file form or from a repository.
  • Kitchen, script that is used to execute job scheme in .kjb xml file form or from a repository.
  • Carte, atemporary web server which is used to execute job/transformation in cluster / parallel.
All the applications run from a particular batch / shell script.


Table of Contents

  1. Windows Installation
  2. Linux Fedora 9 Installation
  3. Spoon Introduction
  4. Job/Transformation Command Line Utitilites
    • Pan
    • Kitchen
  5. Transformation
  6. Job
  7. Variable
  8. Screencast