Tuesday, September 15, 2020

Talend Components

Talend :-

 It is a open & scalable architecture helps respond faster to business requests. It provides many unified tools to develop and deploy data integration job.

Talend stands for” Talend Open Studio”.

Talend open studio is the open source data integration product produced by Talend and it is designed to convert, combine and update data in various areas across a business.

Talend is a ETL tool (Extract Tranfer Load)

Use of talend:-

·         Increases productivity

·         bulit in data quality

·         extensive connectivity

·         Scalability  and management functions

·         Improves efficiency of big data job design with GUI interface

 

Other Data intigration tools:-

·         Informatica Cloud Data Integration

·         Oracle Integration Cloud Service

·         Salesforce Platform: Salesforce Connect

·         SnapLogic

·         Talend Cloud Integration

·         Microsoft sql

 

Why talend:-

·         It allows faster response to business requests Open sources

·         Future Proof

·         deploy data integration jobs faster than hand coding.

·         Cloud capabilities to simplify adoption of the latest innovations

List of Components used in talend:-

·         Tdbinput

·         Tmap

·         Tlogrow

·         Tsplitrow

·         Textract delimit fields

·         TFile json

·         Textract json filed

·         Tfile input raw

·         Tjava row

·         Tflow tolterate

·         Tsend mail

·         Tfile archieve

·         Taggregrate row

·         Tfileoutputdelimited

·         Tfileinput delimited

Tdbinput:-

·         Tdbinput reads a database and extracts fileds based on query

 

Purpose:- Tdbinput executes a DB query with a strictly defined order which must correspond to the schema definition. Then it passes on the field list to the next component via a Main row link.

Component used in job:-

Tmap:-  tMap is one of the core components and is primarily used for mapping input data to output data, that is, mapping one Schema to another.

                                               

Purpose:- Tmap  is used join the one schema to another and tranfer data from single or  multiple sources

Component used in job:-

Tlogrow:- Displays data or results in the Run console

                                               

Purpose:-

        Tlog row is used to moniter the results

Component used in job:-

 

Tsplitrow:-Tsplit row is used split one row into several rows

                                                  

Purpose:- this component is used spliting one input row into several output rows

 

 

Component used in job:-

Textract delimit fields:- Textract delimit fields generates multiple columns from delimitstring column               

                                                                                              

Purpose:- Textract delimit fields helpes to extract fileds from string to write  them  elsewere for examples

Component used in job:-

Textract json filed:- tExtractJSONFields extracts the desired data from incoming JSON fields based on the XPath or JSONPath query

                                               

Purpose:- tExtractJSONFields extracts the data from JSON fields stored in a file, a database table, etc., based on the XPath or JSONPath query

Component used in job:-

Tfile input raw :- tFileInputRaw reads all the content of a file and sends it to a single output column

                                                           

 

Purpose:- This component is used to gather together data and send it to a single output column for subsequent processing by another component

 

Component used in job:-

 

 

Tjava row :-

TJavaRow allows you to enter customized code which you can integrate in a Talend     programme. With tJavaRow, you can enter the Java code to be applied to each row of the flow

                                                   

Purpose:-

    TJavaRow allows you to broaden the functionality of Talend Jobs, using the Java language

 

Component used in job:-

Tflow tolterate :-tFlowToIterate iterates on the input data and generates global variables

                                                           

Purpose :- This component is used to read data line by line from the input flow and store the data entries in iterative global variables

Component used in job:-

x           

Tsend mail :- tSendMail sends emails and attachments to defined recipients

                                               

Purpose :- tSendMail purpose is to notify recipients about a particular state of a Job or possible errors

Component used in job:-

         

Tfile archieve :- This component creates a new zip, gzip, or tar.gz archive file from one or more specified files or folders, and the archive file can be compressed using different compression method

                                                    

Purpose :- This component allows you to create a new archive file from one or more files or folders

Component used in job:-

Taggregrate row :- tAggregateRow receives a flow and aggregates it based on one or more columns. For each output line, are provided the aggregation key and the relevant result of set operations (min, max, sum)

                                                 

Purpose :- Helps to provide a set of metrics based on values or calculations

Component used in job:-

 

Tfileoutputdelimited :- tFileOutputDelimited stores input data into  file or excl and txt or csv formate

                                                               

Purpose :- This component writes a delimited file that holds data organized according to the defined schema

Component used in job:-

 

Tfileinputdelimited :- tFileInputDelimited reads a given file row by row with simple separated fields

                                               

Purpose :- Opens a file and reads it row by row to split them up into fields then sends fields as defined in the Schema to the next Job component, via a Row link

 

 

 

Component used in job:-

TFileInputJSON :- tFileInputJSON extracts JSON data from a file

                                                               

Purpose :- tFileInputJSON extracts JSON data from a file, then transfers the data to a file, a database table

Component used in job:-

Tpop :- Tpop is used extract emails from  inbox to local directery                                                                                         

Purpose :- It fetches one or more email messages and writes the recovered information in specified files. In the Advanced settings view allows you to use filters on your selection.

Component used in job:-

 

How to  automation of  job status to mail either sucess or failure with errors in detailed?

 Tsendmail started :

Tsendmail completed :

Tsendlog error:-

 

 

 

Tlogcatcher:- It fetches  message from java excptions,tdie error,twarns and send through mail  current job contains error message

                                                           

Purpose :- Tlogcatcher Operates as a log function triggered by one of the three: Java exception, tDie or tWarn, to collect and transfer log data

Component used in job:-

3 comments:

  1. I think Informatica is a very useful and important tool to carefully analyse and visualise complex table operations. Apart from this there is lot being asked about it in interviews. Thank you so much for this post.

    Informatica Read Json

    ReplyDelete
  2. Thanks for you great article Needed to compose you a very little word to thank you .
    Read more here

    ReplyDelete