Friday 20 December 2013

context variable in Talend ETL, Executing Talend job from command prompt - Begineers Example

Hi Guys,

This post will tech you about
1) Understanding and usage of context variables
2) Executing Talend job from command line by passing context variables.

References :
Vikram's post :
http://www.vikramtakkar.com/2013/01/understand-context-variables-with.html

Diethard Steiner post:
http://diethardsteiner.blogspot.in/2012/02/talend-open-studio-how-to-set-up.html

Talend Tutorials:
http://www.talendforge.org/tutorials/tutorial.php?idTuto=34

Environment : 
1) TOS 5.1.3
2) Windows 7 OS - Command prompt.

Example Scenario:
Dynamically printing any name that you pass from context variable on console.
This will give you the first basic idea of using context variables in Talend ETL thus you can implement the same concept for other requirements in your project

What is context variable ? Context variables are user defined variable provided by Talend Open studio whose value can be changed at run time.

We can provide the values of the context variables at run time which allows jobs to be executed in different ways with different parameters.

Example:

Find the steps below... For components detailed description, follow the Talend online Tutorials here at https://help.talend.com/display/HOME/Welcome

PART - I
1) Dragging and dropping components  to be use in this job
2) creating context variables
3) Make use of created context variable in "tFixedFlowInput" component.
4) Components usage for this example
5) Executing the job
6) Exporting job and executing from command prompt. 
PART - II
1) Exporting Talend JOB
2) Executing the job from command prompt by passing different parameters(i.e., different context variables).


PART - I
Step 1 : Components in JOB

Drag and drop tFixedFlowInput, tLogRow and tBufferOutput components on to the design area from palette. 

Make "Main" connections in the sequence of components as shown in below figure.
 Way of giving connections 
Right click on 1st component -> Click Row -> Click Main and then connect to 2nd component
In the same way make a Main connection from 2nd component to 3rd component as shown in above figure. 

Step 2:Creating context variable:

There are multiple ways of creating context variables.(As per my knowledge up to this blog written date).  

One of the ways which I am describing here is to create context variable for particular job.. 
Find how image below which are yellow right marked..

The "X" right mark is another way of creating context variables  from repository - This post is not discussing this point rather discussing about above bold text. 


Every job has it's own context variables to create.. 

Context(Job_name) options area :
It'll have 3 sections 
1) Variables 2) Values as tree and 3) Values as tables
also various options to add, delete, move up , move down the context variables as shown in above figure 

Variables :
1) Click on + Symbol and give the name for your context variable ( In this example context variable name is "name"
2) Set what type of context variable in your case ( here it is "String"Type)

Note that when you give name, the Script code generates automatically. 

Values as tree:
Give default values that you want to pass from context variable .. In this example I given it as "Sadakar".. at the time of execution from command prompt you can pass any other name lets say "TalendLearner"..
and the job will take this parameter value and give the same name back on console.


Context variables are user defined variable provided by Talend Open studio whose value can be changed at run time. We can provide the values of the context variables at runtime which allows jobs to be executed in different ways with different parameters. For Example, we can define different context variables for Production & Test and based on the environment the job is running it will set the variable values. - See more at: http://www.vikramtakkar.com/2013/01/understand-context-variables-with.html#sthash.rtPUSPQE.dpuf

Context variables are user defined variable provided by Talend Open studio whose value can be changed at run time. We can provide the values of the context variables at runtime which allows jobs to be executed in different ways with different parameters. For Example, we can define different context variables for Production & Test and based on the environment the job is running it will set the variable values. - See more at: http://www.vikramtakkar.com/2013/01/understand-context-variables-with.html#sthash.rtPUSPQE.dpuf
Context variables are user defined variable provided by Talend Open studio whose value can be changed at run time. We can provide the values of the context variables at runtime which allows jobs to be executed in different ways with different parameters. For Example, we can define different context variables for Production & Test and based on the environment the job is running it will set the variable values. - See more at: http://www.vikramtakkar.com/2013/01/understand-context-variables-with.html#sthash.rtPUSPQE.dpufContext variables are user defined variable provided by Talend Open studio whose value can be changed at run time. We can provide the values of the context variables at runtime which allows jobs to be executed in different ways with different parameters. For Example, we can define different context variables for Production & Test and based on the environment the job is running it will set the variable values. - See more at: http://www.vikramtakkar.com/2013/01/understand-context-variables-with.html#sthash.rtPUSPQE.dpufContext variables are user defined variable provided by Talend Open studio whose value can be changed at run time. We can provide the values of the context variables at runtime which allows jobs to be executed in different ways with different parameters. For Example, we can define different context variables for Production & Test and based on the environment the job is running it will set the variable values. - See more at: http://www.vikramtakkar.com/2013/01/understand-context-variables-with.html#sthash.rtPUSPQE.dpufContext variables are user defined variable provided by Talend Open studio whose value can be changed at run time. We can provide the values of the context variables at runtime which allows jobs to be executed in different ways with different parameters. For Example, we can define different context variables for Production & Test and based on the environment the job is running it will set the variable values. - See more at: http://www.vikramtakkar.com/2013/01/understand-context-variables-with.html#sthash.rtPUSPQE.dpuf
Values as Table:
This will display the values in the form of table.. just for showing all the variables in a neat format.


You have done with creating context variable.. now you need to make use of this context variable

Step 3: 
Making use of created context variable in "fFixedFlowInput" component. 


Now, I am going to describe how you can make use of context variable from tFixedFlowInput 
(This a scenario ...)

IDEA is :
Pass any name(a group of characters - in simple lets say a string)  from "tFixedFlowInput"  with the help of created context variable and display back the same name on the console when execute the job from command prompt. 


1) Click on "tFixedFlowInput" component
2) Click on Component to get it's properties
3) Click on Edit Schema
4) Click on + sign and give Column name( Any name.. it is not mandatory to give the same name as context variable) and close it.
5) Give value for the name ( you can type or press Ctrl+Space and then select the context variable).

Find the image below for flow of steps. 

You are done with the make use of context variable which you created in step 3 in "tFieFixedFlow" component. 
i.e., dynamically you are passing name from "tFileFixedFlow" component to get the same value on the console. 

Step 4: other components to view the output
tLogrow: 
Displays data or results in the Run console.

Detailed description of this component is 

tBufferOutput:
This component collects data in a buffer in order to access it later via webservice for example.

Detailed description of this component is 


Step 5: Save your job and run it
Sample output


PART - II
Step 1: Exporting Talend JOB
1) Right click on your job
2) Click on Export job
3) Give the destination for exporting by giving zip file name.
4) Click on finish. 


Step 2: Executing Talend JOB from command prompt by passing the parameters.
1) Open command prompt if you use window (This example is based on windows 7 OS).
2)  Unzip the exported file and open the open the project using command prompt

Find the image below


Change your directory from the command prompt till you find .bat file of project.
i.e., 
For this example it would be 
D:\>Cd C:\Users\helical002\Desktop\test_param_0.1\test_param_0.1\test_param>

Hold on here.....

Open the .bat file in text editor(Notepad++ is preferrable)... You can find a line(s) code starting with "java"......
for example
java -Xms256M -Xmx1024M -cp classpath.jar; sadakar_project.test_param_0_1.test_param --context=Default %* 

Execute this command from the opened directory.
i.e., 

Complete command execution example

C:\Users\helical002\Desktop\test_param_0.1\test_param_0.1\test_param>java -Xms25
6M -Xmx1024M -cp classpath.jar; bitach.test_param_0_1.test_param --context=Defau
lt %*


Sample output once you execute the above command for this example
.-------.
|tLogRow_1|
|=-----=|
|name   |
|=-----=|
|Sadakar|
'-------'


Now, Give different names as input from the above command.

Syntax for passing parameters :

--context_param param_1_name=param_1_value param_2_name=param_2_value


In this example lets pass a(differs from default) parameter .. lets say "context_variable_quick_understanding"



C:\Users\helical002\Desktop\test_param_0.1\test_param_0.1\test_param>java -Xms25
6M -Xmx1024M -cp classpath.jar; sadakar_project.test_param_0_1.test_param --context_param
 name=context_variable_quick_understanding
.------------------------------------.
|             tLogRow_1              |
|=----------------------------------=|
|name                                |
|=----------------------------------=|
|context_variable_quick_understanding|
'------------------------------------'


That's it... you have done with learning context variables - passing them from command prompt..

NOTE :
When you export your job it will create Default.properties file in your file/folder system.
For example naviage in the folder structure where you can find properties file ( usually it is under contexts folder
C:\Users\sadakar002\Desktop\test_param_0.1\test_param_0.1\test_param\bitach\test_param_0_1\contexts




( Thanks for writing up of "Vikram" , "Dither" on the same for making me to write up this as a quick learning)....


Sadakar
BI developer


I believe in "Learning never exhausts the mind"











Tuesday 10 December 2013

Converting string to date in Talend

Date Expression in Talend

Converting string to date:
 

Relational.ISNULL(row1.Checkdate)==true?
null:new SimpleDateFormat("MM-dd-yyyy hh:mm:ss").parse(row1.Checkdate)



Source :
See the difference b/w small and capital

http://javatechniques.com/blog/dateformat-and-simpledateformat-examples/

Wednesday 31 July 2013

Create a table in PostgreSQL from Talend ETL using excel sheet



Create a table in PostgreSQL from Talend ETL

Environment:
Talend Version : 5.1.3
PostgreSQL : 9.2
Excel sheet with few columns records are having null values.
Step 1:
Right Click on Job Designs -> Create job >
Name: dummytable
Give purpose & Discription of the job




 Step 2:
Left side from the Repository, go to Metadata
File Excel -> Right click -> Create new Excel
There will be 4 steps you need to complete.
i)                    Give Name, purpose, Description etc.
ii)                   Give the location of the excel file (Click on Browser). In File Viewer and Sheets setting, Set sheet parameters. I,e. Select the number of sheets.
iii)                 Encodinig UTF-8, if your excel sheet having header then Tick Header & fill it as 1. Tick Set heading row as column names and Click on Refreshing.  Click onfinish
iv)                 Click on finish.
Step 3:
From the databases(right side from the palette) click on postgreSQL-> Then click on postgresqlOutput component and then drag on to the panel.
Click on component.
Give all the details as shown in figure.


Host: localhost
Port: 6062( default port number of postgreSQL is 5432)
Database: iReportDummy ( You need to create database before you start designing the job)
Schema : public
Username: postgres
Password: postgres
Table : issuedummytable
Action on table : Create table if not exist (See all the options in dropdown).
Action on data : insert (see all the options in dropdown)
Step 4:
·         From the palette of components click on Processing -> select tMap component and drag it to job design. (place in between  the excel sheet component and postgresqlOutput component as shown in  figure 1.
·         Double click on “tMap” component.

 

Click on “auto Map”.  You can find the mapping lines as shown in figure.
NOTE:
There are few columns which are having null values in Excel sheet.
You need to use the below expression to get all the data from excel sheet.
From the right side, click on the column(click  small react angled one), then the expression editor will appear.
There you need to write the condition.
·         Write the similar conditions for all the output columns for which in excel sheet is having null values.
·         row1.placement_count.equals("NULL") == true ? null: new BigDecimal(row1.placement_count)  


 

Step 5:
Save the job and run it.
Check whether the table created correctly or not in postgreSQL.

Thank you :) 

Saturday 27 July 2013

Tips in Talend Open Studio


1) Excel sheet contains NULL values. When we are creating a table using the excel sheet then          we don't 'get exact out put having null values.
     In this case we use the following expression.

 row1.placement_count.equals("NULL") == true ? null: new                                      BigDecimal(row1.placement_count)  


2)