DataStage_Info
Sunday, 1 November 2015
Partitioning Techniques
div dir="ltr" style="text-align: left;" trbidi="on">
Round Robin:- the first record goes to first processing node, second record goes to the second processing node and so on….. This method is useful for creating equal size of partition.
In this data partitioning method
the data splits into various partitions distribute across the processors.
The data partitioning techniques are
a)
Auto
b)
Hash
c)
Modulus
d)
Random
e)
Range
f)
Round Robin
g)
Same
The default partition technique is Auto.
Round Robin:- the first record goes to first processing node, second record goes to the second processing node and so on….. This method is useful for creating equal size of partition.
Hash:- The records with the same values
for the hash-key field given to the same processing node.
Modulus:-
This partition is based on key column module. This partition is similar to hash
partition.
Random:-
The records are randomly distributed across all processing nodes.
Range:-
The related records are distributed across the one node . The range is
specified based on key column.
Auto:-
This is most common method. The data stage determines the best partition method
to use depending upon the type of stage.
It is normal text file. it is
having the information about the processing and storage resources that are
available for usage during parallel job execution.
The default configuration file is having like
The default configuration file is having like
a)Node: it is logical processing unit which performs all ETL operations.
b)Pools: it is a collections of nodes.
c)Fast Name: it is server name. by using
this name it was executed our ETL jobs.
d)Resource disk: it is permanent memory
area which stores all Repository components.
e)Resource Scratch disk:it is temporary
memory area where the staging operation will be performed.
difference between server jobs and parallel jobs
Server jobs:-
a) In server jobs it handles less volume of data with more performance.
b) It is having less number of components.
c) Data processing will be slow.
d) It’s purely work on SMP (Symmetric Multi Processing).
e) It is highly impact usage of transformer stage.
Parallel jobs:-
a) It handles high volume of data.
b) It’s work on parallel processing concepts.
c) It applies parallism techniques.
d) It follows MPP (Massively parallel Processing).
e) It is having more number of components compared to server jobs.
f) It’s work on orchestrate framework
Tuesday, 22 September 2015
1.DataStage Designer.
2.DataStage Administrator.
3.Datastage Director.
1.Datastage Designer:
=>It is used to design jobs.
=>All DataStage Activities are done in this Job.
=>It is used for the import and export the projects to view and
edit the contents of the Repository.
=>For a DataStage Designer, He should know this part very well.
2.DataSatge Administartor:
=>It is used to create the project.
=>Delete the projects.
=>setting the environment variables and also add user
environment variables.
=>This is handle by DataStage Administrator.
3.DataSatge Director:
=>It is used to run a Job.
=>Scheduling the Job.
=>This is handled by DataStage Developer/Operator.
=>Its is a GUI tool.
=>It was introduced in the year 1997 with the name Dataintegrator and the company introduced this product is Vmark in UK.
=>Vmark introduce the products as an normal ETL Tool with great aspects, They have changed the product name as DataStage and the changed there company name too as Ascential. They named as product as Ascential DataStage.
=>That later on they developed the product by Integrating with orchestrate tools and MKS tool kit.In the year 2005 IBM taken the product dataStage changed name IBM DataSatge and fix many Bug's.
=>IBM is the brand name as well known many people in the world. They will marked in the product is huge, Informatica ETl tool is the competitor of DataStage.
=>Now we call as IBM Infospher DataStage.
=>Initially it started from 5.0 Version Later some new versions are launched in to the market, Now IBM has released latest version 11.5.
Subscribe to:
Comments (Atom)

