Powered By Blogger

Thursday, June 21, 2012

New Features of Informatica-9

1. Informatica 9 supports data integration for the cloud as well as on premise. You can integrate the data in cloud applications, as well as run Informatica 9 on cloud infrastructure.

2. Informatica analyst is a new tool available in Informatica 9.

3. There is architectural difference in Informatica 9 compared to previous version.

4. Browser based tool for business analyst is a new feature. 

5. Data steward is a new feature.

6. Allows unified administration with a new admin console that enables you to manage power centre and power exchange from the same console.

7. Powerful new capabilities for data quality.

8. Single admin console for data quality, power centre, power exchange and data services.

9. In Informatica 9, Informatica data quality (IDQ) has been further integrated with the Informatica Platform and performance, manageability and reusability have all been significantly enhanced.

10. The mappings rules are shared between the browser based tool for analysts and the eclipse based development leveraging unified metadata underneath.

11. The data services capabilities in Informatica 9 , both over sql and web services ,can be used for real time dash boarding.

12. Informatica data quality provides world wide address validation support with integrated geocoding.

13. The ability to define rules and view and run profiles is available in both the Informatica developer (Thick client) and Informatica analyst (browser based tool-Thin client).these tools sit on a unified metadata infrastructure. Both tools incorporate security features like authentication and authorization ensuring..

14. The developer tool is now eclipse based and supports both data integration and data quality for enhanced productivity. It provides browser based tool for analysts to support the types of tasks they engage in, such as profiling data, specifying and validating rules & monitoring data quality.

15. There will a velocity methodology. Soon it’s going to introduce on I9.

16. Informatica has the capability to pull data from IMS, DB2 on series and series and from other several other legacy systems (Mainframe) environment like VSAM, Datacom, and IDMS etc.

17. There are separate tools available for different roles. The Mapping architect for Vision tool is designed for architects and developers to create templates for common data integration patterns saving developer’s tremendous amount of time.

18. Informatica 9 does not include ESB infrastructure.

19. Informatica supports open interfaces such as web services and can integrate with other tools that support these as well including BPM tool. 

20. Informatica 9 complements existing BI architectures by providing immediate access to data through data virtualization, which can supplement the data in existing data warehouse and operational data store.

21. Informatica 9 supports profiling of Mainframe data. Leveraging the Informatica platform’s connectivity to Mainframe sources.

22. Informatica 9 will continue support feature of running the same workflow simultaneously.

23. Eclipse based environment is build for developers.

24. Browser based tool is a fully functional interface for business analysts.

25. Dashboards are designed for business executives.

26. There are 3 interfaces through which these capabilities can be accessed. Analyst tool is a browsed tool for analyst and stewards. Developers can use the eclipse based developer tool. Line of business managers can view data quality scorecards.

Wednesday, June 6, 2012

Creating dummy data

 Just going through oracle blogs, and got focus on below site...http://www.oracle-developer.net/display.php?id=515 

Most of the times when you are practicing Oracle or Informatica code, you require a dummy data to be available at table. Particularly when you are working on Performance testing of oracle and informatica, large data is required. How will you create it ? One way is insert some records, then take union of it and again union of it. and so on.. But this will create repetation of data in your table.. 

Best way is below



Create table test as
SELECT ROWNUM                     AS id
       ,      MOD(ROWNUM,2000)           AS grp
       ,      DBMS_RANDOM.STRING('u',5)  AS val
       ,      DBMS_RANDOM.STRING('u',30AS pad
       FROM   dual
       CONNECT BY ROWNUM <= 1000000;



How to log errors while performing DML Operations through sql ?


1. First create source and target table
create table insertTestSourec
(
    id varchar(200)   
)


create table insertTestTarget
(id number
)

2. Create error log table using proc create_error_log
BEGIN
  DBMS_ERRLOG.create_error_log (dml_table_name => 'insertTestTarget');
END;

insert into insertTestSourec values(1)
insert into insertTestSourec values(2)
insert into insertTestSourec values('t')


insert into insertTestTarget
select * from insertTestSourec

Here, statement will get rollback and not a single record will get inserted into target table.

Check your error log table using below sql

select *
from all_tables
where table_name like upper('%insertTestTarget%')

3. Verify the error log table
insert into insertTestTarget
select * from insertTestSourec
LOG ERRORS INTO ERR$_INSERTTESTTARGET ('INSERT') REJECT LIMIT UNLIMITED;

4. Here, after execution of above statement, 2 records (1 and 2) will get inserted into target table and incorrect record will get inserted into ERR$_INSERTTESTTARGET.

select *
from ERR$_INSERTTESTTARGET

String Aggregation Techniques

Query at ReportExtract

You have an employee table, and user will give department number and from which you need to fetch out the respective employees.

select *
from employees
where department_id in (
select distinct department_id
from departments
where department_id = '&DepID'
)

But, twist is, if department table do not have any records, query should return all employee records.

How will you do it? below is the solution

create table departmentsBKP
as
select * from departments

select *
from employees
where department_id in (
select distinct department_id
from departmentsBKP
where department_id = '&DepID'
)
union
select *
from employees
where 0 = (select count(*) from departmentsBKP)

delete from departmentsBKP

and now again if u fire select statement, you will get all records of employees table.

Changing data types of columns...



You have one table say ABC, and in that you have 4 cols which are having data type as varchar2(200).
Requirement is , you need to convert the data types of these columns to Number. This table has a data.
How will you change the data types of these columns ?
Note:- You can not create backup table in production...
----------------
CREATE TABLE ABC(COL1 VARCHAR2(200),COL2 VARCHAR2(200),COL3 VARCHAR2(200),COL4

SET DEFINE OFF;
Insert into HR.ABC
   (COL1, COL2, COL3, COL4)
 Values
   ('1', '1', '1', '1');
Insert into HR.ABC
   (COL1, COL2, COL3, COL4)
 Values
   ('2', '2', '2', '2');
Insert into HR.ABC
   (COL1, COL2, COL3, COL4)
 Values
   ('3', '3', '3', '3');
COMMIT;


---------------

TABLE ABCADD

UPDATE (COL11 NUMBER, COL22 NUMBER, COL33 NUMBER, COL44 NUMBER)
---------------

ABCSET COL44 = TO_NUMBER(COL4)

ALTER

UPDATE ABC SET COL33 = TO_NUMBER(COL3)

UPDATE ABC SET COL22 = TO_NUMBER(COL2)


UPDATE ABC SET COL11 = TO_NUMBER(COL1)
----------------

ALTER  TABLE ABC DROP(COL1, COL2, COL3, COL4)
----------------

ALTER TABLE ABC RENAME column COL44 TO COL4 

ALTER TABLE ABC RENAME column COL33 TO COL3 

ALTER TABLE ABC RENAME column COL22 TO COL2 

ALTER TABLE ABC RENAME column COL11 TO COL1


---------------

1. Here, you need to create dummy columns with number datatype at abc, because you can not create column with same name again. 
2. You converted data using to_number from varchar to number and updated the newly added columns
3. You dropped the old columns
4. You rename newly added columns 

Things to verify...


Lookup Over-ride as clause and port map
Un-connected Lookup Input
Session Variables use
Indirect File Read What if select has more columns that ports available at Lookup and Source Qualifier 

XML of mapping, session, workflow
Mapping, Session, Workflow parameter and variable working
What all things we can parameteriazed ?
Interview Question of Records count - use of aggregator and joiner, second answer to earlier question.
Normalizer, want to trasfer 2 colums as 2 rows. But rest all columns are same
Issue of performance related to birwarkar
in what order we need to perform order by at lookup over-ride
auto option of memory...
full outer join of joiner 

Unix:
 grep "NK2" RDBJANERMM_BRK2_20110916_L1_20110917013839_risk.csv |  grep "USD" | cut -d"," -f3 | uniq

1) Command for listing only the files with some data =>
 find . -size +0c -ls | grep BFE2

2) Command for listing all the files but sorted according to their Size =>
 ls -lrS | grep BFE2
3) To convert "," to ";" fron whole file
First open the file using => view temp.csv
 %s/,/;/g
4) To display 12th colum from a .csv file, with condition that file "risk.csv" contains "EQUIP:ATFV"
 grep "EQUIP:ATFV" risk.csv | cut -d"," -f12
5) To view only directories
  ls -lrt | grep ^d
6) To delete roes from 3 to 17 in vi editor
 3,17d
7) To delete a row in vi editor
 press d twice with the cursor in the row to be deleted.

How will you rename a table



I have created new table say testTable

create table testTable
(
    id number
);

Renamed that to RenameTestTable

rename testTable to RenamedTestTable

When u try to select data from original table, it gives error as table does not exists.

select *
from TestTable

When you select data from renamed table, command gets succeeded.

select *
from RenamedTestTable

Renaming works even if table has the data

drop table RenamedTestTable

create table testTable
(
    id number
);


insert into testTable values(1);

rename testTable to RenamedTestTable


select *
from RenamedTestTable

Now the question is, what happen if table has

1. Index defined on it
2. Table has synonym defined in another schema
3. Table is getting used in Function or Procedure
4. Table is defined in view query.

We will test these scenarios one by one.

1. If table has an index defined on it.....

drop table RenamedTestTable;
create table testTable
(
    id number
);
create unique index IdxTestTable
on testTable(id) ;
select *
from user_ind_columns
where index_name = upper('IdxTestTable');
insert into testTable values(1);
rename testTable to RenamedTestTable;
select *
from user_ind_columns
where index_name = upper('IdxTestTable');
select *
from RenamedTestTable;

So, if index is defined on a table and you are renaming that particular table, nothing will happen to index. At metedata view of index, table name entry  will get change simply.


2. If table has a synonym defined ...

I logged in as sys user/schema and created synonym as below

select *
from hr.testTable

create synonym testTable for hr.testTable

select *
from testTable

select *
from all_synonyms
where synonym_name = upper('testTable')

When I re-named a table at hr schema, and fired select statement at sys; got an error as "ORA-00980: synonym translation is no longer valid".

As synonym is created in sys schema, there is no changed in synonym entry.


3. If table is getting used in a procedure/function ....

I created table and procedure as below

Table Creation :
drop table RenamedTestTable

create table testTable
(
    id number
);

create unique index IdxTestTable
on testTable(id) ;

Procedure Creation :
create procedure TableTest
as

    vCount number;

begin

    select count(*) into vCount
    from testTable;
   
    dbms_output.put_line('Record Count Is ==> ' || vCount);

end;

Note: if ur using toad, just select whole code and press F5, procedure will get compile
Also, note here, procedure and table names are same. It allows because both are different db objects. I might not able to create table or proedure with same name

Running a procedure:
begin

TableTest;

end;

make your dbms_output on at table. If you are using sql *, then type "setserveroutput on".

 You will get 0 as a result.

select *
from all_procedures
where  owner = 'HR'

If you check that proc at all_procedures, amazingly proc name is null and object name it displays as proc name. Not sure how.

select *
from all_objects
where owner = 'HR'
and object_name like 'T%'
If you check that in all_objects, it shows correctly - object type as table different and object type as procedure different. and status of table and procedure is valid.

Now I will rename the table.
rename testTable to RenamedTestTable

fired the below query

select *
from all_objects
where owner = 'HR'
and object_name like 'T%'

Tables vanished from result set and procedures status became InValid.

Just try to run the proc...
begin

TableTest;

end;

Note: to run the proc at toad, just select the above code and do Ctr+Enter

Got an error as below
ORA-06550: line 3, column 1:
PLS-00905: object HR.TABLETEST is invalid
ORA-06550: line 3, column 1:
PL/SQL: Statement ignored

Here, object tabletest is reffered as procedure and not a table, because tabletest table object is no longer exists.


4. If table is getting used in a view ...

drop table RenamedTestTable

create table testTable
(
    id number
);


create unique index IdxTestTable
on testTable(id)

create view testTableView
as
select * from testTable

select *
from all_views
where view_name = upper('testTableView')


select *
from all_objects
where object_name = upper('testTableView')

View as an object is with valid status.

Now, I will rename the table

rename testTable to RenamedTestTable

Now, view became invalid
select *
from all_objects
where object_name = upper('testTableView')

I will try to select from view now
select *
from testTableView;
It gave error as [ORA-04063: view "HR.TESTTABLEVIEW" has errors]

issues

can we over-ride sql over-ride at re-usable Source Qualifier at session level ?

Ans: You can not create re-usable source qualifier tranformation. If you have a sq tranformation, you can over-ride its sql query at session level.

Suppose you have un-connected lookup, having lookup override as below

SELECT DEPARTMENTSBKP.DEPARTMENT_NAME as DEPARTMENT_NAME, DEPARTMENTSBKP.DEPARTMENT_ID as DEPARTMENT_ID FROM DEPARTMENTSBKP
where DEPARTMENTSBKP.DEPARTMENT_ID = 20

You used this un-connected lookup at Mapping and there you changed the query to

SELECT DEPARTMENTSBKP.DEPARTMENT_NAME as DEPARTMENT_NAME, DEPARTMENTSBKP.DEPARTMENT_ID as DEPARTMENT_ID FROM DEPARTMENTSBKP
where DEPARTMENTSBKP.DEPARTMENT_ID = 10

When you run the session, always over-ridden query (that is dep = 10) will get executed. You can change the conditions, and some properties at mapping level of re-usable lookup. You can not change existing ports at mapping level. if you want to change the ports, you need to go back to Tranformation Developer.

In case of aggregator, if we do not select any port as a group by what will be the output? In my case only 1 rows got through...

What will happen if we do not use aggregate function and only 2-3 grup by ports, what is the output and how it works exactly ?

Please refer below comments for the answers of aggregation at Informatica.

TeraData Basics

Overview of Space

Permanent : Perm space allocated to a user or database is a uniquely defined, logical repository for
1. Data Tables
2. Table Hearders (2 header rows per table)
3. Secondary Indexes (SI) [Primary Indexes are part of the table]
4. Join Indexes (JI)
5. Hash Indexes (HI)
6. Permanent Journals
7. Stored Procedures
8. Triggers
9. User Defined Functions (UDF)
10. User Defined Methods (UDM)

When an object is created or data rows are inserted, the space is allocated as needed from the perm space of the immediate owner. The space is returned automatically when no longer needed.

Note: The space allocated to a user or database that is un-assigned to an object is available for spool or temp space.

A database or user with no perm space can still own views, macros, and triggers but can not have objects that require space such as tables, UDFs, stored procedures, HIs, JIs, or journals.

Spool: Spool space holds intermediate query results or formatted answer sets to queries and volatile tables. The system can use unassigned per space for spool space.

When a user creates a new users or databases, the amount of spool space for those new objects must not exceed the spool space limit of their immediate owner. If you do not specify a spool space limit, a new user automatically inherits the same amount of spool as its parent.

To more easily manage spool space, define the value for spool in a profile. You can then assign a profile to users and all of the users will inherit the spool space definition.

Temporary (Temp) : Temp space defines the number of bytes the system will user to store data for global temporary tables. The value you specify must not exceed the value of the immediate parent at the time of creation. If you do not specify a value, the maximum vgalue defaults to that of the parent.

To more easily manage temp spcae, define the value for it in a profile. You can then assign a group of users to this profile and all of the users will inherit thje tmep space definition.



Permanent Space Availability
Initially, all available permanent space in the system is allocated to user DBC. From user DBC, you create other users who use up some of the space owned by DBC. As you create additional new databases, or users under those users, permanent space limits are deducted from the available (unused) space of the immediate owner of that database or user. If those users use up all the space allocated to them by their owner, those users can not create new objects untill they obtain more space.

Note : Unused space is allocated dynamically for temporary or spool space, which can reduce the actual amount of perm space available at a given point in time. Thus, specifying a PERM limit does not guarantee that a session can get all of that space upon demand.

To determine the amount of perm space available by a per-AMP basis, query the DBC.DiskSpace view. To obtain information for all AMPs on your system, use the SUM aggregate. For more information, see DiskSpace[V][X] in data dictionary.

Permanent Space Allocation

Permanent space is dynamically acquired by data blocks and cylinders when the system inserts rows. The data block is a disk-resident structure that contains one or more rows from the same table and is the physical I/O unit for the file system.Data Blocks are stored in physical disk sectors or segments, which are grouped in cylinders.

The total space of your entire Teradata Database configuration is derived by summing all database and user PERM limits. Total unallocated space is equal to the sum of all permanent limits minus the sum of all permanent space currently in use.

You can use the DBC.DiskSpace view to termine how much permanent spacei s being allocated to which database or user. The following table lists the fields that track perm space values.

CURRENTPERM :- The total number of bytes (including table headers) currently allocated to existing data tables, index tables, and subtables, stored procedures, triggers and permanent journals residing in a particular database/user.

This value is maintained on each AMP

MAXPERM : The maximum number of bytes available for storage of all (current and future) data tables, index tables, and subtables, stored procedure, triggers, and permanent journals owned by a particular database/user.

Note: For each database or user reported, the number of permanent bytes is divided by the number of AMPs in the configuration. The result is recorded on each AMP and may not be exceeded on that AMP. That is, a user may own several objects, as long as the combined storage requirements are within the MaxPerm limit set for that user on each AMP.

PEAKPERM: The largest number of bytes ever used to store data in a user or database since the last reset of this value to zero.

This value is maintained on each AMP.

To reset PeakPerm value to Zero, use the DBC.ClearPeakDisk macro.

Specifying Spool Space Limits:

Spool space is permanent space not assigned to any database or user and remains spool until permanent space is required.

The system needs spool space for TJs or to create spool files for processing queries when returning an answer. Spool space is especially important for queries that involves full table scans or user NUSIs.

The system uses spool space for the following :

. The response rows of every query run by that user during a session. Thus, each user needs a high enough spool allocation to contain the biggest anticipated answer set.

. Very large tables. Large tables usually require more available spool space than smaller tables, because intermediate rows are held in spool space during query execution.

. Large Volatile tables. These tables require more available spool space.

Spool Space Allocation

Teradata database allocates spool space dynamically only from disk cylinders that are not being used for permanent or temporary data.

Note: Permanent, temporary and spool data blocks can not coexist on the same cylinder.

Spool space is not reserved. All unused perm space in the Tearadata Database is considered available spool space. When spool is released, the file system returns the cylinders it was using to the free cylinder list.

Limiting spool space for a user helps reduce the impact of possibly bad queries. You set spool space limits for a database, a user, or a profile to limit spool space for a user, but you can not limit spool space at table level.

Defining Temporary Space Limits

Temporary space is used to hold rows of materialized global temporary tables. It is allocated at the database or user level, but not the table level.

You define a temporary space limit with the TEMPORARY parameter of a CREATE/MODIFY PROFILE or CREATE / MODIFY UISER/DATABASE statement.

note: A profile definition overrides any user definition. it does not append settings to the definition.

Query the DBC.DiskSpaceX view to find the system levels for temporary space. When using the CREATE USER, CREATE DATABASE or CREATE PROFILE statement to assign temporary space limits, keep your space limits in mind. Query the DBC.DiskSpaceX view to find the system levels for temporary space.

The following table describes the different types of temporary space.

CURRENTTEMP : This is amount of space currently in use by Global Temporary Tables.

PEAKTEMP : This is the maximum temporary space used since the last session.
Temporary space is released when the session terminates.

MAXTEMP : MaxTemp specifies the limit of space available for global temporary tables.

The may not exceed the limit of
1. The creator or modifier of the profile, when setting TEMPORARY in a profile.
2. The immediate owner of the user being created or modified, if a profile does not apply.

If you do not specify a value and the user is associated with a profile, MaxTemp defaults to the value of the profile, if defined. If the profile TEMPORARY is set to NULL or NONE, or the user is not associated with a profile, MaxTemp defaults to the value of teh parent of the user.


Increasing Space By Giving OwnerShip

This section describes how to increase the permanent space limits of a database or user by transferring a database or user to another database or user.

Transferring OwnerShip

You can transfer both database or users, but to simplify the discussion, this section describes databases. Note that you can easily replace any instance of the word "Database" in the following section with the word "user". Also note that databases can own users and users can own databases.

The GIVE statement transfers ownership of a database, including all the databases owned by that database. In addition, you transfer the permanent space limits defined for that database.

When you give a database to another database in the hierarchy, only the permanent space limit for that database is transferred. The spool and temporary space limits remain the same. All descendents of the given database remain descendents of that given database.

When you drop a database, its permanent space limit is created to its immediate owner. however, its spool space and temporary space is not credited to its immediate owner (that is, the immediate spool and temporary space allocation of the owner remains unchanged)

GIVE does not transfer and change explicit privileges.

Transferring Permanent Space

With Create, Give and Drop, you can transfer the permanent space of one database or user to another. This is particularly useful if you wish to transfer permanent space from a child of a child back to user DBC when user DBC is not the immediate owner.

For example, assume the following hierarchy:


Also assume that:
. F has a MAXPERM of 10, a MAXSPOOL of 50, and a MAXTEMP of 25
. E has a MAXPERM of 10, a MAXSPOOL of 20 and a MAXTEMP of 15

To increase permanent space for E:

1. From space owned by F, create temporary database X with MaxPerm of 5:
Create Database X from F as PERM = 5;
The default is to allocate to a new database the same spool and temp space as its owing database, so MAXSPOOL for X defaults to 50 and MAXTEMP for X defaults to 2. The PERM allocation for X is taken from the space of its owning database; thus, the MAXPERM of F is reduced to 5.

2. Give X to E by using the GIVE statement to transfer ownership.
GIVE X To E;

3. Drop X with the following statement :

Drop Database X;

This increases the MAXPERM of E to 15.

The MAXSPOOL and MAXTEMP of E are unchanged at 20 and 15, respectively.