I have gathered this list over time while working at different client sites. It is not necessarily my list. But it definitely is a good list. If you don’t agree with anything below, don’t follow it. Anyways, here it goes….
INFORMATICA BASIC TUNING GUIDELINES
The following points are high-level issues on where to go to perform “tuning” in Informatica’s products. These are, in no way, permanent problem solvers, nor are they the end-all solution. Just some items (which if tuned first) might make a difference. The level of skill available for certain items will cause the results to vary.
To ‘test’ performance throughput it is generally recommended that the source set of data produce about 200,000 rows to process. Beyond this – the performance problems / issues may lie in the database – partitioning tables, dropping / re-creating indexes, striping raid arrays, etc…  Without such a large set of results to deal with, your average timings will be skewed by other users on the database, processes on the server, or network traffic.  This seems to be an ideal test size set for producing mostly accurate averages.
Try tuning your maps with these steps first.  Then move to tuning the session, iterate this sequence until you are happy, or cannot achieve better performance by continued efforts.  If the performance is still not acceptable, then the architecture must be tuned which can mean changes to what maps are created).  In this case, you can contact us – we tune the architecture and the whole system from top to bottom.
KEEP THIS IN MIND: In order to achieve optimal performance, it’s always a good idea to strike a balance between the tools, the database, and the hardware resources.  Allow each to do what they do best.  Varying the architecture can make a huge difference in speed and optimization possibilities.
Utilize a database for significant data handling operations (such as sorts, groups, aggregates).  In other words, staging tables can be a huge benefit to parallelism of operations.  In parallel design – simply defined by mathematics, nearly always cuts your execution time.
 
Localize. Localize all target tables on to the SAME instance of Oracle (same SID), or same instance of Sybase. Try not to use Synonyms (remote database links) for anything (including: lookups, stored procedures, target tables, sources, functions, privileges, etc…). Utilizing remote links will most certainly slow things down.  For Sybase users, remote mounting of databases can definitely be a hindrance to performance.
If you can – localize all target tables, stored procedures, functions, views, and sequences in the SOURCE database. Again, try not to connect across synonyms. Synonyms (or remote database tables) could potentially affect performance by as much as a factor of 3 times or more.
Remember that Informatica suggests that each session takes roughly 1 to 1 1/2 CPU’s. In keeping with this – Informatica play’s well with RDBMS engines on the same machine, but does NOT get along (performance wise) with ANY other engine (reporting engine, java engine, OLAP engine, java virtual machine, etc…)
Remove any database based sequence generators. This requires a wrapper function / stored procedure call. Utilizing these stored procedures can cause performance to drop by a factor of 3 times. This slowness is not easily debugged – it can only be spotted in the Write Throughput column. Copy the map, replace the stored proc call with an internal sequence generator for a test run – this is how fast you COULD run your map. If you must use a database generated sequence number, use them as part of the source qualifier query. If you’re dealing with GIG’s or Terabytes of information – this should save you lot’s of hours tuning.
TURN OFF VERBOSE LOGGING. The session log has a tremendous impact on the overall performance of the map. Force over-ride in the session, setting it to NORMAL logging mode.  Unfortunately the logging mechanism is not “parallel” in the internal core, it is embedded directly in to the operations.
Turn off ‘collect performance statistics’. This also has an impact – although minimal at times – it writes a series of performance data to the performance log. Removing this operation reduces reliance on the flat file operations.  However, it may be necessary to have this turned on DURING your tuning exercise.  It can reveal a lot about the speed of the reader, and writer threads.
Try to eliminate the use of non-cached lookups. By issuing a non-cached lookup, your performance will be impacted significantly. Particularly if the lookup table is also a “growing” or “updated” target table – this generally means the indexes are changing during operation, and the optimizer looses track of the index statistics. Again – utilize staging tables if possible.  In utilizing staging tables, views in the database can be built which join the data together; or Informatica’s joiner object can be used to join data together – either one will help dramatically increase speed. The thing with Informatica’s Joiner Transformation is that it needs caching of the data to scan through it to find matches. In such cases, make sure you tweak the cache sizes. Unless your source tables are from different RDBMS, let the RDBMS join them.
Separate complex maps – try to break the maps out in to logical threaded sections of processing. Re-arrange the architecture if necessary to allow for parallel processing. There may be more smaller components doing individual tasks, however the throughput will be proportionate to the degree of parallelism that is applied.
Balance between Informatica and the power of SQL and the database. Try to utilize the DBMS for what it was built for reading / writing / sorting / grouping / filtering data en-masse. Use Informatica for the more complex logic, outside joins, data integration, multiple source feeds, etc…  The balancing act is difficult without DBA knowledge. In order to achieve a balance, you must be able to recognize what operations are best in the database, and which ones are best in Informatica.  This does not degrade from the use of the ETL tool, rather it enhances it – it’s a MUST if you are performance tuning for high-volume throughput.
TUNE the DATABASE. Don’t be afraid to estimate: small, medium, large, and extra large source data set sizes (in terms of: numbers of rows, average number of bytes per row), expected throughput for each, turnaround time for load, is it a trickle feed? Give this information to your DBAs and ask them to tune the database for “worst case”. Help them assess which tables are expected to be high read/high write, which operations will sort (order by) etc. Moving disks, assigning the right table to the right disk space could make all the difference.  Utilize a PERL script to generate “fake” data for small, medium, large, and extra large data sets.  Run each of these through your mappings – in this manner, the DBA can watch or monitor throughput as a real load size occurs.
Be sure there is enough SWAP, and TEMP space on your PMSERVER machine. Not having enough disk space could potentially slow down your entire server during processing (in an exponential fashion).  Sometimes this means watching the disk space while your session runs.  Otherwise you may not get a good picture of the space available during operation. Particularly if your maps contain aggregates, or lookups that flow to disk Cache directory or if you have a JOINER object with heterogeneous sources.
Place some good server load monitoring tools on your PMServer in development – watch it closely to understand how the resources are being utilized, and where the hot spots are. Try to follow the recommendations – it may mean upgrading the hardware to achieve throughput.
 
TWEAK SESSION SETTINGS. In the session, there is only so much tuning you can do.  Balancing the throughput is important – by turning on “Collect Performance Statistics” you can get a good feel for what needs to be set in the session – or what needs to be changed in the database. Read the performance section carefully in the Informatica manuals. Basically what you should try to achieve is: OPTIMAL READ, OPTIMIAL THROUGHPUT, and OPTIMAL WRITE.  Over-tuning one of these three pieces can result in ultimately slowing down your session.
Try to merge expression objects, set your lookups to unconnected (for re-use if possible), check your Index and Data cache settings if you have aggregation, or lookups being performed.  Etc…  If you have a slow writer, change the map to a single target table at a time – see which target is causing the “slowness” and tune it.  Make copies of the original map, and break down the copies.  Once the “slower” of the N targets is discovered, talk to your DBA about partitioning the table, updating statistics, removing indexes during load, etc…  There are many database things you can do here.
Remove all other “applications” on the PMServer.  Except for the database / staging database or Data Warehouse itself.  PMServer plays well with RDBMS (relational database management system) – but doesn’t play well with application servers, particularly JAVA Virtual Machines, Web Servers, Security Servers, application, and Report servers.  All of these items should be broken out to other machines. This is critical to improving performance on the PMServer machine.
Hope this helps.
Kirtan Desai
Updates
Dan Linstedt from Myers-Holum Inc made a few comments a few days ago on this post. I have added his comments below.
If you use a joiner, always use a sorted joiner – the cost of sorting data coming from the RDBMS staging area is frequently less than the cost of building the caches in place. Furthermore, if you are NOT on 64 bit Informatica, or you don’t have unlimited (seemingly unlimited) RAM, you have an upper limit to the caching mechanisms in all the cached objects – including Joiner.
If the RDBMS is NOT tuned properly, putting more work into the RDBMS will actually slow things down.
Too many Instances of an RDBMS on the SAME MACHINE will actually kill performance rather than help it, and it really is not necessary in order to handle fail-over as many set it up to do.
Replacing a Lookup with a SORTED JOINER can improve performance dramatically.
The manuals have the formula wrong for the Data / Index Cache settings, even though it’s counter-intuitive, you want 100% of the INDEX cached if you can get it, giving up Data Cache for disk. Why? Because if you can’t access RAM to check for the existence of data you are actually increasing I/O – when you increase I/O you slow performance dramatically. Set your Index caches to twice the size of your data caches to be safe.
There are many variables which can change the course of one’s actions. Feel free to ask questions on this blog.