Powered By Blogger

Saturday, March 5, 2011

Faqs on DW - Page8

What are modeling tools available in the Market?

Here are a number of data modeling tools

Tool Name Company Name
Erwin Computer Associates
Embarcadero Technologies
Rational Rose IBM Corporation
Power Designer Sybase Corporation
Oracle Designer Oracle Corporation

Name some of modeling tools available in the Market?

These tools are used for Data/dimension modeling

1. Oracle Designer
2. Erwin (Entity Relationship for windows)
3. Informatica (Cubes/Dimensions)
4. Embarcadero
5. Power Designer Sybase

How do you load the time dimension?

Time dimensions are usually loaded by a program that loops through all possible dates that may appear in the data. It is not unusual for 100 years to be represented in a time dimension, with one row per day.

Explain the advantages of RAID 1, 1/0, and 5? What type of RAID setup would you put your TX logs?

Transaction logs write sequentially and don’t need to be read at all. The ideal is to have each on RAID 1/0 because it has much better write performance than RAID 5.

RAID 1 is also better for TX logs and costs less than 1/0 to implement. It has a tad less reliability and performance is a little worse generally speaking.

RAID 5 is best for data generally because of cost and the fact it provides great read capability.

What Snow Flake Schema?

Snowflake Schema, each dimension has a primary dimension table, to which one or more additional dimensions can join. The primary dimension table is the only table that can join to the fact table.

What is real time data-warehousing?

Real-time data warehousing is a combination of two things: 1) real-time activity and 2) data warehousing. Real-time activity is activity that is happening right now. The activity could be anything such as the sale of widgets. Once the activity is complete, there is data about it.

Data warehousing captures business activity data. Real-time data warehousing captures business activity data as it occurs. As soon as the business activity is complete and there is data about it, the completed activity data flows into the data warehouse and becomes available instantly. In other words, real-time data warehousing is a framework for deriving information from data as the data becomes available.

What are slowly changing dimensions?

SCD stands for slowly changing dimensions. Slowly changing dimensions are of three types

SCD1: only maintained updated values. Ex: a customer address modified we update existing record with new address.

SCD2: maintaining historical information and current information by using
A) Effective Date
B) Versions
C) Flags
or combination of these
scd3: by adding new columns to target table we maintain historical information and current information

What are Semi-additive and factless facts and in which scenario will you use such kinds of fact tables?

Snapshot facts are semi-additive, while we maintain aggregated facts we go for semi-additive.

EX: Average daily balance

A fact table without numeric fact columns is called factless fact table.

Ex: Promotion Facts

While maintain the promotion values of the transaction (ex: product samples) because this table doesn’t contain any measures

No comments: