Posts

Showing posts from December, 2016

Difference between ETL vs. ELT

Image
  Difference between ETL vs. ELT Basics ETL ELT Process Data is transferred to the ETL server and moved back to DB. High network bandwidth required. Data remains in the DB except for cross Database loads (e.g. source to object). Transformation Transformations are performed in ETL Server. Transformations are performed (in the source or) in the target. Code Usage Typically used for Source to target transfer Compute-intensive Transformations Small amount of data Typically used for High amounts of data Time-Maintenance It needs highs maintenance as you need to select data to load and transform. Low maintenance as data is always available. Calculations Overwrites existing column or Need to append the dataset and push to the target platform. Easily add the calculated column to the existing table. Analysis

What is ETL

  What Is ETL (Extract, Transform, Load) ? ETL Stands for Extract Transform and Load. Every day working at different organization/s we come across  different scenarios where we have to extract data from different sources such as Excel, Text Files, XML, Database systems etc. and then clean that data or transform that data according to our required format and save it to Destinations such as Database system or in different file types. To perform ETL task , there are different tools available in market. Some of them are listed below SQL Server Integration Services IBM Infosphere Information Server PowerCenter Informatica Talend Studio for Data Integration Oracle Data Integrator (ODI) Clover ETL Centerprise Pentaho Data Integration SQL Server Integration Services is an ETL tool that can be used to extract data from different types of sources by using Sources (OLE DB Source, File File Source, Excel Source etc.) that are built in the tool. To Transform data, SSIS Provide different ty...

Relational Data Modeling vs Dimensional Data Modeling

Image
  Dimensional Modeling Dimensional Modeling (DM)  is a data structure technique optimized for data storage in a Data warehouse. The purpose of dimensional modeling is to optimize the database for faster retrieval of data. The concept of Dimensional Modelling was developed by Ralph Kimball and consists of “fact” and “dimension” tables. A dimensional model in data warehouse is designed to read, summarize, analyze numeric information like values, balances, counts, weights, etc. in a data warehouse. In contrast, relation models are optimized for addition, updating and deletion of data in a real-time Online Transaction System. These dimensional and relational models have their unique way of data storage that has specific advantages. For instance, in the relational mode, normalization and ER models reduce redundancy in data. On the contrary, dimensional model in data warehouse arranges data in such a way that it is easier to retrieve information and generate reports. Hence, Dimensio...

Explain OLTP vs OLAP

  Difference between OLTP and OLAP OLTP System OLTP System handle with operational data. Operational data are those data contained in the operation of a particular system. Example, ATM transactions and Bank transactions, etc. OLAP System OLAP handle with Historical Data or Archival Data. Historical data are those data that are achieved over a long period. For example, if we collect the last 10 years information about flight reservation, the data can give us much meaningful data such as the trends in the reservation. This may provide useful information like peak time of travel, what kind of people are traveling in various classes (Economy/Business) etc. The major difference between an OLTP and OLAP system is the amount of data analyzed in a single transaction. Whereas an OLTP manage many concurrent customers and queries touching only an individual record or limited groups of files at a time. An OLAP system must have the capability to operate on millions of files to answer a single q...

What is Datamart?

What is Data Mart? A Data Mart is a subset of a directorial information store, generally oriented to a specific purpose or primary data subject which may be distributed to provide business needs. Data Marts are analytical record stores designed to focus on particular business functions for a specific community within an organization. Data marts are derived from subsets of data in a data warehouse, though in the bottom-up data warehouse design methodology, the data warehouse is created from the union of organizational data marts. The fundamental use of a data mart is Business Intelligence (BI) applications. BI is used to gather, store, access, and analyze record. It can be used by smaller businesses to utilize the data they have accumulated since it is less expensive than implementing a data warehouse. What is Data Mart? Reasons for creating a data mart Creates collective data by a group of users Easy access to frequently needed data Ease of creation Improves end-user response time Lowe...

What is Data Warehouse?

Image
  What is a Data Warehouse? A Data Warehouse (DW) is a relational database that is designed for query and analysis rather than transaction processing. It includes historical data derived from transaction data from single and multiple sources. A Data Warehouse provides integrated, enterprise-wide, historical data and focuses on providing support for decision-makers for data modeling and analysis. It is not used for daily operations and transaction processing but used for making decisions. A Data Warehouse can be viewed as a data system with the following attributes: It is a database designed for investigative tasks, using data from various applications. It supports a relatively small number of clients with relatively long interactions. It includes current and historical data to provide a historical perspective of information. Its usage is read-intensive. It contains a few large tables. "Data Warehouse is a subject-oriented, integrated, time-variant and non volatile store of informa...