DATA WAREHOUSE OPERATIONS


Data Warehouse Operations | Data Warehouse Tutorial | Minigranth

Data Warehouse Operations : Introduction

  • Any data warehouse will consist of random data which will surely be in unstructured manner with a lot of unwanted and dirty data. Dirty data refers to incomplete and noisy data containing errors.
  • To make this data structured and noise free, dirty data needs to be removed. This will help in converting data into useful information and can be achieved using certain data warehouse operations. These operations are combination of ETL(Extraction, Transform, Loading) operations along with data cleaning and data refresh operations.

This image describes various data warehouse operations by categorizing them on the basis of their functionalities.
Data Warehouse Operations : Types
 

1. Data Cleaning

  • In data cleaning, inconsistencies are removed. Also, noisy data containing errors are also rectified.
  • For example : Cleaning of redundant(duplicate) data.

2. Data Refresh

  • In data refresh operation, data in data warehouse is refreshed by broadcasting the data from multiple sources and updating it on timely basis. This is done because, data inside data bases are updated every minute and to get this same data on data warehouse, the process of refreshing is performed.

This image describes the refresh operation used in data warehouse to reload the things.
Data Refresh : Data Warehouse Operations
 

3. Extraction of Data

  • Data obtained after cleaning and refresh is still unstructured and unorganized. To make it organised and enable user to extract and retrieve relevant data is done through data extraction process. This is helpful, if any user wants to mine the data.
  • Data extraction can be classified as:
This image describes the data extraction that is done data warehouse and is categories on the basis of logical and physical implementation.
Data Extraction : Data Warehouse Operations
 

4. Transformation of data

  • Data obtained through heterogeneous data bases have native structure of their respective databases that might be different from that structure of data warehouse. So, transformation of data from heterogeneous database is done to organize data in the structure similar to that of the data warehouse.

5. Data Loading

  • Data loading is responsible for loading the data to its respective target data repository that might include data bases, data marts data warehouses etc.

This image describes the data loading operation which is used in data warehouse.
Data Loading : Data Warehouse Operations