When an IT company wants to make a new data warehouse, data profiling is required to understand the quality of source data. Explain about data profiling in the data warehouse.

–>It is the process to examine the existing information in a data source and then taking out the information about it. The task of data profiling is to find problems present in data.

–>Data is said to be of high quality when:

  •  It is correct that means it is consistent.
  •  It has only one meaning and no ambiguity.
  •  Only one way to convey its meaning.
  •  It has no null values.


=> It is the electronic storage of a large amount of information.

=> As data warehouse is storage of data and data profiling is to check the quality of data, so both are interlinked.

Data profiling in data warehouse includes the following steps:

1) Create the project initiation document, so that each person knows about this file which shows the expectations and the requirements.

2) Either choose an analytical or statistical tool to create the quality of data structure.

3) Analyze the sources of data.

4) Also, determine the scope of your data.

5) See the different data patterns and formats.

6) See the null values, missing values, multiple coding, etc. in the source.

7) Check the relation between foreign and the primary key for data extraction.

8) Analyze the rules of business.

