N/APosted on - 06/28/2011
Hi guys M.Smith is here I am preparing my MCS notes and during studies I am facing few problem and I want to share then with all of you may be anyone can help me in this case What is data cleaning? How can we do that? My friend Send me its answer but I am not satisfy from his answer that’s why I share my problem all of you. So please solve my problem as soon as possible. Thanks in advance.
How to properly perform data cleaning
Hi my name is John Sena.
Data cleaning is also known as data scrubbing. Data cleaning is a process which ensures the set of data is correct and accurate. Data accuracy and consistency, data integration is checked during data cleaning. Data cleaning can be applied for a set of records or multiple sets of data which need to be merged. Data cleaning is performed by reading all records in a set and verifying their accuracy. Typing and spelling errors are rectified. Mislabeled data if available is labeled and filed. Incomplete or missing entries are completed. Unrecoverable records are purged, for not to take space and inefficient operations. Methods:- Parsing – Used to detect syntax errors. Data Transformation – Confirms that the input data matches in format with expected data. Duplicate elimination – This process gets rid of duplicate entries. Statistical Methods- values of mean, standard deviation, range, or clustering algorithms etc are used to find erroneous data.