Dirty data is data that contains errors. This can be caused by many things. You could cause it when typing a report and misspelling words, or having a wrong data linking another field, data that is outdated, inaccurately recorded data etc. This dirty data can cause small problems but can also cause great deal of problems that could cost money. Such dirty data can cause big problems for large companies. If companies system is not working properly than the company cant ever be effective. If computer shows 500 units of expensive product and a physical count shows 420 than a company has a dirty data inventory which is could create looses for the company. Even worst if you were giving information to trading partners it is a bad impression that you are leaving. Dirty data needs to be found and stopped form further damage. Also need go find out what program is making this error.
Cleaning Dirty DataEdit
Cleaning dirty data is an involved process, it involves ordering the data, or to "elementize" (Ralph Kimball) it. After elemetizing the data for the "data cleaners"(Ralph Kimball) is to standardize it. Next you are to verify the data and ensure tha it is accurate.According to Ralph Kimball, next your can match, and household the data, meaning match the information to other info and group together those who share the same household. Finally you can document the information from the above steps. All information for this topic was retrieved from http://www.dbmsmag.com/9609d14.html, on March 20, 2007.
Impact of Dirty Data on BusinessEdit
Dirty data can cause small and significantly large problems for businesses. It can mean the difference between keeping and loosing business relationships. It can also mean taking a financial hit when it comes to have inventory problems and trying to rectify it. It gives an overall appearance to your customers that your business is not up to par and not in control enough to provide the customers with the best possible service.
Is the validity of data. It can be compromised in many ways:
- Human errors when data is entered
- Errors that occur when data is transmitted from one computer to another
- Software bugs or viruses
- Hardware malfunctions, such as disk crashes
- Natural disasters, such as fires and floods
There are many ways to minimize these threats to data integrity. These include:
- Backing up data regularly
- Controlling access to data via security mechanisms
- Designing user interfaces that prevent the input of invalid data
- Using error detection and correction software when transmitting data
http://www.webopedia.com/TERM/D/data_integrity.html March 20, 2007