Monday, October 22, 2012

Working with disparate datasets

Probably the most time consuming and sometimes frustrating aspect of working with data from different databases is figuring out how to match this data and work with a similar data set. It is at this point you want to strangle end users who don't take the time to enter data correctly because crap data in means crap data out. We usually shoot to be able to match around 95% of the correlating data and then either have end users correct the input or create exception rules to correct for repeatable patterns. Most clients believe we spend most of our time figuring out their database design when in actuality this is much simpler a process than figuring out good data versus crap.

No comments: