Tip #2: Offline -- Archived historic data extracts

Article

Tip #2: Offline -- Archived historic data extracts

Extracting active data from operational sources (Legacy Systems, ERP, Flat Files, Non-Relational, etc) can be a major chore. Typically the extract process grapples with a gamut of challenges from locating appropriate data sources, data structures and analyzing and understanding them to sourcing, cleansing and transforming the data per the target system needs. This entire extract process is usually run on a regular and frequent basis.

But there is another extract process run only once, essential to load a data warehouse with all the off-line historical data. Historical data in this context refers to the non-active operational data, i.e. data representing previous months, quarters and years. Most operational systems do not hold more than a few weeks or month of historical data, i.e. active data. So where then does this operational data originate? Well, from a nice archived library of back-up tapes usually.

Can the extract process designed for the current release of operational sources be used to extract this (initial load) historic data? Probably, but usually with some changes. These changes are essential since the back-up tapes may represent an earlier version and release of the source system. This earlier version may have a completely different normalized data model, different entities, attributes and relationships. There may be certain attributes missing, additional attributes, unexpected data values and new codes. These are just a few examples of the

    Requires Free Membership to View

    When you register, you'll begin receiving targeted emails from my team of award-winning editorial writers on the latest customer relationship management (CRM)and call center technology issues today. Our goal is to keep you informed on the hottest issues facing this fast-changing industry.

    Hannah Smalltree, Editorial Director

    By submitting your registration information to SearchCRM.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchCRM.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

changes.

If this issue is discovered unexpectedly while trying to load back-up tapes, just prior to the production date it can lead to an embarrassing delay in the production date, or absence of promised data at launch. To address this issue unplanned extract code is to be written and tested!

To avoid this situation a back-up tape data audit/analysis should be planned and executed at the same time as the source system(s) analysis. Document findings and highlight the differences between the current operational version and the back-up tapes. In addition, make sure that all back-up tapes are available, so as to provide a continuous data history. With this information as a basis, the need for separate code and routines to extract data from the archived back-up tapes can be identified early and potential production impact avoided.

For more information, check out SearchCRM's Best Web Links on Business Intelligence and Data Analysis.