The shell script

Here's the shell script for Tuesday's tip, "Coding data cleansing in AWK."

This Content Component encountered an error

(See Tuesday's tip, Coding data cleansing in AWK, for context):



#!/bin/ksh ######################################################################### #
Data File : data.dat. ** Data Scrubber : chg.awk ** Parameter File : param.dat
##########################################################################
echo "Starting....'date '+%T''..." echo while read -r col from to do echo "doing for $col $from $to ....'date '+%T''" echo grep "$from" data.dat
>|temp_with grep -v "$from" data.dat >temp_without awk -f chg.awk COL=$col FROM=$from TO=$to temp_with
>|temp_with_clear cat temp_with_clear temp_without >|
data.dat echo "Ending $col $from $to ....'date '+%T''" echo done<
param.dat echo "ending....'date '+%T''..." /usr/bin/rm temp_with temp_without temp_with_clear The AWK Script [chg.awk]: BEGIN { FS="t" } { if ( $COL== FROM ) { for (i=1; i<
=NF; i++) { if ( i==COL ) { printf("%s",TO) if ( COL != NF ) { printf("t") } else { printf("n") } } else { printf("%s",$i); if ( i!=NF) { printf("t") } else { printf("n") } } } } else { print $0 } }
The parameter file: 3 i.b.m. IBM 2 profs. Prof. 1 professor Prof. 4 ibm IBM
[Each line indicates the transformation.
For example, the first line indicates that if 'i.b.m' is present anywhere in the file at column 3, it should be transformed into 'IBM']
This was first published in April 2002

Dig deeper on CRM industry news

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataManagement

SearchSAP

SearchOracle

SearchAWS

SearchContentManagement

Close