The shell script

Here's the shell script for Tuesday's tip, "Coding data cleansing in AWK."

(See Tuesday's tip, Coding data cleansing in AWK, for context):



#!/bin/ksh ######################################################################### #
Data File : data.dat. ** Data Scrubber : chg.awk ** Parameter File : param.dat
##########################################################################
echo "Starting....'date '+%T''..." echo while read -r col from to do echo "doing for $col $from $to ....'date '+%T''" echo grep "$from" data.dat
>|temp_with grep -v "$from" data.dat >temp_without awk -f chg.awk COL=$col FROM=$from TO=$to temp_with
>|temp_with_clear cat temp_with_clear temp_without >|
data.dat echo "Ending $col $from $to ....'date '+%T''" echo done<
param.dat echo "ending....'date '+%T''..." /usr/bin/rm temp_with temp_without temp_with_clear The AWK Script [chg.awk]: BEGIN { FS="t" } { if ( $COL== FROM ) { for (i=1; i<
=NF; i++) { if ( i==COL ) { printf("%s",TO) if ( COL != NF ) { printf("t") } else { printf("n") } } else { printf("%s",$i); if ( i!=NF) { printf("t") } else { printf("n") } } } } else { print $0 } }
The parameter file: 3 i.b.m. IBM 2 profs. Prof. 1 professor Prof. 4 ibm IBM
[Each line indicates the transformation.
For example, the first line indicates that if 'i.b.m' is present anywhere in the file at column 3, it should be transformed into 'IBM']
This was first published in April 2002

Dig deeper on CRM industry news

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchBusinessAnalytics

SearchDataManagement

SearchSAP

SearchOracle

SearchAWS

SearchContentManagement

Close