Create key
In step-1, I have created the key
according to the rules as mentioned above against each record. For this, I had added
extra column at the end of the table to show the new key created against each
record.
Application ID
|
Applicant Name
|
Father name
|
Qualification
|
Address
|
key
|
Cs2025
|
Umair
|
ikram
|
PHD
|
Islamabad
|
CS2UMAIK
|
MG2026
|
Waseem
|
maqsood
|
MBA
|
Faisalabad
|
MG2WASMA
|
MH2027
|
Faizan
|
jawad
|
MSC
|
Lahore
|
MH2FAIJA
|
MG2026
|
Wasim
|
maqsood
|
MBA
|
Faisalabad
|
MG2WASMA
|
EN2028
|
atif
|
sohail
|
BS
|
Multan
|
EN2ATISO
|
MH2027
|
faizaan
|
jawad
|
MSC
|
Lahore
|
MH2FAIJA
|
CS2029
|
tariq
|
ali
|
MSC
|
Karachi
|
CS2TARAL
|
NOW, in step 2
Sort the data
In step-2, I have sort the record on
the basis of newly created key of step-1.
Application ID
|
Applicant Name
|
Father name
|
Qualification
|
Address
|
key
|
||||
CS2029
|
tariq
|
ali
|
MSC
|
Karachi
|
CS2TARAL
|
||||
Cs2025
|
Umair
|
ikram
|
PHD
|
Islamabad
|
CS2UMAIK
|
||||
EN2028
|
atif
|
sohail
|
BS
|
Multan
|
EN2ATISO
|
||||
MG2026
|
Wasim
|
maqsood
|
MBA
|
Faisalabad
|
MG2WASMA
|
||||
MG2026
|
Waseem
|
maqsood
|
MBA
|
Faisalabad
|
MG2WASMA
|
||||
MH2027
|
Faizan
|
jawad
|
MSC
|
Lahore
|
MH2FAIJA
|
||||
MH2027
|
faizaan
|
jawad
|
MSC
|
Lahore
|
MH2FAIJA
|
||||
Now,
in step 3
Merge
In step-3, consider the window size
(w) equal to two (2). I required to
identify the similar records on the basis of sorted key.
BSN method sliding window
Applicant id
|
Applicant Name
|
Father-Name
|
Qualification
|
Address
|
Keys
|
|
MG2026
|
waseem
|
maqsood
|
MBA
|
Faisalabad
|
MG2WASMA
|
|
MG2026
|
wasim
|
maqsood
|
MBA
|
Faisalabad
|
MG2WASMA
|
|
Applicant id
|
Applicant Name
|
Father-Name
|
Qualification
|
Address
|
Keys
|
|
MH2027
|
faizan
|
jawad
|
MSC
|
Lahore
|
MH2 FAIJA
|
|
MH2027
|
faizaan
|
jawad
|
MSC
|
Lahore
|
MH2 FAIJA
|
|
In the given table, keys are identical. Two people
with names spelled nearly but not identically, have the exact same address. We
infer they are same person. Just as in the table faizan jawad and faizaan jawad.







0 comments:
Post a Comment