VUSuperior Chat Room

Tuesday, 17 February 2015

CS614 Data warehousing Assignment No.3 Solution (GRADED) Due Date 16 Feb, 2015

 Create key
In step-1, I have created the key according to the rules as mentioned above against each record. For this, I had added extra column at the end of the table to show the new key created against each record.
Application ID
Applicant Name
Father name
Qualification
Address
key
Cs2025
Umair
ikram
PHD
Islamabad
CS2UMAIK
MG2026
Waseem
maqsood
MBA
Faisalabad
MG2WASMA
MH2027
Faizan
jawad
MSC
Lahore
MH2FAIJA
MG2026
Wasim
maqsood
MBA
Faisalabad
MG2WASMA
EN2028
atif
sohail
BS
Multan
EN2ATISO
MH2027
faizaan
jawad
MSC
Lahore
MH2FAIJA
CS2029
tariq
ali
MSC
Karachi
CS2TARAL
NOW, in step 2
Sort the data
In step-2, I have sort the record on the basis of newly created key of step-1.
Application ID
Applicant Name
Father name
Qualification
Address
key
CS2029
tariq
ali
MSC
Karachi
CS2TARAL
Cs2025
Umair
ikram
PHD
Islamabad
CS2UMAIK
EN2028
atif
sohail
BS
Multan
EN2ATISO
MG2026
Wasim
maqsood
MBA
Faisalabad
MG2WASMA
MG2026
Waseem
maqsood
MBA
Faisalabad
MG2WASMA
MH2027
Faizan
jawad
MSC
Lahore
MH2FAIJA
MH2027
faizaan
jawad
MSC
Lahore
MH2FAIJA
Now, in step 3
Merge
In step-3, consider the window size (w) equal to two (2). I  required to identify the similar records on the basis of sorted key.
BSN method sliding window

Identify keys
Applicant id
Applicant Name
Father-Name
Qualification
Address
Keys
MG2026
waseem
maqsood
MBA
Faisalabad
MG2WASMA


MG2026
wasim
maqsood
MBA
Faisalabad
MG2WASMA



Applicant id
Applicant Name
Father-Name
Qualification
Address
Keys
MH2027
faizan
jawad
MSC
Lahore
MH2 FAIJA


MH2027
faizaan
jawad
MSC
Lahore
MH2 FAIJA


In the given table, keys are identical. Two people with names spelled nearly but not identically, have the exact same address. We infer they are same person. Just as in the table faizan jawad and faizaan jawad.

0 comments:

Post a Comment