Visit the official Paxata Documentation portal for all of your doc needs.

Identifying Patterns

ernzdlernzdl Posts: 11
edited March 5, 2019 11:05PM in Data Prep Q&A
How do I find out if a dataset with telephone numbers have different area codes but have the same phone number?
For example, "+1 415 123 12 12" vs "+90 415 123 12 12". 

Answers

  • AkshayAkshay Posts: 94 admin

    Hello Eren,

    The solution to your problem is as follows:

    Step 1: Start the project with the Phone number dataset

     Now there are two ways to go about this:

    In order to check if a Phone number has more than own country code associated with this.

    Step 2:  Use the shape function on Paxata to perform a deduplicate and a group by operation.

    Step 2.1: Perform a deduplicate on Country code, Phone number to get rid of any repeated occurrence of a Country code, phone number pair.

    Step 2.2: Do a group by on Phone number and do a count on country code.


    Step 3: Use an if statement via the compute feature we can check if a duplicate exists on this column.


    I hope this answers your question!

    Thanks,

    Akshay

     


  • Thank you so much Akshay!
Sign In or Register to comment.