Home Data Prep Q&A

Paxata Community Members: Something special in a community experience is coming your way. Stay tuned to this space.
In the meantime, check out the brand new Data Prep for Data Science topic here and the new DataRobot Community.

Visit the official Paxata Documentation portal for all of your doc needs.

How can I append to a filtered datasource?

I have dataset coming from a Sharepoint list.  I need to lookup associate data based on a date in that list to multiple other datasets.
For Example:
  • For items created in September, lookup based on a September Hierarchy dataset
  • For items created in October, lookup based on an October Hierarchy dataset
I've been trying to filter the dataset during the append step (like can be done for calculations), but I cannot seem to get it right.
Tagged:

Best Answer

Answers

  • sayyarsayyar Posts: 24 ✭✭
    Hi @StrobelJones
    We could do this a couple of ways in Paxata based on how the data is coming in. Would you please answer the below questions so that I could guide you towards the right solution?

    1. September Hierarchy Dataset and October Hierarchy Dataset - are they different versions of the same dataset OR are they two different datasets? What will happen for the month of November? 
    2. How is the source data coming in? Is it an incremental load or a full load every time this lookup needs to happen?
    If both sides are incremental load, we could look up the incremental load with incremental load and then combine the outputs iteration over iteration to build a master list. If one side is incremental, while the other is full load, we will have to convert the incremental side to a master list by appending with the previous master list. Once both sides have the same granularity of data (incremental or full list) a lookup could be performed based on the month and associate id. 

     Please let me know. 
    Thank you,
    Shyam Ayyar
  • Right now, the Hierarchy data is loaded in as different datasets, but I could change that to be v1, v2, OR I could append together so that it's a single table and add a column to identify what month it is.

    The other source data is a full load every time as we are looking at other fields on the records that change over time.
Sign In or Register to comment.