Home Getting Started

Big News: we’ve moved to the DataRobot Community! Please keep your eye out for an email invitation to join us there. Refer to the We've Moved FAQ for a guide on how to use your existing Paxata Community account to login to our new home.

Visit the official Paxata Documentation portal for all of your doc needs.

5. Combine Your Data - Append

mikeblankomikeblanko Posts: 10 mod
edited March 1, 2019 11:04PM in Getting Started

This is part 5 of the Getting Started tutorial series. Let's now look at adding more data to our project, both adding columns and rows from similar datasets. 

Video: Combining Your Data - Append 


If you prefer to read instead of watch a video, the same steps from the above video are listed below.

Steps: Combining Your Data - Append

Append your data:

Step 1: Enter the source project of the YTD Clean Contacts dataset that we created in the previous project

Step 2: Use the append step

Step 2.1: Click on the green icon that says “dataset(click to select)” to bring in the October contacts data.


Step 2.2: To bring in a dataset into the library click the green + icon and use the Amazon S3 bucket Datasource (This is where the data lies) , open the tutorial bucket and bring in the October contacts data.


Note: You can see a preview of the dataset in the bottom right panel when you bring in the dataset. You can remove, reorder and rename columns before bringing this data on to the Paxata library.

Step 3: Once you finish import, hit the select button to bring this dataset on to Paxata.

 Step 4: Despite having different column names, Paxata allows you to select the columns that you want to append interactively. Unmatched columns get populated with blanks.

 Step 5: Our October data has an additional column called score but no country column. This maybe because October data has only US contacts. It is clear to us that we need to use all of the previous transformations, even to October data. Using the steps Panel you can move the append step – by simply clicking the append step and dragging it to the desired position

Step 6: Now we see that whitespaces has persisted in the full name column -> in order to fix this, we can select the trim leading and trailing white spaces by criteria (where data type is string) Step 7: Once the dataprep work is done on a project, we can publish this view using a New Lens. A new lens can be automated and exported by the user.



That completes the fifth of six tutorials in the Getting Started series.

Next:  Combine Your Data - Lookup/Join

  1. Create a New Data Prep Project
  2. Explore Your Data
  3. Clean and Transform Your Data
  4. Publish and Export Your Data  
  5. Combine Your Data - Append
  6. Combine Your Data - Lookup / Join


<<prev next >>
Sign In or Register to comment.