Home Getting Started

Big News: we’ve moved to the DataRobot Community! Please keep your eye out for an email invitation to join us there. Refer to the We've Moved FAQ for a guide on how to use your existing Paxata Community account to login to our new home.

Visit the official Paxata Documentation portal for all of your doc needs.

3. Clean and Transform Your Data

mikeblankomikeblanko Posts: 10 mod
edited February 12, 2019 12:27AM in Getting Started

This is part 3 of the Getting Started tutorial series.  You just learned how to create a new project and how to use Paxata to explore the content of your data.  Now, let's take a look at only a few of the many transformation and cleaning features that make Paxata the best data preparation tool in town!

Video: Cleaning and Transforming Your Data

 If you prefer to read instead of watch a video, the same steps from the above video are listed below.

Steps: Cleaning and Transforming Your Data

Let's continue from where we left off in the previous tutorial:

Step 1: Let’s get rid of the white spaces in the country column – use the column operations menu (the drop-down option present on the column header bar) and select the trim leading and trailing option

You will now notice that all the existing filtergram values also change based our action, and data inconsistency on the country column is no longer an issue.

Step 2: Now, we notice that there is a Mixed case issue in both the City and State or Province column. In order to fix this, select the Change into Capital Case option on the column operations menu.

Step 2.1: Expand the column selector and select the columns for which you want to apply this change into Capital Case operation

Step 3: Now let’s solve the data sparsity issues in the Age and Gender Column.

Step 3.1: Pull up a filtergram on the age column and select all the blank values.

Step 3.2: Click on the remove rows tool from the tools bar on the left to get rid of these blank values.

Step 3.3: Now use the find and replace option from the column operations menu on the Gender column

Step 3.4: Replace the blanks with the text value “unknown” so that the entire column is populated

Step 4: Now you can export this dataset to a Paxata supported datasource or to your local desktop! 


That completes the third of six tutorials in the Getting Started series.

Next:  Publishing and Exporting Your Data

  1. Create a New Data Prep Project
  2. Explore Your Data
  3. Clean and Transform Your Data
  4. Publish and Export Your Data  
  5. Combine Your Data -  Append
  6. Combine Your Data - Lookup /Join


<<prev next >>
Sign In or Register to comment.