3. Clean and Transform Your Data
This is part 3 of the Getting Started tutorial series. You just learned how to create a new project and how to use Paxata to explore the content of your data. Now, let's take a look at only a few of the many transformation and cleaning features that make Paxata the best data preparation tool in town!
Video: Cleaning and Transforming Your Data
If you prefer to read instead of watch a video, the same steps from the above video are listed below.
Steps: Cleaning and Transforming Your Data
Let's continue from where we left off in the previous tutorial:
Step 1: Let’s get rid of the white spaces in the country column – use the column operations menu (the drop-down option present on the column header bar) and select the trim leading and trailing option
You will now notice that all the existing filtergram values
also change based our action, and data inconsistency on the country column is
no longer an issue.
Step 2: Now, we notice that there is a Mixed case issue in both the City and State or Province column. In order to fix this, select the Change into Capital Case option on the column operations menu.
Step 2.1: Expand the column selector and select the columns
for which you want to apply this change into
Capital Case operation
Step 3: Now let’s solve the data sparsity issues in the Age and Gender Column.
Step 3.1: Pull up a filtergram on the age column and select all the blank values.
Step 3.2: Click on the remove rows tool from the tools bar on the left to get rid of these blank values.
Step 3.3: Now use the find and replace option from the column operations menu on the Gender column
Step 3.4: Replace the blanks with the text value “unknown” so that the entire column is populated
Step 4: Now you can export this dataset to a
Paxata supported datasource or to your local desktop!
That completes the third of six tutorials in the Getting Started series.
- Create a New Data Prep Project
- Explore Your Data
- Clean and Transform Your Data
- Publish and Export Your Data
- Combine Your Data - Append
- Combine Your Data - Lookup /Join