Home Paxata Fundamentals

Paxata Community Members: Something special in a community experience is coming your way. Stay tuned to this space.
In the meantime, check out the brand new Data Prep for Data Science topic here and the new DataRobot Community.

Visit the official Paxata Documentation portal for all of your doc needs.

How do I profile a dataset

jmayhewjmayhew Posts: 28 admin
edited March 31, 2020 8:51PM in Paxata Fundamentals
Paxata has the ability to perform 30 tests on your data to you understand and learn information about your data including:

  • Count of distinct values (to identify if you have a unique identifier in the data).
  • Min, average and maximum data length (to identify any parsing errors or ensure the data adheres to the requirements)
  • Data type distribution (does the data confirm to a specified data type and are there a mixture of data types in the data)
  • Completeness checks (are there missing values, does the data contain NA, NULL or other empty values)
  • Numeric distribution (any negative values or decimal values)
  • Any additional spaces, excess HTML code, or other anomalies in the data
  • Top 5 value preview (to quickly identify the data in the field)
Paxata will perform these tests on each cell within your data and generate a physical report listing the outcomes on a per field basis. 

To run Data Profiling, please do the following:
  1. Navigate to the Library.  (Click in the upper left corner where it says Projects, and a drop down will appear. Choose Library from the dropdown.)
  2. Find the file you wish to profile in the library. Once located, move your mouse over the file. The Actions Menu will appear.
  3. Move your mouse over More Actions
  4. You will see the profile option appear. Click on that button to activate the profile

If a profile had already been executed, the results will appear within the screen. Otherwise, click on the Generate profile.

Once the profiling is complete, you can either repeat these steps to view the results, look at the data preview for the library results in the library, or open the results within Paxata Project.
Sign In or Register to comment.