Paxata has been acquired by DataRobot to build the industry’s first end-to-end enterprise AI Platform!
How do I do sampling on a specific subset of my dataset?
To do this, first create a new dataset which is a subset of the 45 million row dataset and then perform the sampling.
Step 1: Use a Filtergram and select the desired values by which the dataset needs to be sampled. For Example: I want to select all the data where HQ STATE = CA.
Step 2: Create and publish a Lens that stores this view of the dataset for reusability.
Step 3: Once this dataset has been created, bring it into a new Paxata Project and use the Sampling tool.