Home Data Prep for Data Science, AI and ML

Paxata Community Members: Something special in a community experience is coming your way. Stay tuned to this space.
In the meantime, check out the brand new Data Prep for Data Science topic here and the new DataRobot Community.

Visit the official Paxata Documentation portal for all of your doc needs.

Joining datasets for feature enrichment

Do you have several datasets that you need to combine in order to stack all the feature into a single dataset? For example, you have a predictive model for whether or not a patient will be re-admitted to hospital within 30 days of discharge. And you have three datasets--one with admissions data, another with diagnostic codes, and then one with hospital codes that map to categorical values. Now you need to join everything together to get all of your features combined into a single dataset.

When you have multiple datasets that you want to join together in order to stack all features into one dataset for your ML models, you can easily do that with the Lookup tool. The Lookup tool provides a join type operation that allows you to combine another dataset with your Base (driving) dataset. Additionally, the tool provides a  "Detect Joins" option that visually shows you the variable(s) that your datasets share, along with a percentage score that guides you on how best to combine the datasets.

For complete details on the Lookup tool, with plenty of examples, see the official documentation for: Joining data with the Lookup tool   
Sign In or Register to comment.