Home Data Prep for Data Science, AI and ML


Big News: we’ve moved to the DataRobot Community! Please keep your eye out for an email invitation to join us there. Refer to the We've Moved FAQ for a guide on how to use your existing Paxata Community account to login to our new home.

Visit the official Paxata Documentation portal for all of your doc needs.

Joining datasets for feature enrichment

Do you have several datasets that you need to combine in order to stack all the feature into a single dataset? For example, you have a predictive model for whether or not a patient will be re-admitted to hospital within 30 days of discharge. And you have three datasets--one with admissions data, another with diagnostic codes, and then one with hospital codes that map to categorical values. Now you need to join everything together to get all of your features combined into a single dataset.

When you have multiple datasets that you want to join together in order to stack all features into one dataset for your ML models, you can easily do that with the Lookup tool. The Lookup tool provides a join type operation that allows you to combine another dataset with your Base (driving) dataset. Additionally, the tool provides a  "Detect Joins" option that visually shows you the variable(s) that your datasets share, along with a percentage score that guides you on how best to combine the datasets.

For complete details on the Lookup tool, with plenty of examples, see the official documentation for: Joining data with the Lookup tool   
Sign In or Register to comment.