Shape (Group By)
This function allows for the user to utilize a number of different aggregate functions (listed below) on any of the existing columns within the dataset. When the Group By button is selected, a pane will appear above the dataset which allows the user to specify which columns to include in the Group By process, which column they wish to perform an aggregate function on, which aggregate function to use, as well as specify the name of the new aggregate column being created. The data will display a preview of the selections made by the user, highlighted in blue, so that you can see how it will affect the data. It is important to remember that only those columns included in the "Columns (Aggregates)" field will remain in your data following the Group By function. Those columns included will be used when identifying duplicate rows for grouping.
For a list of the aggregate functions available for use in Paxata, CLICK HERE.
These operations are called aggregate because they find matching rows in the dataset and then combine them into one row. A matching row is defined as one that, excluding the reference column, share the same values in a column-by-column examination. The reference column is excluded from the column-by-column examination because its value are submitted to the aggregate function in order to produce the reference column value in the single-row result.