Visit the official Paxata Documentation portal for all of your doc needs.

Using the Split operator

Splitting a column involves taking the values in a single column and distributing them into one or more new columns in the same row. The split is performed either by selecting a character string on which to separate the column or by giving a character count (number of characters) along which the split should occur. There is also a REGEX option for more advanced users as well.

Match

In the “Separator” field, enter one or more characters on which the split should occur. This will serve as the pattern against which the match will occur. Note that the character(s) specified will not become part of any of the resulting columns—they will be excluded from the distribution to the new column(s). Additional columns can be added or removed by clicking the + and - buttons in the Split Column window.

You also have the option to select "Right to left" for the split. By default, the split function parses left to right using the specified separator. The "Right to left" option allows you to parse the text in the column starting on the right. This is particularly useful in cases where you want to separate a string of text, for example a file name, from its directory path. In this case, specify the slash "/" as your Separator, and then select the "Right to left" option.

Length

The “Lengths” field requires one or more numbers, separated by commas. The “Lengths” field is used to determine the number of characters in each of the new columns. Thus, the values “2,3,2” placed in the field will generate one column with the first two characters, create a second column with the subsequent three characters, and create a third column with the next two characters within the original column.

Unlike “Split by Separator,” this type of split does not drop any characters from the middle of the column. However, instead of any unaccounted-for (left-over) characters being placed together in the last column, they are excluded from the new columns completely. To avoid inadvertently truncating a column, it is recommended that the last number specified be large enough to account for the remaining column length.

Note that doing this will not generate a “whitespace” buffer to account if this value is larger than the length of the available character values. This will only ensure that all characters in the column are accounted for in the last column. Extra spaces specified by the “Lengths” parameter that go beyond the number of characters in the column are simply ignored.

Sign In or Register to comment.