Separate

Function separates one table column (variable) into multiple.

Properties

Variable Selected variable (column) that you want to separate.
Into Property for new column names definition. If you type one name, this name will be used as name prefix followed by integer order number. If you type several (separated by comma), the exactly defined names will be used. If you type lower number of new names than the number of new columns, in resulted table will be keeped only columns with predefined names. Other will be ignored (Extra = drop) or merged in the first or last column (Extra = merge).
Sep Mode Selection of separation method.
  • string - separation based on simple string separator definition
  • position - separation of one column to two new columns based on the position in string from the start (positive integer number) or end (negative integer number) of the string.
  • regex - separation based on separator defined by regular expressions
Separator String, position number or regular expression that determines separation of new columns.Character, is interpreted as a regular expression. Integer number is interpreted as positions to split at. Positive values start at 1 at the far-left of the string; negative value start at -1 at the far-right of the string.
Extra If the Separator property is defined as string value, this property controls what happens when there are too many new columns (see Into property). There are two options:
  • drop - remove extra values
  • merge - only splits at most in relation to count of predefined columns (Into property)
Fill If the Separator property is defined as string value, this property controls what happens when there are not enough new columns predefined (Into property) and property Extra is is set to merge. There are two options:
  • right - fill with missing values on the right (last new column)
  • left - fill with missing values on the left (first new column)
This property also controls what happen when in a row we have less values than new columns. If the Fill property is set to right, values are in new columns “aligned” to the right and vice versa.
Remove Input Boolean. Property determines whether the input column will be removed from the result table.

Keywords

Separate Excel Columns, One Excel Column into Multiple, Split Excel Column, Split Cell, Divide Excel Column, Split Text to Columns, Excel Text to Columns Alternative

See Also

Separate_To_Rows, Gather, Transpose

Video-tutorial

How to Separate One Excel Column into Multiple

Separating values from one spreadsheet column into multiple is a relatively common task that needs to be done in Excel. For this purpose, we use the Text to Columns feature. Its enhanced alternative in Reshape.XL add-in is the Separate function. In addition, we can use the Separate_To_Rows function, but its character is different. An example of Separate function application is in the following figure.

In this tutorial, we will use a very simple table as the sample dataset that contains two variables - Group and Vals. Our task will be to divide the values in the Vals column into separate new columns. For this purpose, we will use a single adjustable and very useful Separate function. By adjusting its properties, you can get at least 14 different results from the same input variable. Their setup is very simple, intuitive and fast.

To add the described function to the processing, click on the Separate button in the Variables ribbon toolbar tab.

After adding a function to the Side Panel, we can display its adjustable properties. Overall, there are 7 properties.

First, set the first combo-box - Variable, which defines the variable (column) from the input table that you want to split.You can leave the Into property blank. Property Sep Mode defines the separation type. We leave this value in the “string” option. Then we define the Separator property - a text string that divide individual values in a string.This is a comma in our case.

Then press the Run button and the add-in separates the values into new columns. The result table is missing the input variable Vals (property Remove Input = TRUE). The program separated the values and put them into new columns. Their names were generated by default as the original (source) variable name “Vals”, followed by a numerical index of 1 to 6.

If you want to use a different prefix than the original column name, write it to the Into property. In the following example, we have defined the prefix of new column names as “Col”. The program then generated new column names in the form of “Col_1” to “Col_6”.

Property Into has a great importance. Its possibilities are described in the following two examples. If you want to exactly define the names of new columns, type them into the Into property and separate them with a comma. In this case, however, it is necessary to define the same number as the number of resulting new columns. In the following example, we defined these names as alphabet characters from A to F. The result is shown in the figure.

The following example also presents the importance of the Into property. In this case, we have defined only two new names - A and B. The program understands this setting to limit the number of newly created columns to two. Add-in separates the individual segments of the text string and only stores the first two values in the resulting table. Other values are ignored (property Extra = drop).

With the Extra property you can achieve a slightly different result. If you set it to merge, the remaining values that did not fit the predefined number of new columns are not excluded. These values are inserted to existing columns. Where they are embedded is defined through the Fill property.

If this value is set to “left”,the program will split the text string and inserts these values to the first (left) column.

Conversely, if the Fill property is set to “right”, the program inserts redundant values into the last new column. In this way you can divide the variable into precisely defined number of new variables, while values that exceed the predefined number of new variables (property Into) will not be lost.

Another type of output is obtained if the program generates the necessary number of new variables and you set the Fill parameter to “right”. In this case, the input column values are split into the necessary number of new columns. However, their images are different from the first example in that they are “right-aligned”. This means that if the selected row contains fewer values than the new columns, the values are inserted in a way the last columns are filled preferentially.

Another example shows the change in the Remove Input property. If we change it to FALSE, the resulting table also displays the original Vals column from which the new variables were generated. By default, this parameter is set to TRUE and the input variable is removed from the resulting table.

So far, we have described the possibility of separating variables from a text string using a predefined separator (string). This separator can be defined as one or more characters such as a comma or dot.

However, the program also includes two other methods of separation. This method can be selected using the Sep Mode property. In the previous examples, we used the string method. In addition, you can split the text string by position or by using regular expressions (regex).

In the next example, we will separate the variable into two new variables by defining the position of separation. We set property Sep Mode to position and we defined the split position as an integer value through the Separator property. We set the value to 3, which means that the variable is divided into two new variables, with the split position being defined after third character in the string.

Sometimes, it is necessary to divide the variable by position from the end of the text string. Function Separate also supports this option. To separate by position from the end of the string, define the Separator property value as negative. For example, if you set this value to -3, the text string will be divided into two new ones, and the split position will be defined before the third character of the text from the end.

The last possibility of separation is through regular expressions. To use them, set the Sep Mode property to regex. This type of separation is a bit more complex and requires a basic understanding of regular expressions. On the other hand, this option allows for very advanced types of separations.

A regular expression is a sequence of characters that define a search pattern. Add-in Reshape.XL uses the .NET implementation of regular expressions. More about it you can read here.

Use this separation if you want to separate the variable according to a more complex string that cannot be defined exactly, but in the form of its properties. For example, you can use this type to separate a variable before each occurence of an email in the text.

In the first example, we separated the variable when the comma was present in the cell. However, the same result can also be achieved by using the same Separator property and setting the Sep Mode to string.

The second example shows a more complex case. In this case, we separate the original variable in positions where a comma or dot occurs (property Separator = “[,]|[.]”).

The last example shows the case where the variable needs to be divided into new columns on any character in the range from “A” to “z” (property Separator = “[A-z]”). We did not define exactly the value but its character.

In this way, very advanced and complex types of separation can be defined. On the Internet, you will find many examples that you can copy and apply directly. For example, you can separate a variable at the second occurence of a phone number, or at each twitter name. All you need to do is look for the regular expression (search for “regular expression, detect xxxx”).

The Separate function delivers an extensive separation of one Excel column into multiple new ones and its definition is relatively simple. In addition to the Separate function, you can also use Separate_To_Rows for separation. However, this function has a slightly different character - the selected variable split and does not create new columns but new lines.