Function selects only unique (distinct) rows from an input table - removes duplicate data.

Select Variables |
Selected variables (table columns) that will be used when determining uniqueness. In the case of non-unique rows, only the first row will be preserved. If no variable is selected, program will use all table variables. |

Keep All |
Boolean. If TRUE, all variables (columns) will be preserved in result dataset. If FALSE, only selected variables remains in the result dataset. |

distinct, distinct alternative in Excel, unique excel rows, select unique, select distinct, excel, spreadsheet table, select unique, select distinct, remove duplicated, duplicated combinations, filter for unique values

A common task when working with data is to obtain unique values or unique combinations of values in selected variables - columns of the table. We often use this feature to analyze an input dataset that contains several categorical variables. An example of unique (distinct) values selection is shown in the following figure. Within the Reshape.XL add-in, we use the **Distinct** function to select unique values. In the following text, this function with all its options is described.

As an input dataset we use a relatively simple table that contains three categorical variables - *Group*, *Group_2* and *Parameter* and one numeric variable - *Value*.

To select unique column values from this dataset, click the **Distinct** button in the ribbon toolbar tab **Subset**.

The function is displayed in the Side panel. If you click on it, its adjustable parameters will appear at the bottom of the Side panel. Overall, the function contains two definable parameters - **Keep All** and **Select Variables**.

Use the **Select Variables** parameter to define variables in which you want to select unique values. In the following example, we looked for distinct values in the *Group* variable. As a result, a table containing only one *Group* variable was displayed, and this variable displays three unique values that occurred in the input dataset - *Spring*, *Well* and *River*.

If you want to search for unique combinations of several variables (columns), we set their check-boxes to **TRUE**. In the following example, we searched for unique combinations within two variables - *Group* and *Group_2*.

In previous examples, we have always obtained only values for the variables in which we searched unique values. To keep other columns in the resulting table, set the **Keep All** property to *TRUE*. In this case, the program looks for unique values in the selected variables and also selects the first values in the order for other (unused) variables.

An example is shown in the following figure. The program was looking for unique combinations for the *Group* and *Group_2* columns. If found, he added to the other columns the values from the first record in the order that contained the current combination of unique values (*Group* and *Group_2*).

If we select a combination of other variables, the character of result table (the number of rows) will change. An example is shown in the following image, where we get unique records by the *Group_2* and *Parameter* columns.

Finally, the last option is to search for **unique rows** - records where the combination of all values in the row is unique. This type of distinct records is obtained if you select all the variables in the **Select Variable** property, or vice versa if you do not select a single variable. The result will be the same for both settings.

Finding and selecting unique (distinct) rows in Excel tables is very easy and fast with the Reshape.XL add-in. Other functions from the Subset group that are interesting are **Filter**, **Sample** or **First_N**.