Format Variables in Reshape.XL

Published on 20 Mar, 2019 by Milos Gregor

Variable formatting is an important part of the data preparation prior to statistical processing. Automatic conversion and data type recognition in Excel sometimes causes great problems. These problems often resulting in errors at statistical processing. Within the Reshape.XL add-in, the approach to the definition of data types is different.

The data type of each dataset variable (column) is defined at the data import. Data type is the same and unchanging within the column. Since 1.1 version of the add-in, automatic data type detection of variable has been improved. The program automatically recognizes the appropriate data type that is used in the resulting dataset. However, it is sometimes useful to modify this type manually.

You can make this adjustment on import. If you want to change the data type later  - during data processing, you can do it using the  Format function. An example of its use is shown in the following illustration. In this case, we’ve reformatted variables from the String data type to Numeric and DateTime.

Changing the data type is very simple process. Described functionality is introduced in the following short video tutorial.

It is clear from the video that the variable formatting is easy-to-use operation. In some cases, however, complications may occur. These concern most of the cases where we have input data in non-standard format. For example, it is about  DateTime and  Numeric values which are stored in the table as  String values in a specific form.

It can be a numeric value where as a decimal separator is used comma instead of a dot or a date value in a special, culture-related format. However, these problems can be solved manually by setting the Culture property in the  Format function.

In addition, the program also includes two additional special features for DateTime values parsing  -  Date_Parse_Exact and  Date_Parse_Num. The  Date_Parse_Exact function is used to extract  DateTime values from  String using the exact date format specification (e.g. YYYY-MM-DD hh:mm:ss ). The  Date_Parse_Num function allows you to extract the date from variable that is stored in Excel as a numeric value (e.g. number  38503 is equal to 31 May 2005 ).