4 Interesting Functions for Missing Values Processing in Excel

Published on 21 Aug, 2019 by Milos Gregor

Missing data is very common in our datasets. Their origin may be diverse - measurement drop, data write errors or records removing during the data cleansing process. For some statistical analyzes we need tables without missing data. We need to either remove these incomplete rows, fill in the empty cells, replace them by constant value or select rows with missing data.

missing data in excel

For this purpose, the Reshape.XL add-in has four interesting functions. These are Fill_NA, Select_NA, Replace_NA and Drop_NA. Each of these functions does something else with the missing data. Drop_NA is used to delete records (rows) that contain one or more empty cells. The function allows you to select whether to delete rows with one or more cells containing missing data. It is also possible to specify selected columns of the table in which the missing data will be searched.

The second option is to fill the empty cells. You can use the Fill_NA function for this purpose. If you don’t want to delete records with missing data, you can fill them within each column using the last (or the next following) known value. To run the function, you only select the columns in which you want to fill in the missing data and set the direction parameter (“Down” = fill according to the last known value, “Up” = fill according to the next known value).

Another option to fill in missing cells is the Replace_NA function. This function makes it possible to complete all the empty cells in selected columns using a constant value. For example, you can use -999.99 for numeric variables and “Empty” or “NULL” for string variables.

The Select_NA function has a different character. Use this function to select rows that contain empty cells in one or more table columns. This function can be used for specific processing of records with missing data. For example, you can analyze the nature of missing data, such as its patterns - the reasons of occurrence.

A quick overview of the practical application on the sample dataset shows the following short video tutorial.

Using the Drop_NA, Fill_NA, Select_NA and Replace_NA functions is very easy and fast in the Reshape.XL add-in. In addition to the functions described above, other built-in functions are also useful for working with missing data - Pack, Filter, Mutate, Mutate_IF or Complete.

For example, the Complete function allows you to transform implicit missing data into explicit. We'll talk more about this function in the next post.