Share:

Knowledge Base

What are robust statistics and how do they differ from non-robust statistics?

09/29/2023 | By: FDS

Robust statistics are methods of data analysis that are resilient to outliers and bias in the data. In contrast, non-robust statistics are prone to outliers and can be heavily influenced by deviating values.

When there are outliers in a data set, they are values ​​that differ significantly from the other data points. These outliers can be caused by various factors, such as measurement errors, unusual conditions, or real but rare events.

Non-robust statistics often use assumptions about the distribution of the data, such as the normal distribution. If these assumptions are violated, outliers can lead to unreliable results. For example, the mean and standard deviation can be greatly affected when outliers are present.

Robust statistics, on the other hand, try to minimize the impact of outliers. They are based on methods that are less sensitive to deviating values. An example of a robust statistic is the median, which represents the middle value in a sorted series of data. The median is less prone to outliers because it's not based on the exact location of the values, just their relative rank.

Another example of a robust statistic is the MAD (Median Absolute Deviation), which measures the dispersion of the data around the median. The MAD uses the median instead of the standard deviation to provide more robust estimates of spread.

In general, robust statistics have the advantage of providing more reliable results when there are outliers or biases in the data. They are less prone to violating assumptions about the distribution of the data and can be a better choice in many situations, especially when the data is incomplete, inaccurate, or non-normal.

Like (0)
Comment