Skip to content
Home / Fundamentals

How to remove duplicates in lists

TL;DR
  • use set casting if you don't care about the list order
  • use dict.fromkeys casting if you do care about the list order

Context

No importance on initial list order

Python allows you to easily cast between different data types. Remember that the list allows you to hold multiple items with the same value. A set on the other hand is restricted to only having unique values.

>>> my_list = [1,2,4,5,6,5,2]
>>> list(set(my_list))
[1,2,4,6,5]

Making use of the set's characteristics and the cast mechanism allows you to quickly remove all duplicates. The only drawback is that a set has no requirement to have its items ordered or maintain a specific order.

Importance on initial list order

If we do care about list order then we can make use of a dict instead. Remember that a dict has key:value pair entries where each key is unique across the dict while the value is allowed to be a duplicate. For our problem, we are making use of the fact that keys are unique and that the keys also respect the order in which they are added.

  • dict order: the order of keys added to a dict is preserved. This is only true since Python 3.6, and you shouldn't assume this characteristic for Python versions prior to that
# Tl;dr we care about list order
>>> my_list = [1,2,4,5,6,5,2]
>>> print(list(dict.fromkeys(my_list)))
[1,2,4,6,5]

The .fromkeys() method is normally used to create a dict from the passed sequence of keys. You can optionally pass in a value which will be used for all keys.

>>> some_keys = [1,2,3]
>>> print(dict.fromkeys(some_keys))
>>> print(dict.fromkeys(some_keys, "hello"))

{1: None, 2: None, 3: None}
{1: "hello", 2: "hello", 3: "hello"}
Remember Me
  • unique keys: remember that the keys of a dict are unique
  • dict order: since Python 3.6 the order of keys added to a dict is preserved. This means that this approach will not work if you are using an older Python version
  • list casting: casting a dict to a list using list(some_dict) will therefore remove the duplicates while preserving the initial order of your list