Priority Queue in Python

Priority queue is a data structure similar to a queue, but where each element has an associated priority. A queue is a first in, first out (FIFO) data structure, whereas in a priority queue the element with the highest priority is served before lower priority elements.

Priority queue in Python? Short answer: use queue.PriorityQueue

Python comes with a built-in PriorityQueue class, contained in the queue module.

In the simplest case, an entry in the priority queue will be a tuple (priority_number, data). Here's a dummy example of how to use it:

import random
from queue import PriorityQueue


def generate_items():
	items = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
	priorities = [2, 1, 3, 3, 0, 1, 1, 2]

	return items, priorities


if __name__ == '__main__':
	items, priorities = generate_items()

	print('Items/Priorities')
	for item, priority in zip(items, priorities):
		print((item, priority))
	
	q = PriorityQueue()
	
	# We add items to priority queue in the order they're listed
	# a -> b -> c -> ...
	for item, priority in zip(items, priorities):
		q.put((priority, item))

	# Let's get items from the priority queue one by one
	# They will be printed in the order of their priority
	print('Getting items from the priority queue')
	while not q.empty():
		next_item = q.get()
		print(next_item)



And here's the output:

Items/Priorities
('a', 2)
('b', 1)
('c', 3)
('d', 3)
('e', 0)
('f', 1)
('g', 1)
('h', 2)

Getting items from the priority queue
(0, 'e')
(1, 'b')
(1, 'f')
(1, 'g')
(2, 'a')
(2, 'h')
(3, 'c')
(3, 'd')

Note that by default, the lower the value of the priority number, the higher the priority of the entry. That is priority 0 is higher than priority 1.

Technically, PriorityQueue is implemented using a min-heap data structure. That means the smallest priority value always comes out first. If we want to reverse this, we can simply use the negative of the priority value when adding elements to the queue.

As a more robust solution, we can wrap our queue items in a custom class with overloaded comparison operators implemented to give us the desired ordering. We'll use priority queue this way in our next example.

Resolving equal priorities and order stability

Let's consider a more involved example. We'll use it to reveal two important problems with our tuple-based solution from the previous section.

Assume we have a customer support center, and we want to implement a queueing policy that prioritizes requests from older customers (those who signed up for our service earlier). To make it more fair, we also want the queue to be stable with respect to the insertion order - if two or more customers with the same priority are in the queue, we must serve them in the order of arrival.

To make things more realistic, each item we push to the queue be an instance of the Task class. It will contain a reference to a customer and some details about their request. For prioritization, we'll simply use the year of signup.

Let's start by defining the Customer class and generating some instances:

import random
from queue import PriorityQueue
from dataclasses import dataclass
from datetime import date
import functools


@dataclass
class Customer():
	signup_date: date
	first_name: str
	last_name: str

	def signup_year(self):
		return self.signup_date.year


def generate_customers(n):
	first_names = ['Jessica', 'Mary', 'Patricia', 'Jennifer', 'Elizabeth', 'James', 'John', 'Robert', 'Michael', 'Noah', 'Damian']
	last_names = ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Rodriguez', 'Murphy', 'Walsh', 'Wilson', 'Taylor']

	customers = []
	for i in range(n):
		c = Customer(
			first_name=random.choice(first_names),
			last_name=random.choice(last_names),
			signup_date = date(
				year=random.randint(2020, 2022),
				month=random.randint(1, 12),
				day=random.randint(1, 28)
			)
		)
		customers.append(c)

	return customers


if __name__ == '__main__':
	random.seed(3)
	customers = generate_customers(5)
	for customer in customers:
		print(customer)

Running this, we get the following output:

('2022-03-12', 'Jennifer Taylor')
('2022-10-03', 'Noah Walsh')
('2021-05-18', 'Noah Smith')
('2022-08-18', 'Jennifer Brown')
('2021-11-28', 'Michael Walsh')

Let's now create the Task class which will hold a customer and the description of why they are in the queue.

class Task():
	def __init__(self, customer, description):
		self.customer = customer
		self.description = description
		self.priority = customer.get_signup_year()
	
	def __repr__(self):
		return str((self.priority, self.description, self.customer))

For our customers, the list of tasks might look something like this:

Task((2022, 'Delivery delay', ('2022-03-12', 'Jennifer Taylor')))
Task((2022, 'Delivery delay', ('2022-10-03', 'Noah Walsh')))
Task((2021, 'Unavailable products', ('2021-05-18', 'Noah Smith')))
Task((2022, 'Delivery delay', ('2022-08-18', 'Jennifer Brown')))
Task((2021, 'Unavailable products', ('2021-11-28', 'Michael Walsh')))

If we try to add these tasks to a priority queue, it will not work. We get the following error:

TypeError: '<' not supported between instances of 'Task' and 'Task'

Why? It's because our PriorityQueue doesn't know how to compare (i.e. prioritize) the Tasks. Comparison operators are not implemented for the class Task.

Fair enough. Let's just employ our old trick - add a (priority, Task) tuple to the queue. But we get the same thing again:

TypeError: '<' not supported between instances of 'Task' and 'Task'

Why doesn't this work? In the simplest case, we'll put into the queue tuples of (priority, item). Behind the scenes, PriorityQueue will perform tuple comparisons to determine which queue item is the smallest, i.e. at the front of the queue. Remember that tuples are compared position by position, comparing corresponding elements. In our case, this means that if the priorities of some two entries are equal (tuple position 0), we proceed to compare the items (tuple position 1).

In many real-world applications, this can pose two problems:

  1. Comparison between items is not always possible, i.e. comparison operators are not defined for the item class
  2. For any two items of the same priority, how do we return them from the queue in the order in which they were added to the queue?

For problem #1 we have an obvious solution - wrap the items in a class where we can implement comparison operators. We can hold the priority as a field in this class and encapsulate prioritization within comparison operators. For us, this will be our Task class.

For problem #2, sort stability, there is also a simple, although less obvious solution: add an entry counter to priority comparison. Whenever we add an item to the queue, we increase the counter. We use the counter to resolve ties in comparison between equal priority items.

The intuition here is simple. Arrive at the queue earlier, and you get a smaller entry counter value. If somebody with the same priority arrives after you, you go first because of your lower entry counter value.

If we use tuple entries, we can simply add the entry counter right after priority in the tuple: (priority, entry_counter, item). The entry counter in the tuple solves our sorting stability problem - in case of equal priorities, we compare the second entry in the tuple, the entry counter. But it also solves our sorting operator problem - since entry_counter must be unique for every queue entry, we'll never need to compare items directly.

We'll leave this as an exercise for the reader and proceed with encapsulating priority and entry_counter within the Task class. This way, we'll only push Task instances to the queue instead of (priority, entry_counter, Task) tuples.

That said, let's fix our Task class:

@functools.total_ordering
class Task():
	def __init__(self, customer, description, entry_counter):
		self.customer = customer
		self.description = description
		self.priority = customer.get_signup_year()
		self.entry_counter = entry_counter
	
	def __repr__(self):
		return "Task(" + str((self.priority, self.entry_counter, self.description, self.customer)) + ")"
	
	def __lt__(self, other):
		return (self.priority, self.entry_counter) < (other.priority, other.entry_counter)

	def __eq__(self, other):
		return (self.priority, self.entry_counter) == (other.priority, other.entry_counter)

Two important things are going on here:

  • We added the entry_counter field, and we'll make sure to set it properly before adding a task to the queue.
  • We implemented comparison methods for the Task class. To make instances of a class comparable, we implement < and == operators (__lt__ and __eq__ methods, respectively). The @functools.total_ordering decorator completes the class with other comparison operators: <= , > and >= .

Now we can complete our example by adding our customers to the queue and serving them in order of priority while respecting arrival order in case of equal priorities.

import random
import functools
from datetime import date
from dataclasses import dataclass
from queue import PriorityQueue

@dataclass
class Customer():
	signup_date: date
	first_name: str
	last_name: str
	
	def get_signup_year(self):
		return self.signup_date.year

	def __repr__(self):
		return str((self.signup_date.strftime('%Y-%m-%d'), self.first_name + ' ' + self.last_name))


def generate_customers(n):
	first_names = ['Jessica', 'Mary', 'Patricia', 'Jennifer', 'Elizabeth', 'James', 'John', 'Robert', 'Michael', 'Noah', 'Damian']
	last_names = ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Rodriguez', 'Murphy', 'Walsh', 'Wilson', 'Taylor']

	customers = []
	for i in range(n):
		c = Customer(
			first_name=random.choice(first_names),
			last_name=random.choice(last_names),
			signup_date = date(
				year=random.randint(2020, 2022),
				month=random.randint(1, 12),
				day=random.randint(1, 28)
			)
		)
		customers.append(c)

	return customers


@functools.total_ordering
class Task():
	def __init__(self, customer, description, entry_counter):
		self.customer = customer
		self.description = description
		self.priority = customer.get_signup_year()
		self.entry_counter = entry_counter
	
	def __repr__(self):
		return "Task(" + str((self.priority, self.entry_counter, self.description, self.customer)) + ")"
	
	def __lt__(self, other):
		return (self.priority, self.entry_counter) < (other.priority, other.entry_counter)

	def __eq__(self, other):
		return (self.priority, self.entry_counter) == (other.priority, other.entry_counter)


if __name__ == '__main__':
	random.seed(3)
	customers = generate_customers(5)

	q = PriorityQueue()
	
	print("Customers arriving...")

	entry_counter = 0
	for customer in customers:
		# Create a task for the customer
		task = Task(
				customer=customer,
				description=random.choice(['Delivery delay', 'Poor product quality', 'Unavailable products']),
				entry_counter=entry_counter
			)
	
		# Add the task to queue
		q.put(task)
		
		# Increment entry counter
		entry_counter += 1

		print(task)

	# Pull the elements from the queue one by one
	print()
	print("Customers served...")
	while not q.empty():
		next_customer = q.get()
		print(next_customer)


Pay attention to lines #68 - #81. This is where we create Task instances and assign the entry_counter to each of them.

If you run the snippet above, you'll get the following output:

Customers arriving...
Task((2022, 0, 'Delivery delay', ('2022-03-12', 'Jennifer Taylor')))
Task((2022, 1, 'Delivery delay', ('2022-10-03', 'Noah Walsh')))
Task((2021, 2, 'Unavailable products', ('2021-05-18', 'Noah Smith')))
Task((2022, 3, 'Delivery delay', ('2022-08-18', 'Jennifer Brown')))
Task((2021, 4, 'Unavailable products', ('2021-11-28', 'Michael Walsh')))

Customers served...
Task((2021, 2, 'Unavailable products', ('2021-05-18', 'Noah Smith')))
Task((2021, 4, 'Unavailable products', ('2021-11-28', 'Michael Walsh')))
Task((2022, 0, 'Delivery delay', ('2022-03-12', 'Jennifer Taylor')))
Task((2022, 1, 'Delivery delay', ('2022-10-03', 'Noah Walsh')))
Task((2022, 3, 'Delivery delay', ('2022-08-18', 'Jennifer Brown')))

See how our customers arrived in a random order but were served (taken out of the queue) by the year of their signup (older customers first). Between customers from the same year, we kept the order in which they arrived in the queue.

As an exercise, try removing entry_counter from comparison operators and observe how this affects the serving order.

Is the Python priority queue thread safe?

Yes, all classes from the built-in queue module are thread-safe. According to the official documentation:

Internally, those three types of queues [Queue, LifoQueue, PriorityQueue] use locks to temporarily block competing threads; however, they are not designed to handle reentrancy within a thread.

Is priority queue the same thing as heap?

Yes and no. While the terms are sometimes used interchangeably, the difference is subtle.

A heap is a data structure (a tree) with an important property - in a max-heap the largest element is always at the root; in a min-heap it is the smallest element. Heaps can efficiently maintain this property (called heap property) while adding or removing elements to the data structure.

This makes heap an obvious choice for implementing priority queues. Priority queue is an abstract data structure with properties described in the opening section of this article:

a priority queue is an abstract data type similar to a regular queue or stack data structure in which each element additionally has a priority associated with it. In a priority queue, an element with high priority is served before an element with low priority.

Python's queue.PriorityQueue is, in fact, built on top of the heapq module. This module provides a lower lever interface to a heap-based priority queue.

Aside from the heap, there are other ways to implement a priority queue, although mostly less efficient.