In my posts Simple Immutable Queues and Enumeration of Immutable Queues I’ve alluded to a nasty flaw that’s waiting to bite. What is that flaw, and why do we need to worry?

The average case performance of .Enqueue() for our simple queue is O(1) - essentially, constant time. (See wikipedia for some background of Big-O notation if you’re not familiar with its use.) Unfortunately, this isn’t quite good enough.

The problem is one of worst case performance for .Dequeue(). Think about what happens if your application ends up queuing a million items on the queue before it tries to remove any of them at all.

The first item we .Enqueue() on the queue ends up as the Head of the queue:

The second item added to the queue ends up as the lone item on the _outbound stack:

The remaining 999,998 items end up on the _inbound stack:

When we want to .Dequeue() our first item, a new value for Head is taken from _outbound, leaving that stack empty. Then we want to .Reverse() the _inbound stack to put all of the items in the right order for the _outbound stack.

Unfortunately, what .Reverse() does is to create an entirely new stack, sharing nothing with the original. Building the new stack requires iterating through all 999,998 items, allocating a new stack item for each one - that’a lot of allocation.

When we’re calling .Dequeue(), the entire original stack (all 999,998 items) is immediately dumped for the garbage collector to handle. A similar thing happens when we iterate through the queue, though in this case we’re dropping items from the new stack for the garbage collector one at a time.

In addition to the high memory overhead, having to reverse a million item _inbound stack also has the effect that the very first Dequeue() operation will take a disproportionally long time, perhaps causing a delay significant enough to cause hiccup in the operation of the application.

The critical thing here is the magnitude of the variation - while some calls to .Dequeue() complete very quickly, other calls might literally take a million times longer. Which some variation in call duration is normal and expected, this degree of difference is cause for concern. If we hadn’t spotted the issue here, it might have lain in the code dormant until a large load happened in production and caused problems there. In the normal case, and almost certainly in all of the test cases, queue performance is quite good enough.

How do we fix it?

What we need to do is to prevent the _inbound stack from getting too large before we Reverse() it, to minimize the amount of debris we create, and to reduce the variation in execution time. Fortunately we can employ a cunning trick.

About this series

Learning about immutable types by implementing some sample data structdures.

Posts in this series

Stacks
Stack Diagrams
Enumerating Stacks
Stack Equality
Stacks Miscellany
Queues
Simple Queues
Reversing Stacks
Enumeration of Queues
The Problem with the Simple Queue
Complex Queues
Queue Concatenation
Factory Methods
Testing Immutable Types
Type Miscellany
Why Immutable Types?
Prior post in this series:
Enumeration of Queues
Next post in this series:
Complex Queues

Comments

blog comments powered by Disqus