The following reference counting implementation:

Code:
Object *Object::Retain()
{
	_refCount.fetch_add(1, std::memory_order_relaxed);
	return this;
}

void Object::Release()
{
	if(_refCount.fetch_sub(1, std::memory_order_release) == 1)
	{
		std::atomic_thread_fence(std::memory_order_acquire); // Synchronize all accesses to this object before deleting it

		CleanUp();
		delete this;
	}
}



For the longest time I did reference counting with strong memory fences (std::memory_order_acq_rel), until I learned that relaxed ordering works with read-modify-write operations just fine.
The release operation is a bit more complicated because if this is the last reference the thread holds to the object, it needs to have a full release operation. However, only the final thread that relinquishes the reference needs an acquire operation to make sure all changes to the object are observed before entering the destructor.

Clever as fuck. The basic idea comes from Herb Sutter's Atomic<> Weapons talk, although he used std::memory_order_acq_rel for dropping a reference.


Of course this doesn't do anything for strongly ordered architectures like x86, but it produces much better code on weakly ordered CPUs like ARM.


Shitlord by trade and passion. Graphics programmer at Laminar Research.
I write blog posts at feresignum.com