var
X
=
0
if
:
set
X
=
42
# this gets rolled back because of the
# next line
X
>
100
# this fails, causing a rollback
X
=
0
# X is back to 0 thanks to rollback
set
operation in Verse. For example, we could have instead called a C++ function, which modified some mutable variable using C++ (i.e. storing to a pointer), and then we could have used other C++ functions to read the mutable variable:SetSomething
(
0
)
# implemented in C++, sets a C++
# variable
if
:
SetSomething
(
42
)
# this gets rolled back because of
# the next line
GetSomething
()
>
100
# this fails, causing a rollback
GetSomething
()
=
0
# back to 0 thanks to rollback
if
statement, are nested transactions, which commit if the condition succeeds and abort otherwise. Conditional expressions are allowed to have effects. If the condition fails, it's as if the effects never happened.<no_rollback>
effect, which we will deprecate as soon as this issue is fixed.<no_rollback>
effect) and it is error-prone, leading to hard-to-debug crashes, like if an undo handler is registered for something that gets deleted.void
Foo
()
{
TArray
<int> Array
=
{
1
,
2
,
3
};
AutoRTFM::
Transact
(
[
&
] ()
{
Array
.
Push
(
42
);
//
Array is {1, 2, 3, 42} here.
AutoRTFM::
Abort
();
});
//
Array is {1, 2, 3} here, due to the Abort.
}
TArray
. This works even if TArray
needs to reallocate its backing store inside Push
. TArray
is not even special in this regard; AutoRTFM can do this for std::vector
/std::map
as well as a plethora of other data structures and functions in Unreal Engine and the C++ standard library. The only changes required to make this work are AutoRTFM's integration with low-level memory allocators and the garbage collector (GC), and as we'll show in this section, even those changes are small (they are purely additive and only a handful of lines; we've added them to all of Unreal Engine's many mallocs, Unreal Engine's GC, and system malloc/new). On top of that, AutoRTFM achieves this with zero performance regression for those code paths that run outside of a transaction (so outside the dynamic scope of a Transact
call). Crucially, even if a code path (like TArray
and many others) is used from within transactions, those uses outside transactions incur no cost. When code is executing inside a transaction, the overhead is high (we've seen 4x on some internal benchmarks), but not high enough to hurt the Fortnite server's tick rate.AutoRTFMPass
's job is to instrument all code so that it can obey Verse's transactional semantics. But the same code in Fortnite may be used in some places where there is no Verse code on the stack and Verse's semantics are unnecessary, and in other places where Verse is on the stack and a rollback is possible.bCurrentThreadIsInTransaction
bit is encoded in the program counter. From the compiler's standpoint, this is simple—it just clones all of your code, gives the cloned functions mangled names, and then instruments only the clones. The original code stays uninstrumented and never does any logging or anything else special, so running that code incurs zero cost. The cloned code gets instrumentation and definitely knows that it is in a transaction, which even enables us to play VM-like tricks, such as playing with alternate calling conventions.AutoRTFMPass
runs at the same point in the LLVM pipeline as sanitizers—i.e., after all IR optimizations have happened and just before sending the code to the backend. This has important implications, like that even in the instrumented clones, the only loads and stores that get instrumentation are the ones that survived optimization. So, while assigning to a non-escaping local variable in C++ is a "store" in the abstract sense (and clang would represent it that way when creating the unoptimized LLVM IR for that program), the store will be totally gone by the time AutoRTFMPass
does its thing. Most likely, it'll be replaced by SSA dataflow, which AutoRTFM has no need to muck with. We have also implemented a few small optimizations based on existing LLVM static analyses to avoid instrumenting stores when we can prove that we don't need to.AutoRTFM::Transact
. Calling this function with a lambda causes the runtime to set up a transaction object that contains the log data structures and other transaction state. Then, Transact
calls the lambda's clone. This is exactly like the lambda, but has transactional instrumentation and obeys the AutoRTFM ABI.Transact
in response to Verse entering a failure context, such as an if condition. In the future, we will call Transact
before entering any Verse execution.malloc/free
and related functions (like custom allocators).WriteFile
in a transaction is an error by default. That's desirable default behavior, since WriteFile
cannot be reliably undone—even if we tried to undo the write later, some other thread (either in our process or another process) could have seen the write and done something else unrecoverable based on that. This makes it practical to take an existing C++ program, compile it with AutoRTFM, and try to do a Transact
. Either it will work correctly, or you will get an error saying that you called a not-transaction-safe function, along with a message about how to find that function and its callsite.malloc
and related functions. AutoRTFM takes the approach of not transactionalizing the implementation of malloc
, but forcing callers to wrap calls to malloc
with AutoRTFM API. This is especially straightforward in programs like Unreal Engine, where there is already a facade API that wraps the actual malloc
implementation. It's best to understand what we do by looking at the code. Before AutoRTFM, FMemory::Malloc
was something like:void*
FMemory::
Malloc
(SIZE_T
Count
, uint32
Alignment
)
{
return
GMalloc
->
Malloc
(Count, Alignment);
}
void*
FMemory::
Malloc
(SIZE_T
Count
, uint32
Alignment
)
{
void*
Ptr;
UE_AUTORTFM_OPEN
(
{
Ptr =
GMalloc
->
Malloc
(Count, Alignment);
});
AutoRTFM::
OnAbort
([
Ptr
]
{
Free
(Ptr);
});
return
AutoRTFM::
DidAllocate
(Ptr, Count);
}
UE_AUTORTFM_OPEN
to say that the enclosed block (which gets turned into a lambda) should run with no transaction instrumentation. This is simple for the runtime and compiler to implement: it just means that we emit a call to the original (uncloned) version of the lambda. This means that GMalloc->Malloc
runs inside the transaction, but without any transactional instrumentation. We call this "running in the open" (as opposed to running closed, which means running the transactionalized clone). So, after this UE_AUTORTFM_OPEN
returns, the malloc call would have run to completion immediately without any rollbacks registered. That maintains transactional semantics because malloc is already thread-safe, and so the fact that this thread did a malloc is not observable to other threads until the pointer to the malloc'd object leaks out of the transaction. The pointer won't leak out of the transaction until we commit, since the only places where Ptr
can be stored are either other newly allocated memory or memory that AutoRTFM will transactionalize by logging the write.AutoRTFM::OnAbort
. If the transaction commits, then this handler does not run, so the malloc'd object survives. But if the transaction aborts, we will free the object. This ensures that there are no memory leaks from aborting.OnAbort
, since we don't have to worry about AutoRTFM trying to overwrite the contents of the object back to some previous state after we've already freed it. Also, not logging writes to newly allocated memory is a useful optimization.UE_AUTORTFM_OPEN
just runs the code block.AutoRTFM::OnAbort
is a no-op; the passed-in lambda is just ignored.AutoRTFM::DidAllocate
is a no-op.void
FMemory::
Free
(
void*
Original
)
{
UE_AUTORTFM_ONCOMMIT
(
{
GMalloc
->
Free
(Original);
});
}
UE_AUTORTFM_ONCOMMIT
function has this behavior:
Open/OnCommit/OnAbort
APIs gives us everything we need to run C++ code in a transaction so long as that code does not use atomics, locks, system calls, or calls into those parts of the Unreal Engine codebase that we chose not to transactionalize at all (like the rendering and physics engines, currently). This has led to 94 total uses of the Open
APIs in all of Fortnite. Those uses involve wrapping existing code with Open
and friends at the appropriate granularity rather than changing algorithms or data structures.<no_rollback>
effect, so that more code can be called from
if
and for
. That effect is only there—polluting a lot of function signatures—because most C++ code cannot be transactional. AutoRTFM fixes that problem.