

Toby 'qubit' Cubitt - Lost or corrupted undo-tree history
source link: http://www.dr-qubit.org/Lost_undo-tree_history.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Lost or corrupted undo-tree history
6 January 2020 Undo-tree is the most popular of the Emacs packages I wrote and maintain,11
not least because it's used by Evil and Spacemacs. However, for many years, the internet has been awash22
with reports of undo history being lost or corrupted when using undo-tree-mode
. Trouble was, though I do all my text editing in Emacs, and always have global-undo-tree-mode
enabled, I'd never encountered these issues myself. And no one had ever been able to send me a recipe for reliably reproducing them. Making it fiendishly difficult to even begin to investigate what might be going on.
This post is about some recent changes to undo-tree that might finally address these issues.
Lost undo history: the dreaded "no further undo information" message
Both Emacs' vanilla undo system and the undo-tree system print this message when there are no further changes to be undone in the buffer. Undo-tree users had been reporting for years that they were seeing this message, and weren't able to undo any further, even though they hadn't undone all the way back to the very first change they'd made to the buffer.
Finally, last year, user and Spacemacs contributor duianto very helpfully posted on the Spacemacs github issue tracker a simple recipe to reproduce this issue. Finally having a reliable way to reproduce this issue, it was then easy to investigate what was going on. And it turned out that …
… undo-tree-mode
was working completely correctly, and behaving exactly as documented.
When the size of the undo history recorded by Emacs grows larger than the limit set by the undo-limit
configuration variable33
, Emacs discards the oldest changes recorded in the undo history, to bring the history size back below this limit. Undo-tree implements exactly the same behaviour, discarding the oldest changes when the size of the undo tree grows bigger than undo-limit
. duianto's recipe generated enough changes for the undo history stored in the undo tree to grow larger than undo-limit
. So undo-tree duly discarded the oldest changes.
One response to this would be to yell RTFM at everyone.
But I think a much more useful and instructive response is to consider why users thought this behaviour was a bug in undo-tree. Whereas (as far as I'm aware) no one reported this issue with vanilla Emacs undo, despite it discarding undo history in exactly the same way.
I can think of a few explanations:
- Undo-tree gave no indication to the user when undo history was being discarded. It just silently discarded the oldest history when it grew beyond
undo-limit
. So users weren't expecting undo history to have been discarded.
This alone doesn't explain it, as Emacs' vanilla undo system gives no such indication, either. But…
Because of the way
undo-tree-mode
stores undo history as a tree of changes, it typically takes far fewer undo operations to before you run out of changes to undo. If you repeatedly undo, you run out of changes as soon as you reach the top of the tree. Other lines of undo history are stored in other branches of the tree, reached by switching to a different undo branch rather than repeatedly undoing.Whereas with vanilla Emacs' linear undo history, you have to wind all the way back through the entire undo history, undoing changes, then undoing previous undos, then undoing the undoing of the undos, and so on, before you reach the beginning of the undo history and run out of things to undo.44
With vanilla Emacs undo, it's hard to know where you've got to in the undo history, or how far back you expect to reach. The very act of undoing creates more undo history. And as soon as you break a sequence of undos with any other command (even moving the point), you start right back from the beginning, undoing your recent undos (which is how vanilla Emacs implements redo operations).
By making the undo history conceptually simpler to navigate, and even displaying the whole undo-tree visually,
undo-tree-mode
makes it much easier to know where you are in the undo history, where you want to get to, and therefore much more noticable if undo history has been discarded.
Viewed in this light, these non-bug reports are an indication that undo-tree is doing a good job of making Emacs undo more usable, but a bad job of meeting users' expectations. Apparently users expect undo history to be preserved indefinitely – not unreasonable given modern computing resources. So they perceive it to be a bug if any undo history is lost. And undo-tree makes it that much easier to notice that some undo history has been lost, compared to vanilla Emacs undo.
If software isn't matching user's expectations, there may not be a bug in the code. But in my view, there is still a bug: in the user experience. These possible explanations for the non-bug reports suggest two ways to address this in undo-tree: reduce user's expectations, or change undo-tree's behaviour to match user's expectations. In the latest undo-tree release, I've tried to do both.
Emacs famously solved the complaints about garbage-collection freezes due to its stop-the-world garbage collector by simply disabling the message telling users garbage collection was happening. The undo-tree history discarding issue seems to call for the opposite solution. Previously, undo-tree (like vanilla Emacs) didn't warn users when undo history was discarded due to breaching the undo-limit
.55
In the new release, undo-tree displays a message in the echo area when undo history is discarded, and points users to the undo-limit
configuration variable.
At the same time, users clearly expect more undo history to be preserved when using undo-tree-mode
. The Emacs default for undo-limit
is 80kB, which is very conservative. It's a very safe bet that any user running Emacs with undo-tree-mode
these days has a lot more spare RAM than this. So the new version of undo-tree increases the default undo history size limit to 80MB when running undo-tree-mode
. More precisely, there are three new configuration variables, undo-tree-limit
, undo-tree-strong-limit
and undo-tree-outer-limit
, all with larger default values than their vanilla Emacs counterparts. undo-tree-mode
sets the undo history size limit to whichever of undo-limit
or undo-tree-limit
is the larger (and similarly for the other configuration variables).
Finally, if you really don't want undo-tree to ever discard any undo history, no matter how large it grows, setting undo-tree-limit
to nil will now make undo-tree preserve undo history indefinitely.66
I'm not sure this is a wise idea, particularly if coupled with the popular undo-tree-auto-save-history
setting which persistently saves undo-tree history to file. Over time, undo history could easily grow large enough to become very slow to load and save, or even use. So this is not the default setting. But the option's there now for those who want to live on the edge.
Corrupted undo-tree history
The undo-history-discarding issue is only half the story, though. There has also been a steady trickle of reports of corrupted undo history, resulting in either errors or (worse still) garbled buffer text when undoing. Unfortunately, I still have yet to be sent a recipe for reliably reproducing this issue, and I can't reproduce it myself. So I'm still shooting in the dark on this one. However, I have a sneaking suspicion a race condition with the Emacs garbage collector might be responsible…
To understand why, you need to understand a little of how Emacs' undo system works behind the scenes, and how undo-tree manages to replace this at the Elisp level, without changing any Emacs c code. Whenever you make any modification to the text in a buffer, Emacs pushes a description of those changes onto the front of the list stored in the buffer-local buffer-undo-list
variable. This variable gets special treatment during garbage collection. Whenever GC runs, it checks if the size of buffer-undo-list
exceeds undo-limit
(or one of the other undo size limits), and deletes enough older undo history from the tail of buffer-undo-list
to bring it back within the configured size limits.
Much of this mechanism is hard-coded deep down in Emacs c code, and runs outside the context of the Elisp interpreter. Undo-tree stays well away from this mechanism, and doesn't mess with it at all. In undo-tree-mode
, undo history still accumulates in buffer-undo-list
. But whenever you call any undo-tree command (undo-tree-undo
, undo-tree-redo
, undo-tree-visualize
etc.), the first thing it does is to transfer any undo history that has accumulated in buffer-undo-list
since the last time, into the undo-tree package's own buffer-undo-tree
data structure. In this way, undo-tree doesn't have to reimplement any of the code for recording undo history (which would anyway be very difficult to do from pure Elisp code). Instead, it can rely on Emacs' built-in, deeply embedded, thoroughly road-tested implementation of this.
However, this design means undo history continues to accumulate in buffer-undo-list
until the next undo-tree command is called. If the user doesn't run any undo-tree commands for a long period, buffer-undo-list
can easily grow larger than undo-limit
and get garbage collected before any undo-tree command is called, i.e. before undo-tree even has a chance to see it.77
It's critical there are no gaps in the undo history; it must contain a continuous, unbroked chain of changes. Otherwise, undoing across a gap in the history would certainly lead to corrupted buffer text at best, errors at worse.
With older undo history still stored in buffer-undo-tree
, somewhat more recent history deleted before undo-tree saw it, and the newest history still stored in buffer-undo-list
, doesn't this mean undo-tree will inevitably end up corrupting the undo history? Is this the source of the bugs? No! It would be a very poor design if that were the case. The potential GC race condition is more subtle.
Let's think through the scenario where Emacs garbage collects undo history exceeding undo-limit
before undo-tree gets to see it. Emacs only ever garbage collects just enough undo history to bring it back within the limit, in order to preserve as much undo history as possible. So the undo data remaining in buffer-undo-list
after GC is as much as Emacs is allowed to store, according to the current undo-limit
setting. Therefore, the correct thing for undo-tree to do in this scenario is to discard the contents of buffer-undo-tree
entirely, and build a fresh undo tree from scrach using just the contents of buffer-undo-list
. This freshly built tree will already contain as much undo history as undo-tree is allowed to store, according to the configured undo-limit
.
All that's required is some mechanism to detect that Emacs has garbage collected undo history since undo-tree last saw the contents of buffer-undo-list
. We could try to check the size of buffer-undo-list
against undo-limit
(and the other settings). But this would be very unreliable. Emacs only discards undo history in chunks ("undo changesets" in Emacs terminology); it doesn't discard individual changes. A very large changeset that just barely pushes buffer-undo-list
over undo-limit
would, when discarded, leave a buffer-undo-list
that's much smaller than the limit. But the same undo history could equally well have been produced by a sequence of small changes that never pushed buffer-undo-list
over undo-limit
at all.
Instead, undo tree adds a special indicator entry to the very end of buffer-undo-list
. When undo tree transfers undo history from buffer-undo-list
to buffer-undo-tree
, it checks whether this special indicator is still present or not.88
Since garbage collection deletes the oldest undo history from the end of buffer-undo-list
first, the first thing that gets deleted is this special indicator entry. If it's missing, it means Emacs has discarded some undo history since undo-tree last saw it, and a fresh buffer-undo-tree
should be built from the remaining contents of buffer-undo-list
.
The undo-tree package's undo-list-transfer-to-tree
function is the only interface between Emacs' low-level undo history recording mechanisms, and undo-tree's system. This is quite a simple function: all it does is pop entries off buffer-undo-list
, and add them to buffer-undo-tree
. It doesn't manipulate the undo data itself in any way.99
And when it comes to actually performing undo operations, undo-tree doesn't even do these itself! Rather, it passes this same undo data back to Emacs' own primitive-undo
function – the very same function that Emacs' vanilla undo system uses. So there's very little scope for corrupting the undo data, and very little code footprint where this could occur. The simple undo-list-transfer-to-tree
function is the only plausible candidate.
But Emacs' GC runs outside the Elisp interpreter. It can be triggered at any point whilst running Elisp code, between pretty much any pair of Elisp instructions. What if GC happens to run whilst undo-list-transfer-to-tree
is in the middle of popping undo entries off buffer-undo-list
? As long as GC deletes part of buffer-undo-list
that comes after the entry undo-list-transfer-to-tree
is currently processing, when it resumes after GC undo-list-transfer-to-tree
will just notice the missing indicator when it gets to the end of buffer-undo-list
, and build a fresh buffer-undo-tree
from the undo history it's just processed. But if GC happens to delete part of the changeset undo-list-transfer-to-tree
is in the middle of processing, it could end up transferring a partial undo changeset. This could potentially lead to corrupted undo history.
Worse still is if GC deletes part of buffer-undo-list
starting from a point before the entry undo-list-transfer-to-tree
has already reached. Emacs' mark-and-sweep GC shouldn't delete any elements reachable from Elisp, which in this scenario would include the whole of buffer-undo-list
. But that variable is a glaring exception. Its contents are always reachable from Elisp, via buffer-undo-list
itself. Yet Emacs needs to delete part of it during GC, nonetheless. So the Emacs garbage collector takes a blunt approach: cons cells get unlinked from buffer-undo-list
by GC, regardless of whether something else is holding a reference to them.1010
In this scenario, when it resumes, undo-list-transfer-to-tree
will continue processing cons cells that have been unlinked from buffer-undo-list
and the behaviour is undefined, or at least undocumented.1111
Like many race conditions, this one is extremely hard to test. It requires GC to be triggered at just the right point. Moreover, GC runs outside the context of the Elisp interpreter. So from within Elisp or the Elisp debugger, you only get to see the state of buffer-undo-list
and undo-list-transfer-to-tree
before and after GC, but not during. However, when testing duianto's recipe for reproducing the undo-tree history discarding issue discussed above, I sporadically saw some odd behaviour that didn't make sense from Elisp, and which I couldn't reliably reproduce. And other comments on the same Spacemacs git issue tracker circumstantially hint at GC being implicated. In any case, whether it's responsible for the trickle of undo-tree corruption bug reports or not, this potential GC race condition is still something that needs addressing.
In the latest undo-tree release, I've implemented multiple measures to reduce the likelihood of – and eliminate the effects of – this race condition:
undo-list-transfer-to-tree
now manually triggers GC by calling thegarbage-collect
function, just before it starts processingbuffer-undo-list
.1212This should significantly reduce the likelihood of GC occurring again whilst in the middle of processing
buffer-undo-list
.- Previously,
undo-list-transfer-to-tree
popped undo entries one by one frombuffer-undo-list
, and built new sections of thebuffer-undo-tree
data structure as it went along. This gave a relatively large time window when GC could be triggered whilst still in the middle of processingbuffer-undo-list
. In the new version,undo-list-transfer-to-tree
first deep-copies the entire contents ofbuffer-undo-list
to a new variable, then processes the copy. In this way, GC could only affect things if it occurs whilst copying, which should be a singificantly shorter time window. - Finally, undo-tree now installs a function in the
post-gc-hook
which sets anundo-tree-gc-flag
variable tot
whenever GC runs.undo-list-transfer-to-tree
resets this flag tonil
just before deep-copyingbuffer-undo-list
, and checks if it's stillnil
afterwards. If not, it keeps retrying, until it succeeds in copyingbuffer-undo-list
without GC being triggered during copying.
The first two measures should significantly reduce the chance of hitting this GC race condition. The third should elliminate any possibility of bad side-effects, in the unlikely event GC does still run at a bad time.
As it's extremely difficult to reliably reproduce the race condition, let alone be sure if this is what's behind the unreproducible bug reports, it's very hard to test how successful these measures will be before the release. On the other hand, it's also possible that many or all of the bug reports related to the undo history discarding non-bug discussed above, which measures discussed there should alleviate.
The new release is now out on both GNU Elpa and here. Time will tell whether these measures stem the tide (or rather, divert the rivulet).
Recommend
-
62
-
6
Producing an N+1 Qubit CCZ State with an N Qubit Adder 20 Oct 2019 Back in April, I read the pre-print "Lower bounds on the non-Clifford resources for quantum comput...
-
8
Visualizing 2-Qubit Entanglement 06 Aug 2017 One of the annoying things about quantum computing is that it's not very amenable to visualization. We do have a great way to draw the state of one qubit, i.e. the...
-
5
Trouble Adding Constants into Qubit Registers 26 Feb 2017 I had a bit of a surprise this week. Despite never publishing a paper, I've been cited in one! The preprint
-
10
Converting Rotations into "Nice" Qubit Operations 24 Nov 2014 In this post: avoiding some issues when mapping from rotations to unitary matrices, and running into different issues. Co...
-
8
Fiber Optics...
-
8
In search of the perfect shotI was struck by the beauty of Saturn.The Cassini spacecraft had spent years in increasingly daring orbits, capturing thousands of images of that enigmatic world, bef...
-
10
September 10, 2021
-
11
AI-powered personalization provider Qubit gets acquired by Coveo Consumers shop online for their needsImage Credit:
-
11
I'm Toby, I was a writer for the Morning Brew and now lead content at Launch House. AMA 🔥Toby Howell1d ago31 replies
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK