Python's regex engine performs pathologically on regexes like
'<!--.*?-->' when fed a large string of repeating '<!--' clauses.
Thanks to JaimeSlome / security@huntr.dev for the report; closes#1380.
Solved by switching to the Rust implementation, which does not suffer
from this issue.
entsToText(), minimizeHTML(), and the old regex constants have been
removed; they do not appear to be used by any add-ons.
In order to split backend.proto into a more manageable size, the protobuf
handling needed to be updated. This took more time than I would have
liked, as each language handles protobuf differently:
- The Python Protobuf code ignores "package" directives, and relies
solely on how the files are laid out on disk. While it would have been
nice to keep the generated files in a private subpackage, Protobuf gets
confused if the files are located in a location that does not match
their original .proto layout, so the old approach of storing them in
_backend/ will not work. They now clutter up pylib/anki instead. I'm
rather annoyed by that, but alternatives seem to be having to add an extra
level to the Protobuf path, making the other languages suffer, or trying
to hack around the issue by munging sys.modules.
- Protobufjs fails to expose packages if they don't start with a capital
letter, despite the fact that lowercase packages are the norm in most
languages :-( This required a patch to fix.
- Rust was the easiest, as Prost is relatively straightforward compared
to Google's tools.
The Protobuf files are now stored in /proto/anki, with a separate package
for each file. I've split backend.proto into a few files as a test, but
the majority of that work is still to come.
The Python Protobuf building is a bit of a hack at the moment, hard-coding
"proto" as the top level folder, but it seems to get the job done for now.
Also changed the workspace name, as there seems to be a number of Bazel
repos moving away from the more awkward reverse DNS naming style.
Like the previous change, avoid exposing the protobuf as a public API
for now. It requires more thought, and is probably better done with
either extra helper accessors like decks.name(), or via a native class.
Also:
- fix issues where the Undo action in the Browse screen was not
consistent with the main window. The existing hook signature has been
changed; from a snapshot of the add-on code from a few months ago, it
was not a hook that was being used by anyone.
- change the undo shortcut in the Browse window to match the main
window. It was different because undoing a change in the editing area
could accidentally trigger an undo of an operation, but the damage is
limited now that (most) operations can be redone. If it still proves to
be a problem, perhaps we should just always swallow ctrl+z when an
editing field is focused.
Allows add-on authors to define their own label for a group of undoable
operations. For example:
def mark_and_bury(
*,
parent: QWidget,
card_id: CardId,
) -> CollectionOp[OpChanges]:
def op(col: Collection) -> OpChanges:
target = col.add_custom_undo_entry("Mark and Bury")
col.sched.bury_cards([card_id])
card = col.get_card(card_id)
col.tags.bulk_add(note_ids=[card.nid], tags="marked")
return col.merge_undo_entries(target)
return CollectionOp(parent, op)
The .add_custom_undo_entry() is for adding your own custom actions.
When extending a standard Anki action, instead store `target =
col.undo_status().last_step` after executing the standard operation.
This started out as a bigger refactor that required a separate
.commit_undoable() call to be run after each operation, instead of
having each operation return changes directly. But that proved to be
somewhat cumbersome in unit tests, and ran the risk of unexpected
behaviour if the caller invoked an operation without remembering to
finalize it.
- backend now updates current notetype as part of addition
- frontend no longer implicitly adds, so we can assign a new name and
add in a single operation
Instead, fetch the config order on the frontend and pass a builtin
variant into the backend.
That makes the following unnecessary:
* Resolving the config sort in search/mod.rs
* Deserializing the Column enum
* Config accessors for the sort columns
* Remove duplicate backend columns
* Remove duplicate column routines
* Move columns on frontend from state to model
* Generate available columns from Colum enum
* Add second column label for notes mode
remove_note() now returns the count of removed cards, allowing us
to unify the tooltip between browser and review screen
I've left the old translation in - we'll need to write a script at
one point that gathers all references to translations in the code,
and shows ones that are unused.
- pass the handler directly
- reviewer special-cases for flags and notes are now applied at
call site
- drop the kind attribute on OpChanges which is not needed
Updating a deck via protobuf is now exposed on the backend, but not
currently on the frontend - I suspect we'll be better off writing
separate routines for the actions we need instead, and we get a better
undo description for free.
This is currently causing an ugly redraw in the browse screen, which
will need fixing.
Instead of generating a fluent.proto file with a giant enum, create
a .json file representing the translations that downstream consumers
can use for code generation.
This enables the generation of a separate method for each translation,
with a docstring that shows the actual text, and any required arguments
listed in the function signature.
The codebase is still using the old enum for now; updating it will need
to come in future commits, and the old enum will need to be kept
around, as add-ons are referencing it.
Other changes:
- move translation code into a separate crate
- store the translations on a per-file/module basis, which will allow
us to avoid sending 1000+ strings on each JS page load in the future
- drop the undocumented support for external .ftl files, that we weren't
using
- duplicate strings in translation files are now checked for at build
time
- fix i18n test failing when run outside Bazel
- drop slog dependency in i18n module
Now behaves the same way as standard find&replace:
- Will match substrings
- Regexs can be used to match multiple items; we no longer split
input on spaces.
- The find&replace dialog has been updated to add tags to the field
list.
- Introduced a new transact() method that wraps the return value
in a separate struct that describes the changes that were made.
- Changes are now gathered from the undo log, so we don't need to
guess at what was changed - eg if update_note() is called with identical
note contents, no changes are returned. Card changes will only be set
if cards were actually generated by the update_note() call, and tag
will only be set if a new tag was added.
- mw.perform_op() has been updated to expect the op to return the changes,
or a structure with the changes in it, and it will use them to fire the
change hook, instead of fetching the changes from undo_status(), so there
is no risk of race conditions.
- the various calls to mw.perform_op() have been split into separate
files like card_ops.py. Aside from making the code cleaner, this works
around a rather annoying issue with mypy. Because we run it with
no_strict_optional, mypy is happy to accept an operation that returns None,
despite the type signature saying it requires changes to be returned.
Turning no_strict_optional on for the whole codebase is not practical
at the moment, but we can enable it for individual files.
Still todo:
- The cursor keeps moving back to the start of a field when typing -
we need to ignore the refresh hook when we are the initiator.
- The busy cursor icon should probably be delayed a few hundreds ms.
- Still need to think about a nicer way of handling saveNow()
- op_made_changes(), op_affects_study_queue() might be better embedded
as properties in the object instead
'card modified' covers the common case where we need to rebuild the
study queue, but is also set when changing the card flags. We want to
avoid a queue rebuild in that case, as it causes UI flicker, and may
result in a different card being shown. Note marking doesn't trigger
a queue build, but still causes flicker, and may return the user back
to the front side when they were looking at the answer.
I still think entity-based change tracking is the simplest in the
common case, but to solve the above, I've introduced an enum describing
the last operation that was taken. This currently is not trying to list
out all possible operations, and just describes the ones we want to
special-case.
Other changes:
- Fire the old 'state_did_reset' hook after an operation is performed,
so legacy code can refresh itself after an operation is performed.
- Fire the new `operation_did_execute` hook when mw.reset() is called,
so that as the UI is updated to the use the new hook, it will still
be able to refresh after legacy code calls mw.reset()
- Update the deck browser, overview and review screens to listen to
the new hook, instead of relying on the main window to call moveToState()
- Add a 'set flag' backend action, so we can distinguish it from a
normal card update.
- Drop the separate added/modified entries in the change list in
favour of a single entry per entity.
- Add typing to mw.state
- Tweak perform_op()
- Convert a few more actions to use perform_op()
Basic proof of concept, where the 'delete note' operation in the
reviewer has been updated to use mw.perform_op(). Instead of manually
calling .reset() afterwards, a summary of the changes is returned as
part of the undo status query, and various parts of the GUI can listen
to gui_hooks.operation_did_execute and decide whether they want to
redraw based on the scope of the changes. This should allow the sidebar
to selectively redraw just the tags area in the future for example.
Currently we're just listing out all possible areas that might be changed;
in the future we could theoretically inspect the specific changes in the
undo log to provide a more accurate report (avoiding refreshing the tags
list when no tags were added for example).
You can test it out by opening the browse screen while studying, and
then deleting the current card - the browser should update to show (deleted)
on the cards due the earlier change.
If going ahead with this, aside from updating all the screens that currently
listen for resets, some thought will be required on how we can integrate
it with legacy code that expects to called when resets are made, and expects
to call .reset() when it makes changes.
Thoughts?
Fixes the following issue:
- some code directly modifies the database, causing modified_in_python
to be set to true
- an undoable operation is run, which calls autosave() at the end
- autosave() notices there's an undoable operation, and commits immediately
- because modified_in_python was true, col.mtime was bumped in Python
- that invalidated the undo queue, preventing the operation from being
undone
The existing code was really difficult to reason about:
- The default notetype depended on the selected deck, and vice versa,
and this logic was buried in the deck and notetype choosing screens,
and models.py.
- Changes to the notetype were not passed back directly, but were fired
via a hook, which changed any screen in the app that had a notetype
selector.
It also wasn't great for performance, as the most recent deck and tags
were embedded in the notetype, which can be expensive to save and sync
for large notetypes.
To address these points:
- The current deck for a notetype, and notetype for a deck, are now
stored in separate config variables, instead of directly in the deck
or notetype. These are cheap to read and write, and we'll be able to
sync them individually in the future once config syncing is updated in
the future. I seem to recall some users not wanting the tag saving
behaviour, so I've dropped that for now, but if people end up missing
it, it would be simple to add as an extra auxiliary config variable.
- The logic for getting the starting deck and notetype has been moved
into the backend. It should be the same as the older Python code, with
one exception: when "change deck depending on notetype" is enabled in
the preferences, it will start with the current notetype ("curModel"),
instead of first trying to get a deck-specific notetype.
- ModelChooser has been duplicated into notetypechooser.py, and it
has been updated to solely be concerned with keeping track of a selected
notetype - it no longer alters global state.
This splits update_card() into separate undoable/non-undoable ops
like the change to notes in b4396b94abdeba3347d30025c5c0240d991006c9
It means that actions get a blanket 'Update Card' description - in the
future we'll probably want to either add specific actions to the backend,
or allow an enum or string to be passed in to describe the op.
Other changes:
- card.flush() can no longer be used to add new cards. Card creation
is only supposed to be done in response to changes in a note's fields,
and this functionality was only exposed because the card generation
hadn't been migrated to the backend at that point. As far as I'm aware,
only Arthur's "copy notes" add-on used this functionality, and that should
be an easy fix - when the new note is added, the associated cards will
be generated, and they can then be retrieved with note.cards()
- tidy ups/PEP8
- note.flush() behaves like before, as otherwise actions or add-ons
that perform bulk flushing would end up creating an undo entry for
each note
- added col.update_note() to opt in to the new behaviour
- tidy up the names of some related routines
Reviews and operations on the backend that support undoing can now be
committed immediately, so they will not be lost in the event of a crash.
This required tweaks to a few places:
- don't set collection mtime on save() unless changes were made in
Python, as otherwise we end up accidentally clearing the backend undo
queue
- autosave() is now run on every reset()
- garbage collection now runs in a timer, instead of relying on
autosave() to be run periodically
- use dataclasses for the review/checkpoint undo cases, instead of the
nasty ad-hoc list structure
- expose backend review undo to Python, and hook it into GUI
- redo is not currently exposed on the GUI, and the backend can only
cope with reviews done by the new scheduler at the moment
- the initial undo prototype code was bumping mtime/usn on undo, but
that was not ideal, as it was breaking the queue handling which expected
the mtime to match. The original rationale for bumping mtime/usn was
to avoid problems with syncing, but various operations like removing
a revlog can't be synced anyway - so we just need to ensure we clear the
undo queue prior to syncing
- Rework V2 upgrade so that it no longer resets cards in learning,
or empties filtered decks.
- V1 users will receive a message at the top of the deck list
encouraging them to upgrade, and they can upgrade directly from that
screen.
- The setting in the preferences screen has been removed, so users
will need to use an older Anki version if they wish to switch back to
V1.
- Prevent V2 exports with scheduling from being importable into a V1
collection - the code was previously allowing this when it shouldn't
have been.
- New collections still default to v1 at the moment.
Also add helper to get map of decks and deck configs, as there were
a few places in the codebase where that was required.