Previously it was Backend's responsibility to store the last progress,
and when calling routines in Collection, one had to construct and pass
in a Fn, which wasn't the most ergonomic. This PR adds the last progress
state to the collection, so that the routines no longer need a separate
progress arg, and makes some other tweaks to improve ergonomics.
ThrottlingProgressHandler has been tweaked so that it now stores the
current state, so that callers don't need to store it separately. When
a long-running routine starts, it calls col.new_progress_handler(),
which automatically initializes the data to defaults, and updates the
shared UI state, so we no longer need to manually update the state at
the start of an operation.
The backend shares the Arc<Mutex<>> with the collection, so it can get
at the current state, and so we can update the state when importing a
backup.
Other tweaks:
- The current Incrementor was awkward to use in the media check, which
uses a single incrementing value across multiple method calls, so I've
added a simpler alternative for such cases. The old incrementor method
has been kept, but implemented directly on ThrottlingProgressHandler.
- The full sync code was passing the progress handler in a complicated
way that may once have been required, but no longer is.
- On the Qt side, timers are now stopped before deletion, or they keep
running for a few seconds.
- I left the ChangeTracker using a closure, as it's used for both importing
and syncing.
Will be handy to use it in our other scripts in the future too - thanks
Rumo!
Results of benchmarking ./run before and after these crate splits:
- Touching a proto file leads to a slight increase: about +90ms
- Touching an rslib file leads to a bigger decrease, as there's less to
recompile: about -700ms
And ./ninja test is even better: about +200ms and -3800ms.
Due to the orphan rule, this meant removing our usages of impl ProtoStruct,
or converting them to a trait when they were used commonly.
rslib now directly references anki_proto and anki_i18n, instead of
'pub use'-ing them, and we can put the generated files back in OUT_DIR.
* Move open_test_collection into Collection test impl
* Fix invalid ids when checking database
* Report fixed invalid ids
* Improve message when trying to export invalid ids
Also move ImportError due to namespace conflicts with snafu macro.
* Take a human name in DeckAdder::new
* Mention timestamps in the db check message (dae)
Will help to correlate the fix with the message shown when importing/
exporting.
This PR replaces the existing Python-driven sync server with a new one in Rust.
The new server supports both collection and media syncing, and is compatible
with both the new protocol mentioned below, and older clients. A setting has
been added to the preferences screen to point Anki to a local server, and a
similar setting is likely to come to AnkiMobile soon.
Documentation is available here: <https://docs.ankiweb.net/sync-server.html>
In addition to the new server and refactoring, this PR also makes changes to the
sync protocol. The existing sync protocol places payloads and metadata inside a
multipart POST body, which causes a few headaches:
- Legacy clients build the request in a non-deterministic order, meaning the
entire request needs to be scanned to extract the metadata.
- Reqwest's multipart API directly writes the multipart body, without exposing
the resulting stream to us, making it harder to track the progress of the
transfer. We've been relying on a patched version of reqwest for timeouts,
which is a pain to keep up to date.
To address these issues, the metadata is now sent in a HTTP header, with the
data payload sent directly in the body. Instead of the slower gzip, we now
use zstd. The old timeout handling code has been replaced with a new implementation
that wraps the request and response body streams to track progress, allowing us
to drop the git dependencies for reqwest, hyper-timeout and tokio-io-timeout.
The main other change to the protocol is that one-way syncs no longer need to
downgrade the collection to schema 11 prior to sending.
* Relax chrono specification for AnkiDroid
https://github.com/ankidroid/Anki-Android-Backend/pull/251
* Add AnkiDroid service and AnkiDroid customizations
Most of the work here was done by David in the Backend repo; integrating
it into this repo for ease of future maintenance.
Based on 5d9f262f4c
with some tweaks:
- Protobuf imports have been fixed to match the recent refactor
- FatalError has been renamed to AnkidroidPanicError
- Tweaks to the desktop code to deal with the extra arg to open_collection,
and exclude AnkiDroid service methods from our Python code.
* Refactor AnkiDroid's DB code to avoid uses of unsafe
The Rust community appear to have converged on tracing - it's used by
the Rust compiler, and receives close to 10x the number of downloads
that slog does. Its API is more ergonomic, and it does a much nicer
job with async rust.
To make this change, we no longer pass around explicit loggers, and rely
on a globally-registered one. The log file location has been changed
from one in each profile folder to a single one in the base folder. This
will remain empty for most users, since only errors are logged by default,
but may be useful for debugging future changes.
* Run cargo +nightly fmt
* Latest prost-build includes clippy workaround
* Tweak Rust protobuf imports
- Avoid use of stringify!(), as JetBrains editors get confused by it
- Stop merging all protobuf symbols into a single namespace
* Remove some unnecessary qualifications
Found via IntelliJ lint
* Migrate some asserts to assert_eq/ne
* Remove mention of node_modules exclusion
This no longer seems to be necessary after migrating away from Bazel,
and excluding it means TS/Svelte files can't be edited properly.
* Remove deprecated `and_hms()`
* Update chrono
* Update licenses and fix script
* Remove deprecated Date struct
* Remove chrono pin
* Skip format check on .vscode
Was failing for no reason.
* Replace deprecated chrono functions
* Add cargo-deny to update-licenses & pin versions (dae)
* Remove time 0.1 dependency (dae)
We don't need to wait for chrono 0.5; it was provided behind a legacy
feature flag.
* Add crate snafu
* Replace all inline structs in AnkiError
* Derive Snafu on AnkiError
* Use snafu for card type errors
* Use snafu whatever error for InvalidInput
* Use snafu for NotFoundError and improve message
* Use snafu for FileIoError to attach context
Remove IoError.
Add some context-attaching helpers to replace code returning bare
io::Errors.
* Add more context-attaching io helpers
* Add message, context and backtrace to new snafus
* Utilize error context and backtrace on frontend
* Rename LocalizedError -> BackendError.
* Remove DocumentedError.
* Have all backend exceptions inherit BackendError.
* Rename localized(_description) -> message
* Remove accidentally committed experimental trait
* invalid_input_context -> ok_or_invalid
* ensure_valid_input! -> require!
* Always return `Err` from `invalid_input!`
Instead of a Result to unwrap, the macro accepts a source error now.
* new_tempfile_in_parent -> new_tempfile_in_parent_of
* ok_or_not_found -> or_not_found
* ok_or_invalid -> or_invalid
* Add crate convert_case
* Use unqualified lowercase type name
* Remove uses of snafu::ensure
* Allow public construction of InvalidInputErrors (dae)
Needed to port the AnkiDroid changes.
* Make into_protobuf() public (dae)
Also required for AnkiDroid. Not sure why it worked previously - possible
bug in older Rust version?
* Collection needs to be closed prior to backup even when not downgrading
* Backups -> BackupLimits
* Some improvements to backup_task
- backup_inner now returns the error instead of logging it, so that
the frontend can discover the issue when they await a backup (or create
another one)
- start_backup() was acquiring backup_task twice, and if another thread
started a backup between the two locks, the task could have been accidentally
overwritten without awaiting it
* Backups no longer require a collection close
- Instead of closing the collection, we ensure there is no active
transaction, and flush the WAL to disk. This means the undo history
is no longer lost on backup, which will be particularly useful if we
add a periodic backup in the future.
- Because a close is no longer required, backups are now achieved with
a separate command, instead of being included in CloseCollection().
- Full sync no longer requires an extra close+reopen step, and we now
wait for the backup to complete before proceeding.
- Create a backup before 'check db'
* Add File>Create Backup
https://forums.ankiweb.net/t/anki-mac-os-no-backup-on-sync/6157
* Defer checkpoint until we know we need it
When running periodic backups on a timer, we don't want to be fsync()ing
unnecessarily.
* Skip backup if modification time has not changed
We don't want the user leaving Anki open overnight, and coming back
to lots of identical backups.
* Periodic backups
Creates an automatic backup every 30 minutes if the collection has been
modified.
If there's a legacy checkpoint active, tries again 5 minutes later.
* Switch to a user-configurable backup duration
CreateBackup() now uses a simple force argument to determine whether
the user's limits should be respected or not, and only potentially
destructive ops (full download, check DB) override the user's configured
limit.
I considered having a separate limit for collection close and automatic
backups (eg keeping the previous 5 minute limit for collection close),
but that had two downsides:
- When the user closes their collection at the end of the day, they'd
get a recent backup. When they open the collection the next day, it
would get backed up again within 5 minutes, even though not much had
changed.
- Multiple limits are harder to communicate to users in the UI
Some remaining decisions I wasn't 100% sure about:
- If force is true but the collection has not been modified, the backup
will be skipped. If the user manually deleted their backups without
closing Anki, they wouldn't get a new one if the mtime hadn't changed.
- Force takes preference over the configured backup interval - should
we be ignored the user here, or take no backups at all?
Did a sneaky edit of the existing ftl string, as it hasn't been live
long.
* Move maybe_backup() into Collection
* Use a single method for manual and periodic backups
When manually creating a backup via the File menu, we no longer make
the user wait until the backup completes. As we continue waiting for
the backup in the background, if any errors occur, the user will get
notified about it fairly quickly.
* Show message to user if backup was skipped due to no changes
+ Don't incorrectly assert a backup will be created on force
* Add "automatic" to description
* Ensure we backup prior to importing colpkg if collection open
The backup doesn't happen when invoked from 'open backup' in the profile
screen, which matches Anki's previous behaviour. The user could
potentially clobber up to 30 minutes of their work if they exited to
the profile screen and restored a backup, but the alternative is we
create backups every time a backup is restored, which may happen a number
of times if the user is trying various ones. Or we could go back to a
separate throttle amount for this case, at the cost of more complexity.
* Remove the 0 special case on backup interval; minimum of 5 minutes
https://github.com/ankitects/anki/pull/1728#discussion_r830876833
* Write media files in chunks
* Test media file writing
* Add iter `ReadDirFiles`
* Remove ImportMediaError, fail fatally instead
Partially reverts commit f8ed4d89ba.
* Compare hashes of media files to be restored
* Improve `MediaCopier::copy()`
* Restore media files atomically with tempfile
* Make downgrade flag an enum
* Remove SchemaVersion::Latest in favour of Option
* Remove sha1 comparison again
* Remove unnecessary repr(u8) (dae)
* Fix legacy colpkg import; disable v3 import/export; add roundtrip test
The test has revealed we weren't decompressing the media files on v3
import. That's easy to fix, but means all files need decompressing
even when they already exist, which is not ideal - it would be better
to store size/checksum in the metadata instead.
* Switch media and meta to protobuf; re-enable v3 import/export
- Fixed media not being decompressed on import
- The uncompressed size and checksum is now included for each media
entry, so that we can quickly check if a given file needs to be extracted.
We're still just doing a naive size comparison on colpkg import at the
moment, but we may want to use a checksum in the future, and will need
a checksum for apkg imports.
- Checksums can't be efficiently encoded in JSON, so the media list
has been switched to protobuf to reduce the the space requirements.
- The meta file has been switched to protobuf as well, for consistency.
This will mean any colpkg files exported with beta7 will be
unreadable.
* Avoid integer version comparisons
* Re-enable v3 test
* Apply suggestions from code review
Co-authored-by: RumovZ <gp5glkw78@relay.firefox.com>
* Add export_colpkg() method to Collection
More discoverable, and easier to call from unit tests
* Split import/export code out into separate folders
Currently colpkg/*.rs contain some routines that will be useful for
apkg import/export as well; in the future we can refactor them into a
separate file in the parent module.
* Return a proper error when media import fails
This tripped me up when writing the earlier unit test - I had called
the equivalent of import_colpkg()?, and it was returning a string error
that I didn't notice. In practice this should result in the same text
being shown in the UI, but just skips the tooltip.
* Automatically create media folder on import
* Move roundtrip test into separate file; check collection too
* Remove zstd version suffix
Prevents a warning shown each time Rust Analyzer is used to check the
code.
Co-authored-by: RumovZ <gp5glkw78@relay.firefox.com>
* Implement colpkg exporting on backend
* Use exporting logic in backup.rs
* Refactor exporting.rs
* Add backend function to export collection
* Refactor backend/collection.rs
* Use backend for colpkg exporting
* Don't use default zip compression for media
* Add exporting progress
* Refactor media file writing
* Write dummy collections
* Localize dummy collection note
* Minimize dummy db size
* Use `NamedTempFile::new()` instead of `new_in`
* Drop redundant v2 dummy collection
* COLLECTION_VERSION -> PACKAGE_VERSION
* Split `lock_collection()` into two to drop flag
* Expose new colpkg in GUI
* Improve dummy collection message
* Please type checker
* importing-colpkg-too-new -> exporting-...
* Compress the media map in the v3 package (dae)
On collections with lots of media, it can grow into megabytes.
Also return an error in extract_media_file_names(), instead of masking
it as an optional.
* Store media map as a vector in the v3 package (dae)
This compresses better (eg 280kb original, 100kb hashmap, 42kb vec)
In the colpkg import case we don't need random access. When importing
an apkg, we will need to be able to fetch file data for a given media
filename, but the existing map doesn't help us there, as we need
filename->index, not index->filename.
* Ensure folders in the media dir don't break the file mapping (dae)
* Add zstd dep
* Implement backend backup with zstd
* Implement backup thinning
* Write backup meta
* Use new file ending anki21b
* Asynchronously backup on collection close in Rust
* Revert "Add zstd dep"
This reverts commit 3fcb2141d2.
* Add zstd again
* Take backup col path from col struct
* Fix formatting
* Implement backup restoring on backend
* Normalize restored media file names
* Refactor `extract_legacy_data()`
A bit cumbersome due to borrowing rules.
* Refactor
* Make thinning calendar-based and gradual
* Consider last kept backups of previous stages
* Import full apkgs and colpkgs with backend
* Expose new backup settings
* Test `BackupThinner` and make it deterministic
* Mark backup_path when closing optional
* Delete leaky timer
* Add progress updates for restoring media
* Write restored collection to tempfile first
* Do collection compression in the background thread
This has us currently storing an uncompressed and compressed copy of
the collection in memory (not ideal), but means the collection can be
closed without waiting for compression to complete. On a large collection,
this takes a close and reopen from about 0.55s to about 0.07s. The old
backup code for comparison: about 0.35s for compression off, about
8.5s for zip compression.
* Use multithreading in zstd compression
On my system, this reduces the compression time of a large collection
from about 0.55s to 0.08s.
* Stream compressed collection data into zip file
* Tweak backup explanation
+ Fix incorrect tab order for ignore accents option
* Decouple restoring backup and full import
In the first case, no profile is opened, unless the new collection
succeeds to load.
In the second case, either the old collection is reloaded or the new one
is loaded.
* Fix number gap in Progress message
* Don't revert backup when media fails but report it
* Tweak error flow
* Remove native BackupLimits enum
* Fix type annotation
* Add thinning test for whole year
* Satisfy linter
* Await async backup to finish
* Move restart disclaimer out of backup tab
Should be visible regardless of the current tab.
* Write restored collection in chunks
* Refactor
* Write media in chunks and refactor
* Log error if removing file fails
* join_backup_task -> await_backup_completion
* Refactor backup.rs
* Refactor backup meta and collection extraction
* Fix wrong error being returned
* Call sync_all() on new collection
* Add ImportError
* Store logger in Backend, instead of creating one on demand
init_backend() accepts a Logger rather than a log file, to allow other
callers to customize the logger if they wish.
In the future we may want to explore using the tracing crate as an
alternative; it's a bit more ergonomic, as a logger doesn't need to be
passed around, and it plays more nicely with async code.
* Sync file contents prior to rename; sync folder after rename.
* Limit backup creation to once per 30 min
* Use zstd::stream::copy_decode
* Make importing abortable
* Don't revert if backup media is aborted
* Set throttle implicitly
* Change force flag to minimum_backup_interval
* Don't attempt to open folders on Windows
* Join last backup thread before starting new one
Also refactor.
* Disable auto sync and backup when restoring again
* Force backup on full download
* Include the reason why a media file import failed, and the file path
- Introduce a FileIoError that contains a string representation of
the underlying I/O error, and an associated path. There are a few
places in the code where we're currently manually including the filename
in a custom error message, and this is a step towards a more consistent
approach (but we may be better served with a more general approach in
the future similar to Anyhow's .context())
- Move the error message into importing.ftl, as it's a bit neater
when error messages live in the same file as the rest of the messages
associated with some functionality.
* Fix importing of media files
* Minor wording tweaks
* Save an allocation
I18n strings with replacements are already strings, so we can skip the
extra allocation. Not that it matters here at all.
* Terminate import if file missing from archive
If a third-party tool is creating invalid archives, the user should know
about it. This should be rare, so I did not attempt to make it
translatable.
* Skip multithreaded compression on small collections
Co-authored-by: Damien Elmes <gpg@ankiweb.net>
This is not ideal, but I struggled to come up with a better solution.
Background:
- The scheduler records the mtime of cards as it's building the queues,
and will throw an error in get_queued_cards() if the card on the DB
has a different mtime. This is to catch bugs - any operation that modifies
cards should be triggering a queue rebuild, or should adjust the queues
appropriately.
- The review screen skips the usual queue rebuild redraw, and directly
updates the flag icon. This is because a rebuild could cause a different
card to appear, or the answer side to switch back to the question side,
neither of which the user expects when they flag a card.
The current behaviour was broken: the queue rebuilding was still happening
on the backend, and the frontend was just failing to reflect it.
I initially tried to special-case Op::SetFlag, having it skip the queue
rebuild, and having set_card_flag() update the mtimes in the active
queue. But those mutations weren't captured by the undo log, so they
didn't get undone when undoing the set flag operation. We could perhaps
work around it by adding a separate undo entry to capture the mutation,
but it started to feel like it would be a pain to maintain moving forward.
By skipping the undo queue and retaining the same mtime, no queue
rebuild is required. Because we're setting usn, the cards will still
sync, but as mtime is not bumped, in the case of a conflict, an older
unsynced change from another client may revert the flag change.
Fixes https://forums.ankiweb.net/t/anki-2-1-50-beta-1-2/15608/145
Instead of calling a method inside the transaction body, routines
can now pass Op::SkipUndo if they wish the changes to be discarded
at the end of the transaction. The advantage of doing it this way is
that the list of changes can still be returned, allowing the sync
indicator to update immediately.
Closes#1252
Allows add-on authors to define their own label for a group of undoable
operations. For example:
def mark_and_bury(
*,
parent: QWidget,
card_id: CardId,
) -> CollectionOp[OpChanges]:
def op(col: Collection) -> OpChanges:
target = col.add_custom_undo_entry("Mark and Bury")
col.sched.bury_cards([card_id])
card = col.get_card(card_id)
col.tags.bulk_add(note_ids=[card.nid], tags="marked")
return col.merge_undo_entries(target)
return CollectionOp(parent, op)
The .add_custom_undo_entry() is for adding your own custom actions.
When extending a standard Anki action, instead store `target =
col.undo_status().last_step` after executing the standard operation.
This started out as a bigger refactor that required a separate
.commit_undoable() call to be run after each operation, instead of
having each operation return changes directly. But that proved to be
somewhat cumbersome in unit tests, and ran the risk of unexpected
behaviour if the caller invoked an operation without remembering to
finalize it.
- transact() now automatically clears card queues unless an op
opts-out (and currently only AnswerCard does). This means there's no
risk of forgetting to clear the queues in an operation, or when undoing/
redoing
- CollectionOp->UndoableOp
- clear queues when redoing "answer card", instead of clearing redo
when clearing queues