The old Python code was only checking for NFC encoding, but we should
check for other issues like special filenames on windows (eg con.mp3)
- On export, the user is told to use Check Media if their media has
invalid filenames.
- On import, legacy packages will be transparently normalized. Since we're
doing the checks on export as well, any invalid names in a v3 package
are an error.
* Fix legacy colpkg import; disable v3 import/export; add roundtrip test
The test has revealed we weren't decompressing the media files on v3
import. That's easy to fix, but means all files need decompressing
even when they already exist, which is not ideal - it would be better
to store size/checksum in the metadata instead.
* Switch media and meta to protobuf; re-enable v3 import/export
- Fixed media not being decompressed on import
- The uncompressed size and checksum is now included for each media
entry, so that we can quickly check if a given file needs to be extracted.
We're still just doing a naive size comparison on colpkg import at the
moment, but we may want to use a checksum in the future, and will need
a checksum for apkg imports.
- Checksums can't be efficiently encoded in JSON, so the media list
has been switched to protobuf to reduce the the space requirements.
- The meta file has been switched to protobuf as well, for consistency.
This will mean any colpkg files exported with beta7 will be
unreadable.
* Avoid integer version comparisons
* Re-enable v3 test
* Apply suggestions from code review
Co-authored-by: RumovZ <gp5glkw78@relay.firefox.com>
* Add export_colpkg() method to Collection
More discoverable, and easier to call from unit tests
* Split import/export code out into separate folders
Currently colpkg/*.rs contain some routines that will be useful for
apkg import/export as well; in the future we can refactor them into a
separate file in the parent module.
* Return a proper error when media import fails
This tripped me up when writing the earlier unit test - I had called
the equivalent of import_colpkg()?, and it was returning a string error
that I didn't notice. In practice this should result in the same text
being shown in the UI, but just skips the tooltip.
* Automatically create media folder on import
* Move roundtrip test into separate file; check collection too
* Remove zstd version suffix
Prevents a warning shown each time Rust Analyzer is used to check the
code.
Co-authored-by: RumovZ <gp5glkw78@relay.firefox.com>
* Implement colpkg exporting on backend
* Use exporting logic in backup.rs
* Refactor exporting.rs
* Add backend function to export collection
* Refactor backend/collection.rs
* Use backend for colpkg exporting
* Don't use default zip compression for media
* Add exporting progress
* Refactor media file writing
* Write dummy collections
* Localize dummy collection note
* Minimize dummy db size
* Use `NamedTempFile::new()` instead of `new_in`
* Drop redundant v2 dummy collection
* COLLECTION_VERSION -> PACKAGE_VERSION
* Split `lock_collection()` into two to drop flag
* Expose new colpkg in GUI
* Improve dummy collection message
* Please type checker
* importing-colpkg-too-new -> exporting-...
* Compress the media map in the v3 package (dae)
On collections with lots of media, it can grow into megabytes.
Also return an error in extract_media_file_names(), instead of masking
it as an optional.
* Store media map as a vector in the v3 package (dae)
This compresses better (eg 280kb original, 100kb hashmap, 42kb vec)
In the colpkg import case we don't need random access. When importing
an apkg, we will need to be able to fetch file data for a given media
filename, but the existing map doesn't help us there, as we need
filename->index, not index->filename.
* Ensure folders in the media dir don't break the file mapping (dae)
Ideally this would have been in beta 6 :-) No add-ons appear to be
using customstudy.py/taglimit.py though, so it should hopefully not be
disruptive.
In the earlier custom study changes, we didn't get around to addressing
issue #1136. Now instead of trying to determine the maximum increase
to allow (which doesn't work correctly with nested decks), we just
present the total available to the user again, and let them decide. There's
plenty of room for improvement here still, but further work here might
be better done once we look into decoupling deck limits from deck presets.
Tags and available cards are fetched prior to showing the dialog now,
and will show a progress dialog if things take a while.
Tags are stored in an aux var now, so they don't inflate the deck
object size.
* Add forget prompt with options
- Restore original position
- Reset reps and lapses
* Restore position when resetting for export
* Add config context to avoid passing keys
* Add routine to fetch defaults; use method-specific enum (dae)
* Keep original position by default (dae)
* Fix code completion for forget dialog (dae)
Needs to be a symbolic link to the generated file
* Add zstd dep
* Implement backend backup with zstd
* Implement backup thinning
* Write backup meta
* Use new file ending anki21b
* Asynchronously backup on collection close in Rust
* Revert "Add zstd dep"
This reverts commit 3fcb2141d2.
* Add zstd again
* Take backup col path from col struct
* Fix formatting
* Implement backup restoring on backend
* Normalize restored media file names
* Refactor `extract_legacy_data()`
A bit cumbersome due to borrowing rules.
* Refactor
* Make thinning calendar-based and gradual
* Consider last kept backups of previous stages
* Import full apkgs and colpkgs with backend
* Expose new backup settings
* Test `BackupThinner` and make it deterministic
* Mark backup_path when closing optional
* Delete leaky timer
* Add progress updates for restoring media
* Write restored collection to tempfile first
* Do collection compression in the background thread
This has us currently storing an uncompressed and compressed copy of
the collection in memory (not ideal), but means the collection can be
closed without waiting for compression to complete. On a large collection,
this takes a close and reopen from about 0.55s to about 0.07s. The old
backup code for comparison: about 0.35s for compression off, about
8.5s for zip compression.
* Use multithreading in zstd compression
On my system, this reduces the compression time of a large collection
from about 0.55s to 0.08s.
* Stream compressed collection data into zip file
* Tweak backup explanation
+ Fix incorrect tab order for ignore accents option
* Decouple restoring backup and full import
In the first case, no profile is opened, unless the new collection
succeeds to load.
In the second case, either the old collection is reloaded or the new one
is loaded.
* Fix number gap in Progress message
* Don't revert backup when media fails but report it
* Tweak error flow
* Remove native BackupLimits enum
* Fix type annotation
* Add thinning test for whole year
* Satisfy linter
* Await async backup to finish
* Move restart disclaimer out of backup tab
Should be visible regardless of the current tab.
* Write restored collection in chunks
* Refactor
* Write media in chunks and refactor
* Log error if removing file fails
* join_backup_task -> await_backup_completion
* Refactor backup.rs
* Refactor backup meta and collection extraction
* Fix wrong error being returned
* Call sync_all() on new collection
* Add ImportError
* Store logger in Backend, instead of creating one on demand
init_backend() accepts a Logger rather than a log file, to allow other
callers to customize the logger if they wish.
In the future we may want to explore using the tracing crate as an
alternative; it's a bit more ergonomic, as a logger doesn't need to be
passed around, and it plays more nicely with async code.
* Sync file contents prior to rename; sync folder after rename.
* Limit backup creation to once per 30 min
* Use zstd::stream::copy_decode
* Make importing abortable
* Don't revert if backup media is aborted
* Set throttle implicitly
* Change force flag to minimum_backup_interval
* Don't attempt to open folders on Windows
* Join last backup thread before starting new one
Also refactor.
* Disable auto sync and backup when restoring again
* Force backup on full download
* Include the reason why a media file import failed, and the file path
- Introduce a FileIoError that contains a string representation of
the underlying I/O error, and an associated path. There are a few
places in the code where we're currently manually including the filename
in a custom error message, and this is a step towards a more consistent
approach (but we may be better served with a more general approach in
the future similar to Anyhow's .context())
- Move the error message into importing.ftl, as it's a bit neater
when error messages live in the same file as the rest of the messages
associated with some functionality.
* Fix importing of media files
* Minor wording tweaks
* Save an allocation
I18n strings with replacements are already strings, so we can skip the
extra allocation. Not that it matters here at all.
* Terminate import if file missing from archive
If a third-party tool is creating invalid archives, the user should know
about it. This should be rare, so I did not attempt to make it
translatable.
* Skip multithreaded compression on small collections
Co-authored-by: Damien Elmes <gpg@ankiweb.net>
* Replace Card.data with .original_position
* Use and update original position in v3
* Show original position in card info
* Revert restoring original position for now
* Fix pb card to/from pylib card
* Try original_position as the last pb field
* minor wording tweaks (dae)
* Implement custom study on backend
* Switch frontend to backend custom study
* Skip typecheck for new pb classes
* Build tag search string on backend
Also fixes escaping of special characters in tag names.
* `cram.cards` -> `cram.card_limit`
* Assign more meaningful names in `TagLimit`
* Broaden rustfmt glob
* Use `invalid_input()` helper
* Assign `FilteredDeckForUpdate` to temp var
* Implement `SearchBuilder`
* Rewrite `custom_study()` with `SearchBuilder`
* Replace match macros with `SearchBuilder`
* Remove `into_nodes_list` & `concatenate_searches`
* Add new `card_rendering` mod
Parses a text with av/tts tags and strips or extracts tags.
* Replace old `extract_av_tags` and `strip_av_tags`
... with new `card_rendering` mod
* ressource -> resource
* Add AV prettifier for use in browser table
* Accept String in av tag routines
... and avoid redundant writes if no changes need to be made.
* add benchmarking with criterion; make links test optional (dae)
cargo install cargo-criterion, then run ./bench.sh
* performance comparison: creating HashMap up front (dae)
the previous solution:
anki_tag_parse time: [1.8401 us 1.8437 us 1.8476 us]
this solution:
anki_tag_parse time: [2.2420 us 2.2447 us 2.2477 us]
change: [+21.477% +21.770% +22.066%] (p = 0.00 < 0.05)
Performance has regressed.
* Revert "performance comparison: creating HashMap up front" (dae)
This reverts commit f19126a2f1.
* add missing header
* Write error message if tts lang is missing
* `Tag` -> `Directive`
* Enable access to old notetype name
* Set minimum height for ChangeNotetypeDialog
* Add bootstrap icons to change-notetype
* Move alert up and make it collapsible
* Tweak some CSS
- Add variables --sticky-bg and --sticky-border to StickyContainer
- Tweak base.css
* Add translatable string "(Nothing)"
* Rework ChangeNotetype screen
* Initially load option at newIndex and remaining options on focus
Optimization for big notetypes:
Should increase efficiency from O(n²) to O(n). Test on notetype with 500 templates shows significant improvement in load time (~10s down to ~1s).
* Try to satisfy rust test
* Change arrow direction depending on reading direction
+ add 0.5em top padding to main
* Create Alert.svelte
* Introduce CSS variable --pane-bg
* Revert "Initially load option at newIndex and remaining options on focus"
This reverts commit f42beee45c.
* Final cleanup
* Refine padding/gutter
We're getting an enum instead of an int in Qt6
normal/reversed have been renamed to ascending/descending; no add-ons
appear to be using the old versions.
* Only collect card stats on the backend ...
... instead of rendering an HTML string using askama.
* Add ts page Card Info
* Update test for new `col.card_stats()`
* Remove obsolete CardStats code
* Use new ts page in `CardInfoDialog`
* Align start and end instead of left and right
Curiously, `text-align: start` does not work for `th` tags if assigned
via classes.
* Adopt ts refactorings after rebase
#1405 and #1409
* Clean up `ts/card-info/BUILD.bazel`
* Port card info logic from Rust to TS
* Move repeated field to the top
https://github.com/ankitects/anki/pull/1414#discussion_r725402730
* Convert pseudo classes to interfaces
* CardInfoPage -> CardInfo
* Make revlog in card info optional
* Add legacy support for old card stats
* Check for undefined instead of falsy
* Make Revlog separate component
* drop askama dependency (dae)
* Fix nightmode for legacy card stats
Python's regex engine performs pathologically on regexes like
'<!--.*?-->' when fed a large string of repeating '<!--' clauses.
Thanks to JaimeSlome / security@huntr.dev for the report; closes#1380.
Solved by switching to the Rust implementation, which does not suffer
from this issue.
entsToText(), minimizeHTML(), and the old regex constants have been
removed; they do not appear to be used by any add-ons.
Matches should arrive in alphabetical order. Currently results are not
capped (JS should be able to handle ~1k tags without too much hassle),
and no reordering based on match location is done. Matches are substring
based, and multiple can be provided, eg "foo::bar" will match
"foof::baz::abbar".
This is not hooked up properly on the frontend at the moment -
updateSuggestions() seems to be missing the most recently typed character,
and is not updating the list of completions half the time.
- changes can now be undone
- the same field can now be mapped to multiple target fields, allowing
fields to be cloned
- the old Qt dialog has been removed
- the old col.models.change() API calls the new code, to avoid
breaking existing consumers. It requires the field map to always
be passed in, but that appears to have been the common case.
- closes#1175
Like the previous change, avoid exposing the protobuf as a public API
for now. It requires more thought, and is probably better done with
either extra helper accessors like decks.name(), or via a native class.
This could potentially help us avoid having to refetch the notetype
during study in the future, though updates to Note initialization and
the LaTeX handling would be required first.
Instead of using a separate undo queue, the code now defers checking for
newly-due learning cards until the answering stage, and logs the updated
cutoff time as an undoable change, so that any newly-due learning cards
won't appear instead of a new/review card that was just undone.
Queue redo now uses a similar approach to undo, instead of rebuilding the
queues.
- split new card fetch order and subsequent sort order; use latter
when building queues
- default to spacing siblings when burying is off, with options to
show each sibling in turn, and shuffle the fetched cards
Avoids duplicate work, and is a step towards allowing the next
states to be modified by third-party code.
Also:
- fixed incorrect underlined count, due to reviews being labeled as
learning cards
- fixed reviewer not refreshing when undoing a test review, by splitting
up backend queue rebuilding from frontend reviewer refresh
- moved answering into a CollectionOp
Allows add-on authors to define their own label for a group of undoable
operations. For example:
def mark_and_bury(
*,
parent: QWidget,
card_id: CardId,
) -> CollectionOp[OpChanges]:
def op(col: Collection) -> OpChanges:
target = col.add_custom_undo_entry("Mark and Bury")
col.sched.bury_cards([card_id])
card = col.get_card(card_id)
col.tags.bulk_add(note_ids=[card.nid], tags="marked")
return col.merge_undo_entries(target)
return CollectionOp(parent, op)
The .add_custom_undo_entry() is for adding your own custom actions.
When extending a standard Anki action, instead store `target =
col.undo_status().last_step` after executing the standard operation.
This started out as a bigger refactor that required a separate
.commit_undoable() call to be run after each operation, instead of
having each operation return changes directly. But that proved to be
somewhat cumbersome in unit tests, and ran the risk of unexpected
behaviour if the caller invoked an operation without remembering to
finalize it.
- backend now updates current notetype as part of addition
- frontend no longer implicitly adds, so we can assign a new name and
add in a single operation
- tokio 1.0
- updated reqwest, thanks to Rumo
- other minor dep updates
the reqwest build file has been split into two, as it was awkward
to manually update the combined file, and the platform gate is now
on the target in rslib/
The deck name must be constructed by calling associated functions of
NativeDeckName, unless the name is guaranteed to be valid machine
name (like "Default").
NativeDeckName exposes methods to mutate the deck name and return
the human name.
The storage routines take &strs, but those should be slices of
NativeDeckNames to ensure machine form and normalization.
Instead, fetch the config order on the frontend and pass a builtin
variant into the backend.
That makes the following unnecessary:
* Resolving the config sort in search/mod.rs
* Deserializing the Column enum
* Config accessors for the sort columns
* Remove duplicate backend columns
* Remove duplicate column routines
* Move columns on frontend from state to model
* Generate available columns from Colum enum
* Add second column label for notes mode