- tags.tag -> tags.name
- priority reset to 0 for now; will be used differently in the future
- cardTags.id removed; (tagId, cardId) is the primary key now
- cardTags.src -> cardTags.type
Cards had developed quite a lot of cruft from incremental changes, and a
number of important attributes were stored in names that had no bearing to
their actual use.
Added:
- position, which new cards will be sorted on in the future
- flags, which is reserved for future use
Renamed:
- type to queue
- relativeDelay to type
- noCount to lapses
Removed:
- all new/young/matureEase counts; the information is in the revlog
- firstAnswered, lastDue, lastFactor, averageTime and totalTime for the same
reason
- isDue, spaceUntil and combinedDue, because they are no longer used. Spaced
cards will be implemented differently in a coming commit.
- priority
- yesCount, because it can be inferred from reps & lapses
- tags; they've been stored in facts for a long time now
Also compatibility with deck versions less than 65 has been dropped, so decks
will need to be upgraded to 1.2 before they can be upgraded by the dev code.
All shared decks are on 1.2, so this should hopefully not be a problem.
- rename to revlog
- change the pk to time, as we want an index on time, and the old multi-column
index was expensive and not useful
- remove yes/no count; they can be inferred from the ease
- remove lastFactor, as it's in the previous entry
- remove delay, it can be inferred from last entry
- remove 'next' from nextInterval and nextFactor
- rename 'thinkingTime' to 'userTime'
- rename reps to rep
- migrate old data to new table, and fix some problems in the process: ease0
-> ease1, and limit thinking time to 60 seconds as it should have been
previously
The stats table was how the early non-SQL versions of Anki kept track of
statistics, before there was a revision log. It is being removed because:
- it's not possible to show the statistics for a subset of the deck
- it can't meaningfully be copied on import/export
- it makes it harder to implement sync merging
Implications:
- graphs and deck stats roughly 1.5-3x longer than before, but we'll have the
ability to generate stats for subsections of the deck, and it's not time
critical code
- people who've been using anki since the very early days may notice a drop in
statistics, as early repetitions were recorded in the stats table but the
revlog didn't exist at that point.
- due bugs in old syncs and imports/exports, the stats and revlog may not
match numbers exactly
To remove it, the following changes have been made:
- the graphs and deck stats now generate their data entirely from the revlog
- there are no stats to keep track of how many cards we've answered, so we
pull that information from the revlog in reset()
- we remove _globalStats and _dailyStats from the deck
- we check if a day rollover has occurred using failedCutoff instead
- we remove the getStats() routine
- the ETA code is currently disabled
- timeboxing routines use repsToday instead of stats
- remove stats delete from export
- remove stats table and index in upgrade
- remove stats syncing and globalStats refresh pre-sync
- remove stats count check in fullSync check, which was redundant anyway
- update unit tests
Also:
- newCountToday -> newCount, to bring it in line with revCount&failedCount
which also reflect the currently due count
- newCount -> newAvail
- timeboxing routines renamed since the old names were confusingly similar to
refreshSession() which does something different
Todo:
- update newSeenToday & repsToday when answering a card
- reimplement eta
Calculating the average on startup is expensive on mobile devices. It might be
nice to provide it as a deck option or per-model setting in the future so that
people can specify how hard their material is and have it treated accordingly.
Previously we had an index on the value field, which was very expensive for
long fields. Instead we use a separate column and take the first 8 characters
of the field value's md5sum, and index that. In decks with lots of text in
fields, it can cut the deck size by 30% or more, and many decks improve by
10-20%. Decks with only a few characters in fields may increase in size
slightly, but this is offset by the fact that we only generate a checksum for
fields that have uniqueness checking on.
Also, fixed import->update reporting the total # of available facts instead of
the number of facts that were imported.
Anki 1.0 had a similar feature but we do things a bit differently now. The
relative spacing applies only to reviews, and spaces cards according to their
interval, instead of spacing all cards the same. Any delay < 1 full day is
treated as no delay, so with the default 10% setting, reviews with an interval
< 10 days are not spaced at all. This should hopefully cut down on support
queries for people wondering why many of their cards were delayed, allows the
two settings to be documented separately, and does away with the somewhat
confusing usage of non-integer new sibling values to disable review spacing.
- this fixes a state where cards failed on that future day could end up
with an earlier due date that the rest of the failed mature cards, leading
to the newly failed cards being repeated prematurely
- this leads to non-deterministic scheduling of the mature bonus fails, so
they are effectively randomized which is probably what most users want
This works fine if the user is showing all cards, but if they have limited
reviews to certain categories, it can result in the counts going negative
because we decremented for cards which weren't actually due. Determining if a
card was actually due or not is an expensive operation, so instead we leave
the counts alone and make sure reviews will finish early if the new/rev counts
are non-zero but the queue is empty.
because field formatting is always on now, users with custom font
sizes/families set only on the card will still have to alter their templates
and either configure the fields or replace the references with triple
curly braces
- move latex preamble into a deck var and include amsmath by default
- include the pre/postamble in the hash, so changes to the preamble result in
newly generated images
- latex now slots in to the formatQA hook to render images in the q/a
- moved call() to utils
- cache/uncache latex have been obsoleted. User can delete manually, and
images will be regenerated with a DB check
- media is no longer hashed, and instead stored in the db using its original
name
- when adding media, its checksum is calculated and used to look for
duplicates
- duplicate filenames will result in a number tacked on the file
- the size column is used to count card references to media. If media is
referenced in a fact but not the question or answer, the count will be zero.
- there is no guarantee media will be listed in the media db if it is unused
on the question & answer
- if rebuildMediaDir(delete=True), then entries with zero references are
deleted, along with any unused files in the media dir.
- rebuildMediaDir() will update the internal checksums, and set the checksum
to "" if a file can't be found
- rebuildMediaDir() is a lot less destructive now, and will leave alone
directories it finds in the media folder (but not look in them either)
- rebuildMediaDir() returns more information about the state of media now
- the online and mobile clients will need to to make sure that when
downloading media, entries with no checksum are non-fatal and should not
abort the download process.
- the ref count is updated every time the q/a is updated - so the db should be
up to date after every add/edit/import
- since we look for media on the q/a now, card templates like '<img
src="{{{field}}}">' will work now
- export original files as gone as it is not needed anymore
- move from per-model media URL to deckVar. downloadMissingMedia() uses this
now. Deck subscriptions will have to be updated to share media another way.
- pass deck in formatQA, as latex support is going to change
this bypasses rebuilding the queue and other startup initialization and thus
loads the deck considerably faster. This is useful when you want to perform
operations on the deck like syncing, but don't need the ability to review
cards
- obsolete spaceUntil - it serves no useful purpose
- the old per-model spacing variables are obsolete, as the new approach
requires uniform spacing across all models for new cards
- introduce a new per-deck variable: newSpacing
- don't fill new queue if we've done today's cards
- still need to check cramming / review early
newSpacing is a time in seconds to delay introduction of sibling new cards.
It can be applied as many times as necessary as there is no harm in new cards
being delayed repeatedly. Because the default queue length is 200 and it can
take quite some time for the spaced cards to be placed in the queue again, we
use a separate array to track spaced new cards provided the configured delay
is less than 20 minutes. At times under 20 minutes this number is not a
guaranteed minimum spacing - if the new card queue is empty the spaced cards
will be flushed before checking the new queue again, as otherwise we end up
trying to fill on every repetition. The due counts no longer decrease by more
than one if the spacing is less than the due cutoff, since that confused some
users.
Review cards are now placed at the end of the current review queue, and will
never be rescheduled to a different day. The old approach had a number of
problems:
- the more card models you had, the more likely a card would be spaced
multiple times, resulting in you forgetting the card before you get a chance
to review it
- spacing was applied even if the due card was already late
- repeatedly failing one card over a period of days or weeks would also stave
the other cards of attention
- the local deck name must now match the online deck
- syncName is a hash of the current deck's path or None
- the hash is checked on deck load, and if it is different (because the deck
was copied or moved), syncing is disabled. This should prevent people from
accidentally clobbering their online decks
When you call operations like deleteCards(), suspendCards() and so on, it is
now necessary to call deck.reset() afterwards. This allows the calling code to
delay a reset if necessary. If the calling code calls a function that says the
caller must reset, the caller should be sure to call .reset() and fetch the
current card again. Failure to do the latter will result in answerCard()
attempting to remove the card from the queue, when the queue has been cleared.
- make sure cardLimit() matches on sql statements that are broken over lines
- fix logic in getCardId()
- don't increment failed count if delay1>0 and card was mature
The old delay1 behaviour isn't easy to achieve with the queue code, as we only
refresh the queue when it's emptied, and if the user has delay1 set to say 9
hours, failed mature cards sitting in the queue could prevent subsequent young
failures from being displayed. Instead we convert delay1 to a count in days in
which to offset failed mature cards. 0 means the same time as delay0, 1 means
show the card a day later, and so on. This means users will lose the ability
to delay mature cards for x number of minutes more than young cards, but a
scan of AnkiOnline decks indicates that's not often done.
We also need to use a separate cutoff for failed cards, since we need to be
able to display them as they expire if the user has disabled per-day
scheduling.
And instead of marking cards as due in the future, we set their due time to
the current time, and move the delay0 calculation to getCardId(). This means
that if the user changes their failed card settings from say 1 hour to 10
minutes, the changes apply to the currently failed cards and not just cards
failed in the future.
In various parts of the code we need to get all cards of a given category
(new, failed, etc) regardless of whether they're suspended, buried, etc. So we
store the true type in the obsolete relativeDelay column and add in index for
it, because it's cheaper than putting indices on reps & successive.
- because the cutoff adds a few hours past midnight, it's possible for a card
that's scheduled for 1.0 days ahead to fall within the current cutoff, so we
need to make sure that doesn't happen
- set spaceUntil=0 when answering card again
- fix randomizeNewCards() query. the whole codebase needs auditing for type
references which need updating
* Adjust type to remove cards from the queues, so we don't have to rebuild
priorities to restore them:
Type -= 3 when suspending
Type += 3 when burying
Type += 6 when cramming / reviewing early
We still need to adjust priorities for backwards compatibility, but this can
be removed in the future.
* Factor out scheduler-specific code in answerCard(), so the different
schedulers are now fully modular
* Differentiate between a card's current queue and its type
* Make sure dueCutoff cuts off at the chosen offset instead of midnight
- new type, combinedDue for failed cards & count checks
- only reset() on deck load if not already done
- remove isDue from dynamic indices but leave old ones around for now
- cramming is now a separate scheduler type
- correctly answering a card while cramming causes its scheduling to be
changed in the standard review too
- options to sort cards by earliest modified, ordered, random
- render priority 0 obsolete, as it's all done at queue generation time now
- reimplement reviewEarly and newEarly by replacing parts of the scheduler,
instead of adding special conditions
- remove references to isDue and priority (1,2,3,4) which is not necessary
anymore
- add option to switch between per-day scheduling and due now scheduling
- newCardsToday() -> newCardsDoneToday()
- don't decrement counts for suspended cards
- make sure to update type when suspending/unsuspending
- fix findCards()
- set hardInterval = 1-1.1 on upgrade, or the default per day scheduling doesn't
make sense
Previously we used getCard() to fetch a card at the time. This required a
number of indices to perform efficiently, and the indices were expensive in
terms of disk space and time required to keep them up to date. Instead we now
gather a bunch of cards at once.
- Drop checkDue()/isDue so writes are not necessary to the DB when checking
for due cards
- Due counts checked on deck load, and only updated once a day or at the end
of a session. This prevents cards from expiring during reviews, leading to
confusing undo behaviour and due counts that go up instead of down as you
review. The default will be to only expire cards once a day, which represents
a change from the way things were done previously.
- Set deck var defaults on deck load/create instead of upgrade, which should
fix upgrade issues
- The scheduling code can now have bits and pieces switched out, which should
make review early / cram etc easier to integrate
- Cards with priority <= 0 now have their type incremented by three, so we can
get access to schedulable cards with a single column.
- rebuildQueue() -> reset()
- refresh() -> refreshSession()
- Views and many of the indices on the cards table are now obsolete and will
be removed in the future. I won't remove them straight away, so as to not
break backward compatibility.
- Use bigger intervals between successive card templates, as the previous
intervals were too small to represent in doubles in some circumstances
Still to do:
- review early
- learn more
- failing mature cards where delay1 > delay0
If the user is not careful to only sync when one side has been modified, they
can end up with cards on one side and not the other. If they then delete a
card, deleting the dangling facts also deletes the fact associated with the
not-yet-synced card. In order to avoid this, we avoid deleting dangling facts
until a DB check.