As per the forum thread, the current due counts are really demotivating when
there's a backlog of cards. In attempt to solve this, I'm trying out a new
behaviour as the default: instead of reporting all the due cards including the
backlog, the status bar will show an increasing count of cards studied that
day. Theoretically this should allow users to focus on what they've done
rather than what they have to do. The old behaviour is still there as an option.
We did away with the stats table because it's impossible to merge it, so the
revlog is canonical now. But we also want a cheap way to display to the user
how much time or how many cards they've done over the day, even if their study
is split into multiple sessions. We were already storing the new cards of a
day in the top level groups, so we just expand that out to log the other info
too.
In the event of a user studying in two places on the same day without syncing,
the counts will not be accurate as they can't be merged without consulting the
revlog, which we want to avoid for performance reasons. But the graphs and
stats do not use the groups for reporting, so the inaccurate counts are only
temporary. Might need to mention this in an FAQ.
Also, since groups are cheap to fetch now, cards now automatically limit
timeTaken() to the group limit, instead of relying on the calling code to do
so.
Instead of collecting the exact number of cards, we just record whether a
group has any reviews or new cards. By not needing to calculate the exact
numbers, it runs a lot faster than before.
Also, changed the group code to ensure parents are automatically created when
a group is added.
As discussed on the forums, moving to a single collection requires moving some
deck-level configuration into groups so users can have different settings like
new cards/day for each top level item.
Also:
- store id in groups
- add mod time to gconf updates
- move the limiting code that's not specific to scheduling into groups.py
- store the current model id per top level group
- moved tags into json like previous changes, and dropped the unnecessary id
- added tags.py for a tag manager
- moved the tag utilities from utils into tags.py
Rather than use a combination of id lookups on the groups table and a group
configuration cache in the scheduler, I've moved the groups and group config
into json objects on the deck table. This results in a net saving of code and
saves one or more DB lookups on each card answer, in exchange for a small
increase in deck load/save work.
I did a quick survey of AnkiWeb, and the vast majority of decks use less than
100 tags, and it's safe to assume groups will follow a similar pattern.
All groups and group configs except the default one will use integer
timestamps now, to simplify merging when syncing and importing.
defaultGroup() has been removed in favour of keeping the models up to date
(not yet done).
- cards in final review are first reset as rev cards so that type==queue and
they can be restored correctly
- new cards in learning have type set to 1 so they too can be restored
correctly
instead of completely resetting a card like we did in resetCards() in the
past, forgetCards() just puts the card back in the new queue and leaves the
factor and revlog alone. If users want to complete reset a card, they'll need to
export it.
- use negative numbers to denote second intervals
- record the rev ivl when leaving lrn queue
- improve revlog upgrade
- don't truncate precision when recording time taken
reps should now be equal to the number of entries in the revlog, and only
exists so that we can order by review count in the browser efficiently
streak is no longer necessary as we have a learn queue now
originally the plan was to get the user to "forget learning cards" or "remove
final drill" when switching between categories, but that's cumbersome and not
intuitive
The undo code was using triggers and a temporary table to write out all changed rows before making a change. This made for powerful undo/redo support, but had some problems:
- creating the tables and triggers wasn't cheap, especially on mobile devices
- likewise, every data modification required writing into two separate databases, almost doubling the amount of writes required
- it was possible to leave the DB in an inconsistent state if an undoable operation is followed by a non-undoable operation that references the undoable operation, and the user then rolls back the undoable operation.
To address these issues, we simplify undo by integrating it with the autosave changes:
- .save() can be passed a name to mark a rollback point. If the user undoes the change, any changes since the last save are lost
- autosaves happen every 5 minutes, and are pushed back on a .save(), so the maximum work a user can lose is 5 minutes.
- reviews are handled separately, so we can let the user undo multiple reviews at once
- if necessary, special cases could be added for other operations like marking
This means that if a user does two damaging operations in a row they won't be able to restore the first one, but such an event is both unlikely, and is also covered by the backups made each time a deck is opened.
Previously cloze deletions were handled by copying the contents of one field
into another and applying transforms to it. This had a number of problems:
- after you add a card, you can't undo the cloze deletion
- if you spot a mistake, you have to edit it twice (or more if you have more
than one cloze for a sentence)
- making multiple clozes requires copying & pasting the sentence multiple
times
- this also lead to much bigger decks if the sentences being cloze-deleted are
large
- related clozes can't be spaced apart as siblings
To address these issues, we introduce the idea of cloze tags in the card
template and fields. If the template has the text:
{{cloze:1:field}}
And a field has the following contents:
{{c1::hello}}
Then the template will automatically replace that part of the text with either
occluded text, or a highlighted answer. All other clozes in the field are
displayed normally.
At the same time, we add support for text: into the template library, instead
of manually creating text: fields in the dict for every field.
Finally, add a forecast routine to get the due counts for the next week, which
is used in the GUI.
The 'entry due' is the due time of a failed card before it enters the learning
queue. When the card graduates or is removed, it has its old due time
restored. We could pull this from the revlog, but it's cheaper to do it this
way.
A lot of the old checks in fixIntegrity() are no longer relevant, and some of
the others may no longer be required. They can be added back in as the need
arises.
- remove revlog.py and move code into scheduler
- add a routine to log a learn repetition
- rename flags to type and set type=0 for learn mode
- add to unit test
This means that the default learn queue sort order doesn't need another column
in the index, but it also means that generated cards will have a higher id,
and would appear later even if they have a lower ordinal. This is probably an
infrequent issue, and a plugin which rewrites ids would probably be an
adequate solution.
Anki used random 64bit IDs for cards, facts and fields. This had some nice
properties:
- merging data in syncs and imports was simply a matter of copying each way,
as conflicts were astronomically unlikely
- it made it easy to identify identical cards and prevent them from being
reimported
But there were some negatives too:
- they're more expensive to store
- javascript can't handle numbers > 2**53, which means AnkiMobile, iAnki and
so on have to treat the ids as strings, which is slow
- simply copying data in a sync or import can lead to corruption, as while a
duplicate id indicates the data was originally the same, it may have
diverged. A more intelligent approach is necessary.
- sqlite was sorting the fields table based on the id, which meant the fields
were spread across the table, and costly to fetch
So instead, we'll move to incremental ids. In the case of model changes we'll
declare that a schema change and force a full sync to avoid having to deal
with conflicts, and in the case of cards and facts, we'll need to update the
ids on one end to merge. Identical cards can be detected by checking to see if
their id is the same and their creation time is the same.
Creation time has been added back to cards and facts because it's necessary
for sync conflict merging. That means facts.pos is not required.
The graves table has been removed. It's not necessary for schema related
changes, and dead cards/facts can be represented as a card with queue=-4 and
created=0. Because we will record schema modification time and can ensure a
full sync propagates to all endpoints, it means we can remove the dead
cards/facts on schema change.
Tags have been removed from the facts table and are represented as a field
with ord=-1 and fmid=0. Combined with the locality improvement for fields, it
means that fetching fields is not much more expensive than using the q/a
cache.
Because of the above, removing the q/a cache is a possibility now. The q and a
columns on cards has been dropped. It will still be necessary to render the
q/a on fact add/edit, since we need to record media references. It would be
nice to avoid this in the future. Perhaps one way would be the ability to
assign a type to fields, like "image", "audio", or "latex". LaTeX needs
special consider anyway, as it was being rendered into the q/a cache.