Commit graph

258 commits

Author SHA1 Message Date
Damien Elmes
024c42fef8 group scheduling refactor
see the following for background discussion:
http://groups.google.com/group/ankisrs-users/browse_thread/thread/4db5e82f7dff74fb

- change sched index to the more efficient gid, queue, due
- drop the dynamic index support. as there's no no q/a cache anymore, it's
  cheap enough to hit the cards table directly, and we can't use the index in
  its new form.
- drop order by clauses (see todo)
- ensure there's always an active group. if users want to study all groups at
  once, they need to create a top level group. we do this because otherwise
  the 'top level group' that's active when everything is selected is not
  clear.

to do:

- new cards will appear in gid order, but the gid numbers don't reflect
  alphabetical sorting. we need to change the scheduling code so that it steps
  through each group in turn
- likewise for the learn queue
2011-09-22 11:54:01 +09:00
Damien Elmes
dac46752ed drop the count_answered option 2011-09-18 09:42:04 +09:00
Damien Elmes
f3965f4c09 when suspending leeches, make sure we don't put the card back in the queue 2011-09-17 21:42:41 +09:00
Damien Elmes
64d13c2cbc add a quick unit test to make sure groupCounts() works with changes 2011-09-15 01:42:48 +09:00
Damien Elmes
ee767ff132 refactor to allow group deletions without schema mod
because group deletions are likely to be a semi-common operation (esp. for new users trying out shared material), deleting groups will no longer cause a full sync. in order to avoid syncing issues, we now allow cards/facts/etc to point to an invalid group, and in that case, we just treat them like they're in the default group
2011-09-15 01:37:30 +09:00
Damien Elmes
fa1b223363 use ms resolution for deck mod 2011-09-14 05:09:42 +09:00
Damien Elmes
bc9f6e6a24 add USNs
Decks now have an "update sequence number". All objects also have a USN, which
is set to the deck USN each time they are modified. When syncing, each side
sends any objects with a USN >= clientUSN. When objects are copied via sync,
they have their USNs bumped to the current serverUSN. After a sync, the USN on
both sides is set to serverUSN + 1.

This solves the failing three way test, ensures we receive all changes
regardless of clock drift, and as the revlog also has a USN now, ensures that
old revlog entries are imported properly too.

Objects retain a separate modification time, which is used for conflict
resolution, deck subscriptions/importing, and info for the user.

Note that if the clock is too far off, it will still cause confusion for
users, as the due counts may be different depending on the time. For this
reason it's probably a good idea to keep a limit on how far the clock can
deviate.

We still keep track of the last sync time, but only so we can determine if the
schema has changed since the last sync.

The media code needs to be updated to use USNs too.
2011-09-13 21:10:21 +09:00
Damien Elmes
b391202e47 add failing three-way test
While thinking about media syncing I realized the current sync algorithm is
flawed in certain cases. It might be time to think about using a USN instead,
as that should also hopefully solve the skewed clock problem properly.
2011-09-13 05:12:30 +09:00
Damien Elmes
87bfb38e2b move db
- if we store it inside the media folder, we inadvertently bump the folder mod
   time every time sqlite creates a journal file

- close/reopen the media db as the deck is closed/opened
2011-09-12 05:03:31 +09:00
Damien Elmes
c59dd854fb add change detection
I removed the media database in an earlier commit, but it's now necessary
again as I decided to add native media syncing to AnkiWeb.

This time, the DB is stored in the media folder rather than with the deck.
This means we avoid sending it in a full sync, and makes deck backups faster.
The DB is a cache of file modtimes and checksums. When findChanges() is
called, the code checks to see which files were added, changed or deleted
since the last time, and updates the log of changes. Because the scanning step
and log retrieval is separate, it's possible to do the scanning in the
background if the need arises.

If the DB is deleted by the user, Anki will forget any deletions, and add all
the files back to the DB the next time it's accessed.

File changes are recorded as a delete + add.

media.addFile() could be optimized in the future to log media added manually
by the user, allowing us to skip the full directory scan in cases where the
only changes were manually added media.
2011-09-12 03:11:06 +09:00
Damien Elmes
7e1df75cc2 simplify media.py
- drop mediaPrefix & the mediaURL-based downloading
- always create the media folder
- remove move() in preparation for a single collection approach
2011-09-11 00:25:22 +09:00
Damien Elmes
9aad5c1166 more unit tests, fix bugs
- make sure gconf has an id
- merge deck conf
2011-09-09 22:34:50 +09:00
Damien Elmes
6cfe112f91 card tests 2011-09-09 21:24:52 +09:00
Damien Elmes
f15cb23c41 skip the 600 second pad during testing 2011-09-09 21:17:42 +09:00
Damien Elmes
85a2bb6193 revlog timestamp is ms based; should fetch facts/cards by mod not id 2011-09-09 21:11:11 +09:00
Damien Elmes
362ae3eee2 initial work on sync refactor
Ported the sync code to the latest libanki structure. Key points:

No summary:

The old style got each side to fetch ids+mod times and required the client to
diff them and then request or bundle up the appropriate objects. Instead, we now
get each side to send all changed objects, and it's the responsibility of the
other side to decide what needs to be merged and what needs to be discarded.
This allows us to skip a separate summary step, which saves scanning tables
twice, and allows us to reduce server requests from 4 to 3.

Schema changes:

Certain operations that are difficult to merge (such as changing the number of
fields in a model, or deleting models or groups) result in a full sync. The
user is warned about it in the GUI before such schema-changing operations
execute.

Sync size:

For now, we don't try to deal with large incremental syncs. Because the cards,
facts and revlog can be large in memory (hundreds of megabytes in some cases),
they would have to be chunked for the benefit of devices with a low amount of
memory.

Currently findChanges() uses the full fact/card objects which we're planning to
send to the server. It could be rewritten to fetch a summary (just the id, mod
& rep columns) which would save some memory, and then compare against blocks
of a few hundred remote objects at a time. However, it's a bit more
complicated than that:

- If the local summary is huge it could exceed memory limits. Without a local
  summary we'd have to query the db for each record, which could be a lot
  slower.

- We currently accumulate a list of remote records we need to add locally.
  This list also has the potential to get too big. We would need to
  periodically commit the changes as we accumulate them.

- Merging a large amount of changes is also potentially slow on mobile
  devices.

Given the fact that certain schema-changing operations require a full sync
anyway, I think it's probably best to concentrate on a chunked full sync for
now instead, as provided the user syncs periodically it should not be easy to
hit the full sync limits except after bulk editing operations.

Chunked partial syncing should be possible to add in the future without any
changes to the deck format.

Still to do:
- deck conf merging
- full syncing
- new http proxy
2011-09-08 12:50:42 +09:00
Damien Elmes
7034c1ed29 drop syncName, fix leech unit test 2011-09-07 20:11:26 +09:00
Damien Elmes
751cb7df67 add a new default for counts()
As per the forum thread, the current due counts are really demotivating when
there's a backlog of cards. In attempt to solve this, I'm trying out a new
behaviour as the default: instead of reporting all the due cards including the
backlog, the status bar will show an increasing count of cards studied that
day. Theoretically this should allow users to focus on what they've done
rather than what they have to do. The old behaviour is still there as an option.
2011-09-07 19:11:37 +09:00
Damien Elmes
28d045feef rewrite groupCounts()
Instead of collecting the exact number of cards, we just record whether a
group has any reviews or new cards. By not needing to calculate the exact
numbers, it runs a lot faster than before.

Also, changed the group code to ensure parents are automatically created when
a group is added.
2011-09-07 03:02:07 +09:00
Damien Elmes
de8a5b69ed top level groups
As discussed on the forums, moving to a single collection requires moving some
deck-level configuration into groups so users can have different settings like
new cards/day for each top level item.

Also:
- store id in groups
- add mod time to gconf updates
- move the limiting code that's not specific to scheduling into groups.py
- store the current model id per top level group
2011-09-07 01:31:46 +09:00
Damien Elmes
9130e09b3e rename some columns for consistency
- revlog's 'time' is now 'id', like the other tables
- 'taken' is now 'time'
- also dropped the eta code
2011-09-06 21:33:19 +09:00
Damien Elmes
6a00419ebc merge deck.qconf and deck.conf 2011-08-28 14:17:33 +09:00
Damien Elmes
a9b4285959 rename a few methods for consistency 2011-08-28 13:48:17 +09:00
Damien Elmes
be5c5a2018 move tags into deck; code into separate file
- moved tags into json like previous changes, and dropped the unnecessary id
- added tags.py for a tag manager
- moved the tag utilities from utils into tags.py
2011-08-28 13:44:29 +09:00
Damien Elmes
78600e8ed6 move group code into a registry like models 2011-08-27 23:45:55 +09:00
Damien Elmes
d3a3edb707 move models into the deck table
Like the previous change, models have been moved from a separate DB table to
an entry in the deck. We need them for many operations including reviewing,
and it's easier to keep them in memory than half on disk with a cache that
gets cleared every time we .reset(). This means they are easily serialized as
well - previously they were part Python and part JSON, which made access
confusing.

Because the data is all pulled from JSON now, the instance methods have been
moved to the model registry. Eg:
  model.addField(...) -> deck.models.addField(model, ...).

- IDs are now timestamped as with groups et al.

- The data field for plugins was also removed. Config info can be added to
  deck.conf; larger data should be stored externally.

- Upgrading needs to be updated for the new model structure.

- HexifyID() now accepts strings as well, as our IDs get converted to strings
  in the serialization process.
2011-08-27 22:27:09 +09:00
Damien Elmes
7afe6a9a7d convert groups to json; use timestamp ids for all but default
Rather than use a combination of id lookups on the groups table and a group
configuration cache in the scheduler, I've moved the groups and group config
into json objects on the deck table. This results in a net saving of code and
saves one or more DB lookups on each card answer, in exchange for a small
increase in deck load/save work.

I did a quick survey of AnkiWeb, and the vast majority of decks use less than
100 tags, and it's safe to assume groups will follow a similar pattern.

All groups and group configs except the default one will use integer
timestamps now, to simplify merging when syncing and importing.

defaultGroup() has been removed in favour of keeping the models up to date
(not yet done).
2011-08-27 17:13:04 +09:00
Damien Elmes
47be8b0546 use the timestamps instead of forcing id on fact/card creation
- we ditch nextCid/nextFid as we don't need incrementing ids anymore
- we add nextPos so we can maintain a user-friendly position number
2011-08-27 00:36:39 +09:00
Damien Elmes
f7b89c9fa1 ensure unique id on per-object add, too 2011-08-26 22:51:08 +09:00
Damien Elmes
ebac628187 ensure duplicate model creation times are accounted for 2011-08-26 22:33:24 +09:00
Damien Elmes
5868ff52b9 handle duplicate fact creation times 2011-08-26 22:28:35 +09:00
Damien Elmes
644a885a07 update fact ids, graves
- should never skip recording graves, for the sake of merging
- 1.0 upgrade will fail on decks that have the same fact creation date. need
      to work around this in the future
2011-08-26 21:23:16 +09:00
Damien Elmes
6644c04852 start work on id refactor - models first
The approach of using incrementing id numbers works for syncing if we assume
the server is canonical and all other clients rewrite their ids as necessary,
but upon reflection it is not sufficient for merging decks in general, as we
have no way of knowing whether objects with the same id are actually the same
or not. So we need some way of uniquely identifying the object.

One approach would be to go back to Anki 1.0's random 64bit numbers, but as
outlined in a previous commit such large numbers can't be handled easy in some
languages like Javascript, and they tend to be fragmented on disk which
impacts performance. It's much better if we can keep content added at the same
time in the same place on disk, so that operations like syncing which are mainly
interested in newly added content can run faster.

Another approach is to add a separate column containing the unique id, which
is what Mnemosyne 2.0 will be doing. Unfortunately it means adding an index
for that column, leading to slower inserts and larger deck files. And if the
current sequential ids are kept, a bunch of code needs to be kept to ensure ids
don't conflict when merging.

To address the above, the plan is to use a millisecond timestamp as the id.
This ensures disk order reflects creation order, allows us to merge the id and
crt columns, avoids the need for a separate index, and saves us from worrying
about rewriting ids. There is of course a small chance that the objects to be
merged were created at exactly the same time, but this is extremely unlikely.

This commit changes models. Other objects will follow.
2011-08-26 21:08:30 +09:00
Damien Elmes
91efb8f30b some initial sync work 2011-05-29 08:13:54 +09:00
Damien Elmes
0af03a6a8a only use deletion log when necessary 2011-05-07 21:45:55 +09:00
Damien Elmes
3d370f675b restore the deletion log
the initial plan was to zero the creation time and leave the cards/facts there
until we have a chance to garbage collect them on a schema change, but such an
approach won't work with deck subscriptions
2011-05-04 19:00:38 +09:00
Damien Elmes
a65f241258 gracefully handle invalid queries 2011-04-29 11:46:26 +09:00
Damien Elmes
ddcf83bc7f add [] to cloze 2011-04-29 11:43:06 +09:00
Damien Elmes
0bbdd722c2 very basic eta 2011-04-28 09:24:05 +09:00
Damien Elmes
19fd581839 change default lrn timing; leechAction doesn't need an array 2011-04-28 09:24:05 +09:00
Damien Elmes
754dcef4f7 handle the case where learning cards have a grade higher than their config 2011-04-28 09:24:05 +09:00
Damien Elmes
0d6064b933 rename() 2011-04-28 09:24:05 +09:00
Damien Elmes
47c30f172f fix issues suspending learning cards
- cards in final review are first reset as rev cards so that type==queue and
  they can be restored correctly
- new cards in learning have type set to 1 so they too can be restored
  correctly
2011-04-28 09:24:05 +09:00
Damien Elmes
434b5442fd fix unit tests 2011-04-28 09:24:05 +09:00
Damien Elmes
88c4f010d3 update rescheduling
like forgetCards(), we no longer adjust the stats or revlog for the card
2011-04-28 09:24:05 +09:00
Damien Elmes
8a9174fd4d we don't need rep in the revlog 2011-04-28 09:24:05 +09:00
Damien Elmes
7acdbfa9ae support for shifting 2011-04-28 09:24:04 +09:00
Damien Elmes
b69031cd4f generalize order/random 2011-04-28 09:24:04 +09:00
Damien Elmes
974324e3dd add unit test for order 2011-04-28 09:24:04 +09:00
Damien Elmes
82c3119c90 update randomizing/ordering code and forgetCards()
instead of completely resetting a card like we did in resetCards() in the
past, forgetCards() just puts the card back in the new queue and leaves the
factor and revlog alone. If users want to complete reset a card, they'll need to
export it.
2011-04-28 09:24:04 +09:00
Damien Elmes
0df95eafc0 revlog updates
- use negative numbers to denote second intervals
- record the rev ivl when leaving lrn queue
- improve revlog upgrade
- don't truncate precision when recording time taken
2011-04-28 09:24:04 +09:00
Damien Elmes
7d64036a07 drop streak, make reps log all entries
reps should now be equal to the number of entries in the revlog, and only
exists so that we can order by review count in the browser efficiently

streak is no longer necessary as we have a learn queue now
2011-04-28 09:24:04 +09:00
Damien Elmes
1c2b403348 apply selective groups to learning queue too
originally the plan was to get the user to "forget learning cards" or "remove
final drill" when switching between categories, but that's cumbersome and not
intuitive
2011-04-28 09:24:04 +09:00
Damien Elmes
2243b691cc add full search to ignore formatting 2011-04-28 09:24:03 +09:00
Damien Elmes
0fdb966766 wrap clozes in a classed span instead of bold 2011-04-28 09:24:03 +09:00
Damien Elmes
03d00228aa support stripping context of cloze 2011-04-28 09:24:03 +09:00
Damien Elmes
e8d6714130 make sure the failed card count reflects the cutoff 2011-04-28 09:24:03 +09:00
Damien Elmes
d51cd5a433 find & replace 2011-04-28 09:24:03 +09:00
Damien Elmes
98a63285e1 port changeModel() 2011-04-28 09:24:02 +09:00
Damien Elmes
d7b86da811 canonify tags after bulk update 2011-04-28 09:24:02 +09:00
Damien Elmes
93d80678f9 _fields -> fields 2011-04-28 09:24:02 +09:00
Damien Elmes
d611299715 make sure add/delTags() is limited to provided ids 2011-04-28 09:24:02 +09:00
Damien Elmes
8c1c729544 fix card state negation 2011-04-28 09:24:02 +09:00
Damien Elmes
13a484ea36 allow user to abort schema mod 2011-04-28 09:24:02 +09:00
Damien Elmes
2a225b1fae use the sort property set in the deck 2011-04-28 09:24:02 +09:00
Damien Elmes
d089deae5a add the ability to reverse sort order 2011-04-28 09:24:02 +09:00
Damien Elmes
d6116a5377 model and group searching 2011-04-28 09:24:02 +09:00
Damien Elmes
efe6177c7a refactor ordering 2011-04-28 09:24:02 +09:00
Damien Elmes
291bd399b7 field searching
dropped support for field:foo, as you can type 'foo:' instead to accomplish
the same thing
2011-04-28 09:24:01 +09:00
Damien Elmes
57938927e7 users can pass a number for template ordinal; makes show:one obsolete 2011-04-28 09:24:01 +09:00
Damien Elmes
94d4e319ae fids and template searches 2011-04-28 09:24:01 +09:00
Damien Elmes
de81f0238a template moving 2011-04-28 09:24:01 +09:00
Damien Elmes
84d2f32685 move graph code into stats.py; remove old deck stats 2011-04-28 09:24:00 +09:00
Damien Elmes
4be8b9d38c fix zerodiv and other errors 2011-04-28 09:24:00 +09:00
Damien Elmes
11f3de525f groupConf() takes gcid, not gid 2011-04-28 09:23:59 +09:00
Damien Elmes
692fba2ea3 use the deck's groups instead of holding on to a private copy 2011-04-28 09:23:59 +09:00
Damien Elmes
77ee8f1385 ditch useGroups 2011-04-28 09:23:59 +09:00
Damien Elmes
f75e2af195 use a single group setting 2011-04-28 09:23:59 +09:00
Damien Elmes
73625e5751 include the gid in the tree so we can tell which groups are real 2011-04-28 09:23:59 +09:00
Damien Elmes
fc96e12a0a add some randomness to lrn interval 2011-04-28 09:23:59 +09:00
Damien Elmes
495b058618 include total count in with rev+due 2011-04-28 09:23:59 +09:00
Damien Elmes
2a1355eb16 make the group tree part of the scheduler instead 2011-04-28 09:23:59 +09:00
Damien Elmes
728715ff84 counts by group 2011-04-28 09:23:59 +09:00
Damien Elmes
2d00163323 tree grouping; add column to groups so they can remember tags 2011-04-28 09:23:59 +09:00
Damien Elmes
e547b0586a simplify undo
The undo code was using triggers and a temporary table to write out all changed rows before making a change. This made for powerful undo/redo support, but had some problems:
- creating the tables and triggers wasn't cheap, especially on mobile devices
- likewise, every data modification required writing into two separate databases, almost doubling the amount of writes required
- it was possible to leave the DB in an inconsistent state if an undoable operation is followed by a non-undoable operation that references the undoable operation, and the user then rolls back the undoable operation.

To address these issues, we simplify undo by integrating it with the autosave changes:
- .save() can be passed a name to mark a rollback point. If the user undoes the change, any changes since the last save are lost
- autosaves happen every 5 minutes, and are pushed back on a .save(), so the maximum work a user can lose is 5 minutes.
- reviews are handled separately, so we can let the user undo multiple reviews at once
- if necessary, special cases could be added for other operations like marking

This means that if a user does two damaging operations in a row they won't be able to restore the first one, but such an event is both unlikely, and is also covered by the backups made each time a deck is opened.
2011-04-28 09:23:59 +09:00
Damien Elmes
63efc4dbaa remove the separate timeGraph() 2011-04-28 09:23:58 +09:00
Damien Elmes
40706f3493 break reps graph into separate graphs; exclude cumulative line from stack 2011-04-28 09:23:58 +09:00
Damien Elmes
9c1e0befc6 bundle js libs; include them in report(); fix some graph ids 2011-04-28 09:23:58 +09:00
Damien Elmes
6ec2500eb8 reps & time graphs 2011-04-28 09:23:58 +09:00
Damien Elmes
60ef1ec49f eases graph 2011-04-28 09:23:58 +09:00
Damien Elmes
7d5d72adf8 add intervals, boxing in weeks for now 2011-04-28 09:23:58 +09:00
Damien Elmes
89fa08c548 due and cumulative due graphs ported 2011-04-28 09:23:58 +09:00
Damien Elmes
2ca9568196 initial graph code reorganization 2011-04-28 09:23:58 +09:00
Damien Elmes
cc9f5b8d86 stripMedia->strip 2011-04-28 09:23:58 +09:00
Damien Elmes
3c40854583 fix collapsing; make sure learning cards are put back on the heap 2011-04-28 09:23:57 +09:00
Damien Elmes
942bf43b52 fix stats
they're running now, but need to be sanity checked to make sure they're doing the right thing
2011-04-28 09:23:57 +09:00
Damien Elmes
63d1448d1e only bump lrn count when cramming if card not immediately graduated 2011-04-28 09:23:57 +09:00
Damien Elmes
464eb2b684 make sure cramming works if there are no selected groups 2011-04-28 09:23:57 +09:00
Damien Elmes
31427f0133 fix lapse card scheduling
- make sure we set a timestamp due time, and put the card back in the queue
- add a unit test for it
2011-04-28 09:23:57 +09:00
Damien Elmes
e407697fb9 fix interval calculation for lapsed cards in learning queue 2011-04-28 09:23:57 +09:00