Commit graph

242 commits

Author SHA1 Message Date
Damien Elmes
1ce5a7552e ensure top level properties updated on rename 2011-10-23 05:45:15 +09:00
Damien Elmes
f37f4aff96 need to sort tree based on split components, not string order 2011-10-22 21:14:46 +09:00
Damien Elmes
b9dc5764a3 add group renaming 2011-10-22 20:41:33 +09:00
Damien Elmes
bbe2973952 add newly created clozes too 2011-10-22 03:58:54 +09:00
Damien Elmes
46dd863f3c convert genCards() to bulk update; drop random 2011-10-22 03:39:34 +09:00
Damien Elmes
852189808f catch attempts to save a fact that deletes a cloze card 2011-10-22 03:10:34 +09:00
Damien Elmes
119217290e implement anki1 importer 2011-10-21 23:45:42 +09:00
Damien Elmes
b242b06052 import media too 2011-10-21 07:53:22 +09:00
Damien Elmes
83f8ef45ff anki2 importing and reorganize import code 2011-10-21 07:36:44 +09:00
Damien Elmes
1ba75e8dc9 don't need a separate migration package; the rest is std import 2011-10-20 22:18:55 +09:00
Damien Elmes
76960abd75 fix upgrading; drop old mnemosyne 1 importer 2011-10-20 22:05:34 +09:00
Damien Elmes
cf4abcb403 split upgrade code into separate file; use .anki2 now 2011-10-20 05:26:50 +09:00
Damien Elmes
050afa57ad make sure consecutive syncs don't do anything; check server removal 2011-10-06 15:54:55 +09:00
Damien Elmes
0c85acf3f7 refactor to support early exit when no media changes
- regular sync now receives a media USN as well
- server bumps media usn not only on sync but any other media change
- added local media.hasChanged()
2011-10-06 15:34:22 +09:00
Damien Elmes
dd37ee5915 syncing same file twice should work 2011-10-06 14:47:22 +09:00
Damien Elmes
afe1ad2b0b add resync test, fix zip meta 2011-10-06 14:37:07 +09:00
Damien Elmes
b4fdf1c690 send 'continue' for multiple zips; test splitting 2011-10-06 13:51:44 +09:00
Damien Elmes
8c1f397459 ensure successive calls work 2011-10-06 13:25:01 +09:00
Damien Elmes
0a52f55e50 test removing too 2011-10-06 13:06:32 +09:00
Damien Elmes
4b710b5a87 use sha1 everywhere; the speed differences are negligible 2011-10-03 14:58:01 +09:00
Damien Elmes
eca6ef204f check file is added on remote test 2011-10-03 13:42:53 +09:00
Damien Elmes
49181ee738 fix media zipping and addFiles call 2011-10-03 12:59:35 +09:00
Damien Elmes
5da3bba1df initial work on media syncing 2011-10-03 12:45:08 +09:00
Damien Elmes
bf3bb9dd32 if sync server offline, abort sync tests 2011-10-01 20:44:35 +09:00
Damien Elmes
f131021c7e remote partial syncing working; fixed mod time on finish() 2011-10-01 14:24:39 +09:00
Damien Elmes
20d753591d add timestamp & common error checks to meta(); kill old code 2011-09-29 22:18:36 +09:00
Damien Elmes
aabc884341 start work on remote syncing; full up/down implemented 2011-09-29 20:58:42 +09:00
Damien Elmes
22df2790f9 refactor media change logging 2011-09-25 06:33:57 +09:00
Damien Elmes
9fdfac722d fixed bug in bundling 2011-09-24 14:46:42 +09:00
Damien Elmes
667b89ecc5 support partial syncs of arbitrary size
The full sync threshold was a hack to ensure we synced the deck in a
memory-efficient way if there was a lot of data to send. The problem is that
it's not easy for the user to predict how many changes there are, and so it
might come as a surprise to them when a sync suddenly switches to a full sync.

In order to be able to send changes in chunks rather than all at once, some
changes had to be made:

- Clients now set usn=-1 when they modify an object, which allows us to
  distinguish between objects that have been modified on the server, and ones
  that have been modified on the client. If we don't do this, we would have to
  buffer the local changes in a temporary location before adding the server
  changes.
- Before a client sends the objects to the server, it changes the usn to
  maxUsn both in the payload and the local storage.
- We do deletions at the start
- To determine which card or fact is newer, we have to fetch the modification
  time of the local version. We do this in batches rather than try to load the
  entire list in memory.
2011-09-24 12:42:02 +09:00
Damien Elmes
699839188b ensure we give correct intervals for new cards 2011-09-23 14:55:20 +09:00
Damien Elmes
001a69db43 make sure we update the rep count on pass/fail, and add unit test 2011-09-23 12:52:38 +09:00
Damien Elmes
e7f416406d refactor learning
Rather than showing the user how many cards are in the learning queue, we want
to be able to show them the number of reps they have to do to clear the queue,
so they can better estimate the required time. Before we were counting up with
the grade column, but this means we can't quickly sum up the number of reps
left. So we invert it, and count down instead.

I also dropped the 'first time bonus' for now. If there's enough demand for
it, it can be added back by using the flags column, instead of a dedicated
cycles column.
2011-09-23 10:29:49 +09:00
Damien Elmes
2b34d8a948 more group/sched refactoring
- keep track of rep/time counts per group, instead of just at the top level
- sort by due after retrieving learn cards
- ensure activeGroups is sorted alphabetically
- ensure new cards come in alphabetical group order
- ensure queues are refilled when empty
2011-09-23 08:19:22 +09:00
Damien Elmes
024c42fef8 group scheduling refactor
see the following for background discussion:
http://groups.google.com/group/ankisrs-users/browse_thread/thread/4db5e82f7dff74fb

- change sched index to the more efficient gid, queue, due
- drop the dynamic index support. as there's no no q/a cache anymore, it's
  cheap enough to hit the cards table directly, and we can't use the index in
  its new form.
- drop order by clauses (see todo)
- ensure there's always an active group. if users want to study all groups at
  once, they need to create a top level group. we do this because otherwise
  the 'top level group' that's active when everything is selected is not
  clear.

to do:

- new cards will appear in gid order, but the gid numbers don't reflect
  alphabetical sorting. we need to change the scheduling code so that it steps
  through each group in turn
- likewise for the learn queue
2011-09-22 11:54:01 +09:00
Damien Elmes
dac46752ed drop the count_answered option 2011-09-18 09:42:04 +09:00
Damien Elmes
f3965f4c09 when suspending leeches, make sure we don't put the card back in the queue 2011-09-17 21:42:41 +09:00
Damien Elmes
64d13c2cbc add a quick unit test to make sure groupCounts() works with changes 2011-09-15 01:42:48 +09:00
Damien Elmes
ee767ff132 refactor to allow group deletions without schema mod
because group deletions are likely to be a semi-common operation (esp. for new users trying out shared material), deleting groups will no longer cause a full sync. in order to avoid syncing issues, we now allow cards/facts/etc to point to an invalid group, and in that case, we just treat them like they're in the default group
2011-09-15 01:37:30 +09:00
Damien Elmes
fa1b223363 use ms resolution for deck mod 2011-09-14 05:09:42 +09:00
Damien Elmes
bc9f6e6a24 add USNs
Decks now have an "update sequence number". All objects also have a USN, which
is set to the deck USN each time they are modified. When syncing, each side
sends any objects with a USN >= clientUSN. When objects are copied via sync,
they have their USNs bumped to the current serverUSN. After a sync, the USN on
both sides is set to serverUSN + 1.

This solves the failing three way test, ensures we receive all changes
regardless of clock drift, and as the revlog also has a USN now, ensures that
old revlog entries are imported properly too.

Objects retain a separate modification time, which is used for conflict
resolution, deck subscriptions/importing, and info for the user.

Note that if the clock is too far off, it will still cause confusion for
users, as the due counts may be different depending on the time. For this
reason it's probably a good idea to keep a limit on how far the clock can
deviate.

We still keep track of the last sync time, but only so we can determine if the
schema has changed since the last sync.

The media code needs to be updated to use USNs too.
2011-09-13 21:10:21 +09:00
Damien Elmes
b391202e47 add failing three-way test
While thinking about media syncing I realized the current sync algorithm is
flawed in certain cases. It might be time to think about using a USN instead,
as that should also hopefully solve the skewed clock problem properly.
2011-09-13 05:12:30 +09:00
Damien Elmes
87bfb38e2b move db
- if we store it inside the media folder, we inadvertently bump the folder mod
   time every time sqlite creates a journal file

- close/reopen the media db as the deck is closed/opened
2011-09-12 05:03:31 +09:00
Damien Elmes
c59dd854fb add change detection
I removed the media database in an earlier commit, but it's now necessary
again as I decided to add native media syncing to AnkiWeb.

This time, the DB is stored in the media folder rather than with the deck.
This means we avoid sending it in a full sync, and makes deck backups faster.
The DB is a cache of file modtimes and checksums. When findChanges() is
called, the code checks to see which files were added, changed or deleted
since the last time, and updates the log of changes. Because the scanning step
and log retrieval is separate, it's possible to do the scanning in the
background if the need arises.

If the DB is deleted by the user, Anki will forget any deletions, and add all
the files back to the DB the next time it's accessed.

File changes are recorded as a delete + add.

media.addFile() could be optimized in the future to log media added manually
by the user, allowing us to skip the full directory scan in cases where the
only changes were manually added media.
2011-09-12 03:11:06 +09:00
Damien Elmes
7e1df75cc2 simplify media.py
- drop mediaPrefix & the mediaURL-based downloading
- always create the media folder
- remove move() in preparation for a single collection approach
2011-09-11 00:25:22 +09:00
Damien Elmes
9aad5c1166 more unit tests, fix bugs
- make sure gconf has an id
- merge deck conf
2011-09-09 22:34:50 +09:00
Damien Elmes
6cfe112f91 card tests 2011-09-09 21:24:52 +09:00
Damien Elmes
f15cb23c41 skip the 600 second pad during testing 2011-09-09 21:17:42 +09:00
Damien Elmes
85a2bb6193 revlog timestamp is ms based; should fetch facts/cards by mod not id 2011-09-09 21:11:11 +09:00
Damien Elmes
362ae3eee2 initial work on sync refactor
Ported the sync code to the latest libanki structure. Key points:

No summary:

The old style got each side to fetch ids+mod times and required the client to
diff them and then request or bundle up the appropriate objects. Instead, we now
get each side to send all changed objects, and it's the responsibility of the
other side to decide what needs to be merged and what needs to be discarded.
This allows us to skip a separate summary step, which saves scanning tables
twice, and allows us to reduce server requests from 4 to 3.

Schema changes:

Certain operations that are difficult to merge (such as changing the number of
fields in a model, or deleting models or groups) result in a full sync. The
user is warned about it in the GUI before such schema-changing operations
execute.

Sync size:

For now, we don't try to deal with large incremental syncs. Because the cards,
facts and revlog can be large in memory (hundreds of megabytes in some cases),
they would have to be chunked for the benefit of devices with a low amount of
memory.

Currently findChanges() uses the full fact/card objects which we're planning to
send to the server. It could be rewritten to fetch a summary (just the id, mod
& rep columns) which would save some memory, and then compare against blocks
of a few hundred remote objects at a time. However, it's a bit more
complicated than that:

- If the local summary is huge it could exceed memory limits. Without a local
  summary we'd have to query the db for each record, which could be a lot
  slower.

- We currently accumulate a list of remote records we need to add locally.
  This list also has the potential to get too big. We would need to
  periodically commit the changes as we accumulate them.

- Merging a large amount of changes is also potentially slow on mobile
  devices.

Given the fact that certain schema-changing operations require a full sync
anyway, I think it's probably best to concentrate on a chunked full sync for
now instead, as provided the user syncs periodically it should not be easy to
hit the full sync limits except after bulk editing operations.

Chunked partial syncing should be possible to add in the future without any
changes to the deck format.

Still to do:
- deck conf merging
- full syncing
- new http proxy
2011-09-08 12:50:42 +09:00