GitHub/Anki - OBJNULLs Forgejo: Beyond coding. We Forge.

mirror of https://github.com/ankitects/anki.git synced 2025-09-20 15:02:21 -04:00

Author	SHA1	Message	Date
Damien Elmes	279a942642	deck -> collection	2011-11-23 17:47:44 +09:00
Damien Elmes	83f8ef45ff	anki2 importing and reorganize import code	2011-10-21 07:36:44 +09:00
Damien Elmes	1ba75e8dc9	don't need a separate migration package; the rest is std import	2011-10-20 22:18:55 +09:00
Damien Elmes	bc9f6e6a24	add USNs Decks now have an "update sequence number". All objects also have a USN, which is set to the deck USN each time they are modified. When syncing, each side sends any objects with a USN >= clientUSN. When objects are copied via sync, they have their USNs bumped to the current serverUSN. After a sync, the USN on both sides is set to serverUSN + 1. This solves the failing three way test, ensures we receive all changes regardless of clock drift, and as the revlog also has a USN now, ensures that old revlog entries are imported properly too. Objects retain a separate modification time, which is used for conflict resolution, deck subscriptions/importing, and info for the user. Note that if the clock is too far off, it will still cause confusion for users, as the due counts may be different depending on the time. For this reason it's probably a good idea to keep a limit on how far the clock can deviate. We still keep track of the last sync time, but only so we can determine if the schema has changed since the last sync. The media code needs to be updated to use USNs too.	2011-09-13 21:10:21 +09:00
Damien Elmes	362ae3eee2	initial work on sync refactor Ported the sync code to the latest libanki structure. Key points: No summary: The old style got each side to fetch ids+mod times and required the client to diff them and then request or bundle up the appropriate objects. Instead, we now get each side to send all changed objects, and it's the responsibility of the other side to decide what needs to be merged and what needs to be discarded. This allows us to skip a separate summary step, which saves scanning tables twice, and allows us to reduce server requests from 4 to 3. Schema changes: Certain operations that are difficult to merge (such as changing the number of fields in a model, or deleting models or groups) result in a full sync. The user is warned about it in the GUI before such schema-changing operations execute. Sync size: For now, we don't try to deal with large incremental syncs. Because the cards, facts and revlog can be large in memory (hundreds of megabytes in some cases), they would have to be chunked for the benefit of devices with a low amount of memory. Currently findChanges() uses the full fact/card objects which we're planning to send to the server. It could be rewritten to fetch a summary (just the id, mod & rep columns) which would save some memory, and then compare against blocks of a few hundred remote objects at a time. However, it's a bit more complicated than that: - If the local summary is huge it could exceed memory limits. Without a local summary we'd have to query the db for each record, which could be a lot slower. - We currently accumulate a list of remote records we need to add locally. This list also has the potential to get too big. We would need to periodically commit the changes as we accumulate them. - Merging a large amount of changes is also potentially slow on mobile devices. Given the fact that certain schema-changing operations require a full sync anyway, I think it's probably best to concentrate on a chunked full sync for now instead, as provided the user syncs periodically it should not be easy to hit the full sync limits except after bulk editing operations. Chunked partial syncing should be possible to add in the future without any changes to the deck format. Still to do: - deck conf merging - full syncing - new http proxy	2011-09-08 12:50:42 +09:00
Damien Elmes	be5c5a2018	move tags into deck; code into separate file - moved tags into json like previous changes, and dropped the unnecessary id - added tags.py for a tag manager - moved the tag utilities from utils into tags.py	2011-08-28 13:44:29 +09:00
Damien Elmes	2dfdfad6f2	update license link	2011-04-28 09:24:01 +09:00
Damien Elmes	8fcc6b3085	gpl3->agpl	2011-04-28 09:24:01 +09:00
Damien Elmes	511d6e89a1	remove progress handling code; we'll do it in the GUI or provide cb	2011-04-28 09:23:55 +09:00
Damien Elmes	4becd8399c	implement field cache, fix unit tests, remove some importers the field cache (fsums table) also needs to store the model id to preserve the old behaviour of limiting duplicate checks to a given model, and to ensure we're actually comparing against the same fields removed the dingsbums and wcu importers; will accept them back if the authors port them to the new codebase.	2011-04-28 09:23:54 +09:00
Damien Elmes	9c247f45bd	remove q/a cache, tags in fields, rewrite remaining ids, more Anki used random 64bit IDs for cards, facts and fields. This had some nice properties: - merging data in syncs and imports was simply a matter of copying each way, as conflicts were astronomically unlikely - it made it easy to identify identical cards and prevent them from being reimported But there were some negatives too: - they're more expensive to store - javascript can't handle numbers > 2**53, which means AnkiMobile, iAnki and so on have to treat the ids as strings, which is slow - simply copying data in a sync or import can lead to corruption, as while a duplicate id indicates the data was originally the same, it may have diverged. A more intelligent approach is necessary. - sqlite was sorting the fields table based on the id, which meant the fields were spread across the table, and costly to fetch So instead, we'll move to incremental ids. In the case of model changes we'll declare that a schema change and force a full sync to avoid having to deal with conflicts, and in the case of cards and facts, we'll need to update the ids on one end to merge. Identical cards can be detected by checking to see if their id is the same and their creation time is the same. Creation time has been added back to cards and facts because it's necessary for sync conflict merging. That means facts.pos is not required. The graves table has been removed. It's not necessary for schema related changes, and dead cards/facts can be represented as a card with queue=-4 and created=0. Because we will record schema modification time and can ensure a full sync propagates to all endpoints, it means we can remove the dead cards/facts on schema change. Tags have been removed from the facts table and are represented as a field with ord=-1 and fmid=0. Combined with the locality improvement for fields, it means that fetching fields is not much more expensive than using the q/a cache. Because of the above, removing the q/a cache is a possibility now. The q and a columns on cards has been dropped. It will still be necessary to render the q/a on fact add/edit, since we need to record media references. It would be nice to avoid this in the future. Perhaps one way would be the ability to assign a type to fields, like "image", "audio", or "latex". LaTeX needs special consider anyway, as it was being rendered into the q/a cache.	2011-04-28 09:23:53 +09:00
Damien Elmes	2f27133705	drop sqlalchemy; massive refactor SQLAlchemy is a great tool, but it wasn't a great fit for Anki: - We often had to drop down to raw SQL for performance reasons. - The DB cursors and results were wrapped, which incurred a sizable performance hit due to introspection. Operations like fetching 50k records from a hot cache were taking more than twice as long to complete. - We take advantage of sqlite-specific features, so SQL language abstraction is useless to us. - The anki schema is quite small, so manually saving and loading objects is not a big burden. In the process of porting to DBAPI, I've refactored the database schema: - App configuration data that we don't need in joins or bulk updates has been moved into JSON objects. This simplifies serializing, and means we won't need DB schema changes to store extra options in the future. This change obsoletes the deckVars table. - Renamed tables: -- fieldModels -> fields -- cardModels -> templates -- fields -> fdata - a number of attribute names have been shortened Classes like Card, Fact & Model remain. They maintain a reference to the deck. To write their state to the DB, call .flush(). Objects no longer have their modification time manually updated. Instead, the modification time is updated when they are flushed. This also applies to the deck. Decks will now save on close, because various operations that were done at deck load will be moved into deck close instead. Operations like undoing buried card are cheap on a hot cache, but expensive on startup. Programmatically you can call .close(save=False) to avoid a save and a modification bump. This will be useful for generating due counts. Because of the new saving behaviour, the save and save as options will be removed from the GUI in the future. The q/a cache and field cache generating has been centralized. Facts will automatically rebuild the cache on flush; models can do so with model.updateCache(). Media handling has also been reworked. It has moved into a MediaRegistry object, which the deck holds. Refcounting has been dropped - it meant we had to compare old and new value every time facts or models were changed, and existed for the sole purpose of not showing errors on a missing media download. Instead we just media.registerText(q+a) when it's updated. The download function will be expanded to ask the user if they want to continue after a certain number of files have failed to download, which should be an adequate alternative. And we now add the file into the media DB when it's copied to th emedia directory, not when the card is commited. This fixes duplicates a user would get if they added the same media to a card twice without adding the card. The old DeckStorage object had its upgrade code split in a previous commit; the opening and upgrading code has been merged back together, and put in a separate storage.py file. The correct way to open a deck now is import anki; d = anki.Deck(path). deck.getCard() -> deck.sched.getCard() same with answerCard deck.getCard(id) returns a Card object now. And the DB wrapper has had a few changes: - sql statements are a more standard DBAPI: - statement() -> execute() - statements() -> executemany() - called like execute(sql, 1, 2, 3) or execute(sql, a=1, b=2, c=3) - column0 -> list	2011-04-28 09:23:53 +09:00
Damien Elmes	2613143fe9	improve dynamic indices, implement new queue	2011-04-28 09:23:28 +09:00
Damien Elmes	4e7e8b03bc	moving scheduling code into separate file, some preliminary refactoring	2011-04-28 09:23:28 +09:00
Damien Elmes	9aa2f8dc40	refactor cards Cards had developed quite a lot of cruft from incremental changes, and a number of important attributes were stored in names that had no bearing to their actual use. Added: - position, which new cards will be sorted on in the future - flags, which is reserved for future use Renamed: - type to queue - relativeDelay to type - noCount to lapses Removed: - all new/young/matureEase counts; the information is in the revlog - firstAnswered, lastDue, lastFactor, averageTime and totalTime for the same reason - isDue, spaceUntil and combinedDue, because they are no longer used. Spaced cards will be implemented differently in a coming commit. - priority - yesCount, because it can be inferred from reps & lapses - tags; they've been stored in facts for a long time now Also compatibility with deck versions less than 65 has been dropped, so decks will need to be upgraded to 1.2 before they can be upgraded by the dev code. All shared decks are on 1.2, so this should hopefully not be a problem.	2011-04-28 09:23:27 +09:00
Damien Elmes	f828393de3	rename deck.s to a more understable deck.db; keep s for compat	2011-04-28 09:21:07 +09:00
Damien Elmes	9421a037f6	remove self explanatory module docstrings; strip trailing whitespace	2011-04-28 09:21:07 +09:00
Damien Elmes	4302306fe9	use a checksum for field values; fixed import->update number Previously we had an index on the value field, which was very expensive for long fields. Instead we use a separate column and take the first 8 characters of the field value's md5sum, and index that. In decks with lots of text in fields, it can cut the deck size by 30% or more, and many decks improve by 10-20%. Decks with only a few characters in fields may increase in size slightly, but this is offset by the fact that we only generate a checksum for fields that have uniqueness checking on. Also, fixed import->update reporting the total # of available facts instead of the number of facts that were imported.	2011-04-28 09:21:06 +09:00
Damien Elmes	28604b9d29	remove priorities	2011-04-28 09:21:06 +09:00
Damien Elmes	7fc593a2ce	fix tag update	2010-12-08 17:05:19 +09:00
Damien Elmes	e3dd736460	add ability to update fields when importing	2010-11-26 01:36:24 +09:00
Damien Elmes	6ec898ca4b	Require explicit reset for most queue-modifying functions When you call operations like deleteCards(), suspendCards() and so on, it is now necessary to call deck.reset() afterwards. This allows the calling code to delay a reset if necessary. If the calling code calls a function that says the caller must reset, the caller should be sure to call .reset() and fetch the current card again. Failure to do the latter will result in answerCard() attempting to remove the card from the queue, when the queue has been cleared.	2010-11-23 17:41:36 +09:00
Damien Elmes	b69fd48768	more type handling updates; don't munge counts on sync In various parts of the code we need to get all cards of a given category (new, failed, etc) regardless of whether they're suspended, buried, etc. So we store the true type in the obsolete relativeDelay column and add in index for it, because it's cheaper than putting indices on reps & successive.	2010-11-13 18:39:24 +09:00
Damien Elmes	1f20442921	require dingsbums decks to use a different name so we don't conflict	2010-10-24 13:26:51 +09:00
Damien Elmes	be4dea39b1	more scheduler updates - reimplement reviewEarly and newEarly by replacing parts of the scheduler, instead of adding special conditions - remove references to isDue and priority (1,2,3,4) which is not necessary anymore - add option to switch between per-day scheduling and due now scheduling - newCardsToday() -> newCardsDoneToday() - don't decrement counts for suspended cards - make sure to update type when suspending/unsuspending - fix findCards() - set hardInterval = 1-1.1 on upgrade, or the default per day scheduling doesn't make sense	2010-10-18 18:01:19 +09:00
Damien Elmes	ad743d850d	start work on scheduling refactor Previously we used getCard() to fetch a card at the time. This required a number of indices to perform efficiently, and the indices were expensive in terms of disk space and time required to keep them up to date. Instead we now gather a bunch of cards at once. - Drop checkDue()/isDue so writes are not necessary to the DB when checking for due cards - Due counts checked on deck load, and only updated once a day or at the end of a session. This prevents cards from expiring during reviews, leading to confusing undo behaviour and due counts that go up instead of down as you review. The default will be to only expire cards once a day, which represents a change from the way things were done previously. - Set deck var defaults on deck load/create instead of upgrade, which should fix upgrade issues - The scheduling code can now have bits and pieces switched out, which should make review early / cram etc easier to integrate - Cards with priority <= 0 now have their type incremented by three, so we can get access to schedulable cards with a single column. - rebuildQueue() -> reset() - refresh() -> refreshSession() - Views and many of the indices on the cards table are now obsolete and will be removed in the future. I won't remove them straight away, so as to not break backward compatibility. - Use bigger intervals between successive card templates, as the previous intervals were too small to represent in doubles in some circumstances Still to do: - review early - learn more - failing mature cards where delay1 > delay0	2010-10-18 14:35:11 +09:00
Damien Elmes	a68366b5c4	fix card ordering when generating cards by basing card creation off fact	2010-07-21 19:46:27 +09:00
Damien Elmes	4769bfa7a5	another hack for w32's low timer resolution	2010-02-12 16:03:48 +09:00
Damien Elmes	c90828349c	remove obsolete reference to card tags, don't store card tags on import	2010-01-23 10:59:40 +09:00
Damien Elmes	84b88507a2	tweak importing message	2009-11-27 19:53:24 +09:00
Rick Gruber-Riemer	4971069856	Added importing for DingsBums?! decks	2009-11-08 14:39:09 +09:00
Jean-Baptiste Mazon	fe19dd806d	rewrite field names as tags when importing with tagDuplicates	2009-10-31 00:35:46 +01:00
Damien Elmes	8bc7e0c945	enforce ordinal ordering when importing	2009-08-17 05:05:50 +09:00
Damien Elmes	fe99ff7518	add supermemo importer from Petr Michalec	2009-07-09 23:03:23 +09:00
Damien Elmes	0d0b9fc81e	make sure card count is properly updated in importing	2009-07-04 15:40:36 +09:00
Damien Elmes	2b86cd6b33	add ability to customize separato in csv import	2009-06-26 07:13:14 +09:00
Damien Elmes	e62967ecb1	switch to python csv	2009-06-18 05:21:47 +09:00
Damien Elmes	90f726e634	remove version numbers from import, as osx gets confused	2009-04-25 03:57:56 +09:00
Damien Elmes	94df742a59	fix bug with zero imports, improve speed with zero imports	2009-04-23 02:00:52 +09:00
Damien Elmes	155de15101	greatly improve import speed on large decks, randomize too	2009-04-23 01:58:40 +09:00
Damien Elmes	e9e5994248	make sure cards are tagged correctly when importing tags	2009-03-28 14:32:38 +09:00
Damien Elmes	36421cf166	use pure field model order when importing	2009-02-27 15:30:11 +09:00
Damien Elmes	ebaa37fe55	update tags when importing	2009-02-20 00:11:44 +09:00
Damien Elmes	a4e3badf80	update importing for new tag handling	2009-02-09 21:54:19 +09:00
Damien Elmes	e50ccf22e5	canonify tags when importing	2009-01-20 02:16:15 +09:00
Damien Elmes	f5feaaa782	change wording	2009-01-17 23:00:51 +09:00
Damien Elmes	ff4cc7b0af	add importing tag support, fix audio	2009-01-17 22:36:14 +09:00
Damien Elmes	334d126237	recording & noise profile support on linux	2009-01-17 01:05:39 +09:00
Damien Elmes	1fa7466dd9	progress for importing	2009-01-16 20:22:46 +09:00
Damien Elmes	91e90d8092	card model > card template	2009-01-05 06:10:10 +09:00

1 2

55 commits