Commit graph

4 commits

Author SHA1 Message Date
Damien Elmes
2f27133705 drop sqlalchemy; massive refactor
SQLAlchemy is a great tool, but it wasn't a great fit for Anki:
- We often had to drop down to raw SQL for performance reasons.
- The DB cursors and results were wrapped, which incurred a
  sizable performance hit due to introspection. Operations like fetching 50k
  records from a hot cache were taking more than twice as long to complete.
- We take advantage of sqlite-specific features, so SQL language abstraction
  is useless to us.
- The anki schema is quite small, so manually saving and loading objects is
  not a big burden.

In the process of porting to DBAPI, I've refactored the database schema:
- App configuration data that we don't need in joins or bulk updates has been
  moved into JSON objects. This simplifies serializing, and means we won't
  need DB schema changes to store extra options in the future. This change
  obsoletes the deckVars table.
- Renamed tables:
-- fieldModels -> fields
-- cardModels -> templates
-- fields -> fdata
- a number of attribute names have been shortened

Classes like Card, Fact & Model remain. They maintain a reference to the deck.
To write their state to the DB, call .flush().

Objects no longer have their modification time manually updated. Instead, the
modification time is updated when they are flushed. This also applies to the
deck.

Decks will now save on close, because various operations that were done at
deck load will be moved into deck close instead. Operations like undoing
buried card are cheap on a hot cache, but expensive on startup.
Programmatically you can call .close(save=False) to avoid a save and a
modification bump. This will be useful for generating due counts.

Because of the new saving behaviour, the save and save as options will be
removed from the GUI in the future.

The q/a cache and field cache generating has been centralized. Facts will
automatically rebuild the cache on flush; models can do so with
model.updateCache().

Media handling has also been reworked. It has moved into a MediaRegistry
object, which the deck holds. Refcounting has been dropped - it meant we had
to compare old and new value every time facts or models were changed, and
existed for the sole purpose of not showing errors on a missing media
download. Instead we just media.registerText(q+a) when it's updated. The
download function will be expanded to ask the user if they want to continue
after a certain number of files have failed to download, which should be an
adequate alternative. And we now add the file into the media DB when it's
copied to th emedia directory, not when the card is commited. This fixes
duplicates a user would get if they added the same media to a card twice
without adding the card.

The old DeckStorage object had its upgrade code split in a previous commit;
the opening and upgrading code has been merged back together, and put in a
separate storage.py file. The correct way to open a deck now is import anki; d
= anki.Deck(path).

deck.getCard() -> deck.sched.getCard()
same with answerCard
deck.getCard(id) returns a Card object now.

And the DB wrapper has had a few changes:
- sql statements are a more standard DBAPI:
 - statement() -> execute()
 - statements() -> executemany()
- called like execute(sql, 1, 2, 3) or execute(sql, a=1, b=2, c=3)
- column0 -> list
2011-04-28 09:23:53 +09:00
Damien Elmes
8e40fdcb18 set the initial factor when card graduates, not when it's created 2011-04-28 09:23:29 +09:00
Damien Elmes
55f4b9b7d0 favour integers, change due representation, fact&card ordering, more
- removed 'created' column from various tables. We don't care when things like
  models are created, and card creation time didn't reflect the actual time a
  card was created
- facts were previously ordered by their creation date. The code would
  manually set the creation time for subsequent facts on import by 0.0001
  seconds, and then card due times were set by adding the fact time to the
  ordinal number*0.000001. This was prone to error, and the number of zeros used
  was actually different in different parts of the code. Instead of this, we
  replace it with a 'pos' column on facts, which increments for each new fact.
- importing should add new facts with a higher pos, but concurrent updates in
  a synced deck can have multiple facts with the same pos

- due times are completely different now, and depend on the card type
- new cards have due=fact.pos or random(0, 10000)
- reviews have due set to an integer representing days since deck
  creation/download
- cards in the learn queue use an integer timestamp in seconds

- many columns like modified, lastSync, factor, interval, etc have been converted to
  integer columns. They are cheaper to store (large decks can save 10s of
  megabytes) and faster to search for.

- cards have their group assigned on fact creation. In the future we'll add a
  per-template option for a default group.

- switch to due/random order for the review queue on upgrade. Users can still
  switch to the old behaviour if they want, but many people don't care what
  it's set to, and due is considerably faster, which may result in a better
  user experience
2011-04-28 09:23:28 +09:00
Damien Elmes
bb79b0e17c add new 'groups' concept, refactor deletions
Users who want to study small subsections at one time (eg, "lesson 14") are
currently best served by creating lots of little decks. This is because:
- selective study is a bit cumbersome to switch between
- the graphs and statitics are for the entire deck
- selective study can be slow on mobile devices - when the list of cards to
  hide/show is big, or when there are many due cards, performance can suffer
- scheduling can only be configured per deck

Groups are intended to address the above problems. All cards start off in the
same group, but they can have their group changed. Unlike tags, cards can only
be a member of a single group at once time. This allows us to divide the deck
up into a non-overlapping set of cards, which will make things like showing
due counts for a single category considerably cheaper. The user interface
might want to show something like a deck browser for decks that have more than
one group, showing due counts and allowing people to study each group
individually, or to study all at once.

Instead of storing the scheduling config in the deck or the model, we move the
scheduling into a separate config table, and link that to the groups table.
That way a user can have multiple groups that all share the same scheduling
information if they want.

And deletion tracking is now in a single table.
2011-04-28 09:23:28 +09:00