Remove contact harvesting from DB version 005 (version 0.1.1), allowing
the implementation of both ContactStoreImpl and ImapDB.Database to be
cleaned up. Remove contact collection from ImapDB.Folder, reducing the
numer of Gee objects created when creating/merging messages. Finally,
remove the MessageAddresses object now that it is unused.
Some remarks:
* Note that Meson adds a hard dependency on Python 3.
* All dependencies and defines are now listed together.
* Some build files were put in their respective subdirectories, e.g. the Geary
engine library will be built from the Meson file in `src/engine`.
* `--fatal-warnings` is no longer an explicit flag, as Meson provides
`-Dwerror=true` for this.
* An explicit resource file needs to be used. The issue to support this from
Meson itself can be found at https://github.com/mesonbuild/meson/issues/706 .
* The `gnome.yelp()` function parses a LINGUAS file so we no longer need to keep
track of all languages in our build system.
* There are no Debian scripts defined in the meson.build files to keep them
clean, but they can be kept as separate scripts in `build-aux`.
* Left out the `dist` target as there is now `ninja dist`
* `geary-docs` is disabled by default, as valadoc-0.38.3 returns errors.
https://bugzilla.gnome.org/show_bug.cgi?id=777044
See the ticket (comment #2) for more information on the thinking and
strategy here, but in a nutshell this will remove from the Geary
database all emails no longer accessible via any folder and not seen
on the server in over 30 days. It also deletes those messages
attachment(s) and removes any empty directories in the attachment/
directory to prevent clutter. If enough messages are garbage
collected, Geary will vacuum the database at startup, which will
lower its disk footprint and reduce fragmentation, potentially
increasing performance.
This introduces a new full-text search algorithm that attempts to
curb the effects of overstemming in the Porter Snowball stemmer.
The FTS table will be regenerated with this update.
The crux of this new algorithm is a configurable heuristic that
reduces stemmed matching. The configuration is not available via the
UI (I suspect it will only confuse users) but can be changed by power
users via GSettings. More information is available at:
https://wiki.gnome.org/Apps/Geary/FullTextSearchStrategy
Attachments without Content-Disposition are now generated and shown
in the client. This requires a database upgrade as well as rescanning
all messages to generate the previously missing attachments.
In addition, this upgrade now stores the attachments' Content-ID in
the database. This makes it much easier for the client to associate
a particular MIME section in the RFC822 message with an attachment in
the database and on disk.
1) Use docid instead of id in search table.
We had previously included an 'id INTEGER PRIMARY KEY' column in the
MessageSearchTable, assuming it would get the same rowid alias treatment
as it does in non-FTS tables. That assumption was wrong: it was being
created as a FTS column. This fixes it so we use docid everywhere.
To fix the old incorrect docid values, we simply blow away the search
table and let the natural search table population process, which now has
the correct docid insertion code, fix the problem.
This also removes the id column from the search table creation SQL, but
this will only affect new users. Upgraders will see an empty, vestigal
id column in their search table. Since SQLite doesn't easily let you
remove columns, it's just easier to ignore the column than go through
all the work to fix it.
2) Do as many rowid lookups as possible in batches, instead of doing
them individually in loops. This speeds up working with large sets of
email.
3) Rejigger indices on the MessageLocationTable to make certain queries
faster. This creates a new covering index in particular for the email
prefetcher, which previously had to sort using a temp table. The new
index should work in the general case too, as we should never be looking
at ordering without folder_id (and since folder_id comes first, it works
as an index on just folder_id, too).
4) For bonus measure, log all slow queries (> 1s execution time) to
debug output.
Closes: bgo #725929
Turns out for long-running upgrades we were running them all in
parallel, which thrashes the disk pretty hard. This adds a simple mutex
lock around each upgrade, so at least the computer is usable while it's
going on. A more robust solution would be to have a single-thread
thread pool where we enqueue upgrades, but that's too much change this
late in the release cycle.
Also it turns out that the nullifying of the internaldate_time_t column
before we repopulate it was very costly, and unnecessary. So, this also
should speed things up for upgrading users.
Closes: bgo #724475
We had a bug in our DateTime to time_t conversion logic where all
time_ts would end up in the year 3800. This fixes that, and repopulates
the internaldate_time_t column with the new, correct time_t values.
Closes: bgo #724335
This adds the ability for Geary to push sent mail up to the account's
Sent Mail folder (if available). There's an accompanying account option
that defaults to on (meaning: push sent mail).
The current implementation will leave messages in the Outbox (though
they won't be sent again) if they fail to be pushed to Sent Mail. This
isn't the best solution, but it at least means you have a way of seeing
the problem and hopefully copying the data elsewhere manually if you
need to save it.
Note that Geary might not always recognize an account's Sent Mail
folder. This is the case for any "Other" accounts that don't support
the "special use" or "xlist" IMAP extensions. In this case, Geary will
either throw an error and leave messages in the Outbox, or erase the
message from the Outbox when it's sent, depending on the value of the
account's save sent mail option. Better support for detecting the Sent
Mail folder in every case is coming soon.
Closes: bgo #713263
This speeds up startup time immensely, probably due it matching the
the filesystem's or Linux memory mgmt's page size. It's also
expected that this will improve database performance in other ways,
as the default was 1K, meaning potentially more I/O than necessary
for standard operations.
Conflicts:
sql/version-010.sql
src/client/folder-list/folder-list-folder-entry.vala
src/engine/rfc822/rfc822-message.vala
Also, I had to manually fix some compile errors introduced due to
interfaces changing on master.
This caps the search results at 1000 emails, due to our unfortunate
requirement of constructing an object for each search result. A better
way to proceed here would be to do the search only as items were loaded
in the SearchFolder, but that gets complicated when the search phrase
gets updated.
This is a limited implementation, so please backup your database before
running this search feature branch from now on as we may change things.
It's using a Unicode Snowball stemming tokenizer available from
https://github.com/littlesavage/sqlite3-unicodesn, also handily
available in src/sqlite3-unicodesn in Geary. If you want to look at the
search tables on the command line, cd into the unicodesn source folder,
run make and make install, then load sqlite3 like:
sqlite3 -cmd '.load unicodesn.sqlext' /path/to/geary.db
This introduces a background account synchronizer into Geary that
prefetches email folder-by-folder to a user-configurable epoch. The
current default is 15 days.
Additional work to make this user-visible is coming, in particular with
The primary purpose for this feature is to allow "full" conversations
(#4293), which needs more of the mailbox stored locally to do searching.
Previously, we were taking folder names as they came off the wire.
Turns out IMAP specifies that folder names with 8 bit code points are
encoded in a crazy scheme unique to IMAP. Now, we properly decode that
scheme to the correct UTF-8 folder names to be displayed to the user.
There's also now a database upgrade path that converts all existing
mailboxes to the decoded version, so your existing database should just
keep working.
It is done.
Initial implementation of the new database subsystem
These pieces represent the foundation for ticket #5034
Expanded transactions, added VersionedDatabase
Further expansions of the async code.
Moved async pool logic into Database, where it realistically
belongs.
Further improvements. Introduced geary-db-test.
Added SQL create and update files for Geary.Db
version-001 to version-003 are exact copies of the SQLHeavy scripts
to ensure no slight changes when migrating. version-004 upgrades
the database to remove the ImapFolderPropertiesTable and
ImapMessagePropertiesTable, now that the database code is pure
IMAP.
When we support other messaging systems (such as POP3), those
subsystems will need to code their own database layers OR rely on
the IMAP schema and simply ignore the IMAP-specific fields.
ImapDB.Account fleshed out
ImapDB.Folder is commented out, however. Need to port next.
ImapDB.Folder fleshed out
MessageTable, MessageLocationTable, and AttachementTable are now
handled inside ImapDB.Folder.
chmod -x imap-db-database.vala
OutboxEmailIdentifier/Properties -> SmtpOutboxEmailIdentifier/Properties
Moved SmtpOutboxFolderRoot into its own source file
SmtpOutboxFolder ported to new database code
Move Engine implementations to ImapDB.
Integration and cleanup of new database code with main source
This commit performs the final integration steps to move Geary
completely over to the new database model. This also cleans out
the old SQLHeavy-based code and fixes a handful of small bugs that
were detected during basic test runs.
Moved Outbox to ImapDB
As the Outbox is tied to the database that ImapDB runs, move the
Outbox code into that folder.
Outbox fixes and better parameter checking
Bumped Database thread pool count and made them exclusive
My reasoning is that there may be a need for a lot of threads at
once (when a big batch of commands comes in, especially at
startup). If performance looks ok, we might consider relaxing
this later.
Squashed commit of many patches that merged Eric's outbox patch
as well as additional changes to upgrade the database rather than
require it be wiped and some refactoring suggested by the Outbox
implementation. Also updated Outbox to be fully atomic via
Transactions.
Before we were fetching the entire message body (including attachments) to get the
preview text. This patch now offers the ability to fetch a small (128 byte) preview
of the email.
Also, since this ticket is about speeding up performance, I've introduced NonblockingBatch,
which allows for multiple async operations to be executed in parallel easily. I've added
its use in a few places to speed up operations, including one that was causing the lag
in #3799, which is why this commit closes that ticket.
This adds support for retrieving partial header and body blocks straight from the email, and
therefore support to pull the References header from a message (which, for some reason, IMAP
doesn't support or include in the FETCH ENVELOPE command). This is necessary for email conversations (#3808).
This required a change to the database schema, meaning old databases will need to be blown
away before starting.
Needed to rethink storage strategies as I researched this and realized that a true scarce database -- where the database is sparsely populated both in columns and rows -- is not feasible due to IMAP's UID rules. The strategy now means that the database rows are contiguous from the highest (newest) message to the oldest *requested by the user*. This is a better situation than having to download the UID for the entire folder.
This commit adds support for IMAP-specific properties, of which UIDValidity is crucial toward completing #3805. The additional code is to integrate these tables into the SQLite Geary backend and to make sure this information is requested from the IMAP server.
NOTE: This commit changes the database schema. Old databases will need to be blown away before running.
This completes the heavy lifting of persisting messages locally. The strategy is that the local database may be sparsely populated, both in the availability of messages in a folder and the fields of a message that is partially stored. As data is pulled from the remote server it's always stored in the database. Future requests will always go to the database first, preventing unnecessary network traffic.
Also, this patch will detect when a message is stored in multiple folders on the server. The database uses soft links from the folder to the message, so the message is stored only once in the database. This technique relies heavily on the availability and validity of the Message-ID header, but we expect this to be reliable the vast majority of the time.
This iteration now stores headers locally and fetches them first before going to the network. Work done in the database to deal with IMAPisms. More work on the GMime bindings (couple of mistakes in prior commit).
Much of the API between the local and net stores had to be reworked for consistency as well as planning ahead for how messages will be retrieved and stored efficiently. This work also attempts to keep in mind that other mail sources (POP, etc.) may be required in the future, and hopefully can be added without major rework.
This large diff represents a growth of the architecture to persist IMAP data as its downloaded. When listing folders, a local database is consulted first to immediately feed to the caller. In the background, network calls fetch the "real" list. The two are collated for differences which are reported to the caller via signals, who are then responsible for updating the user interface appropriately. No other synchronization work is represented in this diff.
Note that this breaks functionality: when a folder is selected, no messages appear in the message list. Fixing this requires more work, and this patch was already large enough. It's ticketed here: #3741