geary/sql/version-030.sql at c8f50f1cb6273a4028475c0e2331a91e74d4a34d - nettika/geary - Forgejo: Beyond coding. We Forge.

nettika/geary

Michael Gratton 7e38198287 ImapDb.Database: Register new ICU-based tokeniser for FTS

The SQLite tokeniser does not deal with scripts that do not use spaces
for word breaking (CJK, Thai, etc), thus searching in those languages
does not work well.

This adds a custom SQLite tokeniser based on ICU that breaks words for
all languages supported by that library, and uses NFKC_Casefold
normalisation to handle normalisation, case folding, and dropping of
ignorable characters.

Fixes #121

2021-01-19 20:48:59 +11:00

19 lines

303 B

SQL

Raw Blame History

 --
 -- Convert full-text search from FTS3/4 to FTS5
 --
 DROP TABLE IF EXISTS MessageSearchTable;
 CREATE VIRTUAL TABLE MessageSearchTable USING fts5(
     body,
     attachments,
     subject,
     "from",
     receivers,
     cc,
     bcc,
     flags,
     tokenize="geary_tokeniser",
     prefix="2,4,6,8,10"
 )