The SQLite tokeniser does not deal with scripts that do not use spaces for word breaking (CJK, Thai, etc), thus searching in those languages does not work well. This adds a custom SQLite tokeniser based on ICU that breaks words for all languages supported by that library, and uses NFKC_Casefold normalisation to handle normalisation, case folding, and dropping of ignorable characters. Fixes #121 |
||
|---|---|---|
| .. | ||
| meson.build | ||
| version-001.sql | ||
| version-002.sql | ||
| version-003.sql | ||
| version-004.sql | ||
| version-005.sql | ||
| version-006.sql | ||
| version-007.sql | ||
| version-008.sql | ||
| version-009.sql | ||
| version-010.sql | ||
| version-011.sql | ||
| version-012.sql | ||
| version-013.sql | ||
| version-014.sql | ||
| version-015.sql | ||
| version-016.sql | ||
| version-017.sql | ||
| version-018.sql | ||
| version-019.sql | ||
| version-020.sql | ||
| version-021.sql | ||
| version-022.sql | ||
| version-023.sql | ||
| version-024.sql | ||
| version-025.sql | ||
| version-026.sql | ||
| version-027.sql | ||
| version-028.sql | ||
| version-029.sql | ||
| version-030.sql | ||