1
0
Fork 0

Fixed word frequency issues causing wrong suggestions order (#164)

* all suggestions are now ordered by length, then by frequency

* word frequency is normalized to 255, instead of to 5; normalization now makes sense

* only maxed out languages are normalized, not all

* all words are normalized at once, instead of only the one that has reached the limit

* normalization now happens on start up, instead of using a trigger

* fixed word frequency not updating when a punctuation mark is appended at the end, for example: 'try,'

* switched the positions of ; and :

* updated documentation
This commit is contained in:
Dimo Karaivanov 2023-01-31 18:14:01 +02:00 committed by GitHub
parent cfe81462e0
commit f6c51d9304
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
8 changed files with 78 additions and 27 deletions

View file

@ -57,7 +57,6 @@ To support a new language one needs to:
- `ID` must be the next available number. - `ID` must be the next available number.
- Set `isPunctuationPartOfWords` to `true`, if you need to use the 1-key for typing words, such as: `it's`, `a'tje` or `п'ят`. Otherwise, it would not be possible to type them, nor will they appear as suggestions. `false` will allow faster typing when apostrophes or other punctuation are not part of the words. - Set `isPunctuationPartOfWords` to `true`, if you need to use the 1-key for typing words, such as: `it's`, `a'tje` or `п'ят`. Otherwise, it would not be possible to type them, nor will they appear as suggestions. `false` will allow faster typing when apostrophes or other punctuation are not part of the words.
- Add the new language to the list in `LanguageCollection.java`. You only need to add it in one place, in the constructor. Please, be nice and maintain the alphabetical order. - Add the new language to the list in `LanguageCollection.java`. You only need to add it in one place, in the constructor. Please, be nice and maintain the alphabetical order.
- Optionally, translate Traditional T9 in your language, by adding `res/values-your-lang/strings.xml`. The Android Studio translation editor is very handy.
### Dictionary Formats ### Dictionary Formats
@ -101,6 +100,10 @@ fifth 3
... ...
``` ```
## Translating the UI
To translate Traditional T9 menus and messages in your language, add: `res/values-your-lang/strings.xml`. Then use the Android Studio translation editor. It is very handy.
Alternatively, if you don't have Android Studio, you could just use `res/values/strings.xml` as a reference and translate all strings in your file, skipping the ones that have the `translatable="false"` attribute.
## Contribution Process ## Contribution Process

View file

@ -1,5 +1,5 @@
# Traditional T9 # Traditional T9
TT9 is an IME (Input Method Editor) for Android devices with a hardware keypad. It supports multiple languages and predictive text typing. TT9 is an IME (Input Method Editor) for Android devices with a hardware keypad. It supports [multiple languages](src/io/github/sspanak/tt9/languages/definitions) and predictive text typing.
This is an updated version of the [original project](https://github.com/Clam-/TraditionalT9) by Clam-. This is an updated version of the [original project](https://github.com/Clam-/TraditionalT9) by Clam-.
@ -26,7 +26,7 @@ So make sure to read the initial setup and the hotkey tips in the [user manual](
## Contributing to the Project ## Contributing to the Project
As with many other open-source projects, this one is also maintained by its author in his free time. Any help in making Traditional T9 better will be highly appreciated. Here is what you could do: As with many other open-source projects, this one is also maintained by its author in his free time. Any help in making Traditional T9 better will be highly appreciated. Here is what you could do:
- [Report bugs](https://github.com/sspanak/tt9/issues) or other unusual behavior on different phones. Currently, the only testing and development device is: Qin F21 Pro+ / Android 11. - [Report bugs](https://github.com/sspanak/tt9/issues) or other unusual behavior on different phones. Currently, the only testing and development device is: Qin F21 Pro+ / Android 11.
- Add [a new language](CONTRIBUTING.md#adding-a-new-language), [new UI translations](res/values/strings.xml) or simply fix a spelling mistake. If you have minimum techincal knowledge, your skills as a native speaker will be of great use. Or, if you are not tech-savvy, just [open a new issue](https://github.com/sspanak/tt9/issues) and write the correct translations there. - Add [a new language](CONTRIBUTING.md#adding-a-new-language), [new UI translations](CONTRIBUTING.md#translating-the-ui) or simply fix a spelling mistake. The process is very simple and even with minimum techincal knowledge, your skills as a native speaker will be of great use. Or, if you are not tech-savvy, just [open a new issue](https://github.com/sspanak/tt9/issues) and put the correct translations there.
- Experienced developers who are willing fix a bug, or maybe create a brand new feature, see the [Contribution Guide](CONTRIBUTING.md). - Experienced developers who are willing fix a bug, or maybe create a brand new feature, see the [Contribution Guide](CONTRIBUTING.md).
Your PRs are welcome! Your PRs are welcome!

View file

@ -18,28 +18,16 @@ import io.github.sspanak.tt9.Logger;
import io.github.sspanak.tt9.ime.TraditionalT9; import io.github.sspanak.tt9.ime.TraditionalT9;
import io.github.sspanak.tt9.languages.InvalidLanguageException; import io.github.sspanak.tt9.languages.InvalidLanguageException;
import io.github.sspanak.tt9.languages.Language; import io.github.sspanak.tt9.languages.Language;
import io.github.sspanak.tt9.preferences.SettingsStore;
public class DictionaryDb { public class DictionaryDb {
private static T9RoomDb dbInstance; private static T9RoomDb dbInstance;
private static final RoomDatabase.Callback TRIGGER_CALLBACK = new RoomDatabase.Callback() { private static final RoomDatabase.Callback DROP_NORMALIZATION_TRIGGER = new RoomDatabase.Callback() {
@Override
public void onCreate(@NonNull SupportSQLiteDatabase db) {
super.onCreate(db);
db.execSQL(
"CREATE TRIGGER IF NOT EXISTS normalize_freq " +
" AFTER UPDATE ON words " +
" WHEN NEW.freq > 50000 " +
" BEGIN" +
" UPDATE words SET freq = freq / 10000 " +
" WHERE seq = NEW.seq; " +
"END;"
);
}
@Override @Override
public void onOpen(@NonNull SupportSQLiteDatabase db) { public void onOpen(@NonNull SupportSQLiteDatabase db) {
super.onOpen(db); super.onOpen(db);
db.execSQL("DROP TRIGGER IF EXISTS normalize_freq");
} }
}; };
@ -48,7 +36,7 @@ public class DictionaryDb {
if (dbInstance == null) { if (dbInstance == null) {
context = context == null ? TraditionalT9.getMainContext() : context; context = context == null ? TraditionalT9.getMainContext() : context;
dbInstance = Room.databaseBuilder(context, T9RoomDb.class, "t9dict.db") dbInstance = Room.databaseBuilder(context, T9RoomDb.class, "t9dict.db")
.addCallback(TRIGGER_CALLBACK) .addCallback(DROP_NORMALIZATION_TRIGGER) // @todo: Remove trigger dropping after December 2023. Assuming everyone would have upgraded by then.
.build(); .build();
} }
} }
@ -65,6 +53,34 @@ public class DictionaryDb {
} }
/**
* normalizeWordFrequencies
* Normalizes the word frequencies for all languages that have reached the maximum, as defined in
* the settings.
*
* This query will finish immediately, if there is nothing to do. It's safe to run it often.
*
*/
public static void normalizeWordFrequencies(SettingsStore settings) {
new Thread() {
@Override
public void run() {
long time = System.currentTimeMillis();
int affectedRows = dbInstance.wordsDao().normalizeFrequencies(
settings.getWordFrequencyNormalizationDivider(),
settings.getWordFrequencyMax()
);
Logger.d(
"db.normalizeWordFrequencies",
"Normalized " + affectedRows + " words in: " + (System.currentTimeMillis() - time) + " ms"
);
}
}.start();
}
public static void runInTransaction(Runnable r) { public static void runInTransaction(Runnable r) {
getInstance().runInTransaction(r); getInstance().runInTransaction(r);
} }
@ -165,10 +181,25 @@ public class DictionaryDb {
public void run() { public void run() {
try { try {
int affectedRows = getInstance().wordsDao().incrementFrequency(language.getId(), word, sequence); int affectedRows = getInstance().wordsDao().incrementFrequency(language.getId(), word, sequence);
// In case the user has changed the text case, there would be no match.
// Try again with the lowercase equivalent.
String lowercaseWord = "";
if (affectedRows == 0) { if (affectedRows == 0) {
// If the user has changed the case manually, so there would be no matching word. lowercaseWord = word.toLowerCase(language.getLocale());
// In this case, try again with the lowercase equivalent. affectedRows = getInstance().wordsDao().incrementFrequency(language.getId(), lowercaseWord, sequence);
affectedRows = getInstance().wordsDao().incrementFrequency(language.getId(), word.toLowerCase(language.getLocale()), sequence);
Logger.d("incrementWordFrequency", "Attempting to increment frequency for lowercase variant: " + lowercaseWord);
}
// Some languages permit appending the punctuation to the end of the words, like so: "try,".
// But there are no such words in the dictionary, so try without the punctuation mark.
if (affectedRows == 0 && language.isPunctuationPartOfWords() && sequence.endsWith("1")) {
String truncatedWord = lowercaseWord.substring(0, word.length() - 1);
String truncatedSequence = sequence.substring(0, sequence.length() - 1);
affectedRows = getInstance().wordsDao().incrementFrequency(language.getId(), truncatedWord, truncatedSequence);
Logger.d("incrementWordFrequency", "Attempting to increment frequency with stripped punctuation: " + truncatedWord);
} }
Logger.d("incrementWordFrequency", "Affected rows: " + affectedRows); Logger.d("incrementWordFrequency", "Affected rows: " + affectedRows);

View file

@ -34,7 +34,7 @@ interface WordsDao {
"lang = :langId " + "lang = :langId " +
"AND seq > :sequence AND seq <= :sequence || '99' " + "AND seq > :sequence AND seq <= :sequence || '99' " +
"AND (:word IS NULL OR word LIKE :word || '%') " + "AND (:word IS NULL OR word LIKE :word || '%') " +
"ORDER BY freq DESC, LENGTH(seq) ASC, seq ASC " + "ORDER BY LENGTH(seq) ASC, freq DESC, seq ASC " +
"LIMIT :limit" "LIMIT :limit"
) )
List<Word> getFuzzy(int langId, int limit, String sequence, String word); List<Word> getFuzzy(int langId, int limit, String sequence, String word);
@ -51,4 +51,16 @@ interface WordsDao {
"WHERE lang = :langId AND word = :word AND seq = :sequence" "WHERE lang = :langId AND word = :word AND seq = :sequence"
) )
int incrementFrequency(int langId, String word, String sequence); int incrementFrequency(int langId, String word, String sequence);
@Query(
"UPDATE words " +
"SET freq = freq / :normalizationDivider " +
"WHERE lang IN ( " +
"SELECT lang " +
"FROM words " +
"WHERE freq >= :maxFrequency " +
"GROUP BY lang" +
")"
)
int normalizeFrequencies(int normalizationDivider, int maxFrequency);
} }

View file

@ -68,6 +68,7 @@ public class TraditionalT9 extends KeyPadHandler {
self = this; self = this;
DictionaryDb.init(this); DictionaryDb.init(this);
DictionaryDb.normalizeWordFrequencies(settings);
if (softKeyHandler == null) { if (softKeyHandler == null) {
softKeyHandler = new SoftKeyHandler(this); softKeyHandler = new SoftKeyHandler(this);

View file

@ -8,7 +8,7 @@ import java.util.Arrays;
public class Punctuation { public class Punctuation {
final public static ArrayList<String> Main = new ArrayList<>(Arrays.asList( final public static ArrayList<String> Main = new ArrayList<>(Arrays.asList(
",", ".", "-", "(", ")", "[", "]", "&", "~", "`", "'", ":", ";", "\"", "!", "?" ",", ".", "-", "(", ")", "[", "]", "&", "~", "`", "'", ";", ":", "\"", "!", "?"
)); ));
final public static ArrayList<String> Secondary = new ArrayList<>(Arrays.asList( final public static ArrayList<String> Secondary = new ArrayList<>(Arrays.asList(

View file

@ -23,11 +23,12 @@ public class PreferencesActivity extends AppCompatActivity implements Preference
@Override @Override
protected void onCreate(Bundle savedInstanceState) { protected void onCreate(Bundle savedInstanceState) {
DictionaryDb.init(this);
settings = new SettingsStore(this); settings = new SettingsStore(this);
applyTheme(); applyTheme();
DictionaryDb.init(this);
DictionaryDb.normalizeWordFrequencies(settings);
super.onCreate(savedInstanceState); super.onCreate(savedInstanceState);
validateFunctionKeys(); validateFunctionKeys();
buildScreen(); buildScreen();

View file

@ -227,6 +227,9 @@ public class SettingsStore {
public int getSoftKeyInitialDelay() { return 250; /* ms */ } public int getSoftKeyInitialDelay() { return 250; /* ms */ }
public int getSoftKeyRepeatDelay() { return 40; /* ms */ } public int getSoftKeyRepeatDelay() { return 40; /* ms */ }
public int getWordFrequencyMax() { return 25500; }
public int getWordFrequencyNormalizationDivider() { return 100; } // normalized frequency = getWordFrequencyMax() / getWordFrequencyNormalizationDivider()
/************* add word, last word *************/ /************* add word, last word *************/