This article is transcoded by SimpRead, original address kymotzhong.info
Preface
The situation is that the words “原因” (reason) and “元音” (vowel) often conflict in my input. Since I work in linguistics, I always need to input the word “元音” (vowel)… but “原因” (reason), as a more commonly used word in practice, naturally should take the first place in input suggestions.
There are some methods, such as using an additional vocabulary to fix the order, as shown in this issue. But I don’t want to change my schema because of this.
So I thought of modifying the word frequency in the user dictionary. The binary format is hard to change, so I modify the frequency generated during synchronization. Since during resynchronization the word frequency always takes the maximum value, as long as we write a very large value, the original value can be safely overwritten.
It should be emphasized that this is a not recommended approach.
Practice
Because I have already modified mine, I will take synchronized data from an earlier device as an example.
yuan2 yin1 元音 c=253 d=3.46751e-11 t=317773
yuan2 yin1 原因 c=104 d=0.00588705 t=317773
c is the total input count; I don’t know what d exactly abbreviates, deviation maybe? But in any case, changing d can change the candidate order of words, and the larger the value, the higher the priority.
To minimize the potential impact of modifications, I created a new folder dummy/ in the synchronization directory, created a terra_pinyin.userdb.txt with the following content:
# Rime user dictionary
#@/db_name terra_pinyin
#@/db_type userdb
#@/rime_version 1.7.3
#@/tick 393271
#@/user_id dummy
yuan2 yin1 原因 c=500 d=114514 t=393271
Then I synchronized. Looking at the terra_pinyin.userdb.txt generated during synchronization, I found the d for “原因” changed to 10000, so there seems to be an upper limit.
Anyway, now it is guaranteed.
Additionally, this method is not recommended for single characters. I tried it, and it causes the short code to also be placed first, affecting normal input. For example, any character with pinyin starting with l, if processed this way, will be ranked before one of the most commonly used characters — “了”. It is a little better if it is a phrase with two or more characters and is indeed common.
© kymot 2024 Partial rights reserved.
Unless otherwise stated, the content is published under CC-BY 4.0.
This site does not have a comment system, but accepts comments and communication in email form.
Guangdong ICP License No. 2021086886
Generator: Quarto; Theme: Litera.