Japanese (#770)
* added Japanese (Hiragana, Katakana, Kanji) * improved dictionary validation: it is now possible to have the same ideogram with two different transcriptions * fixed frequency updating not working sometimes (in Chinese too)
This commit is contained in:
parent
efa1fb4d79
commit
0ec912f9c9
33 changed files with 1603029 additions and 89 deletions
25
docs/dictionaries/jaWordlistReadme.txt
Normal file
25
docs/dictionaries/jaWordlistReadme.txt
Normal file
|
|
@ -0,0 +1,25 @@
|
|||
Japanese wordlists by: EDICT Project
|
||||
Source: https://www.edrdg.org
|
||||
Dictionaries used: JMDICT, ENAMDICT
|
||||
Version: 2025-04-01
|
||||
License: https://www.edrdg.org/edrdg/licence.html (Creative Commons Attribution-ShareAlike Licence V4.0)
|
||||
|
||||
Verb conjugations generated using: Japanese Verb Conjugator V2
|
||||
Source: https://pypi.org/project/japanese-verb-conjugator-v2/
|
||||
Version: 2025-01-13
|
||||
|
||||
Verb conjugations converted to Hiragana using: WanaKana-py
|
||||
Source: https://github.com/Starwort/wanakana-py
|
||||
Version: fa43884 (2019-07-13)
|
||||
|
||||
Japanese frequency list by: Wortschatz Leipzig @ Uni Leipzig
|
||||
Source: https://wortschatz.uni-leipzig.de/en/download/
|
||||
Version: 2025-04-04
|
||||
License: CC-BY
|
||||
Reference:
|
||||
> D. Goldhahn, T. Eckart & U. Quasthoff: Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages.
|
||||
> In: Proceedings of the 8th International Language Resources and Evaluation (LREC'12), 2012
|
||||
> http://www.lrec-conf.org/proceedings/lrec2012/pdf/327_Paper.pdf
|
||||
|
||||
Additional remarks:
|
||||
Hiragana and Katakana for the respective modes were added manually. All words converted to Romaji manually.
|
||||
Loading…
Add table
Add a link
Reference in a new issue