Csv dictionary support (#145)
* the dictionary loader now supports word frequencies * word frequency validation upon building * added default word frequencies to all dictionaries * updated documentation
This commit is contained in:
parent
b5cd92f1f7
commit
2510aba58a
30 changed files with 1175323 additions and 1175101 deletions
|
|
@ -7,5 +7,7 @@ Additionally cleaned up repeating words and added some missing ones.
|
|||
|
||||
Also, used the wooorm's hunspell-compatible dictionary to determine which words need to start with a capital letter
|
||||
Link: https://github.com/wooorm/dictionaries/tree/main/dictionaries/bg
|
||||
License: MIT (available in the link)
|
||||
Git commit: 13 Apr 2022 [0c78cc810c8aafb2e6f5140bb6dcd4026b247eb8]
|
||||
|
||||
Word frequencies obtained from the "General" word frequency dictionary by the Department of Computational Linguistics of the Bulgarian Academy of Sciences.
|
||||
Link: https://dcl.bas.bg/frequency.html
|
||||
|
|
@ -16,4 +16,9 @@
|
|||
% Parts of this wordlist are taken from the german ISPELL dictionary
|
||||
% Thanks to Robert from WinEdt.org for hunting the spellerrors
|
||||
%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
|
||||
Word frequencies obtained from Openboard dictionaries.
|
||||
Link: https://github.com/openboard-team/openboard/tree/master/dictionaries
|
||||
Git commit: 17 Dec 2022 [c3772cd56e770975ea5570db903f93b199de8b32]
|
||||
|
|
@ -21,11 +21,16 @@ Using Git Commit From: Mon Dec 7 20:14:35 2020 -0500 [5ef55f9]
|
|||
Also, used the wooorm's hunspell-compatible dictionary to determine
|
||||
which words need to start with a capital letter
|
||||
Link: https://github.com/wooorm/dictionaries/tree/main/dictionaries/en
|
||||
License: MIT (available in the link)
|
||||
Git commit: 13 Apr 2022 [0c78cc810c8aafb2e6f5140bb6dcd4026b247eb8]
|
||||
|
||||
=====
|
||||
|
||||
Word frequencies obtained from Openboard dictionaries.
|
||||
Link: https://github.com/openboard-team/openboard/tree/master/dictionaries
|
||||
Git commit: 17 Dec 2022 [c3772cd56e770975ea5570db903f93b199de8b32]
|
||||
|
||||
=====
|
||||
|
||||
Spell Checking Oriented Word Lists (SCOWL)
|
||||
|
||||
Mon Dec 7 20:14:35 2020 -0500 [5ef55f9]
|
||||
|
|
|
|||
|
|
@ -1,3 +1,7 @@
|
|||
French wordlist by: Christophe Pallier
|
||||
Source: http://www.pallier.org/ressources/dicofr/dicofr.html
|
||||
Words Count: 336531
|
||||
|
||||
Word frequencies obtained from Openboard dictionaries.
|
||||
Link: https://github.com/openboard-team/openboard/tree/master/dictionaries
|
||||
Git commit: 17 Dec 2022 [c3772cd56e770975ea5570db903f93b199de8b32]
|
||||
|
|
@ -93,3 +93,8 @@ ________________________________________________________________
|
|||
te nemen.
|
||||
10.Contact: Stichting OpenTaal, http://www.opentaal.org, bestuur@opentaal.org
|
||||
|
||||
=====
|
||||
|
||||
Word frequencies obtained from Openboard dictionaries.
|
||||
Link: https://github.com/openboard-team/openboard/tree/master/dictionaries
|
||||
Git commit: 17 Dec 2022 [c3772cd56e770975ea5570db903f93b199de8b32]
|
||||
|
|
|
|||
|
|
@ -3,4 +3,10 @@ Version: 5481cb8 (2018-09-13)
|
|||
Source: https://github.com/hingston/russian/blob/master/100000-russian-words.txt
|
||||
License: https://github.com/hingston/russian/blob/master/LICENSE.md
|
||||
|
||||
Additionally cleaned up repeating and nonsense words.
|
||||
Additionally cleaned up repeating and nonsense words.
|
||||
|
||||
=====
|
||||
|
||||
Word frequencies obtained from Openboard dictionaries.
|
||||
Link: https://github.com/openboard-team/openboard/tree/master/dictionaries
|
||||
Git commit: 17 Dec 2022 [c3772cd56e770975ea5570db903f93b199de8b32]
|
||||
|
|
@ -16,4 +16,10 @@
|
|||
|
||||
|
||||
Kostyantyn Moroz
|
||||
mailto: morozko@i.com.ua
|
||||
mailto: morozko@i.com.ua
|
||||
|
||||
====
|
||||
|
||||
Word frequencies obtained from Openboard dictionaries.
|
||||
Link: https://github.com/openboard-team/openboard/tree/master/dictionaries
|
||||
Git commit: 17 Dec 2022 [c3772cd56e770975ea5570db903f93b199de8b32]
|
||||
Loading…
Add table
Add a link
Reference in a new issue