Measuring word proximity in semantic spaces

Results

In the years 2014–2015 a semantic space was constructed on the basis of the National Corpus of Polish, using the COALS method. Among other things, distances between words can be measured in the space, representing their semantic proximity. The space, together with the user's manual (in Polish), can be downloaded from http://www2.polon.uw.edu.pl/pliki/approval, while the program designed to use the space, along with its source code, is accessible on GNU General Public Licence from http://www2.polon.uw.edu.pl/pliki/approval/interfejs_obslugi_przestrzeni/. The program can be run on computers using Linux, Windows, or Mac OS operating systems.

Based on the semantic space built over the NKJP corpus, 20-element neighbourhood lists were generated for most word pairs analysed in the present project. The proximity between the pair members was also measured. Details are included in the following downloadable files.

Report • Appendix 1 • Appendix 2