This paper suggests a variation of a well-known probabilistic matrix factorization algorithm which is commonly used in data analysis and scientific computing, and which has been considered recently to serve natural language processing. The proposed variation is meant to take benefit from the fact that matrices processed in natural language processing tasks are normally sparse rectangular matrices with one dimension much larger than the other, and this can be used to ensure adequate accuracy with acceptable computation time. Preliminary experiments on real-world textual corpora show that the proposed algorithm achieves relevant improvements compared to the original one.
A probabilistic matrix factorization algorithm for approximation of sparse matrices in natural language processing / Tarantino, G.; Monica, S.; Bergenti, F.. - In: ICT EXPRESS. - ISSN 2405-9595. - 4:2(2018), pp. 87-90. [10.1016/j.icte.2018.04.005]
A probabilistic matrix factorization algorithm for approximation of sparse matrices in natural language processing
Monica S.;Bergenti F.
2018
Abstract
This paper suggests a variation of a well-known probabilistic matrix factorization algorithm which is commonly used in data analysis and scientific computing, and which has been considered recently to serve natural language processing. The proposed variation is meant to take benefit from the fact that matrices processed in natural language processing tasks are normally sparse rectangular matrices with one dimension much larger than the other, and this can be used to ensure adequate accuracy with acceptable computation time. Preliminary experiments on real-world textual corpora show that the proposed algorithm achieves relevant improvements compared to the original one.File | Dimensione | Formato | |
---|---|---|---|
ICT2018.pdf
Open access
Tipologia:
VOR - Versione pubblicata dall'editore
Dimensione
342.12 kB
Formato
Adobe PDF
|
342.12 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris