A probabilistic matrix factorization algorithm for approximation of sparse matrices in natural language processing