Logistic regression for criteria weight elicitation in PROMETHEE-based ranking methods

. For a PROMETHEE II method used to rank concurrent alternatives both preference functions and weights are required, and if the weights are unknown, they can be elicited by leveraging present or past partial rankings. If the known partial ranking is incorrect, the eliciting methods are ineffective. In this paper a logistic regression method for weight elicitation is proposed to tackle this scenario. An experiment is carried out to compare the logistic regression method performance against a state-of-the-art linear weight elicitation method, proving the validity of the proposed methodology.


Introduction
Multicriteria decision making (MCDM) methods compare multi-dimensional alternatives, presented to a decision maker (DM), and rank them according to his/her preferences.MCDM methods use preference functions to measure the DM's specific preferences, and to translate the alternatives into comparable units.The resulting preferences are then weighted to obtain a scalar value for each alternative.This scalar value can be directly used to rank the alternatives.Readers interested in the theoretical foundation of MCDM methods alongside an established MCDM method, can refer to [1], while those interested in a recent application in the Life-cycle assessment (LCA) context can refer to [2].The availability of credible weights can severely impact the performance of most MCDM methods [3], thus a series of weigh eliciting procedures have been designed [4].This paper expands the methods proposed in [5] for weights eliciting in the PROMETHEE II context.Important components required for the application of MCDM methods are often missing, both the preference functions and the weights might not be known.In case the weights are unknown, information on the ranking of the alternatives can be used to elicit them.In the literature specific weighting eliciting procedure has been designed for different MDCD methods, [6] propose an eliciting procedures for TOPSIS, [7] focus on a surrogate weighting procedure in PROMETHEE, and [8] revise Simo's procedure for the ELECTRE method.From a broader perspective, [9] propose a posterior analysis using the popular Simple Additive Weighting (SAW) method while [10] focus of multicriteria additive models.For the PROMETHEE method, if a partial ranking of present or past decisions is available, [5] propose various elicitation methods based on linear and convex constrained optimization.
If the preference functions are also unknown, Robust Ordinal Regression (ROR) methods bypass the elicitation problem by providing all the results obtainable using preference functions that are in line with a known partial ranking.Interested readers can refer to the first ROR publication [11], which is a re-design of the UTA method [1] in a robust framework.The PROMETHEE method re-designed in a ROR framework can be found in [12], while the ROR version of ELECTRE has been designed by [13].
Machine learning (ML) algorithms, like logistic regression, are often used in conjunction with MCDM methods.[14] outline the similarities and differences between ML algorithms and MCDM methods, while [15] bridge the gap between ML and ROR methods.Recent applications of K-means for AHP can be found in [16] and decision tree algorithms for Data Envelop Analysis (DEA) in [17].
In this paper a logistic regression algorithm in used in the PROMETHEE weight elicitation context of [5] rather than a linear model.The linear and logistic regression algorithms are experimentally compared in cases where the known partial ranking is incorrect.
This paper is organized as follows: in Section 2 the proposed weight elicitation models is described.In Section 3 the experimental setting is outlined, and its results are analysed.Section 4 concludes the paper with a summary of the key findings and suggestions for future research.

Logistic regression model
Using the formalism presented in [5], if all the preference between alternatives are known they can be used to elicit unknown weights solving the optimization problem: s.t.
(1) maximizes the log-likelihood of identifying preferred alternatives by linearly separating the preference space.Each alternative  preferred over an alternative  is a success event drawn from a Bernoulli distribution.The distribution is characterized by a success probability   parametrized, in the preference space, by net flow differences through the inverse canonical link function.If the linear predictor for the inverse canonical link function is defined without an intercept its parameters are maximum likelihood estimators for the PROMETHEE II weights.Since no failure event is considered, the Bernoulli distribution log-likelihood simplifies to (1).
This model is effective even some of the known preferences are incorrect.In adthe probabilistic interpretation of the weights allows the DM to identify the known preferences that are most likely to be faulty.

3
Experimental setting and results

Experimental setting
The experiment objective was to compare the two models in scenarios where some of the known preferences are incorrect.The Linear Model is expected to outperform the Logistic Regression Model if all the known preferences are correct, while erroneous information is expected to favour the proposed model over the original one.The Linear Model is also expected to be unable to find a feasible solution if wildly incorrect inputs are provided.
The dataset for alternatives and weights is used in [5], which contains 5dimensional weights for 1000 DMs and their rankings of 100 different alternatives each.It is artificially expanded by permuting the rankings 1000 times, with each permutation being independent from the previous one, and it affects all the rankings leading to 1,000,000 permuted rankings.
The preference function for each weight  is linear in the entire interval [min(  )  , max(  )  ].
For each DM  and permutation ℎ both the Linear Method and the Logistic Regression Method are applied thereby obtaining two sets of elicited weights.Each set  of elicited weights is compared with the DM weights thus obtaining a performance measure in the interval [0,1]: where (  )  is the DM's weight for criterion , and ( ℎ )  is the elicited weight for the same criterion.
For each permutation ℎ and set , the DMs' performance measures are aggregated by estimating their expected value: Each permutation ℎ is rated according to its distance from the unpermuted ranking: where  ℎ is the permuted value in position  and ( 100 2 ) is a normalization constant to constraint  ℎ in the range [0,1].The rating of the unpermuted ranking is 0, while the rating of the reversed ranking is 1.The permutations are generated to achieve ratings that are uniformly distributed between 0 and 1.The two sets of  ̂( ℎ ) can be compared across different  ℎ to gauge how different permutation ratings affect the model's performance.

Results
Figure 1 depicts the obtained  ̂( ℎ ) for the two methods against the permutations  ℎ .The expected values for the performance measure of the Linear Model are plotted with circles, while the Logistic Regression Model values are plotted with crosses.
According to Figure 1, the Linear Model outperforms the Logistic Regression Model in the non-permuted scenario, while the Logistic Regression Model is superior for sizable values of  ℎ .When the permutation becomes severe ( ℎ ≥ 0.5345), the Linear Model is again the preferred option for weight eliciting, up to complete rank reversion.
Both the Linear Model and the Logistic Regression Model were always able to find feasible solutions.

Conclusions
Unexpectedly, the Logistic Regression Model does not always outperform the Linear Model when some of the known preferences are incorrect.The advantage of the Logistic Regression Model is limited to those instances where the permuted ranking is closer to the non-permuted ranking than to the reverse ranking, with cut-point  ℎ = 0.5345.In nearly all the analysed cases, the Linear Model achieves, without incurring infeasibility issues, a constant value of the performance measure  ℎ = 0.8716, where chance alone would yield  ℎ = 0.7370.Leveraging the high dimensionality of the preference space, the Linear Model finds a single feasible solution and retains it for nearly all the permuted rakings, except for the reverse ranking that carries its own solution.
These results provide guidelines on when one method is preferable over the other, and prove that, when the correct method is selected, the elicited weights are close to the real ones above chance.
Further research will use the Logistic Regression Model to identify faulty known preferences, leveraging the probabilistic interpretation of the weights described in Section 2. Other machine-learning algorithms (e.g.Support Vector Machines, Neural Networks) will be specialized into weight-eliciting models.These models are expected to account for not only for incorrect known preferences but also for incorrect preference functions, discarding the hypothesis of the linear separability assumption in the preference space.

Fig. 1 .
Fig. 1.Expected value of the performance measure for the two models and permutation ratings.