Optimization Methods for the Same-Day Delivery Problem

In the same-day delivery problem, requests with restricted time windows arrive during a given time horizon and it is necessary to decide which requests to serve and how to plan routes accordingly. We solve the problem with a dynamic stochastic method that invokes a generalized route generation function combined with an adaptive large neighborhood search heuristic. The heuristic is composed of destroying and repairing operators, and the generalized route generation function, taking advantage of sampled-scenarios that are solved with the heuristic, determines which decisions should be taken at any instant. Results on diﬀerent instances have shown the eﬀectiveness of the proposed method in comparison with a consensus function from the literature, with an average decrease of 10.7%, in terms of solution cost, and 24.5%, in terms of runtime.


Introduction
Same-day delivery is appearing constantly in online retail.Other similar, related applications of this problem emerge in the delivery of groceries and transportation of patients between their homes and a hospital.This problem claims attention, especially due to the complicated, expensive logistic decisions that arise, since it is related to the classical NP-hard vehicle routing problem.Assuming a horizon of time over which a fleet of (identical) vehicles should operate, it is necessary to determine routes for these vehicles, aiming at maximizing the number of requests that can be delivered on time and minimizing traveled distance.As requests are arriving dynamically during the time horizon, each one associated with a time window that needs to be respected, vehicles may return to the depot after performing a delivery in order to pickup products and continue serving the following requests [10].
The Same-Day Delivery Problem (SDDP) is a dynamic problem that was introduced in [10].The authors assumed that vehicles could return to the depot in order to pickup only after finishing their current routes.Based on sampled-scenarios, the authors used a consensus function as a way to take dynamic decisions and consequently generate the vehicle routes.They also considered that a vehicle can wait at the depot while new requests arrive (i.e., they studied a waiting strategy).The idea behind the latter was to anticipate decisions based on the already known requests and potential future ones that were sampled from known probability distributions.From the experimental results, the authors concluded that considering the uncertainty of future requests had an important impact on the solution quality.
The SDDP can be viewed as the dynamic version of a one-to-one Pickup and Delivery Problem (PDP) with time windows (TW), with a single pickup location, which is the depot, to where vehicles need to return for new pickups during the time horizon.Comprehensive surveys in the PDPTW can be found in [2], related to the transportation of goods, and in [6], related to the transportation of people.Moreover, some related works are: [3], in the delivery of groceries, where time windows must be strictly respected and requests are generally known one-day in advance; [1], aiming at maximizing the total expected profit and vehicles can depart from the depot as soon as requests are available (i.e., there is no waiting strategy); [4], in which release dates are associated with requests and a genetic algorithm with local searches is used to solve the problem; [8], solving the PDPTW in which the pickup and delivery nodes are known in advance but not the time at which requests are available (i.e., requests are arriving dynamically during the time horizon); [9], where a multi-period problem is solved in which requests are dynamically integrated into existing decisions and some requests can be served on the next day.
In this work, we tackle the SDDP but, differently from [10], the objective is to minimize the total cost that is incurred from performing all routes and rejecting requests.Moreover, when a vehicle starts performing its route, we allow it to return to the depot after serving a customer and before completing its route in order to pickup more requests.The latter assumption generalizes the proposal in [10], which allows vehicles to return to the depot only after finishing their routes, and enlarges substantially the space of decisions.This is expected to come at the expenses of a larger computational effort.Considering these assumptions, we propose a generalized route generation function in order to improve the way in which routes are built in [10] and develop an Adaptive Large Neighborhood Search (ALNS) [7] to iteratively solve sub-instances on the sampled-scenarios.
Therefore, we present results for three versions of the SDDP: static, in which the problem is solved with the ALNS for all the time horizon, assuming that all requests are known in advance; dynamic, in which the ALNS is applied several times during the time horizon in order to update the current solution; and dynamic-stochastic, in which the generalized route generation function with the ALNS, considering sampled-scenarios of future information, is used to update the current solution.Results are compared with those of the consensus function in [10], specially in the dynamic-stochastic version, considering 120 instances that such authors have proposed.
This work is organized as follows: Section 2 has a formal description of the problem, with its objectives and constraints; Section 3 describes the ALNS and the generalized route generation function, highlighting the differences over that proposed in [10]; Section 4 contains the experimental results of the three versions of the problem and also considering the consensus function in [10]; finally, Section 5 brings concluding remarks and directions for future works.

Problem Definition
The SPPD under study considers a fleet M of identical vehicles and a set L of customers locations over a geographical area.A central depot, denoted as node 0, is associated with start and end times between which vehicles can depart and arrive (i.e., the depot working hours, or the time horizon over which the depot and vehicles are in operation).With each pair i, j ∈ L, it is associated a deterministic travel time t i j and a cost c i j (e.g., distance) that are known in advance.During the depot working hours, requests arrive at a rate λ i ≥ 0 from each location i ∈ L. Let R be the set of requests that will occur during the time horizon.It is composed of requests that are known in advance and some others that unkown requests at the beginning but will become known as time unfold.Each request k ∈ R has a service time µ k , a demand d k , and a delivery time window [s k , e k ].Request k becomes only known at release time r k and can only be treated later on.Requests that are found impossible to deliver on time can be assigned to third-party logistic operator at the expenses of the cost.It is assumed that the delivery costs incurred by the fleet are always lower than the cost of the third-party logistic.
Vehicles start and end at the depot according to its working hours and may serve one or more requests according to the requests that are currently available, respecting vehicles capacity Q.The design of the route associated with each vehicle may involve waiting at the depot for new requests or picking up some requests to perform the deliveries.Also, no diversion is allowed when a vehicle on the way to a customer.However, as soon as a delivery is done, the vehicle can return to the depot to pickup new requests.This means the vehicle doesn't need to finish serving all its on board requests before going back to the depot.The objective of the SDDP is to plan routes for vehicles, aiming at first maximizing the number of requests served by the fleet and secondly, minimizing the total cost of performing the routes.
The description above corresponds to the dynamic version of the problem for which we consider an ALNS to solve partial instances of the problem at any given time of the time horizon.The ALNS is also used to solve the static version for which all requests are known at the start time of the day and then this resulting solution serves as an estimation for the other versions.Aiming at improving solutions of the dynamic version, we consider the dynamic-stochastic version in which sampledscenarios are used to help with decisions regarding possible future requests.

Solution Methods
This section first describes how the SDDP is modeled, then presents the different events that can occur in real-time.Therefore, it describes the ALNS and the two different approaches for tackling the problem: dynamic version and the dynamicstochastic version.

Modeling
The problem is modeled as a classical pickup and delivery problem with time windows with the inclusion of release dates for the arrival of new requests.At any instant, the set of known requests is built where each request is composed of a pickup node at the depot and a drop node at the customer location, besides a restricted time window.Modification of any element that was performed is forbidden, so only choices concerning new requests or nodes that were not visited can be changed.Scenarios containing future requests are generated to help on minimizing costs.Futures requests are dealt like regular requests with the exception that a vehicle cannot take any action before the release date (i.e., the vehicle has to stay idle until the release of the request).
It is important to note that our method allows all types of sequences of nodes to happen.This is not the case in [10] where the problem is modeled as a team orienteering problem with time windows and multi-trips [5].In this problem, every request is composed of a single delivery node.Future requests are also generated and vehicles must return to the depot to do the pickup every time one of their nodes are encountered in the routes.The drawback is that only a subset of possible routes can be produced.For example, it is impossible, in a single node per request model, a route where a vehicle is waiting at the depot for future requests, then it goes to deliver real requests and finally future requests.

Event Management
In [10] is defined two types of events: (1) arrival of a new request when there is at least one vehicle that is waiting at the depot; and, (2) a vehicle has just arrived at the depot or completed its waiting period.Every time a new event happens, instances of the PDPTW are generated and are solved using the ALNS.When allowing vehicles to not complete their routes, we need to consider a delivery completion as new event.Namely, when a vehicle completed a delivery, it can be diverted to the depot to pickup requests and perform the deliveries later.Finally, it is worth nothing that this additional event will possibly increase computational time.

Adaptive large neighborhood search
The proposed ALNS is based on [7], which uses the acceptance probability function of the simulated annealing to accept worse solutions.Then, it works as follows, given an input instance of the problem: (i) it obtains a feasible solution x by a constructive heuristic; (ii) it applies a destroy operator on x to obtain x ; (iii) it applies a repair operator on x to obtain x ; (iv) it replaces x with x if x has lower cost or else by applying the acceptance probability function; (v) it goes back to step (ii) if the maximum number of iterations is not reached, or otherwise it returns x.
In step (i), the initial solution is constructed by observing the release date of requests in a greedy way.With relation to the destroy operator, we consider the removal and random operators that disregard requests of the solution.In the first one, requests that are closely related (i.e., in terms of cost, time, and capacity) are removed.In the second one, requests are randomly selected and removed.Thus, the removed requests are reinserted in step (iii) by one of two repair operators.The first one is a greedy operator that reinserts each of the removed requests into the best route overall.The other one is based on a regret operator, which is a generalization of the greedy one in the sense that not only the best but also k routes are analyzed since a given request cannot be reinserted into the best route.
In steps (ii) and (iii), an operator is chosen according to the roulette wheel selection principle in which a given weight is associated with each operator.These weights are dynamically updated by using statics of previous iterations in which a reaction factor is used to control the influence of weights.Moreover, at the end of step (iii), a local search is applied in x , consisting of determining the best moment to serve each request that has not been served yet.Regarding the acceptance probability function, a given initial temperature is decreased over the ALNS iterations and thus the probability of accepting worse solutions in comparison with the current one is decreased as well.

Dynamic problem
In this version, a PDPTW instance and its solution are maintained over time.On each new event, the instance is updated with new information (e.g., delivery completion, new requests, etc.) and elements that were performed in the past are fixed inside their routes.The ALNS is run to obtain a new solution and it updates the maintained so-lution.New pickup and departure commands to the vehicles are generated.Requests that remained outside the solution are given to the third-party logistic operator when they become impossible to serve.

Dynamic-stochastic problem
In order to improve routes that are planned in the dynamic version for any event, sample-scenarios of future requests are used.These scenarios are generated from a probability function taking into consideration the known requests until the current time.Hence, each sampled-scenario is solved with the ALNS similarly to what is performed in the dynamic problem but now also considering future requests that contemplate a time horizon.
After solving all scenarios, a generalized route generation function is used to identify the best solution among all them.Then, such best solution is used to update the current solution.This function works on the following way: (i) for each solution of a each scenario, remove the sampled requests and every real requests that lie after at least one sampled request from all routes, since they indicate that a vehicle must wait or return to the depot to pickup some future requests; (ii) assign a score to each solution based on the number of times each of its routes are in other solutions, where the solution with the highest score is chosen and implemented.As commented before, requests outside the solution are assigned to the third-party logistic operator when they become impossible to serve.

Experimental Results
All the methods were coded in the C++ programming language and ran on an Intel 2.667 GHz Westmere EP X5650 processor.The experiments were carried out over a subset of instances from [10].The instances under consideration are of two types with relation to the customer location geographies, namely, clustered (C) and randomly dispersed (R).For each geography, we consider data sets that contains 100 (C_1 and R_1 ) and 200 (C_2, C_6, R_2, and R_6,) customers, as well as five types of time windows that are TW.d1,TW.f, TW.h, and TW.r, with one-hour deadlines, and TW.d2, with two-hours deadline.Moreover, the requests arrival rate is homogeneous and there are four different rates that are 1, 2, 3, and 4 (i.e, the overall arrival rate is of 0.1 requests per minute and so on).Therefore, we have a total of 120 instances in such a way the first instance is named as TW.d1_C_1_hom_1 (and so on).The number of vehicles is fixed to 10 for any instance.
Regarding the parameters of the methods, we carried out preliminary experiments in which the sampling horizon was defined over the entire horizon, and the ALNS had 50 and 250 iterations, assuming 30 scenario samples.These experiments indicated, in terms of solution quality and runtime, that performing 250 iterations for the ALNS are preferable.Thus, such values were adopted when solving all the 120 instances.The results that we obtained are presented in Tables 1 and 2. Each line of these tables has the name of the instance, the solution of the static, dynamic, and dynamic-stochastic versions as explained in Section 3, as well as the solution of the dynamic-stochastic but by using the consensus function in [10] with the ALNS.For each problem, it is presented the total solution cost, number of not served requests, and total computational time in seconds.
Observing Table 1, the average solution cost and runtime (in seconds) are, respectively: 2203.5 and 25.4, for the static problem; 3319.6 and 26.0, for the dynamic problem; and, 2592.7 and 14806.1, for the dynamic-stochastic problem that was solved with the generalized route generation function.We notice that the dynamicstochastic that was solved with the consensus function in [10], where these values are 2913.2and 19430.6,respectively, is outperformed by the proposed method, where there is a decrease of 11.0% and 23.8%, respectively.In terms of the number of not served requests, the proposed method performed the best with 0.3 more requests on average over [10].
The results of Table 2 are very similar to those of Table 1.In summary, from Table 2, the average solution cost, number of not served requests, and runtime (in seconds) are: 2321.4,4.8, and 27.1, for the static problem; 3253.0,14.4, and 27.3, for the dynamic problem; 2633.1, 8.7, and 14622.6,for the dynamic-stochastic problem that was solved with the generalized route generation function; and, 2937.2, 8.9, and 19552.5, the dynamic-stochastic that was solved with the consensus function in [10].Once again, the proposed method is able to overcome the dynamic (i.e., in terms of solution cost and number of not served requests, there is a decrease of 19.1% and 35.7%, respectively) and dynamic-stochastic of the literature (i.e., in terms of solution cost and runtime, there is a decrease of 10.3% and 25.2%, respectively), and better approximate the results of the static problem (i.e., in terms of solution cost and number of not served requests, they have the smallest percentage deviation).
Finally, with relation to the instances characteristics, comparing the dynamicstochastic problem with the respective version that was solved with the consensus function in [10], from Tables 1 and 2, we can highlighted that the latter performed worse in all geographies (R and C), time windows (TW.d1,TW.d2, TW.f, Tw.h, and TW.h), and requests arrival rates (1, 2, 3, and 4) in terms of average solution cost and runtime.Thus, we can conclude that the generalized route generation function, which allows vehicles to stop their current routes and return to depot to pickup requests, performs well in practice.

Concluding Remarks
The same-day delivery problem is tackled for which a generalized route generation function combined with an adaptive large neighborhood search is proposed, where sampled-scenarios are used to anticipate future requests and improve decisions.The ALNS has destroy and repair operators whose respective weights are dynamically updated during the search process.Aiming at improving results of a recent consensus function [10], our function allows vehicles to return to the depot in order to pickup requests even if they have not completed their routes, and requests can be rejected (i.e., reassigned to a third-party logistic operator by paying a cost).The computational results of the static, dynamic, and dynamic-stochastic versions over different geographies, arrival rates, and time windows have indicated the proposed method is quite effective to solve the problem when sampled-scenarios are taken into consideration.In general, there is an overall average increase in the solution cost, considering the static problem, of 42.3%, compared with the dynamic, 15.5%, compared with the dynamic-stochastic that uses the generalized route generation function, and 29.3%, compared with the dynamic-stochastic that uses the consensus function in [10].In terms of runtime, this increase is of 1.5%, 56061.8%,and 74295.4%,respectively.
Future works will focus on reducing the total runtime of the proposed method, including a study on the number of scenario samples, sampling horizon, vehicles, and iterations of the ALNS.One direction might also consider a parallel version of the proposed method.

Table 1
Results of the C instances.