.. _sec-DROCapitalBudgeting:
Distributionally Robust Optimization
====================================
We consider another risk-averse decision making approach using
distributionally robust optimization (DRO). To this end, we begin with a
nominal stochastic optimization problem:
.. math::
\max_{s \in S} \sum_{\sigma \in \Sigma} q^\sigma f(s,\xi^\sigma).
In our context, the nominal model is a stochastic capital budgeting problem in
which we maximize expected NPV, so we may take
:math:`f(s,\xi^\sigma) = NPV(s,\xi^\sigma)`. Here, :math:`s \in S` indicates
the constraints that a prioritized solution must satisfy when we prioritize
project selection subject to uncertainty in costs and NPV for each project, as
well as uncertainty in resource availability. The goal is to prioritize in
order to maximize expected NPV under the nominal distribution specified by the
probability mass function :math:`q^\sigma`, :math:`\sigma \in \Sigma`.
We suppose that :math:`\xi` is a discrete random vector with finite sample
space :math:`\Omega`, so that :math:`\xi^\omega`, :math:`\omega \in \Omega`,
enumerates all possible realizations. We further suppose that we only have
observations :math:`\xi^\sigma`, :math:`\sigma \in \Sigma \subset \Omega`,
possibly a strict subset, arising in a data-driven setting. In such a
data-driven setting we might have, for example,
:math:`q^\sigma = 1 / |\Sigma|` for all :math:`\sigma \in \Sigma`.
A DRO variant of this nominal stochastic optimization model is:
.. math::
\max_{s \in S} \min_{p \in P}
\sum_{\omega \in \Omega} p^\omega f(s,\xi^\omega).
Here, we may view this DRO model as playing a “game” against nature. First we
select :math:`s \in S`, and then, knowing :math:`s`, nature selects a
worst-case probability distribution :math:`p \in P` to minimize the expected
NPV, which we seek to maximize. The set :math:`P` is the *distributional
uncertainty set* (DUS), which we will define more precisely below. For now, it
is enough to think of :math:`P` as a neighborhood of probability distributions
centered on the given probability mass function :math:`q`, with the radius of
the neighborhood specified by a parameter :math:`\varepsilon`.
- If :math:`\varepsilon = 0`, the DRO model reduces to the nominal model.
- If :math:`\varepsilon` is very large, nature will select the single
worst-case scenario (e.g., lowest budgets, highest costs, lowest NPVs).
This is typically too conservative to be useful.
With moderate values of :math:`\varepsilon`, we obtain solutions that hedge
against deviations from :math:`q` without being excessively conservative.
We do *not* interpret nature as malevolent. Rather, the inner minimization
:math:`\min_{p\in P}` acts as a regularizer to avoid over-fitting our solution
to one specific estimated distribution, in much the same way that regularizers
are used in high-dimensional statistics and machine learning.
.. _subsec-WassersteinDistance:
Defining a Distributional Uncertainty Set via the Wasserstein Distance
----------------------------------------------------------------------
We constrain nature to choose a probability mass function :math:`p` that is
“close” to the nominal (or empirical) distribution :math:`q`. We define the
DUS as:
.. math::
P = \Bigl\{
p :
D(p,q) \le \varepsilon,\;
\sum_{\omega \in \Omega} p^\omega = 1,\;
p^\omega \ge 0,\; \omega \in \Omega
\Bigr\},
where :math:`D(p,q)` is a distance between nature’s choice :math:`p` and the
nominal distribution :math:`q`. The radius :math:`\varepsilon` governs how
much we allow :math:`p` to deviate from :math:`q`, and thus the degree of
conservatism.
There are many ways to measure :math:`D(p,q)`, including the
Kolmogorov–Smirnov distance, Kullback–Leibler divergence, chi-squared
distances, total variation, and more general :math:`\psi`-divergences. The
Wasserstein distance, based on optimal transport, is particularly useful in
DRO. For a distribution with known finite support, the Wasserstein distance
:math:`D(p,q)` between a nominal distribution :math:`q^\sigma`,
:math:`\sigma \in \Sigma`, and a candidate robust distribution :math:`p^\omega`,
:math:`\omega \in \Omega`, is given by the optimal value of the following
transportation problem:
.. math::
:label: WassersteinEq
\begin{aligned}
D(q,p) = \min_z \quad
& \sum_{\sigma \in \Sigma} \sum_{\omega \in \Omega}
d_{\sigma,\omega} z_{\sigma,\omega} \\
\text{s.t.} \quad
& \sum_{\omega \in \Omega} z_{\sigma,\omega} = q^\sigma,
&& \sigma \in \Sigma, \\
& \sum_{\sigma \in \Sigma} z_{\sigma,\omega} = p^\omega,
&& \omega \in \Omega, \\
& z_{\sigma,\omega} \ge 0,
&& \sigma \in \Sigma,\; \omega \in \Omega.
\end{aligned}
Intuitively, :math:`z_{\sigma,\omega}` denotes how much probability mass
:math:`q^\sigma` must be transported a distance :math:`d_{\sigma,\omega}` from
:math:`\xi^\sigma` to :math:`\xi^\omega`. If :math:`\Omega = \Sigma`,
:math:`p^\omega = q^\omega` for all :math:`\omega`, and
:math:`d_{\omega,\omega} = 0`, then :math:`D(p,q) = 0`.
To fully specify :math:`D(p,q)`, we must define
:math:`d_{\sigma,\omega} = \text{dist}(\xi^\sigma,\xi^\omega)`. We may choose
:math:`\text{dist}(\cdot,\cdot)` to be, for example, an :math:`\ell_\eta`-norm:
.. math::
d_{\sigma,\omega}
= \|\xi^\sigma - \xi^\omega\|_\eta.
With the Wasserstein distance, and a given distribution :math:`q`, we define:
.. math::
P = \Bigl\{
p :
D(p,q) \le \varepsilon,\;
\sum_{\omega \in \Omega} p^\omega = 1,\;
p^\omega \ge 0,\; \omega \in \Omega
\Bigr\}
for a chosen radius :math:`\varepsilon`. We think of :math:`P` as a ball (or
neighborhood) of probability distributions centered at :math:`q`. With
:math:`\Sigma \subset \Omega`:
- If :math:`\varepsilon = 0`, then :math:`P` collapses to the singleton
:math:`\{q\}`.
- Larger :math:`\varepsilon` creates larger neighborhoods and more
conservative robust models.
We can represent :math:`P` using the following *extended-variable*
constraints:
.. math::
:label: WassersteinConstraints
\begin{aligned}
& \sum_{\sigma \in \Sigma} \sum_{\omega \in \Omega}
d_{\sigma,\omega} z_{\sigma,\omega} \le \varepsilon, \\
& \sum_{\omega \in \Omega} z_{\sigma,\omega} = q^\sigma,
&& \sigma \in \Sigma, \\
& \sum_{\sigma \in \Sigma} z_{\sigma,\omega} = p^\omega,
&& \omega \in \Omega, \\
& z_{\sigma,\omega} \ge 0,
&& \sigma \in \Sigma,\; \omega \in \Omega.
\end{aligned}
.. _subsec-TractableReformulation:
Towards a Computationally Tractable Reformulation
-------------------------------------------------
Due to the nested :math:`\max \min` in the DRO model, it is not directly
amenable to off-the-shelf solvers. We therefore reformulate it.
For the moment, fix :math:`s \in S`, so that :math:`f(s,\xi^\omega)` is a
known numerical value for each :math:`\omega \in \Omega`. Nature’s problem can
then be written as:
.. math::
\begin{aligned}
\min_{p,z} \quad
& \sum_{\omega \in \Omega} p^\omega f(s,\xi^\omega) \\
\text{s.t.} \quad
& \sum_{\sigma \in \Sigma} \sum_{\omega \in \Omega}
d_{\sigma,\omega} z_{\sigma,\omega}
\le \varepsilon
&& : [-\gamma], \\
& \sum_{\omega \in \Omega} z_{\sigma,\omega}
= q^\sigma,
&& \sigma \in \Sigma : [\nu^\sigma], \\
& \sum_{\sigma \in \Sigma} z_{\sigma,\omega}
= p^\omega,
&& \omega \in \Omega : [\beta^\omega], \\
& z_{\sigma,\omega} \ge 0,
&& \sigma \in \Sigma,\; \omega \in \Omega.
\end{aligned}
Here :math:`\gamma`, :math:`\nu^\sigma`, and :math:`\beta^\omega` are dual
variables. Nature optimizes over :math:`z` and :math:`p` to select a worst-case
distribution within radius :math:`\varepsilon` of :math:`q`.
Taking the dual of this linear DRO program, and substituting
:math:`\beta^\omega = -f(s,\xi^\omega)`, we obtain:
.. math::
:label: dualProlem
\begin{aligned}
\max_{\gamma,\nu} \quad
& -\gamma \varepsilon + \sum_{\sigma \in \Sigma} \nu^\sigma q^\sigma \\
\text{s.t.} \quad
& -\gamma d_{\sigma,\omega} + \nu^\sigma
\le f(s,\xi^\omega),
&& \sigma \in \Sigma,\; \omega \in \Omega, \\
& \gamma \ge 0.
\end{aligned}
To build intuition, consider two extremes:
- Case :math:`\varepsilon = 0`. There is no penalty for large
:math:`\gamma` in the objective. As :math:`\gamma` grows large, the
constraints for :math:`\sigma \ne \omega` become vacuous. For
:math:`\sigma = \omega`, :math:`d_{\sigma,\sigma} = 0`, so the constraint
becomes :math:`\nu^\sigma \le f(s,\xi^\sigma)`. Optimizing yields
:math:`\sum_\sigma q^\sigma f(s,\xi^\sigma)`, which is exactly the nominal
expected value.
- Case :math:`\varepsilon \to \infty`. To avoid a large penalty
:math:`-\gamma \varepsilon` in the objective, we must have :math:`\gamma = 0`.
The constraint reduces to :math:`\nu^\sigma \le f(s,\xi^\omega)` for all
:math:`\omega`, so :math:`\nu^\sigma = \min_\omega f(s,\xi^\omega)`. The
objective becomes:
.. math::
\sum_\sigma q^\sigma \min_\omega f(s,\xi^\omega)
= \min_\omega f(s,\xi^\omega),
which corresponds to nature placing probability 1 on the worst-case scenario.
Specializing :math:`s \in S` to the capital budgeting prioritization decisions,
and :math:`f(s,\xi^\omega)` to the NPV under scenario :math:`\omega`, the DRO
variant of the stochastic capital budgeting problem is:
.. math::
:label: fullDRO
\begin{aligned}
\max_{x,y,\gamma,\nu} \quad
& -\gamma \varepsilon + \sum_{\sigma \in \Sigma} \nu^\sigma q^\sigma \\
\text{s.t.} \quad
& -\gamma d_{\sigma,\omega} + \nu^\sigma
\le \sum_{i \in I} \sum_{j \in J_i} a_{ij}^\omega x_{ij}^\omega,
&& \sigma \in \Sigma,\; \omega \in \Omega, \\
& y_{ii'} + y_{i'i} \ge 1,
&& i < i',\; i,i' \in I, \\
& \sum_{j=1}^{J_i} x_{ij}^\omega
\ge \sum_{j=1}^{J_i} x_{i'j}^\omega + y_{ii'} - 1,
&& i \ne i',\; i,i' \in I,\; \omega \in \Omega, \\
& \sum_{i \in I} \sum_{j \in J_i}
c_{ijkt}^\omega x_{ij}^\omega
\le b_{kt}^\omega,
&& k \in K,\; t \in T,\; \omega \in \Omega, \\
& \sum_{j \in J_i} x_{ij}^\omega \le 1,
&& i \in I,\; \omega \in \Omega, \\
& \gamma \ge 0.
\end{aligned}
.. _subsec-DROSettings:
LOGOS Settings for DRO Problems
-------------------------------
The DRO approach extends the stochastic optimization approach discussed in
Section :ref:`sec-StochasticCapitalBudgeting`. Both share the same input
structure except for the :xmlNode:`Settings` block.
In both cases, the user specifies a collection of scenarios via an
:xmlNode:`Uncertainties` block. The :xmlNode:`problem_type` within
:xmlNode:`Settings` selects the type of DRO problem. Currently available DRO
problem types are:
- :xmlString:`droskp`
- :xmlString:`dromkp`
- :xmlString:`dromckp`
The user can use :xmlNode:`radius_ambiguity` to control the Wasserstein radius
:math:`\varepsilon` for DRO problems.
Example LOGOS input XML for DRO:
.. code-block:: xml
cbc
EF
0.1
maximize
droskp
.. _subsec-DRO_SKP:
DRO for Single Knapsack Problem
-------------------------------
*Model formulation.*
.. math::
:label: DROSimpleKP
\begin{aligned}
\max_{x,y} \max_{\gamma,\nu} \quad
& -\gamma \varepsilon + \sum_{\sigma \in \Sigma} \nu^\sigma q^\sigma \\
\text{s.t.} \quad
& -\gamma d_{\sigma,\omega} + \nu^\sigma
\le \sum_{i \in I} a_i^\omega x_i^\omega,
&& \sigma \in \Sigma,\; \omega \in \Omega, \\
& \sum_{i \in I} c_i^\omega x_i^\omega
\le b^\omega,
&& \omega \in \Omega, \\
& y_{ii'} + y_{i'i} \ge 1,
&& i < i', \\
& x_i^\omega \ge x_{i'}^\omega + y_{ii'} - 1,
&& i \ne i', \\
& x_i^\omega, y_{ii'}^\omega \in \{0,1\}, \\
& \gamma \ge 0.
\end{aligned}
See Section :ref:`subsec-DRO_DKP` for an example LOGOS input file. The
multi-dimensional knapsack problem is a straightforward extension of the
single-dimensional case, and both share :xmlNode:`problem_type`
:xmlString:`droskp`.
.. _subsec-DRO_DKP:
DRO for Multi-Dimensional Knapsack Problem
------------------------------------------
*Model formulation.*
.. math::
:label: DROMultiDKP
\begin{aligned}
\max_{x,y} \max_{\gamma,\nu} \quad
& -\gamma \varepsilon + \sum_{\sigma \in \Sigma} \nu^\sigma q^\sigma \\
\text{s.t.} \quad
& -\gamma d_{\sigma,\omega} + \nu^\sigma
\le \sum_{i \in I} a_i^\omega x_i^\omega,
&& \sigma \in \Sigma,\; \omega \in \Omega, \\
& \sum_{i \in I} c_{it}^\omega x_i^\omega
\le b_t^\omega,
&& t \in T,\; \omega \in \Omega, \\
& y_{ii'} + y_{i'i} \ge 1,
&& i < i', \\
& x_i^\omega \ge x_{i'}^\omega + y_{ii'} - 1,
&& i \ne i', \\
& x_i^\omega, y_{ii'}^\omega \in \{0,1\}, \\
& \gamma \ge 0.
\end{aligned}
Example LOGOS input XML:
.. code-block:: xml
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
1,2,3,4,5
2.315,0.824,22.459,60.589,0.667,5.173,4.003,0.582,0.122,
-2.870,-0.102,-0.278,-0.322,-3.996,-0.246,-20.155
0.219,0.257,0.085,0.0,0.0,
0.0,0.0,0.122,0.103,0.013,
5.044,1.839,0.0,0.0,0.0,
6.74,6.134,10.442,0.0,0.0,
0.425,0.0,0.0,0.0,0.0,
2.125,2.122,0.0,0.0,0.0,
2.387,0.19,0.012,2.383,0.192,
0.0,0.95,0.0,0.0,0.0,
0.03,0.03,0.688,0.0,0.0,
0,0.2,0.763,0.739,2.539,
0.081,0.032,0,0,0,
0.3,0,0,0,0,
0.347,0,0,0,0,
4.025,0.297,0,0,0,
0.095,0.095,0.095,0,0,
5.487,5.664,0.5,6.803,6.778
18,18,18,18,18
10
0.012, 0.019, 0.032, 0.052, 0.086,
0.142, 0.235, 0.188, 0.141, 0.093
11, 11, 11, 11, 11,
12, 12, 12, 12, 12,
13, 13, 13, 13, 13,
14, 14, 14, 14, 14,
15, 15, 15, 15, 15,
16, 16, 16, 16, 16,
17, 17, 17, 17, 17,
18, 18, 18, 18, 18,
19, 19, 19, 19, 19,
20, 20, 20, 20, 20
10,11,12,13,14,15,16
cbc
EF
0.1
maximize
droskp
.. _subsec-DRO_MKP:
DRO for Multiple Knapsack Problem
---------------------------------
*Model formulation.*
.. math::
:label: DROMKP
\begin{aligned}
\max_{x,y} \max_{\gamma,\nu} \quad
& -\gamma \varepsilon + \sum_{\sigma \in \Sigma} \nu^\sigma q^\sigma \\
\text{s.t.} \quad
& -\gamma d_{\sigma,\omega} + \nu^\sigma
\le \sum_{i \in I} \sum_{m \in M} a_i^\omega x_{im}^\omega,
&& \sigma \in \Sigma,\; \omega \in \Omega, \\
& \sum_{i \in I} c_i^\omega x_{im}^\omega
\le b_m^\omega,
&& m \in M,\; \omega \in \Omega, \\
& y_{ii'} + y_{i'i} \ge 1,
&& i < i', \\
& \sum_{m=1}^M x_{im}^\omega
\ge \sum_{m=1}^M x_{i'm}^\omega + y_{ii'} - 1,
&& i \ne i',\; i,i' \in I,\; \omega \in \Omega, \\
& x_{im}^\omega, y_{ii'}^\omega \in \{0,1\}, \\
& \gamma \ge 0.
\end{aligned}
Example LOGOS input XML:
.. code-block:: xml
1,2,3,4,5,6,7,8,9,10
unit_1, unit_2
78, 35, 89, 36, 94, 75, 74, 79, 80, 16
18, 9, 23, 20, 59, 61, 70, 75, 76, 30
103, 156
10
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
101, 154,
102, 155,
103, 156,
104, 157,
105, 158,
106, 159,
107, 160,
108, 161,
109, 162,
110, 163
cbc
EF
0.1
maximize
dromkp
.. _subsec-DRO_MCKP:
DRO for Multiple-Choice Knapsack Problem
----------------------------------------
*Model formulation.*
.. math::
:label: DROMCKP
\begin{aligned}
\max_{x,y} \max_{\gamma,\nu} \quad
& -\gamma \varepsilon + \sum_{\sigma \in \Sigma} \nu^\sigma q^\sigma \\
\text{s.t.} \quad
& -\gamma d_{\sigma,\omega} + \nu^\sigma
\le \sum_{i \in I} \sum_{j \in J_i} a_{ij}^\omega x_{ij}^\omega,
&& \sigma \in \Sigma,\; \omega \in \Omega, \\
& \sum_{i \in I} \sum_{j \in J_i}
c_{ijkt}^\omega x_{ij}^\omega
\le b_{kt}^\omega,
&& k \in K,\; t \in T,\; \omega \in \Omega, \\
& y_{ii'} + y_{i'i} \ge 1,
&& i < i', \\
& \sum_{j=1}^{J_i} x_{ij}^\omega
\ge \sum_{j=1}^{J_i} x_{i'j}^\omega + y_{ii'} - 1,
&& i \ne i',\; i,i' \in I,\; \omega \in \Omega, \\
& x_{ij}^\omega, y_{ii'}^\omega \in \{0,1\}, \\
& \gamma \ge 0.
\end{aligned}
Example LOGOS input XML:
.. code-block:: xml
1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17
1;
1;
1;
1,2,3;
1,2,3,4;
1,2,3,4,5,6,7;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1
2.046
2.679
2.489
2.61
2.313
1.02
3.013
2.55
3.351
3.423
3.781
2.525
2.169
2.267
2.747
4.309
6.452
2.849
7.945
2.538
1.761
3.002
3.449
2.865
3.999
2.283
0.9
8.608
36538462
83849038
4615385
2788461538
2692307692
5480769231
1634615385
2981730768
7211538462
9038461538
649038462
650000000
216346154
212500000
3076923077
3942307692
1144230769
675721154
1442307692
99711538
4807692
123076923
138461538
86538462
108653846
75092404
6413462
147932692
15E9
3
0.2,0.6,0.2
5E9,10E9,15E9
glpk
EF
0.1
maximize
dromckp