5.1 Milk pooling and blending#

Pooling and blending operations involve the “pooling” of various streams to create intermediate mixtures that are subsequently blended with other streams to meet final product specifications. These operations are common to the chemical processing and petroleum sectors where limited tankage may be available, or when it is necessary to transport materials by train, truck, or pipeline to remote blending terminals. Similar applications arise in agriculture, food, mining, wastewater treatment, and other industries.

This notebook considers a simple example of a wholesale milk distributor to show how non-convexity arises in the optimization of pooling and blending operations. Non-convexity is due to presence of bilinear terms that are the product of two decision variables where one is a scale-dependent extensive quantity measuring the amount or flow of a product, and the other is scale-independent intensive quantity such as product composition. The notebook then shows how to develop and solve a convex approximation of the problem, and finally demonstrates solution the use of ipopt, a solver specifically designed to find global solutions to nonlinear optimization (NLO) problems.

Preamble: Install Pyomo and a solver#

This cell selects and verifies a global SOLVER for the notebook. If run on Google Colab, the cell installs Pyomo and ipopt, then sets SOLVER to use the ipopt solver. If run elsewhere, it assumes Pyomo and the Mosek solver have been previously installed and sets SOLVER to use the Mosek solver via the Pyomo SolverFactory. It then verifies that SOLVER is available.

import sys, os

solver_LO = "appsi_highs"

if 'google.colab' in sys.modules:
    %pip install idaes-pse --pre >/dev/null 2>/dev/null
    %pip install highspy >/dev/null 2>/dev/null
    !idaes get-extensions --to ./bin 
    os.environ['PATH'] += ':bin'
    solver_NLO = "ipopt"
else:
    solver_NLO = "mosek"

import pyomo.environ as pyo 

SOLVER_LO = pyo.SolverFactory(solver_LO)
SOLVER_NLO = pyo.SolverFactory(solver_NLO)

assert SOLVER_LO.available(), f"Solver {solver_LO} is not available."
assert SOLVER_NLO.available(), f"Solver {solver_NLO} is not available."
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Problem description: Pooling milk for wholesale blending and distribution#

A bulk distributor supplies custom milk blends to several customers. Each customer has specified a minimum fat content, a maximum price, and a maximum amount of milk they wish to buy. The distributor sources raw milk from local farms. Each farm produces a milk with a known fat content and cost.

The distributor has recently identified more affordable sources raw milk from some remote farms. These remote farms produce milk grades that can be blended with milk from local farms. However, the distributor only has one truck with a single tank available to transport milk from the remote farms. As a result, milk from the remote farms must be mixed in the tank before being transported to the blending station. This creates a “pool” of uniform composition which to be blended with local milk to meet the customer’s requirements.

The process is shown in the following diagram. The fat content and cost of raw milk is given for each farm. For each customer, data is provided for the required milk fat content, price, and the maximum demand. Arrows indicate pooling and blending of raw milk supplies. Each arrow is labeled with the amount of raw milk.

What should the distributor do?

  • Option 1. Do nothing and continue operating the business as usual with local suppliers.

  • Option 2. Buy a second truck to transport raw milk from the remote farms to the blending facility without pooling.

  • Option 3. Pool raw milk from the remote farms into a single truck for transport to the blending facility.

customers = pd.DataFrame(
    {
        "Customer 1": {"min_fat": 0.045, "price": 52.0, "demand": 6000.0},
        "Customer 2": {"min_fat": 0.030, "price": 48.0, "demand": 2500.0},
        "Customer 3": {"min_fat": 0.040, "price": 50.0, "demand": 4000.0},
    }
).T

suppliers = pd.DataFrame(
    {
        "Farm A": {"fat": 0.045, "cost": 45.0, "location": "local"},
        "Farm B": {"fat": 0.030, "cost": 42.0, "location": "local"},
        "Farm C": {"fat": 0.033, "cost": 37.0, "location": "remote"},
        "Farm D": {"fat": 0.050, "cost": 45.0, "location": "remote"},
    },
).T

local_suppliers = suppliers[suppliers["location"] == "local"]
remote_suppliers = suppliers[suppliers["location"] == "remote"]

print("\nCustomers")
display(customers)

print("\nSuppliers")
display(suppliers)
Customers
min_fat price demand
Customer 1 0.045 52.0 6000.0
Customer 2 0.030 48.0 2500.0
Customer 3 0.040 50.0 4000.0
Suppliers
fat cost location
Farm A 0.045 45.0 local
Farm B 0.03 42.0 local
Farm C 0.033 37.0 remote
Farm D 0.05 45.0 remote

Option 1. Business as usual#

The normal business of the milk distributor is to blend supplies from local farms to meet customer requirements. Let \(L = \{ A, B \}\) designate the set of local suppliers, and let \(C = \{1,2,3\}\) designate the set of customers. Decision variable \(z_{l, c}\) is the amount of milk from local supplier \(l\in L\) that is mixed into the blend sold to customer \(c\in C\).

The distributor’s objectives is to maximize profit

\[ \begin{align*} \text{profit} & = \sum_{(l, c)\ \in\ L \times C} (\text{price}_c - \text{cost}_l) z_{l,c} \end{align*} \]

where \((l, c)\ \in\ L\times C\) indicates a summation over the cross-product of two sets. Each term, \((\text{price}_c - \text{cost}_l)\), is the net profit of including one unit of raw milk from supplier \(l\in L\) in the blend delivered to customer \(c\in C\).

The amount of milk delivered to each customer \(c\in C\) can not exceed the customer demand.

\[ \begin{align*} \sum_{l\in L} z_{l, c} & \leq \text{demand}_{c} & \forall c\in C \end{align*} \]

Let \(\text{fat}_l\) denote the fat content of the raw milk produced by farm \(l\), and let \(\text{fat}^{\min}_c\) denote the minimum fat content required by customer \(c\), respectively. Assuming linear blending, the model becomes

\[ \begin{align*} \sum_{(l,c)\ \in\ L \times C} \text{fat}_{l} \cdot z_{l,c} & \geq \text{fat}^{\min}_{c} \sum_{l\in L} z_{l, c} & \forall c \in C \end{align*} \]

This is a standard linear blending problem that can be solved by linear optimization (LO).

m = pyo.ConcreteModel("Milk pooling v1")

# define sources and customers
m.L = pyo.Set(initialize=suppliers.index[suppliers["location"] == "local"].tolist())
m.C = pyo.Set(initialize=customers.index.tolist())

# define local -> customer flowrates
m.z = pyo.Var(m.L * m.C, domain=pyo.NonNegativeReals)


@m.Objective(sense=pyo.maximize)
def profit(m):
    return sum(
        m.z[l, c] * (customers.loc[c, "price"] - suppliers.loc[l, "cost"])
        for l, c in m.L * m.C
    )


@m.Constraint(m.C)
def demand(m, c):
    return sum(m.z[l, c] for l in m.L) <= customers.loc[c, "demand"]


@m.Constraint(m.C)
def fat_content(m, c):
    return (
        sum(m.z[l, c] * suppliers.loc[l, "fat"] for l in m.L)
        >= sum(m.z[l, c] for l in m.L) * customers.loc[c, "min_fat"]
    )


SOLVER_LO.solve(m)
print(f"\nProfit = {m.profit():0.2f}\n")

Z = pd.DataFrame(
    [[l, c, round(m.z[l, c](), 1)] for l, c in m.L * m.C],
    columns=["supplier", "customer", ""],
)
Z = Z.pivot_table(index="customer", columns="supplier")
Z["Total"] = Z.sum(axis=1)
Z["fat content"] = [
    sum(m.z[l, c]() * suppliers.loc[l, "fat"] for l in m.L)
    / sum(m.z[l, c]() for l in m.L)
    for c in m.C
]
Z["min_fat"] = customers["min_fat"]
display(Z)
Profit = 81000.00
Total fat content min_fat
supplier Farm A Farm B
customer
Customer 1 6000.0 0.0 6000.0 0.045 0.045
Customer 2 0.0 2500.0 2500.0 0.030 0.030
Customer 3 2666.7 1333.3 4000.0 0.040 0.040

Option 2. Buy an additional truck#

As shown before, the distributor can earn a profit of 81,000 using only local suppliers. It is possible to earn a higher profit by also sourcing raw milk from the remote suppliers?

Before considering pooling, the distributor may wish to know the maximum profit possible if raw milk from the remote suppliers could be blended just like local suppliers. This would require acquiring and operating a separate transport truck for each remote supplier, and is worth knowing if the additional profit would justify the additional expense.

The linear optimization model presented Option 1 extends the to include both local and remote suppliers.

m = pyo.ConcreteModel("Milk pooling v2 (additional truck)")

# define sources and customers
m.S = pyo.Set(initialize=suppliers.index.tolist())
m.C = pyo.Set(initialize=customers.index)

# define local -> customer flowrates
m.z = pyo.Var(m.S * m.C, domain=pyo.NonNegativeReals)


@m.Objective(sense=pyo.maximize)
def profit(m):
    return sum(
        m.z[s, c] * (customers.loc[c, "price"] - suppliers.loc[s, "cost"])
        for s, c in m.S * m.C
    )


@m.Constraint(m.C)
def demand(m, c):
    return sum(m.z[s, c] for s in m.S) <= customers.loc[c, "demand"]


@m.Constraint(m.C)
def quality(m, c):
    return (
        sum(m.z[s, c] * suppliers.loc[s, "fat"] for s in m.S)
        >= sum(m.z[s, c] for s in m.S) * customers.loc[c, "min_fat"]
    )


SOLVER_LO.solve(m)
print(f"\nProfit = {m.profit():0.2f}\n")

Z = pd.DataFrame(
    [[s, c, round(m.z[s, c](), 1)] for s, c in m.S * m.C],
    columns=["supplier", "customer", ""],
)
Z = Z.pivot_table(index="customer", columns="supplier")
Z["Total"] = Z.sum(axis=1)
Z["fat content"] = [
    sum(m.z[s, c]() * suppliers.loc[s, "fat"] for s in m.S)
    / sum(m.z[s, c]() for s in m.S)
    for c in m.C
]
Z["min_fat"] = customers["min_fat"]
display(Z)
Profit = 122441.18
Total fat content min_fat
supplier Farm A Farm B Farm C Farm D
customer
Customer 1 0.0 0.0 1764.7 4235.3 6000.0 0.045 0.045
Customer 2 0.0 0.0 2500.0 0.0 2500.0 0.033 0.030
Customer 3 0.0 0.0 2352.9 1647.1 4000.0 0.040 0.040

Sourcing raw milk from the remote farms significantly increases profits. This blending, however, requires at least two trucks to keep the sources of milk from the remote suppliers separated until they reach the blending facility. Note that the local suppliers are completely replaced by the lower cost remote suppliers, even to the extent of providing “product giveaway” by surpassing the minimum requirements of Customer 2.

Option 3. Pool delivery from remote suppliers#

Comparing Option 1 with Option 2 shows there is significantly more profit to be earned by purchasing raw milk from the remote suppliers. But that option requires an additional truck to keep the supplies separated during transport.

Because only one truck with a single tank is available for transport from the remote farms, the pool and blending problem is to combine purchases from the remote suppliers into a single pool of uniform composition, transport that pool to the distribution facility, then blend with raw milk from local suppliers to meet individual customer requirements. Compared to option 2, the profit potential may be reduced due to pooling, but without the need to acquire an additional truck.

Pooling problem#

There are several mathematical formulations of pooling problem in the academic literature. The formulation used here is called the \(p\)-parameterization where the pool composition represents a new decision variable \(p\). The other additional decision variables are \(x_r\) referring to the amount of raw milk purchased from remote supplier \(r\in R\), and \(y_c\) which is the amount of the pooled milk included in the blend delivered to customer \(c\in C\).

The profit objective is the difference between the income received for selling blended products and the cost of purchasing raw milk from local and remote suppliers.

\[ \begin{align*} \text{Profit} & = \sum_{(l,c)\ \in\ L \times C} (\text{price}_c - \text{cost}_l)\ z_{l,c} + \sum_{c\in C} \text{price}_c y_{c} - \sum_{r\in R} \text{cost}_r x_{r} \end{align*} \]

The product delivered to each customer from local farms and the pool can not exceed demand.

\[ \begin{align*} \sum_{l\in L} z_{l, c} + y_{c} & \leq \text{demand}_{c} & \forall c\in C \end{align*} \]

Purchases from the remote farms and the amounts delivered to customers from the pool must balance.

\[\begin{split} \begin{align*} \sum_{r\in R}x_{r} & = \sum_{c\in C} y_{c} \\ \end{align*} \end{split}\]

The average milk fat composition of the pool, \(p\), must satisfy an overall balance on milk fat entering the pool from the remote farms and the milk fat delivered to customers.

\[ \begin{align*} \sum_{r\in R}\text{fat}_{r}\cdot x_{r} & = \underbrace{p \sum_{c\in C} y_{c}}_{\text{bilinear}} \end{align*} \]

Finally, the milk fat required by each customer \(c\in C\) satisfies a blending constraint.

\[ \begin{align*} \underbrace{p y_{c}}_{\text{bilinear}} + \sum_{(l,c)\ \in\ L \times C} \text{fat}_{l}\cdot z_{l,c} & \geq \text{fat}^{\min}_c \ (\sum_{l\in L} z_{l, c} + y_{c}) & \forall c \in C \end{align*} \]

The last two constraints include bilinear terms which are the product of the decision variable \(p\) with decision variables \(y_c\) for all \(c\in C\).

Summarizing, the blending and pooling problem is to find a solution to maximize profit, where

\[\begin{split} \begin{align*} \max \quad & \sum_{(l,c)\ \in\ L \times C} (\text{price}_c - \text{cost}_l)\ z_{l,c} + \sum_{c\in C} \text{price}_c y_{c} - \sum_{r\in R} \text{cost}_r x_{r} \\ \text{s.t.}\quad & \sum_{l\in L} z_{l, c} + y_{c} \leq \text{demand}_{c} & \forall c\in C \\ & \sum_{r\in R}x_{r} = \sum_{c\in C} y_{c} \\ & \underbrace{p y_{c}}_{\text{bilinear}} + \sum_{(l,c)\ \in\ L \times C} \text{fat}_{l}\cdot z_{l,c} \geq \text{fat}^{\min}_c\ (\sum_{l\in L} z_{l, c} + y_{c}) & \forall c \in C \\ & \sum_{r\in R}\text{fat}_{r}\cdot x_{r} = \underbrace{p \sum_{c\in C} y_{c}}_{\text{bilinear}} \\ & p, x_r, y_c, z_{l, c} \geq 0 & \forall r\in R, c\in C, l\in L. \end{align*} \end{split}\]

Before attempting a solution to this problem, let us first consider the implications of the bilinear terms.

Why are bilinear problems hard?#

Bilinearity has a profound consequence on the nature of the optimization problem. To explore this, we treat \(p\) as a parameter and repeatedly solve the resulting linear problem for different values of \(p\).

def milk_pooling_bilinear(p):
    m = pyo.ConcreteModel("Milk pooling v3 (Bilinear formulation)")

    # define sources
    m.L = pyo.Set(initialize=suppliers.index[suppliers["location"] == "local"].tolist())
    m.R = pyo.Set(
        initialize=suppliers.index[suppliers["location"] == "remote"].tolist()
    )

    # define customers
    m.C = pyo.Set(initialize=customers.index.tolist())

    # define flowrates
    m.x = pyo.Var(m.R, domain=pyo.NonNegativeReals)
    m.y = pyo.Var(m.C, domain=pyo.NonNegativeReals)
    m.z = pyo.Var(m.L * m.C, domain=pyo.NonNegativeReals)

    m.p = pyo.Param(default=p)

    @m.Objective(sense=pyo.maximize)
    def profit(m):
        return (
            +sum(
                m.z[l, c] * (customers.loc[c, "price"] - suppliers.loc[l, "cost"])
                for l, c in m.L * m.C
            )
            + sum(m.y[c] * customers.loc[c, "price"] for c in m.C)
            - sum(m.x[r] * suppliers.loc[r, "cost"] for r in m.R)
        )

    @m.Constraint(m.C)
    def customer_demand(m, c):
        return sum(m.z[l, c] for l in m.L) + m.y[c] <= customers.loc[c, "demand"]

    @m.Constraint()
    def pool_balance(
        m,
    ):
        return sum(m.x[r] for r in m.R) == sum(m.y[c] for c in m.C)

    @m.Constraint()
    def pool_quality(m):
        return sum(suppliers.loc[r, "fat"] * m.x[r] for r in m.R) == m.p * sum(
            m.x[r] for r in m.R
        )

    @m.Constraint(m.C)
    def customer_quality(m, c):
        return m.p * m.y[c] + sum(
            suppliers.loc[l, "fat"] * m.z[l, c] for l in m.L
        ) >= customers.loc[c, "min_fat"] * (sum(m.z[l, c] for l in m.L) + m.y[c])

    return m
p_plot = np.linspace(0.025, 0.06, 200)
f_plot = []
for p in p_plot:
    m = milk_pooling_bilinear(p)
    SOLVER_LO.solve(m)
    f_plot.append(m.profit())

fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(p_plot, f_plot, lw=2)
ax.set_xlabel("Pool composition p")
ax.set_ylabel("Profit")
ax.grid(True)
plt.rcParams.update({"font.size": 14})
plt.tight_layout()
plt.show()
../../_images/b4927f01728ba7877031ad4cc0df8ec9d70c992dfb92d39b25381fd9bd6dd00d.png

In contrast to linear or other convex optimization problems, the objective function for this bilinear optimization problem is a non-convex function of a decision variable \(p\) denoting the composition of the blending pool. In fact, when profit is plotted as a function of \(p\), there are three local maxima separated by two local minima.

Convex Approximation#

The cause of the non-convexity in milk pooling problem are thee bilinear terms \(p y_c\) for all \(c\in C\) that appear in the constraints. A linear approximation can be obtained by introducing decision variables \(w_c\) to take the place of the bilinear terms \(p y_c\) in the model expressions. The result is a new, but incomplete, linear optimization problem

\[\begin{split} \begin{align*} \max\ \text{Profit} = & \sum_{(l,c)\ \in\ L \times C} (\text{price}_c - \text{cost}_l)\ z_{l,c} + \sum_{c\in C} \text{price}_c y_{c} - \sum_{r\in R} \text{cost}_r x_{r} \\ \text{s.t.}\qquad & \sum_{l\in L} z_{l, c} + y_{c} \leq \text{demand}_{c} & \forall c\in C \\ & \sum_{r\in R}x_{r} = \sum_{c\in C} y_{c} \\ & w_c + \sum_{(l,c)\ \in\ L \times C} \text{fat}_{l}\ z_{l,c} \geq \text{fat}^{\min}_c \ (\sum_{l\in L} z_{l, c} + y_{c}) & \forall c \in C \\ & \sum_{r\in R}\text{fat}_{r}\ x_{r} = \underbrace{p \sum_{c\in C} y_{c}}_{\text{bilinear}} \\ & w_c, x_r, y_c, z_{l, c} \geq 0 & \forall r\in R, c\in C, l\in L \end{align*} \end{split}\]

Just adding additional variables isn’t enough. Also needed are constraints that cause \(w_c\) to have close to or equal to \(p y_c\), that is \(w_c \approx p y_c\) for all \(c\in C\). If so, then this process produces a convex approximation to the original problem. Because the approximation relaxes the original constraints, a solution to the convex approximation will produced an over-estimate the potential profit.

From the problem formulation, the values of \(y_c\) are bounded between 0 and demand of customer \(c\), and the value of \(p\) is bounded between the minimum and maximum milk fat concentrations of the remote farms.

\[\begin{split} \begin{align*} 0 \leq\ & y_c \leq \text{demand}_c\ & \forall c\in C \\ \min_{r\in R} \text{conc}_r \leq\ & p \leq \max_{r\in R} \text{conc}_r \\ \end{align*} \end{split}\]

Representing the bounds on \(p\) and \(y_c\) as

\[\begin{split} \begin{align*} \underline{p} & \leq p \leq \bar{p} \\ \underline{y}_c & \leq y_c \leq \bar{y}_c & \forall c\in C \end{align*} \end{split}\]

the McCormick envelope on \(w_c\) is given by a system of four inequalities. For each \(c\in C\),

\[\begin{split} \begin{align*} w_c & \geq \underline{y}_c p + \underline{p} y_c - \underline{p}\underline{y}_c \\ w_c & \geq \bar{y}_c p + \bar{p} y_c - \bar{p}\bar{y}_c \\ w_c & \leq \bar{y}_c p + \underline{p} y_c - \bar{p}\underline{y}_c \\ w_c & \leq \underline{y}_c p + \bar{p} y_c - \underline{p}\bar{y}_c \\ \end{align*} \end{split}\]

The following cell implements these constraint in Pyomo. The features to note are:

  • Use of a rule to specify bounds on the decision variables m.y[c].

  • Creating a new decision variable m.p with bounds.

  • Creating a new set of decision variables m.w[c] to replace the bilinear terms.

  • An indexed Pyomo Block to bound m.w[c] using McCormick envelopes.

The result of these operations is a linear model that will provide an upper bound on the profit. Hopefully the resulting solution and bound will be a close enough approximation to be useful.

def milk_pooling_convex():
    m = pyo.ConcreteModel("Milk pooling v3 (Convex approximation)")

    # define sources
    m.L = pyo.Set(initialize=suppliers.index[suppliers["location"] == "local"].tolist())
    m.R = pyo.Set(
        initialize=suppliers.index[suppliers["location"] == "remote"].tolist()
    )

    # define customers
    m.C = pyo.Set(initialize=customers.index.tolist())

    # define flowrates
    m.x = pyo.Var(m.R, domain=pyo.NonNegativeReals)
    m.y = pyo.Var(m.C, bounds=lambda m, c: (0, customers.loc[c, "demand"]))
    m.z = pyo.Var(m.L * m.C, domain=pyo.NonNegativeReals)

    # composition of the pool
    m.p = pyo.Var(
        bounds=(suppliers.loc[m.R, "fat"].min(), suppliers.loc[m.R, "fat"].max())
    )

    # w[c] to replace bilinear terms p * y[c]
    m.w = pyo.Var(m.C, domain=pyo.NonNegativeReals)

    @m.Objective(sense=pyo.maximize)
    def profit(m):
        return (
            +sum(
                m.z[l, c] * (customers.loc[c, "price"] - suppliers.loc[l, "cost"])
                for l, c in m.L * m.C
            )
            + sum(m.y[c] * customers.loc[c, "price"] for c in m.C)
            - sum(m.x[r] * suppliers.loc[r, "cost"] for r in m.R)
        )

    # create a block of constraints for every customer
    @m.Block(m.C)
    def mccormick(b, c):
        m = b.model()
        b.ll = pyo.Constraint(
            expr=m.w[c]
            >= m.y[c].lower * m.p + m.p.lower * m.y[c] - m.p.lower * m.y[c].lower
        )
        b.hh = pyo.Constraint(
            expr=m.w[c]
            >= m.y[c].upper * m.p + m.p.upper * m.y[c] - m.p.upper * m.y[c].upper
        )
        b.lh = pyo.Constraint(
            expr=m.w[c]
            <= m.y[c].upper * m.p + m.p.lower * m.y[c] - m.p.lower * m.y[c].upper
        )
        b.hl = pyo.Constraint(
            expr=m.w[c]
            <= m.y[c].lower * m.p + m.p.upper * m.y[c] - m.p.upper * m.y[c].lower
        )

    @m.Constraint(m.C)
    def customer_demand(m, c):
        return sum(m.z[l, c] for l in m.L) + m.y[c] <= customers.loc[c, "demand"]

    @m.Constraint()
    def pool_balance(
        m,
    ):
        return sum(m.x[r] for r in m.R) == sum(m.y[c] for c in m.C)

    @m.Constraint()
    def pool_quality(m):
        return sum(suppliers.loc[r, "fat"] * m.x[r] for r in m.R) == sum(
            m.w[c] for c in m.C
        )

    @m.Constraint(m.C)
    def customer_quality(m, c):
        return m.w[c] + sum(
            suppliers.loc[l, "fat"] * m.z[l, c] for l in m.L
        ) >= customers.loc[c, "min_fat"] * (sum(m.z[l, c] for l in m.L) + m.y[c])

    SOLVER_LO.solve(m)

    return m
def report_solution(m):
    # Supplier report
    S = suppliers.copy()
    for l in m.L:
        for c in m.C:
            S.loc[l, c] = m.z[l, c]()
    for r in m.R:
        S.loc[r, "Pool"] = m.x[r]()
    S = S.fillna(0)
    S["Amount"] = S[m.C].sum(axis=1) + S["Pool"]
    S["Expense"] = S["Amount"] * S["cost"]

    # Customer report
    C = customers.copy()
    for c in m.C:
        for l in m.L:
            C.loc[c, l] = m.z[l, c]()
    for c in m.C:
        C.loc[c, "Pool"] = m.y[c]()
    C = C.fillna(0)
    C["Amount"] = C[m.L].sum(axis=1) + C["Pool"]
    C["fat delivered"] = (
        sum(C[l] * S.loc[l, "fat"] for l in m.L) + C["Pool"] * m.p()
    ) / C["Amount"]
    C["Income"] = C["Amount"] * C["price"]

    print(m)
    print(f"\nPool composition = {m.p():5.4f}")
    print(f"Profit = {m.profit():0.2f}")
    print(f"\nSupplier Report\n")
    display(S.round(4))
    print(f"\nCustomer Report\n")
    display(C.round(4))


m_convex = milk_pooling_convex()
report_solution(m_convex)
'Milk pooling v3 (Convex approximation)'

Pool composition = 0.0400
Profit = 111411.76

Supplier Report
fat cost location Customer 1 Customer 2 Customer 3 Pool Amount Expense
Farm A 0.045 45.0 local 2500.0 -0.0000 -0.0 0.0000 2500.0000 112500.0000
Farm B 0.030 42.0 local 0.0 1029.4118 0.0 0.0000 1029.4118 43235.2956
Farm C 0.033 37.0 remote 0.0 0.0000 0.0 4852.9412 4852.9412 179558.8244
Farm D 0.050 45.0 remote 0.0 0.0000 0.0 4117.6471 4117.6471 185294.1195
Customer Report
min_fat price demand Farm A Farm B Pool Amount fat delivered Income
Customer 1 0.045 52.0 6000.0 2500.0 0.0000 3500.0000 6000.0 0.0421 312000.0
Customer 2 0.030 48.0 2500.0 -0.0 1029.4118 1470.5882 2500.0 0.0359 120000.0
Customer 3 0.040 50.0 4000.0 -0.0 0.0000 4000.0000 4000.0 0.0400 200000.0

The convex approximation of the milk pooling model estimates an upper bound on profit of 111,412 for a pool composition \(p = 0.040\). The plot below compares this solution to what was found by in an exhaustive search over values of \(p\).

fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(p_plot, f_plot, lw=2)
ax.set_xlabel("Pool composition p")
ax.set_ylabel("Profit")
ax.grid(True)

ax.plot(m_convex.p(), m_convex.profit(), "ro", ms=10)
ax.axhline(m_convex.profit(), color="r", linestyle="--")
ax.axvline(m_convex.p(), color="r", linestyle="--")
ax.annotate(
    "convex approximation",
    xy=(m_convex.p(), m_convex.profit()),
    xytext=(0.036, 106000),
    ha="right",
    fontsize=14,
    arrowprops=dict(shrink=0.1, width=1, headwidth=5, facecolor="black"),
)
plt.rcParams.update({"font.size": 14})
plt.tight_layout()
plt.show()
../../_images/fe6a4740aa3c604082e16a4bf76b5cb9d4ec976ea9eea334b74f7774797f45d9.png

As this stage the calculations find the maximum profit for a given value of \(p\). The challenge, of course, is that the optimal value of \(p\) is unknown. The following cell computes profits over a range of \(p\).

The convex approximation is clearly misses the market in the estimate of profit and pool composition \(p\). Without the benefit of the full scan of profit as a function of \(p\), the only check on the profit estimate would be to compute the solution to model for the reported value of \(p\). This is done below.

m_est = milk_pooling_bilinear(m_convex.p())
report_solution(m_est)

fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(p_plot, f_plot, lw=2)
ax.set_xlabel("Pool composition p")
ax.set_ylabel("Profit")
ax.grid(True)

ax.plot(m_convex.p(), m_convex.profit(), "ro", ms=10)
ax.axhline(m_convex.profit(), color="r", linestyle="--")
ax.axvline(m_convex.p(), color="r", linestyle="--")
ax.annotate(
    "convex approximation",
    xy=(m_convex.p(), m_convex.profit()),
    xytext=(0.036, 106000),
    ha="right",
    fontsize=14,
    arrowprops=dict(shrink=0.1, width=1, headwidth=5, facecolor="black"),
)

ax.plot(m_est.p(), m_est.profit(), "go", ms=10)
ax.axhline(m_est.profit(), color="g", linestyle="--")
ax.annotate(
    "local max found",
    xy=(m_convex.p(), m_est.profit()),
    xytext=(0.045, 105000),
    ha="left",
    fontsize=14,
    arrowprops=dict(shrink=0.1, width=1, headwidth=5, facecolor="black"),
)
plt.rcParams.update({"font.size": 14})
plt.tight_layout()
plt.show()
'Milk pooling v3 (Bilinear formulation)'

Pool composition = 0.0400
Profit = 100088.24

Supplier Report
fat cost location Customer 1 Customer 2 Customer 3 Pool Amount Expense
Farm A 0.045 45.0 local 6000.0 0.0 0.0 0.0000 6000.0000 270000.0000
Farm B 0.030 42.0 local 0.0 0.0 0.0 0.0000 0.0000 0.0000
Farm C 0.033 37.0 remote 0.0 0.0 0.0 3823.5294 3823.5294 141470.5878
Farm D 0.050 45.0 remote 0.0 0.0 0.0 2676.4706 2676.4706 120441.1770
Customer Report
min_fat price demand Farm A Farm B Pool Amount fat delivered Income
Customer 1 0.045 52.0 6000.0 6000.0 0.0 0.0 6000.0 0.045 312000.0
Customer 2 0.030 48.0 2500.0 0.0 0.0 2500.0 2500.0 0.040 120000.0
Customer 3 0.040 50.0 4000.0 0.0 0.0 4000.0 4000.0 0.040 200000.0
../../_images/67fa8136bb2857e049591d92d926a341c23feb07ab4e72ae2367082b6324c3af.png

The result shows the profit if the pooled milk transported from the remote farms has a fat content \(p = 0.04\) them a profit of 100,088 is realized which is better than 81,000 earned for business as usual with just local suppliers, but falls short of the 122,441 earned if the remote milk supply could be transported without pooling.

With regard to the practical impact, the results of using this particular convex approximation are mixed. The approximation successfully produced a value for the pool composition \(p\) which would produce a profit of over \(100088\). However, the reported value for \(p\) was actually the smallest of the three local maxima for this problem. This discrepancy may have large consequences regarding the choice of suppliers.

Nonlinear Optimization (NLO) solution with ipopt#

The final version of this milk pooling model returns to the bilinear formulation with pool composition \(p\) as a decision variable. The following Pyomo implementation needs to specify a solver capable of solving the resulting problem, in this case we use the nonlinear solver ipopt.

def milk_pooling_bilinear_NLO():
    m = pyo.ConcreteModel("Milk pooling bilinear model")

    # define sources
    m.L = pyo.Set(initialize=suppliers.index[suppliers["location"] == "local"].tolist())
    m.R = pyo.Set(
        initialize=suppliers.index[suppliers["location"] == "remote"].tolist()
    )

    # define customers
    m.C = pyo.Set(initialize=customers.index.tolist())

    # define flowrates
    m.x = pyo.Var(m.R, domain=pyo.NonNegativeReals)
    m.y = pyo.Var(m.C, bounds=lambda m, c: (0, customers.loc[c, "demand"]))
    m.z = pyo.Var(m.L * m.C, domain=pyo.NonNegativeReals)
    m.p = pyo.Var(bounds=(remote_suppliers["fat"].min(), remote_suppliers["fat"].max()))

    @m.Objective(sense=pyo.maximize)
    def profit(m):
        return (
            +sum(
                m.z[l, c] * (customers.loc[c, "price"] - suppliers.loc[l, "cost"])
                for l, c in m.L * m.C
            )
            + sum(m.y[c] * customers.loc[c, "price"] for c in m.C)
            - sum(m.x[r] * suppliers.loc[r, "cost"] for r in m.R)
        )

    @m.Constraint(m.C)
    def customer_demand(m, c):
        return sum(m.z[l, c] for l in m.L) + m.y[c] <= customers.loc[c, "demand"]

    @m.Constraint()
    def pool_balance(
        m,
    ):
        return sum(m.x[r] for r in m.R) == sum(m.y[c] for c in m.C)

    @m.Constraint()
    def pool_quality(m):
        return sum(suppliers.loc[r, "fat"] * m.x[r] for r in m.R) == m.p * sum(
            m.x[r] for r in m.R
        )

    @m.Constraint(m.C)
    def customer_quality(m, c):
        return m.p * m.y[c] + sum(
            suppliers.loc[l, "fat"] * m.z[l, c] for l in m.L
        ) >= customers.loc[c, "min_fat"] * (sum(m.z[l, c] for l in m.L) + m.y[c])

    return m
m_global = milk_pooling_bilinear_NLO()
SOLVER_NLO.options["bound_frac"] = 0.01
SOLVER_NLO.solve(m_global)
report_solution(m_global)

fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(p_plot, f_plot, lw=2)
ax.set_xlabel("Pool composition p")
ax.set_ylabel("Profit")
ax.grid(True)
ax.plot(m_global.p(), m_global.profit(), "go", ms=10)
ax.axhline(m_global.profit(), color="g", linestyle="--")
ax.annotate(
    "global max found",
    xy=(m_global.p(), m_global.profit()),
    xytext=(0.035, 87000),
    ha="left",
    fontsize=14,
    arrowprops=dict(shrink=0.1, width=1, headwidth=5, facecolor="black"),
)
plt.rcParams.update({"font.size": 14})
plt.tight_layout()
plt.show()
'Milk Pooling Model - Bilinear (NLO)'

Pool composition = 0.0330
Profit = 102833.33

Supplier Report
fat cost location Customer 1 Customer 2 Customer 3 Pool Amount Expense
Farm A 0.045 45.0 local 6000.0001 -0.0 2333.3334 0.0000 8333.3334 375000.0037
Farm B 0.030 42.0 local -0.0000 0.0 -0.0000 0.0000 -0.0000 -0.0000
Farm C 0.033 37.0 remote 0.0000 0.0 0.0000 4166.6667 4166.6667 154166.6683
Farm D 0.050 45.0 remote 0.0000 0.0 0.0000 -0.0000 -0.0000 -0.0000
Customer Report
min_fat price demand Farm A Farm B Pool Amount fat delivered Income
Customer 1 0.045 52.0 6000.0 6000.0001 -0.0 0.0000 6000.0001 0.045 312000.0031
Customer 2 0.030 48.0 2500.0 -0.0000 0.0 2500.0000 2500.0000 0.033 120000.0012
Customer 3 0.040 50.0 4000.0 2333.3334 -0.0 1666.6667 4000.0000 0.040 200000.0020
../../_images/18c4ef7055cd03606fdb28a61dcf9e561a8854e890c47a2aab39c8a799392ef2.png

In this case, the convergence to the global optimum was pure luck. The way general non-linear optimization solvers work is that they begin from an initial solution and improve from then on, stopping when a set of conditions, known as the KKT conditions, is met. If we change the ipopt settings for how to pick the initial point, we can end up in one of the suboptimal local minima.

Setting the localmin optional argument to True, we can trigger this behavior and obtain a solution that is not the global optimum.

m_global2 = milk_pooling_bilinear_NLO(localmin=True)
# change ipopt options to find local minima instead of the global one
SOLVER_NLO.options["bound_frac"] = 0.5
SOLVER_NLO.solve(m_global2)
report_solution(m_global2)

fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(p_plot, f_plot, lw=2)
ax.set_xlabel("Pool composition p")
ax.set_ylabel("Profit")
ax.grid(True)
ax.plot(m_global.p(), m_global.profit(), "bo", ms=10)
ax.axhline(m_global.profit(), color="b", linestyle="--")
ax.annotate(
    "global max",
    xy=(m_global.p(), m_global.profit()),
    xytext=(0.0245, 96000),
    ha="left",
    fontsize=14,
    arrowprops=dict(shrink=0.1, width=1, headwidth=5, facecolor="black"),
)
ax.plot(m_global2.p(), m_global2.profit(), "go", ms=10)
ax.axhline(m_global2.profit(), color="g", linestyle="--")
ax.annotate(
    "local max found",
    xy=(m_global2.p(), m_global2.profit()),
    xytext=(0.0495, 93000),
    ha="left",
    fontsize=14,
    arrowprops=dict(shrink=0.1, width=1, headwidth=5, facecolor="black"),
)
plt.rcParams.update({"font.size": 14})
plt.tight_layout()
plt.show()
'Milk Pooling Model - Bilinear (NLO)'

Pool composition = 0.0450
Profit = 101392.16

Supplier Report
fat cost location Customer 1 Customer 2 Customer 3 Pool Amount Expense
Farm A 0.045 45.0 local 0.0 -0.0 -0.0000 0.0000 -0.0000 -0.0000
Farm B 0.030 42.0 local -0.0 2500.0 1333.3333 0.0000 3833.3334 161000.0016
Farm C 0.033 37.0 remote 0.0 0.0 0.0000 2549.0196 2549.0196 94313.7265
Farm D 0.050 45.0 remote 0.0 0.0 0.0000 6117.6471 6117.6471 275294.1203
Customer Report
min_fat price demand Farm A Farm B Pool Amount fat delivered Income
Customer 1 0.045 52.0 6000.0 0.0 -0.0000 6000.0001 6000.0001 0.045 312000.0031
Customer 2 0.030 48.0 2500.0 -0.0 2500.0000 -0.0000 2500.0000 0.030 120000.0012
Customer 3 0.040 50.0 4000.0 -0.0 1333.3333 2666.6667 4000.0000 0.040 200000.0020
../../_images/de0d1bc36bc029e0df68f6768aff1d1f468644f463345d86ad3d3a26131fc3cc.png

Concluding Remarks#

The solution for the bilinear pooling model reveals several features of the problem.

  • For the given parameters, pooling raw materials for shipment from remote suppliers yields the most profitable solution, but that solution is only possible because there are local suppliers to augment the pool blend to meet individual customer requirements.

  • Customer 2 receives a blend that is 3.3% exceeding the requirement of 3%. This results in some “give away” of product quality in return for the economic benefits of pooling.

Bibliographic Notes#

The pooling and blending is a large scale, high value, fundamental problem of logistics for the process and refining industries. The prototypical examples are the pooling and blending crude oils to meet the feed stock constraints of refineries, and for the pooling of refinery products for pipeline delivery to distribution terminals. Un

Haverly (1978) is a commonly cited small benchmark problem for the pooling and blending of sulfurous fuels.

Haverly, C. A. (1978). Studies of the behavior of recursion for the pooling problem. Acm sigmap bulletin, (25), 19-28. https://dl.acm.org/doi/pdf/10.1145/1111237.1111238

There is an extensive literature on pooling and blending. The following encyclopedia entry explains the history of the pooling problem, how it leads to multiple local minima and other pathological behaviors, and approaches to finding practical solutions.

Visweswaran, V. (2009). MINLP: Applications in Blending and Pooling Problems. https://link.springer.com/referenceworkentry/10.1007/978-0-387-74759-0_375

Recent research overviews include

Gupte, A., Ahmed, S., Dey, S. S., & Cheon, M. S. (2013). Pooling problems: relaxations and discretizations. School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA. and ExxonMobil Research and Engineering Company, Annandale, NJ. http://www.optimization-online.org/DB_FILE/2012/10/3658.pdf

The current state-of-the-art appears to be a formulation of the pooling problem is a mixed-integer quadratically-constrained quadratic optimization on a given network.

Ceccon, F., & Misener, R. (2022). Solving the pooling problem at scale with extensible solver GALINI. Computers & Chemical Engineering, 107660. https://arxiv.org/pdf/2105.01687.pdf

Applications for pooling and blending are probably underappreciated. In particular, what role might pooling and blending problems have in projects like the World Food Programme (WFP)?