Tuesday, 19 September 2017

Spillover from coupled effect of public policy and technology into new areas of services and manufacturing

Technological innovations coupled with complementary public policies have a spillover effect that transcends into new areas of services and manufacturing. Several dormant and stagnant industries benefit due to such an upward surge in demand of products and services. There are several segments that have set benchmark with their technological innovation and in turn have opened doors for growth of complementary industries.

1.       Electric Vehicles
Currently 3% of total vehicles in the world comprise of electric vehicles (EVs). Governments are encouraging automakers to develop their internal capabilities and put EVs on roads in next 10-15 years. While this seems like a feasible timeline there are certain factors that are detrimental to such growth plan. First of all OEMs doesn’t have the necessary research and development (R&D) capabilities to transition from internal combustion engines to battery powered technology. Further it is plagued by dearth of battery suppliers and inadequate availability of charging stations. However, the road does not seem gloomy after all, since there are technological giants such as Tesla who are pioneers of innovation in EV technology. In contrast, auto OEMs would require a joint collaborative effort for synchronizing their capabilities for achieving desired output.
Spillover effect will have positive effect on the following industries:
·         Battery manufacturers
·         Aluminium sheet manufacturers to take advantage of light weight auto body
·         Infrastructure for supporting charging stations
·         Electronics manufacturers for supplying accessories to charging stations and vehicles
Spillover effect will have negative effect on the following industries:
·         Steel manufacturers
·         IC engine
·         Gas and oil producers and suppliers
·         OEMs without dedicated R&D for electric vehicle technology

2.      Artificial Intelligence
Next on our list is artificial intelligence which has made a lot of noise in last five years. The exponential rise of digital data and online consumers has resulted in the use of AI for several features primarily in areas of NLP, text and speech recognition. However, it is not just limited to digital media, but has moved into other areas of automation and is fuelling the growth of drones and driver less cars. AI would create jobs for high skilled workers and would transform business from operating mode to controlling mode. This has been one of the talking points debated in last decade.
Spillover effect will have positive effect on the following industries:
·         Automation and AI driven technologies
·         Supporting technology such as IoT, Cloud computing etc.
·    Change from BPO  (Business Process Outsourcing) to BCO (Business Control Outsourcing)
Spillover effect will have negative effect on the following industries:
·         Service sector
·         Manual labor
·         Traditional manufacturing

3.      Blockchain
Blockchain technology has recently become the automatic choice of various industries. This is largely credited to the positive sentiment resulting from its short span success with cryptocurrencies. It is an old field of distributed systems which came into limelight in last few years. Decentralized control and in built trust mechanism is built for ensuring secure transactions, thanks to large computing power and consensus mechanism. It requires large amount of computation power for securing huge volume of digital transactions and would aid the dormant GPU industry, who are viewing it as a long term market. Bigger customer base and larger frequency of digital transactions, imposes additional requirement for computing hardware. Well, it is still early to predict its success, but with emergence of enterprise and public blockchain platforms, it seems like it might gain general acceptance in near future.
Spillover effect will have positive effect on the following industries:
·         GPU manufacturers
·         Cloud mining
·         Digital transaction
Spillover effect will have negative effect on the following industries:
·         Third party security

Tuesday, 12 September 2017

Inventory: Result of mismatch between supply and demand

Inventory results due to mismatch between supply and demand in a manufacturing organization. Demand and supply are external to the manufacturing system and have uncertainty associated with them. Low supplier yields from breakdowns, maintenance or unreliable lead times create uncertainty in getting appropriate supply. Demand has always been uncertain due to several factors that are beyond control of the organization.
Manufacturing organizations have control over capabilities that are within their control such as production capacity, storage capacity, resources and flexibility. Flexibility can be either in form of volume or delivery flexibility. Both these forms of flexibility impact the amount of inventory that exists in the system. Further, due to assembly and production activities within the organization, there are three types of inventory at any given point of time.
1.      Raw material inventory
2.      WIP (Work In Process) inventory
3.      Finished goods inventory
Names of the three categories of inventory are self explaining the forms of inventory that exist during procurement, production and after final assembly. Raw material inventory is amount of raw material that is present in stock to take care of uncertainty from less reliable suppliers and utilization of part in production obtained from BOM (Bill of Material) structure. While, it is more dependent upon supplier uncertainty, the decision of how much stock to store is influenced by the production capacity and forecasted demand. Inventory requirement for a particular period “t” can be stated as:
Inventory[t] = Forecasted demand[t] - Production[t] - Inventory [t - 1]
The above expression is a necessary condition, but does not necessarily would produce optimal result since we do not have an objective that is to be optimized. Initially we have forecasted demand for finished goods, and using BOM utilization and current material on hand, we can calculate the requirement (or forecasted demand for raw material). WIP inventory is outcome from system design and how different work stations are aligned to smooth the production flow.
While these are deterministic in nature, there is another source of uncertainty for delivery lead time and it must take into account demand during lead time by the manner in which demand is replenished. This makes the calculations difficult since uncertainty of demand is cascaded further down into the calculations. Safety stock is the amount of inventory that takes care of such uncertainty originating from unreliable lead times.
Safety Stock = Reorder Point + Lead time consumption
These calculations are rather straightforward, but increases vulnerability of the system due to uncertainties of supply, demand and delivery lead time. The production capacity is internal and can be controlled by the manufacturing system. So, with accurate prediction of systematic demand component inventory stock is determined. Safety stock is determined by taking into account both systematic and random component of demand and lead time uncertainty.

Monday, 11 September 2017

Applying TOC (Theory of Constraints) in manufacturing operations

Efficiency and productivity have received greater attention for performance measurement. However, E.M. Goldratt emerged as one of the greatest adversaries in the use of efficiency for performance measurement. We use a lot of charts, graphs and analysis for depicting operations scenario, but it fails during implementation in real world situation. Bottlenecks exists everywhere, although they are used primarily in manufacturing operations. The current article focuses primarily on manufacturing operations, but it is applicable even for services. 
According to Goldratt, there are three variables that can be used to control and measure manufacturing operations i.e. throughput, operational expense and inventory. He argued that any activity in the organisation can be grouped under the three categories. Throughput is the rate at which system produces output. Operational expense is the total expense incurred in running of manufacturing operation. And inventory represents all the amount that has been tied up in assets which the organisation intends to sell. 
Next, and important concept is identification of bottleneck which hinder the progress of entire chain. A chain is as strong as its weakest link, thus if one identifies the weakest link or bottleneck, than efforts need to be taken in improving the efficiency. This can be accomplished by investing resources and focusing upon improvements in bottleneck operations. At any given point of time, there is one bottleneck and upon resolving the speed of such bottlenecks, it moves. Hence, bottlenecks are moving within the organisation. It might also be the case that bottleneck moves from manufacturing and into sales and marketing operations. For all it is a continuous improvement and need is to identify the bottleneck. A bottleneck operation in the shop floor usually relates to a work station which has large queue of materials in front of it and usually affects succeeding operations. 
A typical manufacturing environment is characterised by preceding relationship and statistical fluctuations. The operations are always carried out in a predetermined sequence and have associated variance in processing times. Because of the effect of these two factors, uncertainty prevails and it becomes necessary to identify the most influential operation which determines throughput. More and more a system tries to move towards 100% balancing, there exists greater chances in failure of the system. Thus, it is sometimes fine to have resources with slack and substantial amount of inventory. Since, faster response in serving the customers will result in improving customer satisfaction and bringing new customer.        
Summarising steps for applying TOC in manufacturing operations are:
Step 1: Determine precedence relationship between operations
Step 2: Identify bottleneck or weakest link in the operation
Step 3: Continuous improvement, redistribution / addition of resources and outsourcing  
Step 4: Go to step 1

Friday, 8 September 2017

Analytics in supply chain and logistics

Supply chain and logistical activities are relying on analytics to gain competitive advantage. Previously, supply chain configuration was focused on dedicated markets serving loyal customers. The transition has occurred in recent years from buyers market to sellers market. Henry Ford on production of famous T-model said that he made “any color so long as it is black”. The mere suggestion indicated that people have dearth of options for cars and color is not something that they are concerned about. Well, that thought would not stand in today’s world which is characterized by competitive environment encompassing several players. Supply chain has traversed a long journey in the last century and the growth has increased multi-fold in last decade. There are several factors that are responsible for growth such as technology, customer awareness, internet, diversification, globalization etc.
Google search for analytics would yield several areas of the domain ranging from Google analytics, business analytics, web analytics, data analytics, supply chain analytics and so on. Supply chain uses analytics in forms suited for decision making at different phase. Strategic decision making for Greenfield projects with focus on identifying site location and future growth predictions. It would use qualitative inputs that would govern potential locations by evaluating future tariffs, logistics service providers (LSPs)/3PLs, geopolitical stability, supplier proximity etc. It would also heavily depend on product attributes, distribution network, future demand (with uncertainty) and target market. Commonly, data analytics is heavily used for analyzing past data and identifying growth opportunities which may not be limited to shipment pattern, load optimization, heavy lanes, service level, fill rates and others for base scenario and To-be scenario. Product attributes classifying them under volume and weight criteria under following categories:
·         High volume – Low weight (e.g. cigarettes)
·         High volume – High weight (e.g. heavy machinery)
·         Low volume – Low weight (e.g. plastic pins)
·         Low volume – High weight (e.g. metal screws)
Primarily the product categories is based on density, but can be elaborated based on above attributes. Further, life cycle and usage of product classifies them into fast moving, medium moving and slow moving products. FMCG are fast moving products with shelf life limitation, while high annual usage value products are not stored in the inventory, to smooth cash flow.
Finally analytics is also used in other areas such as route planning, inventory planning, supply planning, demand planning and network design. This is not an exhaustive list, but it is impossible to cover all areas in supply chain domain that utilize analytics. Analytics is evolving with the impetus of technological innovation, such as IoT, Big Data, cloud storage and offers complementary benefit in utilizing them for continuous improvement of supply chain.   

Monday, 4 September 2017

Is data analysis same as data analytics?

I have been hearing the use of term “data” in a rather vague form in MIS (management information system), ERP (enterprise resource planning) and so on. But, during the last four years it has gathered increase attention by the industries across geographies. Its advent is in sync with growth of data users across the globe, primarily from emerging economies. Common public has grown attached with internet and Smartphone to such an extent that makes it worthy to note exponential rise in the user base. While, at first this looked like a trend that was forth coming, but lately it has transcended to a different domain of its own which is IoT (internet-of things). Vast amount of digital footprints left by individuals operating on the internet with use of even remote devices are up for grabs to analyse. An article dating back to 2014 showed that IDCs digital universe predicted growth in world’s data to 44 zettabytes by 2020 and IoT would contribute 10% of it. That is a huge number in absolute sense. Data storage and securities are the issues that come along with it and require analysing them on the fly.
So there is huge data, but how does one make sense out of it. There is lot of noise in the data, which it makes it a gruesome task to explore and visualise the findings. Statistics is the basic technique that has been used to analyse the data, but data science has added new dimension to the existing knowledge framework by proposing a scientific means to explore the data and present its findings. It is confusing to define boundaries associated with buzzwords such as data science, machine learning or artificial intelligence. At, first it occurred to me that they are almost the same, but there is a fine line that defines them into unique specialisations. Same is the case with data analysis and analytics. Once, we are given data there are two ways of basic analysis i.e. descriptive and prescriptive. The descriptive analysis reports the data in its crude form and provides information evident at the first instance. But, there is nothing great with it; we want to know something that is not evident from first look. Prescriptive analysis is a diagnostics mechanism that provides recommendations and inferences by delving much deeper into data. While, these two mechanisms have their proven worth at different phases of decision making, the important thing is they both are data analysis. Usual MIS reporting to upper management is a process and more or less doesn’t change with time; it is a form of data analysis. Visualisations have become so important for managers that, sometimes an inferential chart is best way to send the message.
Performing data analysis on humongous data is not a feasible alternative, rather one must move towards a more detailed manner for evidence based analysis. Data analysis is a subset of data analytics that initiates with basic data cleaning, analysis and further moves towards applying specific statistical tools and machine learning for providing better insights. It is becoming an integral part of industry decision making to use data analytics for both quantitative and qualitative analysis. It would be right to say that data analysis has moved ahead and become part of a larger data analytics domain, that is suitable answer to future data explosion.     

Friday, 1 September 2017

A primer for operations research formulation: Analysis of product mix problem with bill of material (BOM)


Product mix problem considered here incorporates engineering bill of material structure in the model. In actual situation, the flow of such system encompasses of input, production and output. The input is typically, raw material capacity. The production is a complex network of processes, workstations with equipment, machines and human workforce. Since, we want to analyze a problem for product mix without delving deep into the scheduling and machines, we would consider processes as part of the production system.

Production system has engineering bill of material that ties up raw material requirement for each of end product. The concept of equivalent unit is used for each of the product produced in sequential processes yielding different output amount. We will use minimum capacity of each product among all the process as the daily capacity. We compute the capacity in terms of both number of units and time. In the output end we have demand and constraint imposed by market. 

The production data represents crux of our problem. Each shift has four hundred and eighty minutes. There are three processes with planned downtime of eight percent, fourteen percent and twelve percent for each shift. The processing time for each product on each process is known with high accuracy. Along with these data, we have the bill of material matrix representing engineering requirement of each raw material for each end product. The next step is to compute equivalent units and minimum capacity which is the pre-processed data. The shift available time for each process is calculated by deducting downtime from the total available time. These time duration are then divided by respective processing time for product on the processes. The divided numbers are rounded down for accommodating practical consideration in production. The minimum capacity for each product is the minimum capacity among all the processes. The product with least capacity is the standard product and we compute equivalent units of other products.

The output data encompasses of daily demand for each product and market constraint for amount to be sold. Such market constraint might require producing minimum amount of a product or negative correlated demand for supplementary and positive correlated demand for complementary products. The product P3 and P2 are positively correlated and requires maintaining P3 product at least fifty units above P2.






The algebraic notations for the problem are:



The formulation of product mix problem is



The formulation involves redundancies in certain calculation for process times. The objective is to illustrate the approach for such problems.

Python Code:



# import required packages

from pulp import*
import pandas as pd
import math

# Input data lists and parameter 

rawmtrl = ['RM1','RM2']
product = ['P1','P2','P3']
process = [1,2,3]

rmcap = [['RM1',30000],['RM2',40000]]
bom = [['RM1','P1',1],['RM1','P2',3],['RM1','P3',1],['RM2','P1',2],['RM2','P2',2],['RM2','P3',1]]
shifttime = 480
downtime = [[1,0.08],[2,0.14],[3,0.12]]
prodproctime = [['P1',1,0.3],['P1',2,0.4],['P1',3,0.4],['P2',1,0.3],['P2',2,0.2],['P2',3,0.4],['P3',1,0.4],['P3',2,0.5],['P3',3,0.4]]
demand = [['P1',800],['P2',300],['P3',400]]
profit = [['P1',5500],['P2',3000],['P3',2300]]

# Preprocessing of data for OR model

shiftprod = []
for i in downtime:
    prod = shifttime * (1 - i[1])
    shiftprod.append([i[0],prod])

prodproccap = []
for i in prodproctime:
    for j in shiftprod:
        if i[1] == j[0]:
            c = math.floor(j[1]/i[2])
            prodproccap.append([i[0],i[1],c])
            
prodproccapdf = pd.DataFrame(prodproccap,columns=['Product','Process','MinCapacity'])
prodproccapdf = prodproccapdf.groupby(['Product'], as_index=False)['MinCapacity'].min()
prodproccapdf = prodproccapdf.set_index(['Product'])

mincap = int(prodproccapdf.min())

equnitdf = mincap/prodproccapdf
equnitdf = equnitdf.rename(columns={'MinCapacity':'EqUnits'})

# Converting other input data into dataframe for OR model

rmcapdf = pd.DataFrame(rmcap,columns=['Raw Material','Capacity'])
rmcapdf = rmcapdf.set_index(['Raw Material'])

bomdf = pd.DataFrame(bom,columns=['Raw Material','Product','ReqUnits'])
bomdf = bomdf.set_index(['Raw Material','Product'])

prodproctimedf = pd.DataFrame(prodproctime,columns=['Product','Process','PT'])
prodproctimedf = prodproctimedf.set_index(['Product','Process'])

shiftproddf = pd.DataFrame(shiftprod,columns=['Process','ShiftTime'])
shiftproddf = shiftproddf.set_index(['Process'])

demanddf = pd.DataFrame(demand,columns=['Product','Demand'])
demanddf = demanddf.set_index(['Product'])

profitdf = pd.DataFrame(profit,columns=['Product','Profit'])
profitdf = profitdf.set_index(['Product'])

# Lists for OR model
rmprod = []
for i in rawmtrl:
    for j in product:
        rmprod.append([i,j])
        
prodproc = []
for j in product:
    for k in process:
        prodproc.append([j,k])
        
class ProductMixApplication:
    
    def __init__(self,rawmtrl,product,process,rmprod,prodproc,rmcapdf,bomdf,equnitdf,mincap,prodproctimedf,shiftproddf,
                 demanddf,profitdf):
        
        self.rawmtrl = rawmtrl
        
        self.product = product
        
        self.process = process
        
        self.rmprod = rmprod
        
        self.prodproc = prodproc
        
        self.rmcapdf = rmcapdf
        
        self.bomdf = bomdf
        
        self.equnitdf = equnitdf
        
        self.mincap = mincap
        
        self.prodproctimedf = prodproctimedf
        
        self.shiftproddf = shiftproddf
        
        self.demanddf = demanddf
        
        self.profitdf = profitdf
             
        self.prob = LpProblem("Product Mix Application",LpMaximize)
        
        self.x = LpVariable.dicts("Raw_Material_amount_variable", ((i) for i in self.rawmtrl),lowBound=0,cat='Integer')
        
        self.y = LpVariable.dicts("Product_amount_variable", ((j) for j in self.product),lowBound=0,cat='Integer')
        
        self.s = LpVariable.dicts("Idle_Time_variable", ((k) for k in self.process),lowBound=0,cat='Continuous')
        
        self.prob += lpSum([self.profitdf.loc[(j),'Profit'] * self.y[j] for j in self.product])
        
        # Raw material capacity constraint
        
        for i in self.rawmtrl:
            self.prob += self.x[i] <= self.rmcapdf.loc[(i),'Capacity']
            
        # BOM requirement constraint
        
        for i in self.rawmtrl:
            self.prob += lpSum([self.y[j] * self.bomdf.loc[(a,j),'ReqUnits'] for a,j in self.rmprod if a == i]) == self.x[i]
            
        # Production capacity constraint in equivalent units
        
        self.prob += lpSum([self.y[j] * self.equnitdf.loc[(j),'EqUnits'] for j in self.product]) <= self.mincap
        
        # Processing time constraint
        
        for k in self.process:
            self.prob += lpSum([self.prodproctimedf.loc[(j,a),'PT'] * self.y[j] for j,a in self.prodproc if a == k]) + self.s[k] == self.shiftproddf.loc[(k),'ShiftTime']
        
        # Demand constraint
        
        for j in self.product:
            self.prob += self.y[j] <= self.demanddf.loc[(j),'Demand']
            
        # Market constraint for correlated product demands
        
        self.prob += self.y['P3'] - self.y['P2'] >= 50
        
    def solve(self):
        self.prob.solve()
        self.prob.writeLP("ProductMixApplication.lp")
        
    def status(self):
        return LpStatus[self.prob.status]
        
    def objective(self):
        return value(self.prob.objective)
        
    def returnVar(self):
        var = []
        for i,v in enumerate(self.prob.variables()):
            var.append([v.name,v.varValue])
        return var

mod = ProductMixApplication(rawmtrl,product,process,rmprod,prodproc,rmcapdf,bomdf,equnitdf,mincap,prodproctimedf,shiftproddf,
              demanddf,profitdf)
mod.solve()
status = mod.status()
variables = mod.returnVar()
objective = mod.objective()

Column generation: Vehicle routing problem with time window

For understanding basics of column generation please refer to my previous post http://practicalappliedanalytics.blogspot.in/2017/09/column-generation-column.html

The informal formulation for VRPTW (Desaulniers, G., Desrosiers, J. and Solomon, M.M., Column generationChapter 3,Springer, 2005, DOI: 10.1007/b135457) is as follows:

The restricted master problem for path generation is:

The sub-problem with reduced cost factor in the objective is:

Objective of sub-problem is


Assumptions:
1.     Homogeneous fleet of vehicles
2.     Starting and ending depot are same
3.     Cost is assumed to be proportional to distance
4.     Travel time is calculated with assuming fixed vehicle speed in all lanes
5.     The solution of the model might require further branching on vehicles or flow variables to get integer solution. Here column generated model is build
6.     Upload 'VRPTWData.csv' file for extracting input parameters. This data is based on Solomon instances.


Python code

(Link:https://gist.github.com/adarsh70/f4ecb197ce0bc2252bd8c01f6d5978f1)

#Vehicle Routing Problem with time windows (Adapted from Solomon)




from pulp import*

import pandas as pd

#Data for the model both in list and dataframe


aggrdat = pd.DataFrame.from_csv('VRPTWData.csv',index_col = ['CustNo'])

aggrdatrec = pd.DataFrame.to_records(aggrdat)
aggrdatser = pd.Series(aggrdatrec)
aggrdatlis = aggrdatser.tolist()

nodesdf = aggrdat.iloc[:,[]]
custdf = aggrdat.iloc[1:,[]]
demanddf = aggrdat.iloc[1:,[2]]
timewindowdf = aggrdat.iloc[0:,[3,4,5]]

nodes = []
for i,x in enumerate(aggrdatlis):
    for j,y in enumerate(x):
        if j == 0:
            nodes.append(y)
            
demand = []
for i,x in enumerate(aggrdatlis):
    if i != 0:
        demand.extend([([x[0],x[3]])])
        
timewindow = []
for i,x in enumerate(aggrdatlis):
    timewindow.extend([[x[0],x[4],x[5],x[6]]])

cust = [] 
for i,x in enumerate(nodes):
    if i > 0:
        cust.append(x)
        
depot = [] 
for i,x in enumerate(nodes):
    if i == 0:
        depot.append(i)
        
loccoord = []
for i,x in enumerate(aggrdatlis):
    loccoord.extend([[x[0],x[1],x[2]]])

#Calculate distance and time based on assumed vehicle speed of 5 units per time period


def dist(a,b):
    dist = []
    dm = (((a[1] - b[1])**2) + ((a[2] - b[2])**2))**(1/2)
    dist.extend([a[0],b[0],dm])
    return dist

def time(a,b):
    time = []
    dm = ((((a[1] - b[1])**2) + ((a[2] - b[2])**2))**(1/2))/5
    time.extend([a[0],b[0],dm])
    return time

arcs = []
for i,x in enumerate(loccoord):
    for j,y in enumerate(loccoord):
        arcs.extend([[x[0],y[0]]])
 
arccost = []
for i,x in enumerate(loccoord):
    for j,y in enumerate(loccoord):
        arccost.append(dist(x,y))

arctime = []
for i,x in enumerate(loccoord):
    for j,y in enumerate(loccoord):
        arctime.append(time(x,y))
     
M = []
for i,x in enumerate(arctime):
    s = timewindow[x[0]-1][2] - timewindow[x[1]-1][1] + x[2]
    M.extend([[x[0],x[1],s]])

arccostdf = pd.DataFrame(arccost,columns=['fromnode','tonode','cost'])
arccostdf = arccostdf.set_index(['fromnode','tonode'])

arcsdf = pd.DataFrame(arcs,columns=['fromnode','tonode'])

arctimedf = pd.DataFrame(arctime,columns=['fromnode','tonode','time'])
arctimedf = arctimedf.set_index(['fromnode','tonode'])

Mdf = pd.DataFrame(M,columns=['fromnode','tonode','Mvalue'])
Mdf = Mdf.set_index(['fromnode','tonode'])

#Master Problem relaxed for calculating duals


class MasterProblem:
    
    def __init__(self,nodes,arcs,cust,depot,demand,timewindow,arccost,arctime,vehcap,genpaths,arccostdf,demanddf,arcsdf,nodesdf,custdf,arctimedf,timewindowdf,Mdf,problemname):
        
        self.nodes = nodes
        self.arcs = arcs
        self.cust = cust
        self.depot = depot
        self.demand = demand
        self.timewindow = timewindow
        self.arccost = arccost
        self.arctime = arctime
        self.vehcap = vehcap
        self.genpaths = genpaths
        self.arccostdf = arccostdf
        self.demanddf = demanddf
        self.arcsdf = arcsdf
        self.nodesdf = nodesdf
        self.custdf = custdf
        self.arctimedf = arctimedf
        self.timewindowdf = timewindowdf
        self.Mdf = Mdf
        self.problemname = problemname
        
        
        self.prob = LpProblem(problemname,LpMinimize)
        
        self.obj = LpConstraintVar("obj")
        self.prob.setObjective(self.obj)
        
        self.PathVars = []
        
        self.constraintList = []
        for i,x in enumerate(cust):
            var = LpConstraintVar("Cust"+str(i),LpConstraintEQ,1)
            self.constraintList.append(var)
            self.prob += var
            
        
        ct = []
        for i,x in enumerate(self.genpaths):
            temp = []
            for j,y in enumerate(x):
                if y>0:
                    for a,b in enumerate(self.arcs):
                        if a == j:
                            temp.append(b[0])
                            temp.append(b[1])
            ct.extend([temp])
            
        custcount = []
        temp = []
        for i,x in enumerate(ct):
            for a,b in enumerate(self.cust):           
                temp.append(x.count(b))
            custcount.extend([temp])
            temp = []
            
        
        
        for i,x in enumerate(self.genpaths):
            pathcost = 0
            for j,y in enumerate(x): 
                if y>0:
                    pathcost += (self.arccost[j][2])
                    
            var = LpVariable("Path"+str(i),0,None,LpContinuous,lpSum(self.obj * pathcost + [self.constraintList[a] * b for a,b in enumerate(custcount[i])]))
            self.PathVars.append(var)
            
    def solve(self):
        self.prob.writeLP("Masterprob.lp")
        self.prob.solve()
        
        return [self.prob.constraints[i].pi for i in self.prob.constraints]
    
    def addPath(self,path):
        self.genpaths.append(path)
        temp = []
        
        
        for j,y in enumerate(path):
            if y>0:
                for a,b in enumerate(self.arcs):
                    if a == j:
                        temp.append(b[0])
                        temp.append(b[1])
                        
        custcount = []
        for i,x in enumerate(self.cust):
            custcount.append(temp.count(x))
            
        pathcost = 0
        for j,y in enumerate(path):
            if y>0:
                pathcost += (self.arccost[j][2])
                    
        var = LpVariable("Path"+str(len(self.genpaths)),0,None,LpContinuous,lpSum(self.obj * pathcost + [self.constraintList[a] * b for a,b in enumerate(custcount)]))
        self.PathVars.append(var)
            
            
    def startSubProblem(self,duals):
        newSubProb = SubProblem(duals,self.arcs,self.nodes,self.vehcap,self.nodesdf,self.custdf,self.arccostdf,self.demanddf,self.arcsdf,self.timewindowdf,self.Mdf,self.arctimedf)
        path=newSubProb.returnPath()
        return path
    
    def setRelaxed(self,relaxed):  
        if relaxed==False:
            for var in self.prob.variables():
                var.cat = LpBinary
                
    def getObjective(self):
        return value(self.prob.objective)
    
    def getUsedPaths(self):
        usedPathList=[]
        for i,x in enumerate(self.PathVars):
            if value(x)>0:
                usedPathList.append((value(x),self.genpaths[i]))
        return usedPathList        

#Sub Problem for finding new path variables based on reduced cost


class SubProblem:
 
    def __init__(self,duals,arcs,nodes,vehcap,nodesdf,custdf,arccostdf,demanddf,arcsdf,timewindowdf,Mdf,arctimedf):
     

        self.subprob = LpProblem("Sub Problem solver",LpMinimize)
     
        self.varList = LpVariable.dicts("Arc_variable", ((i,j) for i,j in arcs),cat='Binary')
     
        self.subprob += lpSum([self.varList[i,j] * (arccostdf.loc[(i,j),'cost'] - duals[i - 2]) for i,j in arccostdf.index if i > 1]
                             + [self.varList[i,j] * arccostdf.loc[(i,j),'cost'] for i,j in arccostdf.index if i == 1])
                         
     
        self.subprob += lpSum([self.varList[i,j] * demanddf.loc[(i),'Demand'] for i,j in arcs if i != 1]) <= vehcap
     
        self.subprob += lpSum([self.varList[i,j] for i,j in arcs if i == 1]) == 1
     
        self.subprob += lpSum([self.varList[i,j] for i,j in arcs if j == 1]) == 1
     
        for i in nodes:
            for j in nodes:
                if i == j:
                    self.subprob += self.varList[i,j] == 0
     
        for c in cust:
            self.subprob += (lpSum([self.varList[i,j] for i,j in arcs if j == c]) == lpSum([self.varList[i,j] for i,j in arcs if i == c]))
     
        self.twvarList = LpVariable.dicts("Time_variable", ((i) for i in nodes),cat='Continuous')
     
        for i in nodes:
            self.subprob += self.twvarList[i] >= timewindowdf.loc[(i),'ReadyTime']
            self.subprob += self.twvarList[i] <= timewindowdf.loc[(i),'DueDate']
         
     
        for i in nodes:
            for j in nodes:
                self.subprob += (lpSum([self.twvarList[i] - self.twvarList[j] + Mdf.loc[(i,j),'Mvalue'] * self.varList[i,j]]) <= lpSum([Mdf.loc[(i,j),'Mvalue'] - arctimedf.loc[(i,j),'time']]))
     
     
     
        self.subprob.writeLP('subprob.lp')
        self.subprob.solve()
     
    def returnPath(self):
        path=False
        if value(self.subprob.objective) < 0:
            path=[]
            for i,v in enumerate(self.subprob.variables()):
                if i < len(arcs):
                    path.append(v.varValue)
        return path

#Generate initial paths for each customer individually (i.e. Depot-Cust-Depot) and run iteration on Master and Sub-Problem to get the solution


def conjugatepos(arcpos,arcs):
    for i,x in enumerate(arcs):
        if i == arcpos:
            first = x[1]
            second = x[0]
        
    for i,x in enumerate(arcs):
        if x[0] == first and x[1] == second:
            pos = i
    return pos

genpaths=[]
print ("Generating start paths:")
## generate simple start paths
for x in range(len(cust)):
    temp=[0.0 for y in range(len(arcs))]
    genpaths.append(temp)


for i,x in enumerate(genpaths):
    genpaths[i][i+1] = 1.0
    conjpos = conjugatepos(i+1,arcs)
    genpaths[i][conjpos] = 1.0
    print (genpaths[i])



print ("\n\nTrying to solve problem")
CGprob=MasterProblem(nodes,arcs,cust,depot,demand,timewindow,arccost,arctime,20,genpaths,arccostdf,demanddf,arcsdf,nodesdf,custdf,arctimedf,timewindowdf,Mdf,'Vehicle Routing Problem with time windows')

relaxed=True
while relaxed==True:   
    duals=CGprob.solve()

    newPath=CGprob.startSubProblem(duals)

    print ('New path: %s' % newPath)

    if newPath:
        CGprob.addPath(newPath)
    else:
        CGprob.setRelaxed(False)
        CGprob.solve()
        relaxed=False

print ("\n\nSolution: %s Optimal path cost is" % CGprob.getObjective())

t=CGprob.getUsedPaths()
for i,x in enumerate(t):
    print ("Path %s: selected %s times %s" % (i,x[0],x[1]))

#Solution is non-integer so next step involves performing B&B on flows


Column Generation: Shortest path problem with time constraint

Column generation technique is used in integer (pure or mixed) programming problem in which enumeration of variables is very large. It can be viewed as matrix with small number of rows with very large number of columns. Since, it is difficult to enumerate the variables and incorporate in the problem column generation technique is used which iteratively adds variables that seems feasible to enter. It is similar to simplex method in which new variables are added into the basis.

We look here column generation problem on shortest path problem with time constraint. A set of nodes and arcs are present with arc time and arc cost associated with each arc. The objective is to reach from start node to end node while satisfying total path time within threshold time value. The problem selected (Desaulniers, G., Desrosiers, J. and Solomon, M.M., Column generation, Chapter 1,Springer, 2005, DOI: 10.1007/b135457) is shown in below figure:


It is required to reach from node 1 to node 6 in shortest time while satisfying the total time constraint. The problem can be stated in MILP form as:


The problem involves in enumeration of paths such that objective is met. However, there are nine paths and it would become difficult for a problem involving thousands of nodes. In such cases, we would decompose the original problem into master problem and sub-problem. Start the master problem by randomly selecting a path and an artificial variable with very large coefficient. Upon solving the master problem without integer constraint and additional path variable. The sum of path variable would be one and time restriction constraints of original problem are used. The master problem which is so called as restricted master problem is formulated as:

The sub-problem is same as original problem except the objective is modified with reduced cost for the constrain (1.11) and (1.12) as:


Upon construction of the restricted master problem and sub-problem, iterate it till objective of sub-problem is greater than or equal to zero. Check for the solution to the problem and if it is not integer, than we would perform column generation at each node with branch and bound technique. Here, we would consider the first solution of column generation with iterations till the sub-problem objective becomes greater than or equal to zero. The data consists of four headers in .csv file with headers: fromnode, tonode, Cost, Time. Minimum path time threshold is 14.


from pulp import*
import pandas as pd

arcdf = pd.DataFrame.from_csv('.....SPTRESData.csv', index_col=['fromnode', 'tonode'])



class MasterProblem:
    
    def __init__(self,maxresrtime,nodes,arcdf,arcs,arccost,arctime,initialpaths,problemname):
        self.maxresrtime = maxresrtime
        self.nodes = nodes
        self.arcdf = arcdf
        self.arcs = arcs
        self.arccost = arccost
        self.arctime = arctime
        self.initialpaths = initialpaths
        
        self.prob = LpProblem(problemname,LpMinimize)
        
        self.obj = LpConstraintVar("obj")
        self.prob.setObjective(self.obj)
        
        self.PathVars = []
        
        self.FirstConstr = LpConstraintVar("TimeResrConstr",LpConstraintLE,self.maxresrtime)
        self.prob += self.FirstConstr
        
        self.SecondConstr = LpConstraintVar("LambdaConstr",LpConstraintEQ,1.0)
        self.prob += self.SecondConstr
        
        
        
        
        artfvar = LpVariable("ArtVar",0,None,LpContinuous,
                            lpSum(self.obj * 1000 + self.FirstConstr + self.SecondConstr))
        self.PathVars.append(artfvar)
        
        
        
        
        pathtime = 0
        pathcost = 0
        
        for i,x in enumerate(self.initialpaths):
            if x>0:
                pathtime += (self.arctime[i]) 
                pathcost += (self.arccost[i]) 
            
        
        
                    
        var = LpVariable("Path"+str(len(self.initialpaths)),0,None,LpContinuous,
                            lpSum(self.obj * pathcost + pathtime * self.FirstConstr + self.SecondConstr))
        self.PathVars.append(var)
            
    def solve(self):
        self.prob.writeLP("MasterProb.lp")
        self.prob.solve()
        
        return [self.prob.constraints[i].pi for i in self.prob.constraints]
    
    def addPath(self,path):
        self.initialpaths.append(path)
        pathtime = 0
        pathcost = 0
        
        for j,y in enumerate(path):
            if y>0:
                pathtime += (self.arctime[j]) 
                pathcost += (self.arccost[j]) 
                
        var = LpVariable("Path"+str(len(self.initialpaths)),0,None,LpContinuous,
                         lpSum(self.obj * pathcost + pathtime * self.FirstConstr + self.SecondConstr))
        self.PathVars.append(var)
        
    
    def startSubProblem(self,duals):
        newSubProb = SubProblem(duals,self.nodes,self.arcdf,self.arcs,self.arccost,self.arctime)
        path = newSubProb.returnPath()
        return path
    
    def setRelaxed(self,relaxed):  
        if relaxed==False:
            for var in self.prob.variables():
                var.cat = LpBinary
                
    def getObjective(self):
        return value(self.prob.objective)
    
    def getUsedPaths(self):
        usedPathList=[]
        for i,x in enumerate(self.PathVars):
            if value(x)>0:
                usedPathList.append((value(x),self.initialpaths[i]))
        return usedPathList       




class SubProblem:
    
    def __init__(self,duals,nodes,arcdf,arcs,arccost,arctime):
        
        
        f = -duals[0]
            
        s = -duals[1]
            
        
        self.subprob = LpProblem("Sub Solver",LpMinimize)
            
        self.varList = LpVariable.dicts("Arc_variable", ((i,j) for i,j in arcs),cat='Binary')
        
        self.subprob += lpSum([self.varList[i,j] * arcdf.loc[(i,j),'cost'] for i,j in arcdf.index] + 
                             [self.varList[i,j] * f * arcdf.loc[(i,j),'time'] for i,j in arcdf.index]) + s 
        
        self.subprob += lpSum([self.varList[i,j] for i,j in arcs if i == 1]) == 1
        
        self.subprob += lpSum([self.varList[i,j] for i,j in arcs if j == 6]) == 1
        
        for n in nodes:
            if n != 1:
                if n != 6:
                    self.subprob += (lpSum([self.varList[i,j] for i,j in arcs if i == n]) == lpSum([self.varList[i,j] for i,j in arcs if j == n]))     
        
        
        self.subprob.writeLP('subprob.lp')
        self.subprob.solve() 
        
        
    
    def returnPath(self):
        path=False
        if value(self.subprob.objective) < 0:
            path=[]
            for v in self.subprob.variables():
                path.append(v.varValue)
        return path





nodes = [1,2,3,4,5,6]
arcs = [[1,2],[1,3],[2,4],[2,5],[3,2],[3,4],[3,5],[4,5],[4,6],[5,6]]
arccost = [1,10,1,2,1,5,12,10,1,2]
arctime = [10,3,1,3,2,7,3,1,7,2]
paths = [1,0,1,0,0,0,0,0,1,0]



print ("\n\nTrying to solve problem")
CGprob=MasterProblem(14,nodes,arcdf,arcs,arccost,arctime,paths,'Shortest Path Problem with time constraint')

relaxed=True
while relaxed==True:   # once no more new columns can be generated set relaxed to False
    duals=CGprob.solve()

    newPath=CGprob.startSubProblem(duals)

    print ('New path: %s' % newPath)

    if newPath:
        CGprob.addPath(newPath)
    else:
        CGprob.setRelaxed(False)
        CGprob.solve()
        relaxed=False

print ("\n\nSolution: %s Optimal path cost is" % CGprob.getObjective())

t=CGprob.getUsedPaths()
for i,x in enumerate(t):
    print ("Path %s: selected %s times %s" % (i,x[0],x[1]))