Tuesday, October 27, 2009

Travel in Liechtenstein: Variation of Shortest Path Problem

depiction file: liech sample.dpn
excel file: liech sample2.xlsx


This time, for our example, we would set an imaginary trip to one city in Liechtenstein.  Suppose we are planning to leave the hotel in the morning, go to an architecture object, and then lunch at a restaurant.  There are several other places that we want to visit, but it was so hot that my goal in life is to make sure I get enough fun while traveling on the shortest walking distance ever. Therefore, we can formulate this to be somewhat similar to the shortest path problem.  



Part I: Preparing the Raw Data

First, an imaginary route and attraction is set on the city using Depiction, properly labelled.





Then, as usual, we draw arcs using the road network.





Then the road network is exported to GML format, which then is used as a raw data on Excel.





GML is a variation of XML, which means that Excel can open it.  Then, using =LEFT and RIGHT function, we have prepared data that is ready to use.



Part II: Modeling in Excel



The idea behind the shortest path problem is to let Excel solver decide which routes to take while meeting the constrains, that is, the Net Flow equal to 0.  Flow is calculated by making sure that every ending point leads to another starting point.  The easiest way to do so is to use =SUMIF.  


In my example, I put all the available nodes on J4 to J12.   Next to it, I calculate the flow: using SUMIF, I calculate how many routes leads to (J4) nodes, and whether there are equal number of  routes leaving from (J4) nodes. 


Last, I add the said constrains, that is the minimum fun point in order to be qualified.  Then the goal is set to minimize the walking distance.  




Set the solver's constrains, check on "Assume Linear Model" and "No Negative" in the Options, and the problem is solved.  If you want to try it out or play around with it, I attach the excel and depiction files at the beginning of the page.


Hope this helps.




Sunday, October 25, 2009

First Tutorial: Shortest Distance Network on Excel

Suppose I am a warehouse manager that are tasked to collect casks of wine from the local vineyards.  I have one truck, and this truck will go around to the vineyards and back.  I want to find the shortest route and the shortest distance for my truck.


The very first thing you want to do is to get the raw data.  As mentioned from my previous post, I use Depiction for that.  Now that we have the raw data, it is time to set up what we need for the linear programming.  For this problem, we will use a shortest distance problem.


A linear programming consists of:


  • Goal:  In this case, it will be the distance traveled, and the goal is to minimize it
  • Variables: The one that will be changed in order to achieve the most optimum goal
  • Constrains: The limitation / constrains that defines the network




So let’s begin.  The data that we need are:


  • Starting Location (Starting node)
  • Ending Location (Ending node)
  • Length of the Arc (distance between the two points)



example 1: Start, Stop, and Arc length


Then you may add a few paths that are not written before if you like.  For example, I draw only one path from point 2 to 3 whereas it is possible to go from point 3 to 2 too.  I add them in this stage.


Next, we define the variable.  Later on we will ask Excel to change these in order to find the minimum route.



example 2: route


Third, we define the goal.  In this case, it is the maximum distance.  There are several ways to do this, but probably the simplest way is to use the Excel function SUMPRODUCT. 


And here’s the trick.  If we set the route to be TRUE or FALSE by using the number ‘1’ for TRUE and ‘0’ for FALSE, then when we multiply the “distance” (column B) and the “route” (column F), the result is the total distance traveled. 


Then we define some constrains
The first constrains that we should set is that our route is TRUE or FALSE only (1 or 0).  That is, we do not want a fraction or a number greater than 1.  Therefore, in the column H, I type this formula, then dragged it down to cover every route. 


=IF(F2=0, 1,  IF(F2=1, 1, 0))


This formula will return “1”if the route number is exactly 0 or 1, and will return 0 for anything else.  We will make it into our first constrain: All number in column H has to be 1.


The next constrain we will define is that the flow summed to 0.  That is, every ending point is chosen only once.  In order to do so, we will have to construct a list of points.  In our case, 1 to 8.  Then we will build the flow using SUMIF.


Suppose Column D is the ending point of the route, column J is the new table we set, and column F is the route.


=SUMIF(D$2:D$37, J2,$F$2:$F$37)


This code will sum the value in F column if the D column is equal to J2.  Therefore, this can be used to determine how many ending point is there. 
Then we add it so that every ending point will lead to another point (aka new route).  Therefore, we add the starting point.  The formula become this:


=SUMIF(D$2:D$37,J2,F$2:F$37)-SUMIF(C$2:C$37,J2,F$2:F$37)


If there is an unconnected ending (i.e. ending point which does not lead to a new point), the value on the cell will be positive, and negative if there is no corresponding ending point for a starting point.  This will be our cross check to make sure every points is connected to each other.  This will have to be 0 all the time.


Last, we don’t want any backtrack, or we want to visit every node.  We sue SUMIF again, but this time only the first part.


Then we use the solver to define all of the said constrains.  



example 3: excel solver




Here is the depiction file and excel file that I use
Depiction file
Excel file


Hope this help!

On GIS / Mapping Software

One latitude degree is about 60 nautical miles, where 1 nautical miles is 1852 meter .  Then of course, the exact distance between two places usually do not matter.  It is the distance, in term of road networks, that matters.  This is where premium mapping software such as Depiction Software and Google Earth Pro shines.  These softwares can automatically geocode addresses, allows the user to connect points by the open street road network, edit some properties, and then export it in the GML (Depiction) or KML (Google Earth).  Not to mention, its graphic can be used for presentation purpose. 


GIS Software in Use:

Depiction on work: Mapping the road network distance and prepare it for linear programming purposes


Each software has its ups and downs.  Depiction is quite inexpensive compared to others, and may allow enough customization to make it useful.  However, as it is mostly used as emergency communication (thus can be used offline) and as simple mapping software, it does not have the business/statistic/subscribed data sources that other expensive GIS software boasts of.  But more often than not, the software may serves as 70% solution to everything.  Google Earth Pro comes next, and has very powerful capabilities and even some extremely valued Google market intelligence data, but is also four times more expensive. 



---


Example 1: GML on Excel -- using the RIGHT and LEFT function as well as some coding, to get starting node, ending node, and the length of the arc



Example 2: Ready to use Data for Linear Programming



On Network Optimization

One way of describing a network is a set of points (supply chain term: nodes) connected by routes (supply chain term: arcs).  Network optimization is a very common subject to study in transportation, logistic, manufacturing, and operation management -- and for a good reason: an optimal network usually leads to a lean operation.  Example of the commonly used network optimization are the minimum cost network and the minimum time network.


The most 'ancient' but still surprisingly well algorithm for the network optimization is the greedy algorithm: connecting the nodes by the shortest arc.  In a simple case, greedy is usually right.  The more 'advanced,' that is, more time and computing power saving method, is the dynamic algorithm.  But since greedy is much simpler and usually right, it is the most common method used by non-programmer.  Nowadays, however, due to the increase in the computing power of our chips, it is now possible to find the most optimal solution using the more brutal force -- aka brute force method.  That is, to collect data for every possible route and then sort for the most optimum.  It is not the best way, but since today's average computer can probably calculate more than 2^30 computation per second, it is possible.


And then came the power of the spreadsheet.  In case you have not realized it yet, Microsoft Excel is extremely powerful.  Excel also came in with the "Solver" add-in, which let you use the possibly simplest method of network optimization without the needs of knowing programming algorithm.  This is called Linear Programming method.


Microsoft's Guide to Optimization: Excel Solver Tutorial
The built-in Excel Solver has some limitation.  If your variables are large and complicated enough that  you encountered error message such as "the problem is too large for solver to handle," then you might consider the upgraded, premium Solver, or even Oracle Crystal Ball.  The later is very expensive, but can easily do monte carlo simulation and more.


In this blog you will find some examples of linear programmings that are tailored for transportation and logistic purposes.


Hope this helps.