USING GENETIC ALGORITHM TO MODEL ROAD NETWORKS
Wifi Active In SP Posts: 158 Joined: Oct 2010 
31102010, 10:27 PM
USING GENETIC ALGORITHM TO MODEL ROAD NETWORKS
SEMINAR REPORT Submitted by ANOOP THOMAS DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING COLLEGE OF ENGINEERING TRIVANDRUM 2009 USING GENETIC ALGORITHM TO MODEL ROAD NETWORKS.rar (Size: 428.68 KB / Downloads: 46) 1. ABSTRACT Current urban transportation faces significant problems, due in no small part to the disconnect in traffic and infrastructure growth rates. As traffic has steadily increased, the infrastructure to deal with that traffic has steadily decreased because of rising costs and general lack of space. To solve some of these problems, whichgreatly affect critical functions such as road maintenance and management, transportation planning must attempt to optimize the selection of roads to be built or improved. To solve this problem genetic algorithm can be effectively used. Studies shows that solutions generated by using genetic algorithm is much more effective than manually generated solutions and efficiency goes on increasing as population increases. 2. Introduction To Genetic Algorithm The genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover. In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached. Once we have the genetic representation and the fitness function defined, GA proceeds to initialize a population of solutions randomly, then improve it through repetitive application of mutation, crossover, inversion and selection operators. 2.1 INITIALIZATION Initially many individual solutions are randomly generated to form an initial population. The population size depends on the nature of the problem, but typically contains several hundreds or thousands of possible solutions. Traditionally, the population is generated randomly, covering the entire range of possible solutions (the search space). Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to be found. 2.2 SELECTION During each successive generation, a proportion of the existing population is selected to breed a new generation. Individual solutions are selected through a fitnessbased process, where fitter solutions (as measured by a fitness function) are typically more likely to be selected. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample of the population, as this process may be very timeconsuming. Most functions are stochastic and designed so that a small proportion of less fit solutions are selected. This helps keep the diversity of the population large, preventing premature convergence on poor solutions. Popular and wellstudied selection methods include roulette wheel selection and tournament selection. 2.3 REPRODUCTION The next step is to generate a second generation population of solutions from those selected through genetic operators: crossover (also called recombination), and/or mutation. For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its "parents". New parents are selected for each new child, and the process continues until a new population of solutions of appropriate size is generated. Although reproduction methods that are based on the use of two parents are more "biology inspired", some research suggests more than two "parents" are better to be used to reproduce a good quality chromosome. These processes ultimately result in the next generation population of chromosomes that is different from the initial generation. Generally the average fitness will have increased by this procedure for the population, since only the best organisms from the first generation are selected for breeding, along with a small proportion of less fit solutions, for reasons already mentioned above. Although Crossover and Mutation are known as the main genetic operators, it is possible to use other operators such as regrouping, colonizationextinction, or migration in genetic algorithms. 2.4 TERMINATION This generational process is repeated until a termination condition has been reached. Common terminating conditions are: • A solution is found that satisfies minimum criteria • Fixed number of generations reached • Allocated budget (computation time/money) reached • The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results • Manual inspection • Combinations of the above 3. Terms Used Population. Total number of expected “individuals,” in this case a collection of valid road networks. Individual. Single object in the population being evaluated—in this case, a road network; synonymous with genetic chromosome. Mutation. Genetic operator that alters one or more gene values in a chromosome from its initial state. Mutation can add entirely new gene values to the gene pool. Crossover. Process of combining (mating) two chromosomes (parents) to produce a new chromosome (offspring). Arc. Road section that joins two geographic points. Track. Road’s capacity to join geographic points. Hypothesis. Network that is valid but not necessarily optimum. The model starts with a randomly created population of valid road networks—the result of generating multiple road networks (hypotheses) to obtain many diverse but reasonable solutions. To generate connected networks only, the model first creates a minimum connected network with nodes that correspond to geographic points. It then adds plausible arcs so that the produced network has an urban road network’s main features. Both arcs and nodes have real geographical representation, so the model has no simulated components. The model then applies modifications to chromosomes for the current individuals and generates the next population according to an update strategy in which the population size is fixed. Our approach uses an elitist strategy4 to produce the next population, so called because only the fittest individuals—about 20 percent of the current population—go unchanged to the next generation. The model generates the remaining 80 percent of the population from the best individuals after applying the crossover operator. The designed mutation operator, in contrast, is constrained,4 and mutation is random. Our experiments used mutation probabilities of 1 to 5 percent. The fitness evaluation assesses the road network quality according to flow criteria. To perform this function, we used Saturn (Simulation and Assignment of Traffic to Urban Networks), offtheshelf flowassignment software developed at the University of Leeds.5 Saturn performs the evaluation and provides the model with fitness feedback. On the basis of an origindestination (OD) matrix that provides the flow, Saturn generates balanced flows. The fittest individuals—those that survived in each generation—become the next population. Trip Assignment Applying genetic algorithms to trip assignment requires new representation and coding and the design of domainspecific genetic operators. It also requires formalizing evaluation functions that assess individual fitness in terms of domain knowledge. 4. Two Representations This model has a twopart representation for each individual. Figure 2a shows a network of four nodes and six arcs, although a network can have up to 12 arcs. Expressing the individual as part of a connected graph ensures that all network solutions are valid. The chromosome representation in Figure 2b represents all the individuals as part of a valid network, essentially translating the arcs of the connected graph into ordered positions. Each chromosome has two sections. The first section represents a minimum number of networks, each of which is guaranteed to be a connected graph (arcs 6, 12, 7, 1). The second section represents the arcs not considered in the first section (arcs 4 and 8). Coding the representations in this manner overcomes their inherent difficulties. Typical graph representation and node permutation do not necessarily consider all the arcs, and a chromosome structure, although considering all arcs, imposes an ordering on their positions. The twopart coding representation removes these obstacles and makes storing and handling valid solutions straightforward. 4.1 Directed Graph In the directed graph representation, a network, N, is a graph N = (P, A), where P represents a set of geographic points (road intersections, bus stops, or any other relevant point), and A (for arcs) represents a set of homogeneous road sections that allow the joining of two points belonging to P. Each a ∈ A is an arc of the graph that has specific properties, such as length, cost, and capacity (number of tracks). 4.2 Chromosomes For each connected graph representation, as in Figure 2a, a chromosome represents the arcs in the graph as well as the unconnected tracks. Because of the chromosome’s heterogeneous nature, the model performs a twostep transformation to enable additional genetic operations. It transforms the first part of the original chromosome (Figure 2b) into a permutation of OD nodes associated with the participating arcs. This step ensures that, in applying operations such as crossover,the model will always generate connected subnets irrespective of the initially considered arcs. For the second part, the model applies a single crossover operator. As the algorithm progresses, the selection operator picks up a full fixedlength chromosome, and the model performs the next crossover operation differently, depending on which part of the chromosome it is modifying. Evaluation Function To evaluate the quality of each generated network, we defined an evaluation function that assesses the degree to which H (a candidate network) minimizes both the users’ trip time within a network and the costs of building a new arc or modifying an existing one by adding or improving tracks. Formally, the two objective evaluation function for H is (1) where TA is the overall time vehicles spend going through each arc in the network (total trip time) and CA is the cost of building the arcs. To provide homogeneous processing for T and C, the function converts them to the same metric unit (such as money) using θ, a constant metric unit converter. Weight parameters, α and β, indicate the relative importance of each objective function; α and β represent weighted impacts of trip time and the cost of generating routes that always sum to one. TA is conditional on balanced user flows: The sum of the arc flows must equal the overall flow of that route. Formally, this relationship is (2) where xa is the flow of trips in arc a, xkrs is the flow in route k going from origin r to destination s, qrs is the overall flow between r and s, and ta is the trip time in a. Trip time is the sole consideration in computing the generalized cost of a trip. The building cost CA is based on the arc length and the number of associated tracks for each arc in the network (capacity). Thus, the proposed cost function is (3) where la is the length of arc a, and pa is the number of tracks to be built or improved in a. Equation 2 is based on Wardrop’s balance principle, which assumes that users within a network will try to minimize their operation costs in traveling through the network. In practical terms, this means that if users could, they would always choose the shortest route to get to a destination. However, because of vehicle congestion—too many users traveling on the same roads— the shortest route might not exist. Consequently, users change their minds and explore other routes until they find minimumcost routes, accounting for existing operation constraints. Once all users can find the most appropriate route, the transport network is considered balanced. Our model implements this principle using the Frank Wolfe algorithm, which involves a convex combination method selected according to a heuristic. Saturn also generates balanced vehicle flows using this algorithm. This simple objective function assesses the quality of the hypotheses generated from the previous operators. The function can penalize bad solutions and reward the fittest ones in such a way that the fittest solutions propagate easily through the population. 5. Genetic Operations Genetic operators aim to reproduce chromosomes using a roulette selection strategy. Roulette selection simulates a roulette wheel by assigning a wheel’s section to each individual according to its fitness. To deal with the sexual recombination (crossover) of hypotheses, the model applies a crossover operator to the selected chromosomes. Different operators modify different chromosomal sections. The first section requires a coding scheme similar to that used for simple crossover except that this scheme preserves the hypothesis viability and allows the exchange of ordered information. A version of this scheme is the partially matched crossover (PMX), which picks up a crossing region by selecting two crossing sites. To illustrate how this scheme works, consider two chromosomes, A and B, which have two crossing sites each (x): A: 614 x 829 x 0735 B: 729 x 361 x 5480 Initially, PMX performs the crossover in the center region to yield a temporary offspring: A’: HHH x 361 x HHHH B’: HHH x 829 x HHHH where H positions are holes in the temporary offspring, where PMX can define an association between the exchanged positions, in this case, 3−8, 6−2, 1−9. The scheme fills these holes in two stages. The first stage is to fill the holes with values from the original parents (A or B) that are not duplicated in the exchanged values A”: HH4 x 361 x 07H5 B”: 7HH x 829 x 54H0 To fill the remaining holes, PMX uses the previously generated association: if A1 = 6, then A*i = 2 and so on with the remaining H. By applying this operator to the remaining holes, PMX obtains the final offspring: A*: 2943610785 B*: 7618295430 For the chromosome’s second section, the model uses a simple crossover operator. To illustrate, consider the chromosomes A and B, with a single crossing point each (x): A: 20 x 130 B: 12 x 001 The operator generates the offspring by exchanging the parents’ sections: A*: 20001 B*: 12130 The model also applies the mutation operator to both chromosomal parts, but it modifies only one part. For the first section, the operator merely selects two random values and exchanges them. In the following example, x indicates a value that the operator has selected for mutation: Before mutation: A : 2943 x 6107 x 85 After mutation: A* : 2943810765 For the second section, the operator randomly selects one value and modifies it. For example, Before mutation: A : 200 x 01 After mutation: A* : 20021 The resulting chromosome represents a configuration that calls for adding an arc with two tracks to the network. Once the model performs the genetic operations, it evaluates the resulting chromosome using the fitness function described earlier 6. Experiments And Results Figure 3 shows the road network for Los Angeles City, the city area we chose for our experiments. Figure 3b is the best network generated from a manual search, which we used as a basis for comparing the searches with our model. As mentioned earlier, our objective in conducting the experiments was to investigate the extent to which our model could generate more road networks and higherquality networks than would be possible with a manual process. We hypothesized that because the graph must always be connected with no additional constraints, a manual solution search would take too long to derive many of the reasonable candidate configurations that an automated process could produce. Our analyses took into account several scenarios approved by the City Committee for Land Usage and Projects. Fig(3) 6.1 Experimental Parameters To measure the quality of the generated solutions, our evaluation objectives were trip time in peak hours and the cost of modifying the network. We derived peak times from two trip matrices for 2000 and 2010. The peak times were for morning only because these are the only difficult times for a road network in a city this size. Peak morning times also fit with the movements from residential to highly concentrated urban zones, both industrial and educational. Other time periods are not relevant, since people tend to scatter in return trips, as opposed to converging on a particular few city areas. As Table 1 shows, for each matrix, we used three population sizes for both years, considering 500 iterations for 2000 and 1,000 iterations for 2010. We also ran several preliminary tests to determine the mutation probabilities that produce the best fitness values. We divided the target city area into 48 subareas and considered OD pairs for the trips for each subarea. The goal was to provide relevant details of the possible road network variations for the years considered. Our results show a scenario in which trip time and the cost of generating routes have equal weight (α and β are set to 0.5).For all the scenarios, we selected a random network configuration, keeping in mind the manually calculated best fitness value for the proposed network in 2000.That value, 2,443.78, became the cost to be minimized. Projections from our model for the same year resulted in better final fitness values for all three population sizes. 6.2 Results After only 500 iterations, the model produced satisfactory results that we believe might be achievable for more populated cities, although we cannot estimates calability with certainty. Figure 4 shows the best road networks and their corresponding fitness evolution for two populations. For a population of 75, we noted more improvement in the number of solutions and solution quality, but this might be due primarily to the larger search space and hence the higher genetic variability. As Figures 4b and 4d show, the model reached convergence on the best solutions before 1,000 iterations. For a population of 25, the model reached the best solution near iteration 500, which is similar to the results with a population of 50. The model found the best solution within a population of 75 around iteration 700. To make infrastructure project and implimentationions for 2010, we ran more iterations, but kept the same road network. In this scenario, the manual fitness was 37,318.71, which means that costs will increase significantly if the current road infrastructure remains unchanged in the face of growing vehicle flow. Once again, the model demonstrated significant improvement over the manual process (for a population of 25, the final fitness valuewas 8,790.88; for 50, it was 8,673.29; and for 75, it was 7,973.06). Figure 5 shows a typical generated network for this project and implimentationion and its corresponding fitness evolution. Comparing the fitness evolution diagrams in Figures 4d and 5b reveals no quality correlation of the initial and final populations. From a specific network configuration within a fixed period (less than 1,000 iterations), the genetic algorithm can find better solutions than manual procedures. As Figure 6 shows, the analyses also suggest that a bigger population involves better fitness values. Table 2 shows the best fitness values of the solutions that our model generated and compares them to the fitness values of manually produced solutions for the same years and population sizes. The last column of the table shows the percentage cost reduction between the manual procedure and the model. The table is proof that our model can indeed generate better solutions than the manual procedure for any of the population sizes given (for 25, the best value was 1,887.20; for 50, 1,886.43; and for 75, 1,859.48). Also, results seem to indicate that the larger the population, the greater the reduction percentage. For 2010 and a population of 75, for example, the reduction is nearly 80 percent. However, sudden increases in running times, evident in Table 2, necessitate tradeoffs in solution quality and the time required to obtain the solutions. The experiments also show that keeping the current road network as it is will result in a significant lack of network infrastructure in 2010. The city must extend or create new roads from the west and north to the central area. Although all the experiments used the same 200 nodes, how the nodes affect performance depends on the length of the modified chromosome. The population size also affects running time, sometimes significantly. 7. Conclusion The proposed model could play a key role in city planning as a heuristic strategy for detecting infrastructure problems. Experimental results point to solutions that involve improving existing roads and creating new ones, suggestions that correlate well with existing plans for future improvements. The model tends to build diagonals between zones and nodes, where there is a lack of road infrastructure. Additional work might involve integrating the concept of OD nodes to a particular network design strategy. A human expert could use genetic algorithms to decide among a set of reasonable road network choices. It might also be useful to provide the model with massive amounts of new geographical data so that it can be finetuned to generate project and implimentationions for road networks several years ahead. Other efforts might explore different techniques for optimizing several objectives so that the model can consider multiple roadplanning criteria without the need to define complex formulas for putting all the objective functions in a unique fitness value. Following are the important conclusions obtained from the experiments conducted Keeping the current road network as it is will result in a significant lack of network infrastructure in 2010 Build diagonals between zones and nodes, where there is a lack of road infrastructure Useful to provide the model with massive amounts of new geographical data Genetic algorithm is much more effective in generating road networks With increase in population there is an improvement in fitness value Run time increases with population As population increases efficiency of algorithm increases As traffic increases efficiency of algorithm increases 8. BIBLIOGRAPHY 1. P. Bhawnani et al., “Intelligent Decision Support for Road Mapping a Technology Transfer Case Study with Siemens Corporate Technology,” Proc. Int’l Workshop Software Technology Transfer in Software Engineering (TT 6), ACM Press, 2006, pp. 3540. 2. D. van Vliet, “SATURN User Manual—Simulation and Assignment of Traffic to Urban Road Networks,” tech. report, The Institute for Transport Studies, Univ. of Leeds, 2002. 3. M. Frank and P. Wolfe, “An Algorithm for Quadratic Programming,” Naval Research Logistics, vol. 1, no. 3, 1956, pp. 95110. Use Search at http://topicideas.net/search.php wisely To Get Information About Project Topic and Seminar ideas with report/source code along pdf and ppt presenaion



