How genetic algorithm works in Bioinformatics?

A. Initialization

Originally various individual solutions are generated arbitrarily to build initial population. Size of population depends on problem nature, but typically it carries several hundred to several thousand possible solutions. Usually, the population is created arbitrarily, covering the complete range of probable solutions. Sometimes solutions may be “seeded” where there is a chance of optimal solutions.

B. Selection

During every consecutive generation, a fraction of the present population is chosen for breeding a new generation. Fitness-based process chooses individual solutions, where solutions measured through functions of fitness are usually likely to be chosen. Many selection procedures rate the fitness for every solution and specially select the one best solution among all. Some other procedures rate just a random population sample, because this procedure may be inefficient in terms of time. Most functions are designed such that a little quantity of solutions is selected which are less fit. These benefits keep the variety of the large population, avoiding early convergence on poor solutions. Widespread and well-considered selection procedures comprise tournament selection and roulette wheel selection.

C. Reproduction

In next step second population of solutions is generated from selected genetic operators: recombination, or/and mutation. For producing every recent solution, “parent” solution pair is chosen for breeding process through the previously selected pool. By generating a “child” solution from the above described procedures of mutation and crossover, a new solution is produced that usually shares various properties of its “parents”. These new parents are chosen for every new child, and this procedure lasts until a different population of solutions is generated. This population is of suitable size. Though reproduction approaches are based on how we use two parents are inspired by biology various research recommended that it is better to use more than two “parents” to reproduce a chromosome quality. These procedures finally result in the subsequent population generation of chromosomes, different from the original generation. Usually, average fitness increases through this method for the population, as only the best generation or first generation from GA is chosen for breeding process, along with a little amount of solutions that stood less fit, for causes that have been already cited above.

D. Termination

This generational step is repetitive until a condition has been reached that terminates this process. Some common conditions for termination are:

• A solution has been found that fulfills least standards;

• Fixed or decided number of generations has been reached;

• Allocated computation time and/or money) have been reached;

• Successive iterations are no longer generating good results or highest fittest solution has been reached;

• Manual inspection;

• Combinations of the one or more reasons described above.

E. Simple Genetic algorithm pseudocodes:

Step no.1: Select the initial individual population.

Step no.2: Estimate the fitness of every individual of that population.

Step no.3: Repeat on this generation till end: (adequate fitness attain, time limit, etc.)

· For reproduction, choose the best-fit individuals.

· By mutation and crossover operations, breed new individuals to give birth.

· Estimate the individual fitness for recent individuals.

· Replace all least-fit population by new best fit individuals.

Bioinformatics

Search This Blog