Skip to main content

Genetic Algorithm and Bioinformatics



The Genetic algorithm is heuristic searching method, lies on population genetics. In 1970, John Holland introduced Genetic Algorithm (GA).  GA is a mechanics based algorithm of natural genetics and natural selection and started with population (a set of solution). A solution is characterized by a chromosome and its size is conserved during each generation. Fitness of every chromosome is assessed at each generation, and after that for the subsequent generation, chromosomes are selected probabilistically according to values based on their fitness. Some carefully chosen chromosomes are allowed randomly to mate and yield offspring. Only the chromosomes of high fitness values have high probability values for selection and new subsequent generation chromosomes have a high average fitness value as compared with the older one. This process of evolution repeated until a condition is satisfied at the end of a process. Strings or chromosomes are the solutions of Genetic Algorithms. In many cases, chromosomes are shown by strings or lists and for this reason many operations in genetic algorithms have been designed as for strings and lists. For implementing genetic algorithm, high level languages are used i.e Perl, Phython, C/Java/ C++. These programming languages are highly productive and widely used in bioinformatics.

GA is a searching method used to discover approximate or exact searching problems and optimization solutions. Genetic algorithms (GA) have been characterized as search heuristics. GA are a special group of evolutionary algorithms, using methods encouraged by evolutionary biology. These algorithms include mutation, inheritance, crossover and selection. Genetic algorithms are also used to discover optimal solutions for simple to multifaceted problems of different domain areas i.e engineering, biology, social science and computer science. These domains are using GA as an alternative to hill climbing, simulated annealing (SA), or for tattoo searching. As oppose to local searching procedures and methods, genetic algorithms lies on a set of liberate calculations well-ordered by a probabilistic approach. It is a natural selection model of fittest entities inside a sequential generation. According to classical definition, an individual is a solution for a problematic question under consideration and Population is a set of individuals under consideration. Every individual has only one chromosomal string which encodes its data properties. After that, one quantum of information is represented by a sequence of chromosomal alleles, i.e bits, digits, and letters. An alternative representation of data needs decoding and coding for exchanging solutions with nominal object space. GA is an evolutionary algorithm which solves problems without having efficient solution and optimization problems such as modeling systems, scheduling problems.

Genetic algorithm programming is a method of evolutionary algorithms which helps mapping data to a given output especially when set formulation is unknown. Programmers/mathematicians can discover procedures to resolve problems which treat with a limited number of variables, as number of variables increases from 10 or more (i.e above 50) variables, the problem under consideration becomes almost difficult to solve. If mathematical data is accessible and outputs are available then expression which combines data with answers is absent, a GA can ‘evolve’ expression tree and built close fit data. Crossing over, mutation and other components of genetic algorithms are used for for a given problem, breeding the ‘highest-fitness’ tree. It will absolutely perfect match variables with answers and will produce an output almost close to the required output or answer.

Comments

Popular posts from this blog

Information Retreival Systems in Bioinformatics: Entrez

Currently many biological databases have been developed and became an important toolbox for every scientist in research and academic purpose. Searching a sequence homologue of either Protein, DNA or to know the novelty of a sequence, one needs to do a sequence search against available databases. Similarly, searching for Open Reading Frame, structure, functional, regulatory sequences and repeated elements, we also need to search our query against different available databases. As biological data is increasing with the passage of time, its tremendous growth requires a searching and access system to retrieve useful information. In biological data, three retrieval systems are widely used relevant to a scientific need, it includes: Entrez, Sequence Retrieval System also known as SRS and DBGET. These retrieval systems let its user a text search against multiple molecular databases and also provides useful relevant information in the forms of links either internal or external to our qu...

Genetic algorithm and its applications in medicine

With the increase in biological and medical data it has become necessary for medical and bioinformaticians to have some automated approaches to identify different patterns it their data, so as to predict or have some useful information. Many applications have been described above for genetic algorithm, along with these applications GA has been applied in protein structure prediction, RNA structure prediction and Motif finding. Basic steps of GA are almost same in many applications but it requires expertise, parameters and involves a huge number of randomness and can provide different results in outcomes.      Applications of Genetic Algorithm in medicine Oncology Screening tests suggests a valuable chance cancer detection at early stages, which when keep an eye on by proper handling could recover the patient’s survival rate. Developing a non-invasive procedure for the detection of cervical cancer, Duraipandian et al, using colposcopy developed Raman spectra ...

Information Retreival System: Implementation

NCBI provides an information retrieval system, Entrez, designed to provide user friendly access to biomedical data including structural, molecular, sequences and literature.   Entrez provides access and searching facilities to more than 30 databases of genome, health, structural, literature, sequence and chemical. It provides faecet, limited and advance searching option with Boolean operators to customize user’s query. It also facilitates querying with wild card characters, mapping and controlled vocabulary. Web implementation of Entrez has more valuable applications and benefits over Network Entrez as it facilitates searching with a tremendous amount of data in different databases. Entrez provides navigational links between different databases either provided by NCBI or external (journal/databases) for each record by using two types of relationships: neighbors and hard links. Both of these types of relationships have been found on the basis of controlled vocabulary and algor...