Decoding Mechanism In Solution Of Genetic Problems
Due to advancement in technology involving generic study, the cost of accumulation of genetic details from an individual has considerably reduced which in turn leads to increase in a variety of dynamic services provided by genomic companies of late. Identification of one’s relatives is one of such service where genotypes of the persons are collected and preserved in the database. Then these genotypes are compared in pairs, and if any relation is found, then it is concluded that the individuals are related genetically. However, there are several data privacy issues related to sharing the genetic details with another third-party source. Though a study was proposed in 2013 to address this issue using fuzzy encryption technology, it was capable of comparing the genetic details to common variants of the individual only.
In this paper, the author Prof. Ostrosky and his colleagues have tried to address both these issues by proposing a novel encoding method, which uses the decoding mechanism where all the variants of a person’s genetic details are converted into integer values and compared with other data of similar variants. This novel method was also validated using data from two different scenarios- namely, real-life and simulated data. Details from 1000 genomes used for simulation proved that the above method could successfully detect up to fifth-degree cousins.
In this paper, fuzzy encryption and decryption methodology have been used. Here two sets of particulars: namely ‘Public key’ which is shared with everyone and ‘Private Key’ which is accessible only by the individual self are used. The two diploid genomes, called haplotypes are encoded, and the distance between them is computed to find the level of similarity. To determine if the two individuals are related, the ‘Private sketch’ of the first person is compared with the ‘Secure or Public sketch’ of the second person. To validate the security aspect of the figures, it needs to be shown that the entropy (or the degree of extracted particulars) in ‘Public sketch’ is much lesser than entropy in the genome.
To calculate the number of segments which matches, the haplotypes shares are computed and according to the result, ‘0’, ‘1’ or ‘2’ we can infer if there is ‘no match’ ‘semi-match’ or ‘complete match’ respectively. For simulation cases, a random method is used to compare two genomes, and the error factor is taken into consideration to accommodate the ‘Phase error’ and ‘Sequence error.’ In the case of simulations, encoding is used to compute the similarity score between the genome of two individuals. For every successful match, the fuzzy encryption program terminates successfully. Besides LWK, another population set namely MXL was used to validate the result in the simulated scenario. In case of real data, after separating the cryptic, the genome was divided into a certain length, identical segments are counted, and a threshold value is used to determine if the genomes are related to each other.
The paper has a very straightforward and plain-spoken approach written in a crisp language which makes it very comprehensible. The presentation of ideas and the surveys given were easy to follow and understand. The theoretical specifics mentioned by the authors are very insightful and help in complementing the elegance of the ideas formulated in the paper.The manner in which the approach takes into consideration the privacy issue by not involving any third-party company and proposing that both individuals can access their genomes or genetic details makes this method indeed credible and secure to identify the genetic relatives.
The simulated method where figures from 1000 genomes were taken from different population sets, majorly involving LWK involving ‘Public sketch’ and ‘Private sketch’ has indeed helped the study to seamlessly identify the genetic relatives of an individual. However, being a just an 8-page research paper has its drawbacks too. Clearly, it becomes apparent to the reader that the paper being very crisp in its formulations, lacks the in-depth details, hypothesis, and the postulations that could have given the paper more substance.
The approach and methodology for identifying genetic relatedness between individuals using data samples from different population sets can undergo further research and empirical and statistical scrutiny to broaden the sphere of psychological analysis using ‘Fuzzy encryption and decryption’ and a ‘Decoding’ methodology. The algorithms proposed the theorems and the other metrics used in the research paper can be further researched to produce a base for future hypothesis and proposals of expanding the scope of study in the sphere of genetic research. With the advancement in sequencing technology and a decline in error rates, this model can be further explored in future to increase the accuracy and quantum of matches between genetically related individuals or relatives.