Date of Award
2004
Level of Access Assigned by Author
Open-Access Thesis
Degree Name
Master of Electrical Engineering (MEE)
Department
Electrical Engineering
Advisor
Mohamad T. Musavi
Second Committee Member
Habtom Ressom
Third Committee Member
Bruce Segee
Abstract
Base calling is the central part of any large-scale genomic sequencing effort. Current sequencing technology produces error rates less than 3.5%. This corresponds to at least 35 errors in a 1000 base read. As the base calling algorithm's error rates drop, the smaller base call errors could be difficult to locate. Hence, assembling algorithms and human operators use a confidence value measure to determine how well the base calling algorithm has performed for each base call. This will clearly make it easier to uncover potential errors and correct them, thus increasing the throughput of genetic sequencing. The model developed here employs fuzzy logic, providing flexibility, adaptability and intuition through the use of linguistic variables and fuzzy membership functions. The proposed approach uses a fuzzy logic system to provide the confidence values of bases called. Three variables that are calculated during the base calling procedure are involved in the fuzzy system. These variables can be calculated at any spatial location and are: peakness, height, and base spacing. In addition to the first most likely candidate (the base called), the peakness and height are also found for the second likely candidate. The technique has been tested on over 3000 ABI 3700 DNA files and the result has shown improved performance over the existing Phred's and ABI's quality value.
Recommended Citation
Varghese, Rency Susan, "Confidence Measure for DNA Base Calling Using a Fuzzy System" (2004). Electronic Theses and Dissertations. 248.
https://digitalcommons.library.umaine.edu/etd/248