
Status: Ongoing
People: ManChon U, Eric Talevich, Natarajan Kannan, Khaled Rasheed
Introduction:
Cancer can be conceptualized as a set of somatic mutations in a cell's DNA that confer a growth advantage on that cell, either by inducing new growth or by causing the cell to ignore signals that limit its growth and replication. Protein Kinases (PK) are a diverse super-family of proteins responsible for a cell's regulation and signal transduction; consequently, mutations of PK genes are frequently implicated in human cancer. In order to understand how these mutations lead to cancer, and hopefully to guide the design of new drug therapies that can detect and treat such cancers, we wish to identify precisely which features of non-synonymous point mutations in PK genes are responsible for promoting cancer in human cells.
Our ultimate goal in this research project will be to identify some suspicious mutations that we can assert with a high degree of certainty to be driver mutations and build a sophisticated model for this process. Since the beginning of this century, various types of cancers have risen in prominence as a cause of human mortality, yet no single treatment has been found to effectively cure all cancers. Complicating matters is the observation that once a cell becomes cancerous, its DNA-repair mechanisms generally shut down, causing a proliferation of mutations throughout the cell's DNA. The mutations that are initially responsible for oncogenicity , called "driver" mutations, are therefore difficult to distinguish from the comparatively harmless mutations that occur incidentally from this process, called "passenger" mutations.
Publication: Being prepared

Status: Ongoing
People: ManChon U, Roberto Perdisci
Introduction:
When people make critical decisions, they often consider the opinions of multiple experts rather than committing themselves to a single expert or relying on their own judgment. The same approach should also apply to virus/malware classification/clustering. However, previous studies were mainly focus on simply applying majority voting approach to achieve this goal, while this approach it may result in an overoptimistic estimate of the malware clustering accuracy. In this project, we propose VAMO, a system that provides a fully automated quantitative analysis of the validity of malware clustering results. Unlike previous work, VAMO does not seek a majority voting-based consensus across different AV labels, and does not discard the malware samples for which such a consensus cannot be reached. In essence, VAMO aims to avoid reducing the reference set to only including "easy to cluster" samples. Rather, VAMO explicitly deals with the inconsistencies typical of multiple AV labels to build a more representative reference set. Furthermore, VAMO avoids the need of a (semi-)manual mapping between AV labels from different scanners that was required in previous work. Through an extensive evaluation in a controlled setting and a real-world application, we show that VAMO outperforms majority voting-based approaches, and provides a better way for malware analysts to automatically assess the quality of their malware clustering results.
Publication:

Status: Ongoing
People: ManChon U, Chiahsun Ho, Shelby Funk, Khaled Rasheed
Introduction:
Hard real-time systems require that all jobs are assigned a deadline and the system is deemed to be correct only if all jobs complete execution at or before their deadlines. Such strict timing requirements add to the complexity of the scheduling problem. This complexity is exacerbated when the system is executed on a multiprocessor platform. Even so, scheduling overheads must be kept to a minimum in order for the runtime behavior to be predictable. Thus, real-time scheduling algorithms have the dual requirement of satisfying complex requirements while using fairly simple and straightforward logic. One way an algorithm may achieve this goal is to reduce the overhead due to preemption and migration by rearranging the schedule so as to increase the duration between preemptions. Unfortunately, determining how best to rearrange the jobs is and NP-Complete problem. Hence, we need to use heuristics when scheduling such systems. This leads us to ask a couple of questions. First, what is the best heuristic? Second, is the same heuristic best for all real-time systems?
We use a Genetic Algorithm to help us answer these questions. Our genetic algorithm based real time system scheduler (GART) is based on the DP-Wrap scheduling algorithm. The genetic algorithm searches through a variety of candidate heuristics to determine the best heuristic for a given task set. Experimental results demonstrate that this approach is able to efficiently identify the best heuristic for all the systems we consider. Moreover, we find that the "best" heuristic does, in fact, depend of various system parameters.
Publication:

Status: Past
People: ManChon U, Khaled Rasheed
Introduction:
Researchers have known for some time that non-linearity exists in the financial markets and that neural networks can be used to forecast market returns. In this project, we implemented a novel stock market prediction system which focuses on forecasting the relative tendency growth between different stocks and indices rather than purely predicting their values. This research utilizes artificial neural network models for estimation. The results are examined for their ability to provide an effective forecast of future values. Certain techniques, such as sliding windows and chaos theory, are employed for data preparation and pre-processing. Our system successfully predicted the relative tendency growth of different stocks with up to 99.01% accuracy.
Our system can predict the future relative tendency growth among different stocks – or stocks and indices - on a daily bases. Therefore, daily or short term investors and mutual fund managers – especially hedge fund managers - are the most likely group to benefit from our system as they can decide to perform the drop/add of a specific stock on a specific day.
Publication:
-
ManChon U, Khaled Rasheed; "A Relative Tendency Based Stock Market Prediction System", Proceeding of The Ninth International Conference on Machine Learning and Applications (ICMLA 2010), IEEE, pp. 949-953, Washington DC, USA, 2010.
-
ManChon U, Zhen Li;
"Public Goods Game Simulator with Reinforcement Learning Agents", Proceeding of The Ninth International Conference on Machine Learning and Applications (ICMLA 2010), IEEE, pp.43-49, Washington DC, USA, 2010.

Status: Past
People: ManChon U, Vasim Mahamuda, Khaled Rasheed
Introduction:
Machine learning algorithms are extensively being used in the field of bioinformatics. With the invent of new genome sequencing techniques there has been a considerable increase in the amount of data available in the biological databases. In research projects, such as determining the structure and function of biological molecules, the role of machine learning is evident. Metagenomics deals with the study of micro-organisms such as prokaryotes that are found in samples from natural environments. The samples obtained from the environment may contain DNA from many different species of micro-organisms including bacteria and archea.
In this project, we compared the performance of six well known supervised learners - decision trees, naïve bayes, support vector machines, artificial neural networks, bayesian networks and decision tables . We also analyzed the performance of three different meta-learners namely bagging, boosting and stacking. Afterwards, we apply decision trees, bayesian networks and decision tables to see how the performance degrades when the number of species present in the metagenomic sample increases. We also try to see how the performance of the metagenomic sample changes as the percentage of unknown sequences in the metagenomic sample is varied.
Publication:
-
ManChon U, Vasim Mahamuda, Khaled Rasheed; "On the Scalability of Supervised Learners in Metagenomics", Proceeding of The Ninth International Conference on Machine Learning and Applications (ICMLA 2010), IEEE, pp. 803-807, Washington DC, USA, 2010.
-
Vasim Mahamuda, ManChon U, Khaled Rasheed; "Application of Machine Learning Algorithms for Binning Metagenomic Data", Proceeding of The 2010 International Conference on Bioinformatics & Computational Biology (BIOCOMP'10/ISBN:1-60132-132-5/CSREA), pp.:68-74, Las Vegas, USA, 2010.
|
| |
|