Mehmet's Research Page


I have a staff scientist position at the Lister Hill National Center for Biomedical Communications of the U.S. National Library of Medicine at the National Institutes of Health.


I lead the Computational Model Learning Group, whose mission is to advance computational methods for effective biomedical communications. We develop machine learning methods to algorithmically construct models from biomedical text and data. The models of interest range from text-based semantic information models to biological process and clinical outcome models.


The main project of the group is to develop an information architecture to map information from various data sources into concepts and interrelations. The information architecture is called the multifaceted Ontological Network (muON), which currently is based on the sets of concepts and relations found in the Unified Medical Language System (UMLS®), and maps words and phrases of Medline® citations to UMLS concepts.

From Data to Concepts: The system is planned to enable agents (human or software) to interpret data, extract information, understand the context of the data, and draw intelligent inferences. For example, the four-character word 'H5N1', which might be found in a text or in a data field, would be mapped on muON to the virus (Avian Influenza), to the disease (Influenza), and to the properties of the disease (incl. transmission, prevalence, morbidity, mortalitiy, and epidemiology).

From Concepts to Data: It will also enable agents to locate and retrieve the data by querying the system not only with keywords but also with concepts. Conceptual search can provide semantically richer set of information than the information that can be retrieved with conventional means. For example, query "[What are] the types of transmission of H5N1?" would yield "bird-to-bird", "bird-to-human", "bird-to-pig", and so on, where each of the transmission types would be associated with a probability, time interval, and locations, if such information is readily available or can be inferred.

The project was chartered by the Board of Scientific Counselors in 2004. For the project details, see the report presented to the board.

Selected Publications

  1. Kayaalp, M. (2005) Why Do We Need Probabilistic Approaches to Ontologies and the Associated Data? Proceedings of the American Medical Informatics Association Symposium: 1005 [Suppl.]
  2. Gay, C. W.; Kayaalp, M., and Aronson, A. R. (2005) Semi-Automatic Indexing of Full Text Biomedical Articles. Proceedings of the American Medical Informatics Association Symposium: 271–275.
  3. Kayaalp, M. (2004) Modeling and Learning Methods. A Report to the Board of Scientific Counselors. Report No. LHNCBC-TR-2004-002. Bethesda, MD: Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine.
  4. Kayaalp, M. (2004) Bayesian Methods for Diagnosing Physiological Conditions of Human Subjects from Multivariate Time Series Biosensor Data. Physiological Data Modeling Contest, the Twenty-First International Conference on Machine Learning (ICML-2004).
  5. Kayaalp, M.; Aronson, A. R.; Humphrey, S. M.; Ide, N. C.; Tanabe; L. K., Smith, L. H. et al. (2003) Methods for accurate retrieval of MEDLINE citations in functional genomics. Proceedings of the 12th Annual Text Retrieval Conference (TREC-12): 175–184.
  6. Kayaalp, M. (2003) Learning Dynamic Bayesian Network Structures from Data. Doctoral Dissertation, Intelligent Systems, University of Pittsburgh, PA.
  7. Kayaalp, M. and Cooper, G. F. (2002) A Bayesian Network Scoring Metric That Is Based on Globally Uniform Parameter Priors. Proceedings of the Eighteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-2002): 251–258.
  8. Kayaalp, M.; Cooper, G. F., and Clermont, G. (2001) Predicting with Variables Constructed from Temporal Sequences. Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics (AISTATS-2001): 220–225.
  9. Kayaalp, M.; Cooper, G. F., and Clermont, G. (2000) Predicting ICU Mortality: A Comparison of Stationary and Nonstationary Temporal Models. Proceedings of the American Medical Informatics Association Symposium: 418–422.
  10. Aronis, J. M.; Cooper, G. F.; Kayaalp, M., and Buchanan, B. G. (1999) Identifying patient subgroups with simple Bayes'. Proceedings of the American Medical Informatics Association Symposium: 658–662.
  11. Cooper, G. F.; Buchanan, B. G.; Kayaalp, M.; Saul, M., and Vries, J. K. (1998) Using computer modeling to help identify patient subgroups in clinical data repositories. Proceedings of the American Medical Informatics Association Symposium: 180–184.
  12. Kayaalp, M.; Pedersen, T., and Bruce, R. (1997) A Statistical Decision Making Method: A Case Study on Prepositional Phrase Attachment. Proceedings of the 1997 Meeting of the ACL SIG in Computational Natural Language Learning (CoNLL97): 33–42.
  13. Pedersen, T.; Kayaalp, M., and Bruce, R. (1996) Significant Lexical Relationships. Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96): 455–460.
  14. Kayaalp, M. (1993) Multifaceted Ontological Networks: Methodological Studies toward Formal Knowledge Representation. Master Thesis. Department of Computer Science and Engineering, Southern Methodist University, Dallas, TX.

Contact Information

private: mehmet(at)kayaalp(dot)us
official: mehmet(dot)kayaalp(at)nih(dot)gov
voice: (301) 451-4633
fax: (301) 402-0118
Lister Hill Center
NIH Bldg. 38A, Mailstop 52
8600 Rockville Pike
Bethesda, MD 20894