Introductory Computational Molecular Biology

by Xuegong Zhang, PhD

With the availability of high-throughput genomic and proteomic data, the application of mathematics and computer science have changed the face of modern biology. A substantial coreof bioinformatics or computational molecular biology methods has been developed during thepast two decades to meet the need of biological scientists for data storage, data retrieval, and mostimportantly, data analysis and scientific discovery. This course introduces the kernel ideas ofbioinformatics, reviews some of the major scientistific questions that bioinformatics can hopefully help to study, and describe selected methods in both sequence analysis and functional genomics. It is widely recognized that the research in this field is interdisciplinary in nature andrequires knowledge in computational algorithms, statistics, pattern recognition, and molecularbiology. The course is designed both for informatics students and for biology students. Students in this class are expected to spend a substantial amount of time and effort reading research articles/monographs ranging from information science to biology. Most (if not all) the materials used in this course are in English so students are required to have reasonbly good reading ability in English. The lectures will be given in mixed English and Chinese.

Design and Analysis of Bioinformatics Algorithms

by Rui Jiang, PhD

This course systematically introduces basic concepts and techniques required for the design and analysis of bioinformatics algorithms. In particular, it will present several algorithm design methodologies including exhaustive search, branch-and-bound, the greedy method, dynamic programming, divide-and-conquer, graph algorithms, and sampling methods. Besides, mathematical tools useful in algorithm analysis such as the notion of asymptotic complexity and the big-O notation will be discussed. Efficient algorithms for various fundamental problems concerning discrete structures including trees, strings, and graphs, especially those arising in bioinformatics, will be designed using these methods. Through the study of this course, students will have the ability to design and analysis algorithms independently in their research.

Statistical Methods with Applications

by Rui Jiang, PhD

Statistics is a mathematical science pertaining to the collection, presentation, analysis, and interpretation of data. It is applicable to a wide variety of academic disciplines. In this course, we will focus on the principles of basic statistical inference methods and their real-life applications. We will introduce how to use statistical methods to summarize or describe a collection of data, and how to infer patterns implied in the data in a way that accounts for randomness and uncertainty in the observations.

The purpose of this course is to build applied statistics from the first principles of probability theory. Starting from the basics of probabilities, the properties of random variables, and the common families of distributions, we shall develop the methods of statistical inference — including point estimation, hypothesis testing, interval estimation, analysis of variance(ANOVA), and various regression models — using definitions, examples, techniques, and concepts that are statistical and are natural extensions and consequences of previous concepts.

Throughout the course, we shall use R as the statistical computing platform to illustrate statistical methods and associated examples, and to demonstrate contemporary applications in bioinformatics.