Genomic Signal Processing

Genomic Signal Processing

Ilya Shmulevich
Edward R. Dougherty
Copyright Date: 2007
Pages: 312
  • Cite this Item
  • Book Info
    Genomic Signal Processing
    Book Description:

    Genomic signal processing (GSP) can be defined as the analysis, processing, and use of genomic signals to gain biological knowledge, and the translation of that knowledge into systems-based applications that can be used to diagnose and treat genetic diseases. Situated at the crossroads of engineering, biology, mathematics, statistics, and computer science, GSP requires the development of both nonlinear dynamical models that adequately represent genomic regulation, and diagnostic and therapeutic tools based on these models. This book facilitates these developments by providing rigorous mathematical definitions and propositions for the main elements of GSP and by paying attention to the validity of models relative to the data. Ilya Shmulevich and Edward Dougherty cover real-world situations and explain their mathematical modeling in relation to systems biology and systems medicine.

    Genomic Signal Processingmakes a major contribution to computational biology, systems biology, and translational genomics by providing a self-contained explanation of the fundamental mathematical issues facing researchers in four areas: classification, clustering, network modeling, and network intervention.

    eISBN: 978-1-4008-6526-0
    Subjects: Ecology & Evolutionary Biology, Mathematics, Health Sciences

Table of Contents

  1. Front Matter
    (pp. i-iv)
  2. Table of Contents
    (pp. v-viii)
  3. Preface
    (pp. ix-xiv)
  4. Chapter One Biological Foundations
    (pp. 1-22)

    No single agreed-upon definition seems to exist for the termbioinformatics, which has been used to mean a variety of things ranging in scope and focus. To cite but a few examples from textbooks, Lodish et al. (2000) state that ‟bioinformatics is the rapidly developing area of computer science devoted to collecting, organizing, and analyzing DNA and protein sequences.” A more general and encompassing definition, given by Brown (2002), is that bioinformatics is “the use of computer methods in studies of genomes.” More general still: ‟bioinformatics is the science of refining biological information into biological knowledge using computers” (Draghici, 2003)....

  5. Chapter Two Deterministic Models of Gene Networks
    (pp. 23-76)

    A deterministic model of a genetic regulatory network can involve a number of different mechanisms that capture the collective behavior of the elements constituting the network. The models can differ in numerous ways:

    What physical elements are represented in the model (e.g., genes, proteins, other factors)

    Whether the model is capable of representing a dynamical process (i.e., Is there a notion of time?)

    At what resolution or scale is the behavior of the network elements captured (e.g., Are genes discretized, such as being either on or off, or do they take on continuous values?)

    How the network elements interact (e.g.,...

  6. Chapter Three Stochastic Models of Gene Networks
    (pp. 77-159)

    Stochastic models of genetic regulatory networks differ from their deterministic counterparts by incoporating randomness or uncertainty. Most deterministic models can be generalized such that we associate probabilities with particular components or aspects of the model. For example, in an undirected graph model, instead of specifying whether an edge between two vertices is present or absent, we can associate a probability with that event. Thus, an edge can be present with probability 0.8, in essence making it a probabilistic edge. For example, because of the uncertainty inherent in making inferences from data such as from yeast two-hybrid experiments (Ito et al.,...

  7. Chapter Four Classification
    (pp. 160-224)

    Pattern classification plays an important role in genomic signal analysis. For instance, cDNA microarrays can provide expression measurements for thousands of genes at once, and a key goal is to perform classification via different expression patterns. This requires designing a classifier (decision function) that takes a vector of gene expression levels as input and outputs a class label that predicts the class containing the input vector. Classification can be between different kinds of cancer, different stages of tumor development, or a host of such differences. Classifiers are designed from a sample of expression vectors. This requires assessing expression levels from...

  8. Chapter Five Regularization
    (pp. 225-261)

    Thus far we have taken the perspective that a collection of features is given, sample data are obtained, a classifier based on the features is designed from the data via a classification rule, and the error of the classifier is estimated from the data. The key design issue has been the relation between the classification rule and the sample size. The feature set and the sample data are taken as given, and the designer selects a classification rule. In this chapter, we consider alterations to this paradigm. First, we do not assume that the feature set is given; instead, it...

  9. Chapter Six Clustering
    (pp. 262-294)

    A classification operator takes a single data point and outputs a class label; a cluster operator takes a set of data points and partitions the points into clusters (subsets). Clustering has become a popular data analysis technique in genomic studies using gene expression microarrays (Ben-Dor et al., 1999). Time series clustering groups together genes whose expression levels exhibit similar behavior through time. Similarity indicates possible coregulation. Another way to use expression data is to take expression profiles over various tissue samples and then cluster these samples based on the expression levels for each sample. This approach is used to indicate...

  10. Index
    (pp. 295-299)