Constructing a Knowledge Base for Gene Regulatory Dynamics by Formal Concept Johannes Wollbold12, Reinhard Guthke2, and Bernhard Ganter1 1 University of Technology, Institute of Algebra, Dresden, Germany 2 Leibniz Institute for Natural Product Research and Infection Biology - oll-Institute (HKI) Jena, Germany Abstract. Our aim is to build a set of rules, such that reasoning overtemporal dependencies within gene regulatory networks is possible. Theunderlying transitions may be obtained by discretizing observed time se-ries, or they are generated based on existing knowledge, e.g. by Booleannetworks or their nondeterministic generalization. We use the mathemat-ical discipline of formal concept analysis (FCA), which has been appliedsuccessfully in domains as knowledge representation, data mining or soft-ware engineering. By the attribute exploration algorithm, an expert or asupporting computer program is enabled to decide about the validity ofa minimal set of implications and thus to construct a sound and com-plete knowledge base. From this all valid implications are derivable thatrelate to the selected properties of a set of genes. We present results ofour method for the initiation of sporulation in Bacillus subtilis. Howeverthe formal structures are exhibited in a most general manner. Thereforethe approach may be adapted to signal transduction or metabolic net-works, as well as to discrete temporal transitions in many biological andnonbiological areas.
Keywords: complete lattices, reasoning, temporal logic, gene expression As the mathematical methodology of formal concept analysis (FCA) is littleknown within systems biology, we give a short overview of its history and pur-poses. During the early years 1980, FCA emerged within the community of setand order theorists, algebraists and discrete mathematicians. Its first aim was tofind a new, concrete and meaningful approach to the understanding of completelattices (ordered sets such that for every subset the supremum and the infimumexist). The following discovery showed to be very fruitful: Every complete lat-tice is representable as a hierarchy of concepts, which were conceived as sets ofobjects sharing a maximal set of attributes. This paved the way for using thedeveloped field of lattice theory for a transparent and complete representation ofvery different types of knowledge. FCA was inspired by the pedagogue Hartmut von Hentig [7] and his program of restructuring sciences, with a view to interdis-ciplinary collaboration and democratic control. The philosophical backgroundgoes back to Charles S. Peirce (1839 - 1914), who condensed some of his mainideas to the pragmatic maxim: Consider what effects, that might conceivably have practical bear-ings, we conceive the objects of our conception to have. Then, ourconception of these effects is the whole of our conception of theobject. [14, 5.402] In that tradition, FCA aims at unfolding the observable, elementary proper- ties defining the objects subsumed by scientific concepts. If applied to temporaltransitions, effects of homogeneous classes of states can be modeled and pre-dicted in a clear and concise manner. Thus FCA seems to be appropriate todescribe causality - and the limits of its understanding.
At present, FCA is a richly developed mathematical theory, and there are practical applications in various fields as data and text mining, knowledge man-agement, semantic web, software engineering or economics [3]. FCA has beenused for the analysis of gene expression data in [2] and [13], but this is thefirst approach of applying it to model (gene) regulatory networks. The math-ematical framework of FCA is very general and open, such that multifariousrefinements are possible, according to current approaches of modeling dynamicswithin systems biology. On the other hand, we developed a formal structure forgeneral discrete temporal transitions. They occur in a variety of domains: controlof engineering processes, development of the values of variables or objects in acomputer program, change of interactions in social networks, a piece of music,etc.
In this paper, however, the examples are uniquely biological. The purpose is to construct a knowledge base for reasoning about temporal dependencies withingene regulatory or signal transduction networks, by the attribute explorationalgorithm: For a given set of interesting properties, it builds a sound, completeand nonredundant knowledge base. This minimal set of rules has to be checkedby an expert or a computer program, e.g. by comparison of knowledge basedpredictions with data.
Since there exist relatively fixed thresholds of activation for many genes, it is a common abstraction to consider only two expression levels off and on. Theclassical approach of Boolean networks [8] is able to capture essential dynamicaspects of regulatory networks. Our present work is based on it, which also makesit possible to use mathematical and logical derivations for deciding many rulesautomatically and for a scaling up to larger networks. Nevertheless, the intro-duction of more fine-grained expression levels remains possible, e.g. in the senseof qualitative reasoning [11]. Further, this work is influenced by computationtree logic [1], automata theory and a FCA modeling of temporal transitions in[15]. Temporal concept analysis as developed by K.E. Wolff [3, p. 127-148] ismore directed toward a structured visualization of experimental time series thentoward temporal logic. We applied it to the analysis of gene expression data in[19].
In Section 2, the general mathematical framework will be developed. Section 3 gives results for a B. subtilis Boolean network. In Section 4, we will discuss thepotential of the method and make some proposals for improving it by solvingmathematical problems which have emerged from the applications.
Fundamental Structures of Formal Concept Analysis One of the classical aims of FCA is the structured, compact but complete visual-ization of a data set by a conceptual hierarchy. We briefly introduce its basic def-initions; for an easy example see http://www.upriss.org.uk/fca/fca.html.
Definition 1. A formal context (G, M, I) defines a relation I ⊆ G × M be-tween objects from a set G and attributes from a set M . The set of the attributescommon to all objects in A ⊆ G is denoted by the 0-operator: A0 := {m ∈ M gIm for all g ∈ A}.
The set of the objects sharing all attributes in B ⊆ M is B0 := {g ∈ G gIm for all m ∈ B}.
Definition 2. A formal concept of the context (G, M, I) is a pair (A, B) withA ⊆ G, B ⊆ M, A0 = B and B0 = A. A is the extent, B the intent of theconcept (A, B).
Thus a formal context (G, M, I) is a special, but universally applicable type of a data table, provided with two operators CM : P(M ) → P(M ), B ⊆ M 7→B00 and CG : P(G) → P(G), A ⊂ G 7→ A00. It is easy to see that they areclosure operators, with the properties monotony, extension and idempotency [4,Definition 14]. It follows that the set of all extents resp. intents of a formalcontext is a closure system, i.e. it is closed under intersections [4, Theorem 1].
Formal concepts can be ordered by set inclusion of the extents or - dually, with the inverse order relation - of the intents. With this order, the set of allconcepts of a given formal context is a complete lattice [4, Theorem 3] (Figure1).
During the interactive attribute exploration algorithm [4, p. 85ff.], an expert is asked about the general validity of basic implications A → B between theattributes of a given formal context (G, M, I). An implication has the meaning:"If an object g ∈ G has all attributes a ∈ A ⊆ M , then it has also all attributesb ∈ B ⊆ M ." If the expert denies, s/he must provide a counterexample, i.e. a newobject of the context. If s/he accepts, the implication is added to the stem base ofthe - possibly enlarged - context. A theorem by Duquenne-Guiges [4, Theorem8] ensures that every implication semantically valid in the underlying formalcontext can be derived syntactically from this minimal set by the Armstrongrules [4, Proposition 21]. In many applications, one is merely interested in theimplicational logic of a given formal context, and there is no need for an expertto confirm the implications.
Constructing the Knowledge Base - Summary of the Method We start with two sets: – The universe E. The elements of E represent the entities of the world which we are interested in.
– The set F (fluents) denotes changing properties of the entities.
A state ϕ ∈ G is an assignment of values in F to the variables e ∈ E, hence it is defined by a specific choice of attributes m ∈ M ⊆ E ×F .3 By means of a statecontext (Definition 3, Table 3 left part), temporal data can be translated intothe language of FCA. The dynamics is modeled by a binary relation R ⊆ G × Gon the set of states, which gives rise to a transition context K (Definition 4,Table 1): the objects are transitions (elements of the relation) and the attributesthe values of the entities defining the input and output state of a transition.
This data table may reflect observations repeated at different time points, orthe transitions may be generated by a dynamic model. As to the latter, we arefocusing here on Boolean networks, i.e sets of Boolean functions for each entity(Definition 5).
It is promising to consider the transitive closure of R. Objects of the tran- sitive context Kt (Definition 6, Table 2) then are pairs of states such that theoutput state emerges from the input state by some transition sequence of arbi-trary length. Finally we extend the state context Ks by the temporal attributesalways(m), eventually(m) and never(m), which are determined by the giventransitions (Definition 3, Table 3).
The defined mathematical structures may be used in various ways. For in- stance, one could evaluate - i.e. generalize implications or reject them supposingoutliers or by reason of special conditions - experimental time series by compar-ison with existing knowledge. Our general procedure is the following: 1. Discretize a set of time series of gene expression measurements and transform it to an observed transition context 2. For a set of interesting genes, translate interactions from biological literature and databases into a Boolean network.
3. Construct the transition context K by a simulation starting from a set of states, e.g. the initial states of or all states (for small networks).
4. Derive the respective transitive contexts 5. Perform attribute exploration of Kt. Decide about an implication A → B, A, B ⊆ M , by checking its validity in and/or by searching for supple- mentary knowledge. Possibly provide a counterexample from 6. Answer queries from the modified context Kt and from its stem base.
In step 5., automatic decision criteria could be thresholds of support q = (A ∪ B)0 and confidence p = (A∪B)0 for an implication in obs. A weak criterion 3 Thus - as usual - states with the same variable values are identified. It would also be possible to distinguish them as situations by introducing a new attribute, e.g. "timeinterval".
is to reject only implications with support 0 (but if no object in attributes from A, the implication is not violated). In [18], a strong criterion hasbeen applied: implications of Kt had to be valid also in the observed context(p = 1). This is equivalent to an exploration of the union of the two contexts.
In Section 3 we will analyse pure knowledge based simulations; the validation by data and experimental literature had been done before in [5]. For that reason,in step 5. the stem base is computed automatically, without further confirmationby an expert.
Step 2. could be supported by text mining software. Then attribute explo- ration provides strong criteria of validation. We implemented the steps 1., 3.
and 4. in R [www.r-project.org]. For step 5., we used the Java tool Concept Ex-plorer [http://sourceforge.net/projects/conexp]. The output was translated withR into a PROLOG knowledge base. The R scripts are available on request.
Definition of the Relevant Formal Contexts With given sets E and F , we define a state as a map ϕ : E → F . To explorestatic features of states, the following formal context4 is defined.
Definition 3. Given two sets E (entities) and F (fluents), a state context isa formal context (G, M, I) with G ⊆ F E := {ϕ : E → F } and M ⊆ E × F ; itsrelation I is given as ϕ I (e, f ) ⇔ ϕ(e) = f , for all ϕ ∈ G, e ∈ E and f ∈ F .
Definition 4. Given a state context (G, M, I) and a relation R ⊆ G × G, atransition context K is the context (R, M × {in, out}, ˜ I) with the property ∀i ∈ {in, out} : (ϕin, ϕout) ˜ I(e, f, i) ⇔ ϕi(e) = f.
Transitions may be generated by a Boolean network: Definition 5. Let E be an arbritray set of entities, F := {0, 1} (fluents), andstates G ⊆ F E. Then a transition function F E → F E is called a Booleannetwork.
We will identify the elements of F with −, + or off, on respectively. This defini-tion is subsumed by the definition of a dynamic network in [12, Definition p. 34],with a set of variables E and state sets X1 = . = Xn = F . We use a parallelupdate schedule, i.e. the order relation on E is empty. Boolean networks maybe generalized in order to include nondeterminism; then different output statesϕout are generated from a single input state ϕin (see Section 3, compare [18]).
Definition 6. A transition context K with a transitively closed relation t(R) ⊆G × G is called a transitive context Kt.
4 It is equivalent to a "many-valued context" with "nominal scale" [4, Section 1.3], [18, p. 123]. For better readability, we draw the contexts in the latter form (Table 3).
Deriving a one-valued context according to Definition 3 is obvious: each many-valuedattribute e is replaced by {(e, f ) f ∈ F }), e.g. SigA by SigA.off and SigA.on. If anattribute e takes exactly one of these values, negation of on and off is expressed.
Other kinds of scaling like the interordinal scale could be interesting, if there aremore than two levels ( F > 2).
Table 1. A transition context for the states of Table 3, with all attributes that arechanging during the small simulation, as well as Spo0A and Spo0AP.
Table 2. The transitive context derived from the transition context of Table 1.
Definition 7. A state context K = (G, M, I) is extended to a formal contextKs = (G, M ∪ T , I ∪ IT ) by a set of temporal attributes T := {always(m) m ∈M } ∪ {never(m) m ∈ M } ∪ {eventually(m) m ∈ M }. Let ϕin ∈ G, m =(e, f ), e ∈ E, f ∈ F , and t(R) ⊆ G × G a transitively closed relation. Therelation IT of Ks then is defined as follows: ϕin IT always(m) ⇔ ∀(ϕin, ϕout) ∈ t(R) : ϕout(e) = f ϕin IT never(m) ⇔ ∀(ϕin, ϕout) ∈ t(R) : ϕout(e) 6= f ϕin IT eventually(m) ⇔ ∃(ϕin, ϕout) ∈ t(R) : ϕout(e) = f For B ⊆ T , set always(B) := {{always(b1), ., always(bi)} b1, ., bi ∈ B}, andanalogously never(B) and eventually(B).
The attributes will be abbreviated to alw(m), nev(m) and ev(m). In a nondeter-ministic setting, alw(m) and nev(m) refer to all possible transition paths startingfrom ϕin, ev(m) to the existence of a path.
Dependency of Contexts and Background Knowledge In the following we will present first mathematical results that can improvecomputability; they are not necessary for the understanding of the applicationin Section 3. By entering background knowledge (not necessarily implications) Table 3. Left part : A state context corresponding to a simulation starting from a B.
subtilis state without nutritional stress (see Section 3.1, [16, Table 4]). +: on, -: off.
Right part : extension by temporal attributes. Here they are the same for all states,since these reach the attractor (limit state cycle) {ϕ1, ϕ2} after at most one time step.
prior to an attribute exploration, the algorithm may be shortened considerably[3, p. 101-113]. We searched for first order logic background formula in order touse the results of an attribute exploration for the exploration of the next contextin the hierarchy. Then the implications of the latter context are derivable fromthis background knowledge and a reduced set of new implications. Also duringthe exploration of one context, implications can be decided automatically basedon already accepted implications. In this way the expert is enabled to concentrateon really interesting hypotheses. Thus, the implications of a state context holdin the input and output part of the corresponding transition context (for anexample see [15, p. 149f.]). Related to transitive and extended state contexts,the subsequent result holds: Proposition 1. Let Ks = (G, M ∪ T, I ∪ IT ) an extended state context. Sup-pose the relation t(R) ⊆ G × G is the object set of the transitive context Kt =(t(R), M × {in, out}, ˜ I). Then the following entailments between implications of both contexts are valid: Bin → mout in Kt ≡ B → always(m) in Ks Bin → mout in Kt = B → eventually(m) in Ks Bout → mout in Kt = always(B) → always(m) in Ks Bout → mout in Kt = eventually(B) → eventually(m) in Ks Bin ∪ mout → ⊥ in Kt = B → never(m) in Ks If the latter implication does not follow from the stem base of Kt, this is equivalentto B → eventually(m) in Ks.
Proof. The proofs are straightforward from the definitions.
In order to get a complete overview on valid entailments, as a first step we performed rule exploration [20] of the following test context, i.e. explorationof Horn rules instead of implications, thus variables are admitted: The objects are all possible Kt respectively the corresponding Ks, and the attributes arethe following classes of implications with "homogeneous" premises. Then theexplored rules for implications correspond to entailments valid for the semanticsgiven by the objects, the transitive contexts. The sets are nonempty subsets ofM , m = (e, f ), f ∈ F := {0, 1}, and m ∈ B0, C0 (B1, C1) ⇒ ϕ(e) = 0 (1). Wesuppose that all states and transitions are completely defined.
12. Bout → Cout 2. Bin → Cout ≡ Bin → nev(C 13. Bout → Cout 3. Bin → Cout ≡ Bin → alw(C 4. Bin → ev(C1) 15. ev(B1) → alw(C1) 7. ev(B1) → Cin 18. alw(B1) → alw(C1) 8. alw(B1) → Cin 19. alw(B1) → nev(C1) 9. nev(B1) → Cin 20. nev(B1) → ev(C1) 10. Bout → Cout 11. Bout → Cout The equivalences in 2. and 3. follow from Proposition 1(2). Since the impli- cations comprising input attributes are independent from those related only tooutput attributes, rule exploration was performed (almost) independently forthe first 9 and the remaining 13 implications. Results for the second part areshown here.
The exploration started from a hypothetical context as single object of the test context, where no implications were valid. Before, we had added 25 knownentailments as background rules (BR), like those of Proposition 1 or followingfrom the definitions, like alw(B1) → ev(B1). A counterexample represents a sig-nificant class of contexts. They had to be chosen carefully, since an object nothaving its maximal attribute set might preclude a valid entailment.5 The explo-ration resulted in the following stem base of only 14 entailments. Most of themare background rules (they are accepted automatically during the exploration),but not all of these are needed in order to derive all valid entailments betweenthe chosen implications. This demonstrates the effectivity and minimality of thealgorithm. Entailments 5., 6., 7. and 10. were newly found.
1. nev(B1) → alw(C1) = nev(B1) → ev(C1) (BR 1)2. nev(B1) → ev(C1), nev(B1) → nev(C1) = ⊥ (BR 11)3. alw(B1) → alw(C1) = alw(B1) → ev(C1) (BR 2)4. alw(B1) → ev(C1), alw(B1) → nev(C1) = ⊥ (BR 14)5. ev(B1) → nev(C1), nev(B1) → nev(C1) = Bin → Cout, Bout → Cout, Bout 6. ev(B1) → nev(C1), alw(B1) → nev(C1) = Bout → Cout 7. ev(B1) → nev(C1), alw(B1) → ev(C1) = ⊥ 5 Thus the attribute set of a counterexample must be a concept intent in the final test 8. ev(B1) → alw(C1) = ev(B1) → ev(C1), alw(B1)) → ev(C1), alw(B1)) → 9. ev(B1) → ev(C1) = alw(B1) → ev(C1) (BR 4) 10. ev(B1) → ev(C1), nev(B1) → ev(C1) = Bin → Cout, Bout → Cout, Bout → 11. Bout → Cout = ev(B 1) → ev(C1), alw(B1) → ev(C1), alw(B1) → alw(C1) (BR 4, BR 5 ⇐ Proposition 1(4)(5)) 12. Bout → Cout = alw(B 1) → nev(C1) (BR 9 ⇐ Proposition 1(4)) 13. Bout → Cout = nev(B 1) → ev(C1), nev(B1) → alw(C1) (BR 1, 10 ⇐ Propo- 14. Bout → Cout = nev(B 1) → nev(C1) (BR 6 ⇐ Proposition 1(4)) It remains to prove the rules of this stem base, which is easy; we are giving some hints: BR 1, 2, 3, and 4 are based on alw(A) → ev(A), A ⊆ M , and BR 11and 14 on nev(C1), ev(C1) → ⊥ (⊥ = set of all attributes, and the correspondingobject set is empty).
7.: Since nev(C1) and ev(C1) do not occur together by definition, the combi- nation of the two implications has support 0 in the test context. In the underlyingcontexts, the premise alw(B1) (a subcase of ev(B1)) is no attribute of any state.
The implication alw(B1) → ⊥ holds, which has not been considered explicitly.
10.: Inversely, in all possible cases the states / transitions have the attribute ev(C1), and therefore also alw(C1) and Cout. Explicitly: > → ev(C alw(C1), > → Cout. 5. is a parallel rule concerning nev(C Rules 7. and 10. suggest that implications with empty premise > or con- clusion ⊥ should be considered explicitly. If the counterexamples have maximalattribute sets, as a conclusion it can be stated that we have derived a set of rulesrepresenting a minimal, sound and complete entailment calculus for the selectedclasses of implications for transition and state contexts.
Results: Sporulation in Bacillus subtilis In order to demonstrate the characteristics of the proposed method, we will applyit to a gene regulatory network assembled in [5] and transformed to a Petri netas well as a Boolean network in [16].
B. subtilis is a gram positive soil bacterium. Under extreme environmental stress, it produces a single endospore, which can survive ultraviolet or gammaradiation, acid, hours of boiling or long periods of starvation. The bacteriumleaves the vegetative growth phase in favour of a dramatically changed andhighly energy consuming behaviour, and it dies at the end of the sporulationprocess. This corresponds to a switch between two clearly distinguished geneticprograms, which are complex but comparatively well understood.
By literature and database search, de Jong et al. [5] identified 12 main regula- tors, constructed a model of piecewise linear differential equations and obtainedrealistic simulation results. An exogenous signal (starvation) triggers the phos-phorylation of the transcription factor Spo0A to Spo0AP by the kinase KinA;this process is reversible by the phosphatase Spo0E. Spo0AP is necessary to transcribe SigF, which regulates genes initiating sporulation and therefore is anoutput of the model. The interplay with other transcription factors AbrB, Hpr,SigA, SigF, SigH and SinR is graphically represented in [5, Figure 3]; SinI in-activates SinR by binding to it. SigA and Signal are considered as an input tothe model and are always on. Table 4 lists the Boolean equations building themodel in [16] (communicated by the author). They exhibit a certain degree ofnondeterminism, since the functions for the off fluents sometimes are not thenegation of the on functions. This accounts for incomplete or inconsistent knowl-edge. In the case of state transitions determined by k conflicting function pairs,we generated 2k output states.
Table 4. Boolean rules for the nutritional stress response regulatory network, derivedin [16] from [5]. x ˆ = SigA AbrB Spo0AP = SigA + AbrB + Spo0AP = (SigH Spo0AP SinR) + (SigH Spo0AP SinI) = (SinR SinI) + SigH + Spo0AP = (SigH Spo0AP) + (SigA Spo0AP) = (SigA SinR SinI) + (SigH SigA ) + Spo0AP Spo0AP = Signal Spo0A Spo0E KinASpo0AP = Signal + Spo0A + Spo0E + KinASpo0E = SigA AbrB Spo0AP = SigA + AbrB + Spo0AP = (SigA AbrB Hpr SinR SinI Spo0AP) + (SigA AbrB Hpr SinR SinI Spo0AP) = SigA + AbrB + Hpr + (SinR SinI) + (SinR SinI) + Spo0AP) = TRUE (input to the model) = TRUE or FALSE (constant, depending on the input state) Simulation Starting from a State Typical for the VegetativeGrowth Phase We performed supplementary analyses of the transitions starting from a typicalstate without the starvation signal [16, Table 4]. The concept lattice for theresulting transitive context (Table 2, with a part of the attribute set only) isshown in Figure 1. The larger circles at the bottom represent object concepts; their extents are the four single transitions with the input state at t = 0 ort = 2, and the intents are all attributes above a concept. Thus for instance, thetwo latter transitions have the attribute KinA.in.on in common, designating therespective concept. Moreover, they are distinguished unambigously from othersets of transitions by this attribute - the concept is generated by "KinA.in.on".
Fig. 1. Concept lattice (computed and drawn with Concept Explorer) representinga simulation without nutritional stress. Signal : starvation; AbrB, Hpr, SigA, SigF,SigH, SinR, Spo0A (phosporylated form Spo0AP ): transcription factors; KinA: ki-nase; Spo0E : phosphatase; SinI inactivates SinR by binding to it. i-j indicates atransition (ϕin ). Bold / blue lines: Filter (superconcepts) and ideal (subcon- cepts) of the concept ({(ϕin )}, {AbrB.in.off, SigH.in.off, SpoOE.in.off, Hpr.in.on}) Implications of the stem base can be read from the lattice. For instance there are implications between the generators of a concept: < 4 > AbrB.in.off → SigH.in.off, SpoOE.in.off, Hpr.in.on Analogous implications hold for the attributes of the conclusion, and there areimplications between attributes of sub- and superconcepts. < 4 > indicates thatthe rule is supported by four transitions.
The bottom concept has an empty extent. Its intent is the set of attributes never occuring during this small simulation. The top concept does not have an empty intent - as it is often the case -, but it consists of 10 attributes commonto all 6 transitions. The corresponding rule has an empty body (>): < 6 > > → Signal.in.off, SigA.in.on, SigF.in.off, Spo0A.in.on, Spo0AP.in.off, SinR.in.off, SinI.in.off, Signal.out.off, SigA.out.on, SigF.out.off, Spo0A.out.on, Spo0AP.out.off SinR.out.off SinI.out.off Related to the simulation in the presence of nutritional stress, the transitive context has about 20 transitions, 500 concepts and 50 implications. In a suchcase it is more convenient to query the implicational knowledge base. But alsofor the visualization of large concept hierarchies, there exist more sophisticatedtools like the ToscanaJ suite [http://sourceforge.net/projects/toscanaj/].
Analysis of All Possible Transitions In order to analyse the dynamics of the B. subtilis network exhaustively, wegenerated 4224 transitions from all possible 212 = 4096 initial states (thus therules are nearly deterministic). There were 11.700 transitions in the transitivecontext, from which we computed the stem base containing 524 implicationswith support > 0, but 11.023.494 ≈ 224 concepts.
It was not feasible to provide biological evidence for a larger part of the im- plications, within the scope of this methodological study. This could be done byliterature search, especially automatic text mining, by new specialized experi-ments, or - in a faster, but less reliable way - by comparison with high-throughputobserved time series [18, 3.2]. Instead we will give examples for classes of impli-cations that can be validated or falsified during attribute exploration in specificways. We start with the examples of [16, 4.3].
– "For example, we know that in the absence of nutritional stress, sporulation should never be initiated [5]. We can use model checking to show this holdsin our model by proving that no reachable state exists with SigF = 1 startingfrom any initial state in which Signal = 0, SigF = 0 and Spo0AP = 0." [16,341] This is equivalent to the rule following from the stem base: Signal.in.off, SigF.in.off, Spo0AP.in.off → SigF.out.off, – SigF.out.on → KinA.out.off, Spo0A.out.off, Hpr.out.off, AbrB.out.off: Spo0AP is reported to activate the production of SigF but also repress itsown production (mutual exclusion). [5] – SigH.out.off → AbrB.out.off, SpoOE.out.off, SinR.out.off, SinI.out.off All these genes are regulated gene.out = SigA.in + AbrB.in (+ .).
In our approach, such dependencies and mutual exclusions can be checked systematically. We searched the stem base for further interesting and simple < 4500 > Spo0AP.in.on, KinA.out.off → Hpr.out.off < 4212 > SigH.in.on. KinA.out.off → Hpr.out.off < 3972 > AbrB.in.off, KinA.out.off → Hpr.out.off Hpr and KinA are determined by different Boolean functions, but they are coreg-ulated in all states emerging from any input state characterized by the singleattributes Spo0AP.on, SigH.on or AbrB.on.
< 3904 > AbrB.out.on → SigA.in.on, SigA.out.on, SigF.out.off, Spo0A.out.on, Spo0E.out.on, SigH.out.on, Hpr.out.off, SinR.out.off, SinI.out.off AbrB is an important "marker" for the regulation of many genes, which is un-derstandable from the Boolean rules with hindsight. By a PubMed query, aconfirmation was found for downregulation of SigF together with upregulationof AbrB [17].
Finally we entered sets of interesting attributes as facts into the PROLOG knowledge base, such that a derived implication was computed6. Complemen-tary to (9), we searched after conditions for the switch towards sporulation(SigF.out.on) and found the implication SigF.in.off, Spo0AP.in.off, SigF.out.on → Signal.in.on. Signal.out.on, SigA.in.on, SigA.out.on, Spo0AP.out.off, (14) Spo0A.out.off, AbrB.out.off, KinA.out.off, Hpr.out.off.
The latter four attributes follow immediately from the Boolean rules, but Spo0AP.out.off depends in a more complex manner on the premises. It is alsonoteworthy that the class of input states developing to a state with attributeSigF.out.on is only characterized by the common attributes Signal.in.on andSigA.in.on, i.e. the initial presence or absence of no other gene is necessary forthe initiation of sporulation7 The present work translates observations and simulations of discrete temporaltransitions into the language of formal concept analysis. The application to awell studied gene regulatory network showed how a model can be validated ina systematic way, by drawing clear and complete consequences from the theory(the knowledge based network), and we found interesting new transition rules.
6 and accordingly the closure of the attribute set7 For this complete simulation, the conditions Signal = SigA = TRUE had been dropped, but they were supposed to be constant.
The approach could be expanded by accounting for the change of the networkstructure itself in strongly different biological situations, e.g. with or withoutstress. Thus in ongoing work we adapt a literature based network to observedtranscriptome time series, resulting in two sets of Boolean functions related tothe stimulation of human fibroblast cells by the cytokines Tnfα or Tgfβ.
Until now we have established the foundation in order to exploit manyfold mathematical results of FCA for the analysis of gene expression dynamics andof discrete temporal transitions in general. An important question is: How canattribute exploration be split into partial problems, in this special case? Forinstance, one could focus on a specific set of genes first, which is understand-able as a scaling [4, Definition 28]. Then the decomposition theory of conceptlattices will be useful, which permits an elegant description by means of thecorresponding formal contexts. [4, Chapter 4] The price of the logical completeness is its computational complexity. In this regard the status of attribute exploration has not yet fully been clarified.
Computation time strongly depends on the logical structure of the context, andthere exist cases where the size of the stem base is exponential in the size ofthe input [10]. However, deriving an implication from the stem base is possiblein linear time, related to the size of the base, and the PROLOG queries in Sec-tion 3.2 were very fast. As demonstrated in Section 2.4, attribute explorationcan be shortened by background knowledge. Further it will be crucial to decideimplications without the necessity to generate all possible transitions. For thatpurpose, model checking [6] could be a promising approach, or the structural andfunctional analysis of Boolean networks by an adaptation of metabolic networkmethods in [9]. There, determining activators or inhibitors corresponds to thekind of rules found by our method, and logical steady state analysis indicateswhich species can be produced from the input set and which not. An excitingdirection of research would be to conclude dynamical properties of Boolean net-works from their structure and the transition functions, e.g. by regarding them aspolynomial dynamical systems over finite fields [12, Section 4] and by exploitingtheoretical work in the context of cellular automata [12, Section 6].
The present work is a first step to use the potential of formal concept analysis for solving questions within systems biology. As indicated, many directions ofresearch are possible. We encourage their investigation and are open to anycollaboration with mathematicians, computer scientists or (systems) biologists.
Acknowledgement. The work was supported by the German Federal Ministryof Education and Research BMBF (FKZ 0313652A).
1. Chabrier-Rivier, N. et al.: Modeling and Querying Biomolecular Interaction Net- works. Theor. Comp. Sc. 325(1), 25-44 (2004) 2. Choi, V., et al.: Using Formal Concept Analysis for Microarray Data Comparison.
Advances in Bioinformatics and Computational Biology 5, 57-66 (2006) 3. Ganter, B., Stumme, S., Wille, R.: Formal Concept Analysis - Foundations and Applications. LNAI, vol. 3626. Springer, Heidelberg (2005) 4. Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations.
Springer, Heidelberg (1999) 5. de Jong, H., et al.: Qualitative Simulation of the Initiation of Sporulation in Bacillus subtilis. Bulletin of Mathematical Biology 66, 261-299 (2004) 6. Esparza, J.: Model checking using net unfoldings. Sci. Comput. Programm. 23, 151- 7. von Hentig, H.: Magier oder Magister? ¨ Uber die Einheit der Wissenschaft im andigungsprozess. Suhrkamp, Frankfurt (1974) 8. Kauffman, S.A.: The Origins of Order: Self-Organization and Selection in Evolution.
Oxford University Press, New York (1993) 9. Klamt, S., et al.: A methodology for the structural and functional analysis of sig- naling and regulatory networks. BMC Bioinformatics 7(56) (2006) 10. Kuznetsov, S.O., Obiedkov, S.A.: Counting Pseudo-intents and #P-completeness.
In: Missaoui, R., Schmid, J. (eds.) ICFCA 2006. LNCS, vol. 3874, pp. 306-308.
Springer, Heidelberg (2006) 11. King, R.D., Garrett, S.W, Coghill, G.M.: On the Use of Qualitative Reasoning to Simulate and Identify Metabolic Pathways. Bioinformatics 21(9), 2017-2026 (2005) 12. Laubenbacher, R.: Algebraic Models in Systems Biology. In: Anai, H., Horimoto, K. (eds.) Algebraic Biology 2005, pp. 33-35. Universal Academy Press, Tokyo (2005) 13. Motameny, S., Versmold, B., Schmutzler, R.: Formal Concept Analysis for the Iden- tification of Combinatorial Biomarkers in Breast Cancer. In: Medina, R., Obiedkov,S.A. (eds.) ICFCA 2008. LNCS, vol. 4933, pp. 229-240. Springer, Heidelberg (2008) 14. Peirce, C.S.: How to Make Our Ideas Clear. In: Hartshorne, C., Weiss, P. (eds.) Collected papers. Harvard University Press, Cambridge / Mass. (1931-35) 15. Ganter, B., Rudolph, S.: Formal Concept Analysis Methods for Dynamic Concep- tual Graphs. In: ICCS 2001, LNAI 2120, pp. 143-156. Springer, Heidelberg (2001) 16. Steggles, L. J. et al.: Qualitatively modelling and analysing genetic regulatory networks: a Petri net approach. Bioinformatics 23(3), 336-343 (2007) 17. Tomas, C.A. et al.: DNA array-based transcriptional analysis of asporogenous, nonsolventogenic Clostridium acetobutylicum strains SKO1 and M5. J Bacteriol.
15, 4539-47 (2003) 18. Wollbold, J.: Attribute Exploration of Discrete Temporal Transitions. In: G´ et al. (eds.) Contributions to ICFCA 2007, pp. 121-130. Clermont-Ferrand (2007) 19. Wollbold, J., Huber, R., Wolff, K.E.: Conceptual Representation of Gene Ex- pression Processes. In: Knowledge Processing in Practice 2007. To appear in LNAI.
Springer, Heidelberg (2008) 20. Zickwolff, M.: Rule Exploration: First Order Logic in Formal Concept Analysis.
PhD thesis. University of Technology, Darmstadt (1991)

Source: http://www.risc.jku.at/conferences/ab2008/prelim/51470232.pdf


Antibiotics and antibiotic resistance Spotlight on Antibiotic Research UKColin Garner Medicine as currently practised throughout the world is threatened by the rise of antibiotic resistant (Chief Executive, Antibiotic infections. In 2014 it was reported that 50,000 people a year die from antibiotic resistant infections in the USA and Europe. Worldwide there may be as many as 700,000 deaths a year. Jim O'Neill, the eminent economist

Vaz newsletter - march 2016 - final

Publication of the Veterinary Association of Zambia—Vol. 4 (4) THE VETERINARIAN Position Statement from the Veterinary Association of Zambia on the use of "Veterinary Prescriptions" to purchase "Prescription-Only Medications" from veterinary drug outlets The Veterinary Association of Zambia Executive Committee dosages, and for the wrong conditions). Greater veterinary