How to compute weights of Concept Network links (in the process of building a Concept Network to represent bibliographic references)? The exploitation of bibliographic database terms pointed out the use of cooccurrence.
In knowledge acquisition domain, it is admitted that there are two ways of proceeding: upward, and downward. The downward one is called "onomasiologique" (in French, the English counterpart should be onomasiologic) and starts with the conceptual level (a model) to understand texts. This manner is efficient when the documents are tightly structured, but remains counterproductive on unexpected knowledge. The upward manner is called "sémasiologique" (in French, translation should be semasiologic), and starts with data to build conceptual entities. The building of the Concept Network can then be called "sémasiologique", as it start from data of the references base in order to build concepts.
In Frath et al. 1995, authors say that, to them "meaning is built ... thanks to a combination: the constituents of a syntagm act to semantically constrain each other, and by doing so specify the syntagm's meaning" (rough translation for "le sens se construit essentiellement grâce à une combinatoire: les constituants d'un syntagme exercent les uns sur les autres des contraintes sémantiques qui en restreignent et donc en précisent le sens.") Their system, that helps to extract from a text conceptual entities and relationships, extracts repeated segments, simplifies them, generalizes them morphologically (rough lemmatization (I don't if this word exists in English)), and then searches for pair of words cooccurrences. This relationships are then manually labelled. Our understanding of the meaning is similar: a word's meaning becomes more and more accurate only thanks to others words (or concepts) that are associated to it.
Researchers analyzing human understanding during reading pointed out structures similar to those of the Concept Network: for Fayol (Fayol 1992), schemas designate knowledge "blocks" concerning a domain; they are constituted from semantic networks which elements have privileged relationships because of their frequent cooccurences. Thus, to Fayol, elements cooccurring frequently can be brought nearer one from another. Moreover, it appears that, in literature, authors refer to an activation mechanism, and that this activation diffuses through networks constituting the schemas. Likewise, Segui teaches us (Segui 1992) that showing a stimulusword activate not only its own lexical representation, but also that of a set of words matching its orthographic neighbors, in order to quickly delimit the candidates to recognize during the reading. Going from the strict orthographic recognition's frame to the conceptual recongition, orthographic neighborhood should be replaced by conceptual neighborhood. He also says that, experimentally, it is possible to act on a word recognition, by previously modifying the activation value of its most frequent neighbors.
In its Ph.D. thesis about associations analysis (Michelet 1988), Michelet says: "giving the most relevant associations of a term allows one to reconstitute a definition for it: the essence for a definition is association. ". He shows association indices, based on terms cooccurrence. According to his definition, "an association index must yield nondecreasing values when cooccurrence increase." That's pretty obvious: the more two terms appear often together, the more their association is important (in our case: the more their mutual influence on each other is high). Moreover, "an association index between two terms must not increase if a record containing only one the two terms is added to the base." It would be damageable that such an addition modifies the influence of a term on another in such a manner: the two terms association would increase while their cooccurrence would not vary.
Let C_{i} be the occurrence value of the i object in a base of size N.
Let C_{i j} be the number of records in this base where objects i and j cooccur.
The equivalency index:
E_{i j} = C^{2}_{i j} / (C_{i} x C_{j})
"shows all the `good' properties ...: it's a local association index"
Knowing that an association index is homogeneous if it remains constant when all its variables are multiplied by a constant factor, and local if it does not depend on the base's size.
This equivalency index gives a notion of conceptual proximity, that is to say that two terms appearing often in the same record should be bound, conceptually. As Michelet says: "Statistical association coefficients can be used to give an idea of the structural links existing in the vocabulary. ... statistical aggregations don't send to a `logical' linking, but, on the contrary, to a convergence of interest.."
As we wish to obtain a way of computing an influence of a node on another, we can transform the equivalency index into bidirectional influence (i.e. to have the same influence from node 1 towards node 2 as from bide 2 towards node 1). It would be an acceptable behavior for some applications (for example, a different application whose links where doubled to make them bidirectional is the Traveling Salesman Problem, that has also been coded). But in the case of the bibliographic references, one wants that a term 1 can influence a term 2 in a different way that term 2 on term 1. Indeed, let's take the example of an author and one of his coauthors. Let A_{1} be the first author and A_{2} his coauthor in a reference. Let C_{1} be the number of appearances of A_{1} in the base, and C_{2} be that of A_{2} in the same base.
Let C_{1 2} be the number of joint articles of the two authors. Let's give values to these variables:
C_{1} = 50, C_{2} = 5, C_{1 2} = 4.
For the equivalency index, E_{1 2} = 6.4%. However one can easily see that A_{2} is much more related with A_{1} than A_{1} with A_{2}, since almost the totality of its references has A_{1} as a coauthor.
The inclusion index (Michelet 1988) translates much better this concept of "influence" of one term on another one:
I_{i→j} = C _{i j} / C_{i}
Here, I_{1→2} = 4/50 = 8% whereas I_{2→1} = 4/5 = 80%.
As the activation value of a node propagates according to its influences towards the other nodes, and that a node is activated when an agent find one of its instances in the Blackboard, it is better to use the inclusion index to represent the influence of A_{2} on A_{1}.
Indeed, if the system updates A_{2}, there is a 80% probability (using the learning base statistics), that A_{1} was also in the reference to treat, whereas if the system find A_{1}, it has only a 8% likelihood to find A_{2} in the same reference.
Always according to (Michelet 1988): "if one observes a property a, when there is a P1 probability that one also observes the property b, this probability is estimated by the relative frequency of appearance of b knowing that a exists, i.e. by the inclusion coefficient I_{a b} = C_{a b} / C_{a}." The influence I_{i→j} is therefore an estimate of the probability that one observes the term j knowing that one observed the term i, it is thus an estimate of the conditional probability P(ji).
This writeup is closely related with the Building a Concept Network to represent bibliographic references writeup.
Bibliography
 Fayol 1992
 M. Fayol.
La lecture, processus, apprentissage, troubles, chapter La compréhension lors de la lecture: un bilan provisoire et quelques questions, pages 79101.
Presses Universitaires de Lille, 1992.
 Frath et al. 1995

P. Frath, R. Oueslati and F. Rousselot.
Identification de relations sémantiques par repérage et analyse de cooccurrences de signes linguistiques.
In Actes des journées d'Acquisition des Connaissances, pages 173185, Grenoble, 57 avril 1995.
 Michelet1988
 B. Michelet.
L'analyse des associations.
Thèse de doctorat, Université de Paris VII, UFR de Chimie, Paris, 26 Octobre 1988.
Spécialité: Information Scientifique et Technique.
 Segui1992
 J. Segui.
La lecture, processus, apprentissage, troubles, chapter Les composantes cognitives de la lecture, pages 4353.
Presses Universitaires de Lille, 1992.
Disclaimer: as I don't speak fluently English, I accept all suggestions to improve writeups.
Disclaimer bis: I translated the citations of this writeup. If you ask, I can add the original French version.