Many terms have no particular semantic meaning. They are often called empty words (at least, they are called mots vides in French). These terms have particular characteristics: they have a high occurrence, and their influences (in the sense of Concept Network influence) on other terms are weak.

For example, the table below show the 10 title words the most occurring in the database (that used for BAsCET, see Building the logical part of a Concept Network representing bibliographic references). Except Recognition and document, that carry the domain of the base, those are joining words, mainly English ones, in spite of the presence of the word de (the English/French proportion is nearer to 1 than to 0, in this base). The words having the less potential links (before threshold-filtering) compared to their occurrence are "full" words, or not empty words, when the other ones have more links.

Table: The 10 more occurring words

     +-------------+-----------+-----------------+-----------------+
     |Words        |Occurrence |#potential links |#links/occurrence|
     +-------------+-----------+-----------------+-----------------+
     |of           |  253      |  732            |  2.89           |
     |Recognition  |  218      |  489            |  2.24           |
     |for          |  170      |  572            |  3.36           |
     |de           |  164      |  540            |  3.29           |
     |and          |  155      |  590            |  3.80           |
     |A            |  130      |  399            |  3.06           |
     |in           |   91      |  397            |  4.36           |
     |Document     |   82      |  207            |  2.52           |
     |the          |   72      |  363            |  5.04           |
     |to           |   67      |  325            |  4.85           |
     +-------------+-----------+-----------------+-----------------+

Log in or register to write something here or to contact authors.