Semantic Coupling Between Classes: Corpora or Identifiers?

Nemitari Ajienka, Andrea Capiluppi

Research output: Chapter in Book/Report/Conference proceedingConference proceeding (ISBN)peer-review

8 Citations (Scopus)
147 Downloads (Pure)


Context: Conceptual coupling is a measure of how loosely or closely related two software artifacts are, by considering the semantic information embedded in the comments and identifiers. This type of coupling is typically evaluated using the semantic information from source code into a words corpus. The extraction of words corpora can be lengthy, especially when systems are large and many classes are involved. Goal: This study investigates whether using only the class identifiers (e.g., the class names) can be used to evaluate the conceptual coupling between classes, as opposed to the words corpora of the entire classes. Method: In this study, we analyze two Java systems and extract the conceptual coupling between pairs of classes, using (i) a corpus-based approach; and (ii) two identifier-based tools. Results: Our results show that measuring the semantic similarity between classes using (only) their identifiers is similar to using the class corpora. Additionally, using the identifiers is more efficient in terms of precision, recall, and computation time. Conclusions: Using only class identifiers to measure their semantic similarity can save time on program comprehension tasks for large software projects; the findings of this paper support this hypothesis, for the systems that were used in the evaluation and can also be used to guide researchers developing future generations of tools supporting program comprehension.
Original languageEnglish
Title of host publication10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2016
Number of pages6
ISBN (Electronic)9781450344272
Publication statusE-pub ahead of print - 30 Nov 2016
EventACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) - Ciudad Real, Spain
Duration: 8 Sept 20169 Sept 2016

Publication series

NameInternational Symposium on Empirical Software Engineering and Measurement
ISSN (Print)1949-3770
ISSN (Electronic)1949-3789


ConferenceACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
CityCiudad Real


  • Corpora
  • Corpus
  • Latent Semantic Indexing (LSI)
  • Object-oriented software (OO)
  • Open-source software (OSS)
  • Semantic coupling
  • Semantic similarity
  • Vector Space Model (VSM)


Dive into the research topics of 'Semantic Coupling Between Classes: Corpora or Identifiers?'. Together they form a unique fingerprint.

Cite this