Dissecting deep learning networks - visualizing mutual Information

Hui Fang, Victoria Wang, Motonori Yamaguchi

Research output: Contribution to journalArticle

3 Downloads (Pure)

Abstract

Deep Learning (DL) networks are recent revolutionary developments in artificial intelligence research. Typical networks are stacked by groups of layers that are further composed of many convolutional kernels or neurons. In network design, many hyper-parameters need to be defined heuristically before training in order to achieve high cross-validation accuracies. However, accuracy evaluation from the output layer alone is not sufficient to specify the roles of the hidden units in associated networks. This results in a significant knowledge gap between DL’s wider applications and its limited theoretical understanding. To narrow the knowledge gap, our study explores visualization techniques to illustrate the mutual information (MI) in DL networks. The MI is a theoretical measurement, reflecting the relationship between two sets of random variables even if their relationship is highly non-linear and hidden in high-dimensional data. Our study aims to understand the roles of DL units in classification performance of the networks. Via a series of experiments using several popular DL networks, it shows that the visualization of MI and its change patterns between the input/output with the hidden layers and basic units can facilitate a better understanding of these DL units’ roles. Our investigation on network convergence suggests a more objective manner to potentially evaluate DL networks. Furthermore, the visualization provides a useful tool to gain insights into the network performance and thus to potentially facilitate the design of better network architectures by identifying redundancy and less-effective network units.
Original languageEnglish
JournalEntropy
Publication statusAccepted/In press - 23 Oct 2018

Fingerprint

learning
artificial intelligence
output
random variables
redundancy
neurons
education
evaluation

Keywords

  • Deep Learning
  • Convolutional Neural Networks
  • Information theory
  • Mutual Information
  • Visualization

Cite this

Fang, H., Wang, V., & Yamaguchi, M. (Accepted/In press). Dissecting deep learning networks - visualizing mutual Information. Entropy.
Fang, Hui ; Wang, Victoria ; Yamaguchi, Motonori. / Dissecting deep learning networks - visualizing mutual Information. In: Entropy. 2018.
@article{1f8274138f4a4037aff0929a3a6c2565,
title = "Dissecting deep learning networks - visualizing mutual Information",
abstract = "Deep Learning (DL) networks are recent revolutionary developments in artificial intelligence research. Typical networks are stacked by groups of layers that are further composed of many convolutional kernels or neurons. In network design, many hyper-parameters need to be defined heuristically before training in order to achieve high cross-validation accuracies. However, accuracy evaluation from the output layer alone is not sufficient to specify the roles of the hidden units in associated networks. This results in a significant knowledge gap between DL’s wider applications and its limited theoretical understanding. To narrow the knowledge gap, our study explores visualization techniques to illustrate the mutual information (MI) in DL networks. The MI is a theoretical measurement, reflecting the relationship between two sets of random variables even if their relationship is highly non-linear and hidden in high-dimensional data. Our study aims to understand the roles of DL units in classification performance of the networks. Via a series of experiments using several popular DL networks, it shows that the visualization of MI and its change patterns between the input/output with the hidden layers and basic units can facilitate a better understanding of these DL units’ roles. Our investigation on network convergence suggests a more objective manner to potentially evaluate DL networks. Furthermore, the visualization provides a useful tool to gain insights into the network performance and thus to potentially facilitate the design of better network architectures by identifying redundancy and less-effective network units.",
keywords = "Deep Learning, Convolutional Neural Networks, Information theory, Mutual Information, Visualization",
author = "Hui Fang and Victoria Wang and Motonori Yamaguchi",
year = "2018",
month = "10",
day = "23",
language = "English",
journal = "Entropy",
issn = "1099-4300",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

}

Dissecting deep learning networks - visualizing mutual Information. / Fang, Hui; Wang, Victoria; Yamaguchi, Motonori.

In: Entropy, 23.10.2018.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Dissecting deep learning networks - visualizing mutual Information

AU - Fang, Hui

AU - Wang, Victoria

AU - Yamaguchi, Motonori

PY - 2018/10/23

Y1 - 2018/10/23

N2 - Deep Learning (DL) networks are recent revolutionary developments in artificial intelligence research. Typical networks are stacked by groups of layers that are further composed of many convolutional kernels or neurons. In network design, many hyper-parameters need to be defined heuristically before training in order to achieve high cross-validation accuracies. However, accuracy evaluation from the output layer alone is not sufficient to specify the roles of the hidden units in associated networks. This results in a significant knowledge gap between DL’s wider applications and its limited theoretical understanding. To narrow the knowledge gap, our study explores visualization techniques to illustrate the mutual information (MI) in DL networks. The MI is a theoretical measurement, reflecting the relationship between two sets of random variables even if their relationship is highly non-linear and hidden in high-dimensional data. Our study aims to understand the roles of DL units in classification performance of the networks. Via a series of experiments using several popular DL networks, it shows that the visualization of MI and its change patterns between the input/output with the hidden layers and basic units can facilitate a better understanding of these DL units’ roles. Our investigation on network convergence suggests a more objective manner to potentially evaluate DL networks. Furthermore, the visualization provides a useful tool to gain insights into the network performance and thus to potentially facilitate the design of better network architectures by identifying redundancy and less-effective network units.

AB - Deep Learning (DL) networks are recent revolutionary developments in artificial intelligence research. Typical networks are stacked by groups of layers that are further composed of many convolutional kernels or neurons. In network design, many hyper-parameters need to be defined heuristically before training in order to achieve high cross-validation accuracies. However, accuracy evaluation from the output layer alone is not sufficient to specify the roles of the hidden units in associated networks. This results in a significant knowledge gap between DL’s wider applications and its limited theoretical understanding. To narrow the knowledge gap, our study explores visualization techniques to illustrate the mutual information (MI) in DL networks. The MI is a theoretical measurement, reflecting the relationship between two sets of random variables even if their relationship is highly non-linear and hidden in high-dimensional data. Our study aims to understand the roles of DL units in classification performance of the networks. Via a series of experiments using several popular DL networks, it shows that the visualization of MI and its change patterns between the input/output with the hidden layers and basic units can facilitate a better understanding of these DL units’ roles. Our investigation on network convergence suggests a more objective manner to potentially evaluate DL networks. Furthermore, the visualization provides a useful tool to gain insights into the network performance and thus to potentially facilitate the design of better network architectures by identifying redundancy and less-effective network units.

KW - Deep Learning

KW - Convolutional Neural Networks

KW - Information theory

KW - Mutual Information

KW - Visualization

M3 - Article

JO - Entropy

JF - Entropy

SN - 1099-4300

ER -