We next review research topics closely aligned with explainable deep learning. A survey, visualized in Figure 3, identifies four broad related classes of research. Work on learning mechanism (Section 3.1) investigates the backpropagation process to establish a theory around weight training. These studies, in some respects, try to establish a theory to explain how and why DNNs converge to some decision-making process. Research on model debugging (Section 3.2) develops tools to recognize and understand the failure modes of a DNN. It emphasizes the discovery of problems that limit the training and inference process of a DNN (e.g., dead ReLUs, mode collapse, etc.). Techniques for adversarial attack and defense (Section 3.3) search for differences between regular and unexpected activation patterns. This line of work promotes deep learning systems that are robust and trustworthy; traits that also apply to explainability. Research on fairness and bias in DNNs (Section 3.4) is related to the ethics trait discussed above, but more narrowly concentrates on ensuring DNN decisions do not over-emphasize undesirable input data features. We elaborate on the connection between these research areas and explainable DNNs next.
正在翻譯中..