Interpretable deep learning for guided structure-property

Interpretable deep learning for guided

structure-property explorations in photovoltaics

Balaji Sesha Sarath Pokuri

Department of Mechanical Engineering

Iowa State University

Ames, IA 50011

[email protected]

Sambuddha Ghosal

Department of Mechanical Engineering,

Department of Computer Science

Iowa State University

Ames, IA 50011

[email protected]

Apurva Kokate

Department of Computer Science,

Iowa State University

Ames, IA 50011

[email protected]

Baskar Ganapathysubramanian

Department of Mechanical Engineering

Iowa State University

Ames, IA 50011

[email protected]

Soumik Sarkar

Department of Mechanical Engineering

Iowa State University

Ames, IA 50011

[email protected]

Abstract

The performance of an organic photovoltaic device is intricately connected to its

active layer morphology. This connection between the active layer and device

performance is very expensive to evaluate, either experimentally or computationally.

Hence, designing morphologies to achieve higher performances is non-trivial and

often intractable. To solve this, we ﬁrst introduce a deep convolutional neural

network (CNN) architecture that can serve as a fast and robust surrogate for

the complex structure-property map. Several tests were performed to gain trust

in this trained model. Then, we utilize this fast framework to perform robust

microstructural design to enhance device performance.

1 Introduction

Mapping microstructure to macro-scale property of materials and controlling such relationships has

been an important theme of modern materials research [

]. Despite the investment of millions of

research hours and dollars [

], this largely remains elusive to the research community. A major reason

is that the map from microstructure (domain) to property (co-domain) is inherently non-surjective

and highly non-linear, making incremental judgments based on incremental modiﬁcations infeasible.

Added to this is the inﬁnite dimensionality of microstructure space, which makes the description of

the domain in-exhaustible. Hence, estimating the property of one morphology from another visually

similar morphology is not trivial – each morphology will require a full scale experiment/simulation

for reliable quantiﬁcation. In this paper, we discuss one such problem and present a solution using

modern machine learning paradigms. More speciﬁcally, we look at solving the relationship between

active layer microstructure and performance of an Organic Solar Cell (OSC). OSCs utilize a bulk

heterojunction morphology in the active layer to efﬁciently convert incident solar radiation into

Workshop on Machine Learning for Molecules and Materials, (NIPS 2018), Montréal, Canada.

electrical energy [

]. The total power conversion efﬁciency is very intricately related to this material

distribution if acceptor and donor polymers in the active layer (See Sec. 2). This relationship has

several contradicting and competing factors. For example, larger interfacial area between acceptor

and donor regions enables higher production of charges, but simultaneously also increases the loss of

charges through recombination. While larger domains can help transport charges faster, they also

impede excitons from dissociating into charges. These contradictory features make this an exciting

area of research, ﬁnally culminating in the search for the most efﬁcient microstructures for solar

energy conversion.

Traditionally, this structure-property map was computationally resolved either through a detailed

and tedious PDE simulation [

] or Kinetic Monte Carlo simulation [

] or through

identiﬁcation of surrogate descriptors of performance [

]. While the ﬁrst method is robust and

accurate, it is limited by the size of microstructure. Often it requires massive, robust, and well

established computational resources along with a perfectly scaling computational model to solve

for even simple representations of morphologies. Hence such analysis was often limited to 2D

microstructure quantiﬁcation, although 3D quantiﬁcation is not uncommon [

]. The second method

overcomes most of these computational challenges. Typically intuitive features of the microstructure

such as domain sizes and orientations, connectivities and islands are quantiﬁed (using graphs) instead

of performing a full scale PDE simulation. This technique is intuitive and requires substantially

less computational resources (although memory requirements are higher). Subsequently, it enables

simultaneous parallel analysis of multiple morphologies at once. Moreover, close to linear dependence

of several stage efﬁciencies on these microstructural descriptors was also shown [

]. However, since

the dimensionality of the domain is inﬁnite, the intuitive metrics tend to be a limited description of

the bigger reality.

In this work, we take the approach called DLSP (Deep Learning based Structure-Property exploration)

involving machine learning to solve this complex map of microstructure to performance (depicted

in Fig. 1). Modern machine learning has been shown to be a powerful tool to learn complex

phenomena [

], especially image recognition [

]. This technique has the potential to identify

features in the microstructure that are not intuitive to the simple eye. By posing the task of enumerating

microstructure informed performance as an image recognition problem, we can easily and conﬁdently

extend current implementations of neural networks to this problem. Furthermore, interpreting the

trained network can give both trust in training as well as insights into the most impactful features of a

microstructure, thereby enabling researchers to tailor processes to create such features.

Figure 1: DLSP framework: We construct a forward map from morphology to performance. Upon

building trust in this trained model, it was further used for performing manual and automated design.

The rest of the paper is organised as follows: In Sec. 2, we discuss the physics behind the problem of

morphology design. Subsequently, a short discussion on the methods for data generation and labelling

is discussed. The details of network architecture and other machine learning parameters are discussed

in Sec. 3. In Sec. 4, the results of training, validation and testing on in-sample and out-of-sample

morphologies and performance interpretations are presented. The beneﬁts of such fast interpretable

forward surrogate model are demonstrated through the design of property-maximized-microstructure

in Sec. 5. Finally in Sec. 6, key takeaways from this work and future directions are discussed.

2 Physics of structure-property explorations in photovoltaics

2.1 Organic Photovoltaics

Organic photovoltaic devices are energy harvesting devices which employ organic materials for

solar energy conversion. These provide multiple advantages over traditional silicon based cells, like

ﬂexibility, transparency and ease of manufacturability. They however are limited by their efﬁciency

of operation. Although major breakthroughs in processing and materials have improved the efﬁciency

drastically, they still lag behind the traditional photovoltaics.

The efﬁciency of these devices is intricately dependant on the material distribution/morphology in the

active layer. The active layer generally is a bulk hetero-junction, enabling multiple sites for charge

generation. Several features of the morphology have different roles in the process of converting solar

energy. The ability to change these morphological features by changing the processing protocol is a

major source of control in these devices.

There are several stages during the solar power conversion in an OPV. Firstly, the incident solar

energy generates excitons in the donor phase. These excitons are highly unstable and need to diffuse

to a nearest interface with the acceptor material to separate into positive and negative charges. This

diffusion to the interface is critical to evaluate the efﬁciency of absorption of incident light. The

dissociation of excitons to form charges depends on the nature of interface and the materials in the

interface. For example, interfaces with non-aligned crystal boundaries show lower dissociation than

those with aligned crystals . In the next stage, these charges (positive charge in the donor and negative

charge in the acceptor) need to be drifted to the respective electrode to produce electricity. Usually,

this drift is provided by the potential difference between the two electrodes. However, these charges

also encounter other interfaces which have pairs of positive and negative charges, leading to potential

recombination. In summary, the total efﬁciency of the active layer involves exciton production, charge

separation and charge transportation efﬁciencies.

In this context, quantifying the dependence of the device as well as stage efﬁciencies becomes a

critical part in developing strategies to design processing conditions. It can already be seen that the

role of morphology cannot be over-estimated in the power conversion efﬁciency. Hence strategies

were developed to quantify the efﬁciencies these morphologies. While these techniques are robust

and rigorous, they are expensive and time intensive. This makes them infeasible for further designing

morphologies, which often requires several quantiﬁcations. So, we turn to modern fast methods of

quantifying data, especially images. We represent the morphologies as images and take advantage of

the deep convolutional neural networks to do performance based classiﬁcation.

2.2 Data generation and quantiﬁcation

In order to train the network (to be described in Sec. 3), we generate a dataset of microstructural

images using the Cahn-Hilliard equation. This process generates time series of images that can be

treated as independent images for the sake of training. This method helps to quickly produce several

thousands of images within a very short amount of time. Previous analysis using these images can

be found in [

]. A characteristic of this procedure for generating morphologies through simulation

is their similarity to morphologies in real active layers produced during thermal annealing. In the

morphology, the domains are similar in size and have smooth interface contours. These characteristics

will also help us to build trust in the training process by manually create morphologies breaking these

characteristics and testing the performance of the trained network. More implications of this will

be in discussed in Sec. 4. Using data augmentation techniques over the originally simulated data,

we ﬁnally produce a dataset of nearly

65, 000

(2D) morphologies. Each morphology is a gray-scale

image of size 101px × 101px.

Figure 2: Proposed CNN architecture. Note that this is much shallower and has less trainable

parameters compared to VGG-16 and ResNet-50.

These morphologies were then characterized using an in-house physics based simulator [

]. This

simulator uses steady state excitonic drift diffusion equation to model the processes of exciton

dissociation and charge transport. We use the short circuit current

as a means of labelling the

data. The whole data was divided into

classes which are equally spaced between the best(

7mA/cm

) and worst performing(J

= 0.2mA/cm

) in the data.

3 Proposed Deep Learning framework

Convolutional Neural Networks (CNN) are well-established architectures for classiﬁcation of images.

They have proved to perform outstandingly well in terms of efﬁciently extracting complex features

from images and function as a classiﬁcation technique when provided with sufﬁcient data [

17, 20, 8, 7]. Hence, to address the task above, we develop a CNN based architecture to classify the

morphologies. The data label, originally a continuous value, is discretized into

bins. The network

architecture is depicted in Fig. 2, with

1.2

million learning parameters. This architecture was taken

from [

], where the original aim was to determine steering angles for a car from the dash camera

image. The task is similar here also – both use images from time series data with predeﬁned label

to train. Our network is slightly shallower, to enable better training, enhanced with better feature

visualization. This is also to prevent problems that can arise with deeper networks – vanishing (or

exploding) gradients [

], which provide a hindrance to convergence, and the accuracy saturation with

increasing depth. Shallower networks also provide better feature visualizations – in this case, we use

saliency maps [

] to visualize learnt features(Sec. 4). The network was trained for approximately

epochs (with

18s

per epoch) with a learning rate of

0.0001

, on the

45, 000

-image training set to

reach the desired accuracy (

95.41%

). The crossentropy (categorical) loss (or cost) function along

with the Adam optimizer [13] was used to minimize the error.

Apart from this, we also tested two standard architectures with our dataset:

•

VGG-16 (learning parameters

million) , with learning rate of

0.0001

, batch size of

128

initialized with random weights was also trained on the training dataset, achieving a test

accuracy of 96.61% at epoch 70 (with 180s per epoch).

•

ResNet-

(learning parameters

million) , with learning rate of

0.0001

, batch size of

128

initialized with random weights was also trained on the training dataset, achieving a test

accuracy of 96.45% at epoch 70 (with 580s per epoch).

We can see that our custom (shallower) CNN, while with lower training parameters, performs very

similar to the established deep CNNs. Therefore, we defer the choice of selecting a "good" network

to be based on the learnt features and out-of-sample performance and not just the accuracy/f1-score

of the testing dataset.

4 Results

Figure 3 shows the confusion matrices for in-sample test data classiﬁcation. It had an accuracy of

95.41%

and F1-score of

94.45%

. From the confusion matrix, it can clearly be seen that most of the

classiﬁcation is correct, and those which are wrongly predicted are only off by one class. This wrong

prediction is not unexpected, as we are binning a continuous variable into non-overlapping classes.

The edge cases have the potential to be misclassiﬁed. It is also to be noted that other two standard

architectures show similar confusion matrices, with similar prediction accuracies.

Figure 3: Confusion matrix for insample test predictions. Notice the heavily diagonally dominant

matrix, indicating a very good classiﬁcation accuracy.

4.1 Out-of-sample testing

While it is a known fact in the machine learning community that neural networks in their training

phase can possibly overﬁt (depending on the model capacity, amount of training data and training

hyperparameters), we resort to two methods of checking the robustness of our trained network(s).

As noted earlier, the morphology data used for training is generated by solving a PDE. This inherits

certain properties to the data such as smooth contours and uniform domain sizes. Hence we try to

systematically break these assumptions about the dataset and see the performance of the network.

Firstly, we test the network on a columnar structure (Fig. 4b). This is a widely studied structure, often

considered as a high performing morphology. This is a completely out-of-sample data. This columnar

morphology also has several sharp interface contours, which are completely absent in the training

dataset. The results are in Fig. 4b. The actual

values from a full scale drift-diffusion simulation

(along with the corresponding true label) are also presented. We can see how the network is able to

predict the correct label corresponding to each of the columnar microstructures.

(a) Saliency map for in-sample

testing morphologies

(b) Saliency maps for out-of-sample morpholo-

gies

mized’ structures from [5]

Figure 4: Saliency maps and performance of our custom trained CNN. Note how the saliency maps

closely follow the interface regions in the microstructure. It should also be noted that the networks

shows good performance even on samples outside the training dataset.

Next, we also test our framework with the "high" performing morphologies described in [

]. These

morphologies were obtained through optimization strategies using graph based surrogate performance

descriptors. We can also see how our network has rightly identiﬁed (Fig. 4c) all these as high

performing class label 9.

4.2 Interpretability

We next query the network to understand the learnt features. For this, we use saliency maps [

] to

identify important features of the image input. Saliency map visualization is a visualization technique

that generates heat maps on test images that bring out (highlight) the regions (microstructure regions,

for our case) the trained CNN model focuses on to generate it’s classiﬁcation output. The heat-map

signiﬁes the aspect that regions highlighted more than the others carry a greater signiﬁcance towards

the model’s classiﬁcation output. Fig. 4 shows the saliency maps for morphologies in the data,

columnar structures and the "high" performing morphologies identiﬁed in [5].

We can see, in all Figs. 4a, 4b & 4c, how the network highlights interface between the acceptor and

donor regions. This is a vital test of the learning capabilities of the network. In terms of the underlying

physics, the interface is the most critical feature of performance. Properly conﬁgured interface enables

charge separation and hence better conversion. Poorly conﬁgured interface enables poor dissociation

and higher recombination, which depreciates conversion efﬁciency. Finally, interface further away

from the top and bottom electrodes are more critical, as the charges produced at these locations have

a higher chance of recombination. We can see from Fig. 4 how the network is able to ﬁnd the right

type of interface as critical to device performance.

Furthermore, we also observe from Fig. 5 that the saliency maps from the standard deep networks

(VGG-

and ResNet-

) are unable to locate these above features. Although, the test accuracy is

slightly higher than our custom network, we see that the saliency outputs do not provide us with

any understandable information. This observation is in line with [

], where it was shown that

deeper models are harder to explain than their shallower counterparts even though they may achieve

a higher classiﬁcation accuracy. These results signify the importance of tailoring architectures to the

application. Thus, for performing morphology design, we use the custom architecture as a surrogate

map from the microstructure space to the performance space.

Figure 5: Comparison of Saliency map outputs for our Custom Model (second column), VGG-net

(third column) and ResNet-50 (fourth column), with input images shown in the ﬁrst column: top row

shows an example image for class 0, bottom row shows an example image from omega morphologies

(correctly predicted as class 9 by our custom model)

5 Morphology design

Having developed a fast and trust-worthy surrogate map from the microstructures to the performance,

we can use it enable microstructural design. In this section, we show two separate techniques,

manual and automated, for designing microstructures. The goal of both these techniques is to explore

(uncanny) morphologies that demonstrate superior performance. Traditionally, this was generally

achieved through a conventional optimization strategy, like simulated annealing, where an initial

morphology is tweaked repeatedly to achieve superior performance. At every stage, the current

morphology is evaluated for its performance. Subsequently, the whole process requires several

computationally expensive evaluations and hence becomes time consuming. But, with the above

demonstrated framework, evaluating the morphology becomes signiﬁcantly faster and easier. Hence

it provides an very powerful way to quickly ’evolve’ morphologies to optimize performance.

5.1 Manual design

In order to enable manual exploration, we created a browser interface (Fig. 6) that enables the user to

interactively modify morphologies to both visualize and improve morphology performance. Using

this interface, the user can incrementally add changes to the initial morphology that can improve the

predicted performance. Since the performance assessment is done by the trained CNN, the whole

process happens real-time. Fig. 7 shows how one can modify images to include several features of

varying sizes, with the aim of improving performance. This tool can in turn help identify features of

morphology that affect the performance.

Figure 6: Browser interface for performing manual exploration and design.

(a) Class 1

(b) Class 2

(d) Class 9

(e) Class 3

(f) Class 9

(g) Class 2/3

Figure 7: Exploration by manual design: Notice how the trained network captures the underlying

physics – addition of non-interfering features improves performance. When a ‘blocking’ layer

was added in (e), the performance drops. Also note that increasing domain size (in (g)) leads

to performance drop, in conjunction with reduced exciton dissociation efﬁciency and increased

recombination.

(a) Iteration 10 (b) Iteration 30 (c) Iteration 50

Figure 8: Exploration by automated design: The optimization started with a bilayer structure. Notice

how the framework directs the formation of ﬁner features

5.2 Automated design

While the above interface may serve very well to understand the inﬂuence of morphological features

on performance, it is not feasible to perform a full scale morphology design. Manual exploration

limits the full exploration of the best performing morphology manifold. Thus, to fully explore this

space, we link this fast surrogate with a probabilistic optimization algorithm to ﬁnd those optimal

structures. More speciﬁcally we use a population based incremental learning (PBIL) approach to

model morphologies and evolve them to achieve optimum performance. PBIL estimates the explicit

probability distribution of the optimal morphology. The multi variate probability distribution is stored

as a probability matrix

of the 2D morphology, i.e., each pixel is associated with a probability. This

matrix

is updated as follows: the optimization starts with a given probability matrix, generally

based on the intuition of the researcher. Subsequently,

morphology instances are sampled around

this matrix

. For each realization, the DLSP is deployed to evaluate the performance,

, j ∈ [1, n]

Then

best samples (

< n

) are used to calculate,

, the probabilistic update matrix. Next,

the probability vector is updated according to

P = P · (1 − l

) + P

· l

, where

is the learning

rate. Intuitively, the update step reinforces features present in the best performing morphologies,

and dampens those missing. The algorithm terminates by standard criteria (iteration limits and

improvement bounds). Only the probability matrix is stored and multiple realizations’ evaluations

can embarrassingly parallel.

The results of PBIL optimization are very promising. Fig. 8 indicates structure at different scales,

mimicking ﬁnger-like fractal structures. These are very similar to the results presented in [5].

6 Conclusion

In this work, we address the issue of designing active layer morphologies to enhance device perfor-

mance, especially in OPV applications. The usual methods to quantify morphology are either too

costly or too simplistic. Hence we take a data-driven approach (DLSP) to create a morphology quan-

tiﬁer that can perform fast evaluations. We train a custom designed CNN that reads the morphology

and classiﬁes it into

bins of increasing performance metric

. Using out-of-sample datasets, we

conﬁrm that there is no severe over-ﬁtting issues during the training process. Two other standard

networks (VGG-16 and ResNet-50) were trained end-to-end independently. It was observed that

the custom network, although shallower, gave very similar accuracy. However, our custom network

performed much better when visualized using saliency maps as well as when tested on out-of-sample

datasets. It identiﬁed critical features of the interface in the morphology, which both VGG-16 and

ResNet-50 failed to identify consistently. The custom designed network is then used to perform

morphology design for achieving enhanced performance. Two approaches were taken to do this –

the ﬁrst one aims to inform the user about the effect of morphology on performance. The second

approach uses the trust-worthy network as a fast cost function and performs morphology optimization

using PBIL algorithm. It should be noted that this work serves as a proof of concept of using deep

neural networks for material morphology quantiﬁcation. It raises several other interesting questions

of how to integrate physical phenomena into the training process. Can these physics based intuitions

can be exploited to reduce the demand on the size of data for training? Can a most effective dataset

be created to reduce the training data? Can we make the training process more robust to adversarial

attacks? All these questions form the scope of future study.

References

[1]

Designing materials to revolutionize and engineer our future.

https://www.nsf.gov/

funding/pgm_summ.jsp?pims_id=505073. Accessed: 2018-11-01.

[2]

B. Alipanahi A. Delong, M.T. Weirauch and B.J. Frey. Predicting the sequence speciﬁcities of

dna-and rna-binding proteins by deep learning. Nature Biotechnology, 33(8):831–838.

[3]

M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel,

M. Monfort, U. Muller, J. Zhang, et al. End to end learning for self-driving cars. arXiv preprint

arXiv:1604.07316, 2016.

[4]

M. Casalegno, G. Raos, and R. Po. Methodological assessment of kinetic monte carlo simula-

tions of organic photovoltaic devices: The treatment of electrostatic interactions. The Journal of

chemical physics, 132(9):094705, 2010.

[5]

P. Du, A. Zebrowski, J. Zola, B. Ganapathysubramanian, and O. Wodo. Microstructure design

using graphs. npj Computational Materials, 4(1):50, 2018.

[6]

A. Esteva. Dermatologist-level classiﬁcation of skin cancer with deep neural networks. Nature,

542(7639):115–118, 2017.

[7]

S. Ghosal, A. Akintayo, P. Boor, and S. Sarkar. High speed video-based health monitoring using

3d deep learning. Dynamic Data-Driven Application Systems (DDDAS), 2017.

[8]

S. Ghosal, D. Blystone, A. K. Singh, B. Ganapathysubramanian, A. Singh, and S. Sarkar. An

explainable deep machine vision framework for plant stress phenotyping. Proceedings of the

National Academy of Sciences, 115(18):4613–4618, 2018.

[9]

X. Glorot and Y. Bengio. Understanding the difﬁculty of training deep feedforward neural

networks. AISTATS, 2010.

[10]

A. Nguyen T. Fuchs J. Yosinski, J. Clune and H. Lipson. Understanding neural networks

through deep visualization. arXiv preprint, 1506.06579, 2015.

[11]

S. Ju, T. Shiga, L. Feng, Z. Hou, K. Tsuda, and J. Shiomi. Designing nanostructures for phonon

transport via bayesian optimization. Physical Review X, 7(2):021024, 2017.

[12]

A. Vedaldi K. Simonyan and A. Zisserman. Deep inside convolutional networks: Visualising

image classiﬁcation models and saliency maps. arXiv preprint, 1312.6034, 2013.

[13]

D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint, 1412.6980,

2014.

[14]

H. K. Kodali and B. Ganapathysubramanian. Computer simulation of heterogeneous poly-

mer photovoltaic devices. Modelling and Simulation in Materials Science and Engineering,

20(3):035015, 2012.

[15]

R.A Marsh, C. Groves, and N. C. Greenham. A microscopic model for the behavior of

nanostructured organic photovoltaic devices. Journal of applied physics, 101(8):083509, 2007.

[16]

L. Meng, Y. Shang, Q. Li, Y. Li, X. Zhan, Z. Shuai, R. GE. Kimber, and A. B Walker. Dynamic

monte carlo simulation for highly efﬁcient polymer blend photovoltaics. The Journal of Physical

Chemistry B, 114(1):36–41, 2009.

[17]

V. Mnih. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533,

2015.

[18]

G. Montavon, W. Samek, and K. Müller. Methods for interpreting and understanding deep

neural networks. Digital Signal Processing, 2017.

[19]

B. Ray, M. S. Lundstrom, and M. A. Alam. Can morphology tailoring improve the open circuit

voltage of organic solar cells? Applied Physics Letters, 100(1):7, 2012.

[20]

D. Silver. Mastering the game of go with deep neural networks and tree search. Nature,

529(7587):484–489, 2016.

[21]

P. K. Watkins, A. B. Walker, and G. LB Verschoor. Dynamical monte carlo modelling of

organic solar cells: The dependence of internal quantum efﬁciency on morphology. Nano letters,

5(9):1814–1818, 2005.

[22]

O. Wodo, S. Tirthapura, S. Chaudhary, and B. Ganapathysubramanian. A graph-based formula-

tion for computational characterization of bulk heterojunction morphology. Organic Electronics,

13(6):1105–1113, 2012.

[23]

O. Wodo, J. Zola, B. S. S. Pokuri, P. Du, and B. Ganapathysubramanian. Automated,

high throughput exploration of process–structure–property relationships using the mapreduce

paradigm. Materials discovery, 1:21–28, 2015.

[24]

D.L. Yamins and J.J. DiCarlo. Using goal-driven deep learning models to understand sensory

cortex. Nature Neuroscience, 19(3):356, 2016.