Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
x-ray_diffraction_pattern_classification_using_convolutional_neural_networks_and_some_brutal_force [2016/08/05 09:58]
zhiyuan1
x-ray_diffraction_pattern_classification_using_convolutional_neural_networks_and_some_brutal_force [2016/08/05 10:41] (current)
zhiyuan1
Line 1: Line 1:
-This article discusses a simple experiment on classifying experimental X-ray diffraction patterns. Note that methods described here are not polished, so anyone following the steps below should double-check the results.+**This article discusses a simple experiment on classifying experimental X-ray diffraction patterns.** Note that methods described here are not polished, so anyone following the steps below should double-check the results.
  
-//Working Environment:​ CentOS 6.7; Python 3.5.2 in Anaconda3; Theano 0.8.2; \\  +//Working Environment:​ CentOS 6.7; Python 3.5.2 in Anaconda3; Theano 0.8.2; ​Keras, version unknown ​\\  
-Author: Zhiyuan Yang  zyang96@ucla.edu // +Author: Zhiyuan Yang ​zyang96@ucla.edu // 
-==== Data Preparation and Templates Creation ====+====  Templates Creation ​and Data Preparation ​====
 Suppose initial experiment data are stored in a hdf5 file, say ''​data.h5''​. Within it lies a single 3-dimensional dataset ''​patterns''​. Suppose initial experiment data are stored in a hdf5 file, say ''​data.h5''​. Within it lies a single 3-dimensional dataset ''​patterns''​.
 <​code>​ <​code>​
Line 14: Line 14:
 (an unsupervised clustering approach can be adopted here, and was described in this article: https://​www.osapublishing.org/​oe/​abstract.cfm?​uri=oe-19-17-16542 . However, with human labor it is much more accurate, and should result in higher accuracy with the deep learning approach described latter, in return.) (an unsupervised clustering approach can be adopted here, and was described in this article: https://​www.osapublishing.org/​oe/​abstract.cfm?​uri=oe-19-17-16542 . However, with human labor it is much more accurate, and should result in higher accuracy with the deep learning approach described latter, in return.)
  
-Suppose ​ about 200 single-hit patterns are selected, and are stored in a hdf5 dataset ''​clean_single_hits''​. The next step is to generate more templates of sinlge-hit patterns to compare with the rest of the whole dataset. This can be achieved with some basic python routines (note again that codes here are not optimized, ​ :+Suppose ​ about 200 single-hit patterns are selected, and are stored in a hdf5 dataset ''​clean_single_hits''​. The next step is to generate more templates of sinlge-hit patterns to compare with the rest of the whole dataset. This can be achieved with some basic python routines (note again that codes here are not optimized, ​and one can improve them for better performance) ​:
 <​code>​ <​code>​
 import h5py import h5py
Line 21: Line 21:
 import numpy.linalg as linalg import numpy.linalg as linalg
 from sklearn.preprocessing import scale from sklearn.preprocessing import scale
-import csv 
-from operator import itemgetter 
  
-def generate_templates(pattern): ​for each patterngenerate 36 templates, each from  +def preprocessing(x):​ 
-    l = np.zeros([36,​260,​257]) ​ # rotating the initial image about a certain degree+    x[x<0] = 0 
 +    x *= 255.0/​x.max() 
 +     
 +def generate_templates(pattern): ​ 
 +'''​for each pattern generate 36 templates, each from  
 +rotating the initial image about a certain degree 
 +'''​ 
 +    l = np.zeros([36,​260,​257]) ​
     for i in range(36):     for i in range(36):
         rotated = sci.rotate(pattern,​ angle=(5*i),​ reshape=False)         rotated = sci.rotate(pattern,​ angle=(5*i),​ reshape=False)
Line 39: Line 44:
     return total     return total
     ​     ​
-def smallest_value(image,​ templates):​ +def smallest_value(image,​ templates): ​ 
-    diffs = []+'''​returns the smallest L2 norm of the "​difference matrix",​ 
 +which serves as a loose measurement of how different (or similar) 
 +the new data is to all the generated templates. 
 +'''​ 
 +    diffs = []          
     num = len(templates)     num = len(templates)
     for i in range(num):     for i in range(num):
Line 49: Line 58:
     return float(diffs[0])     return float(diffs[0])
 </​code>​ </​code>​
 +Use the functions above to preprocess the data, generate templates and produce a sorted list of L2 norms of new, unprocessed data. Those in the front of the list are ones that are most similar to our templates (so most likely to be also clean single hits). ​
 +
 +Then, you can partition the new data into several groups. You can either leave them as hdf5 datasets, or you can all images into different folders. ​ It should be something like \\ 
 +  .\training_set
 +      .\single
 +      .\multiple
 +   ​.\validation_set
 +     ​.\single
 +     ​.\multiple
 +   ​.\data_to_predict
 +
 +**I suggest the latter approach, since it's easier to do real-time data augmentation in Keras this way.**
 +====  Build Convolutional Neural Network Model to Train and Predict the Data ====
 +Once you are done preparing data, you can start building the convolutional neural network. The one I am using here is
 +not exactly "​deep",​ but it should be enough for a relatively small dataset like this.
 +To build the network I also used Keras, which is a higher-level api built upon both Theano and Tensorflow for cleaner code structure. **Tensorflow is the preferred backend since it has better multi-gpu support, but on an operating system as old as CentOS 6, it is almost impossible to get it installed.**
 +<​code>​
 +from keras.models import Sequential
 +from keras.preprocessing.image import ImageDataGenerator
 +from keras.layers import Convolution2D,​ MaxPooling2D,​ ZeroPadding2D
 +from keras.layers import Activation, Dropout, Flatten, Dense, BatchNormalization
 +
 +# data augmentation methods ​                                                                    
 +train_datagen = ImageDataGenerator(
 +    rotation_range=40,​
 +    width_shift_range=0.2,​
 +    height_shift_range=0.2,​
 +    rescale=1./​255,​
 +    shear_range=0.2,​
 +    zoom_range=0.4,​
 +    horizontal_flip=True,​
 +    fill_mode='​nearest'​)
 +                                    ​
 +test_datagen = ImageDataGenerator() # do nothing for validation set.
 +# network architecture ​                                                                         ​
 +model = Sequential()
 +model.add(Convolution2D(32,​ 3, 3, input_shape=(3,​ 256, 256)))
 +#​model.add(BatchNormalization()) ​                                                               ​
 +model.add(Activation('​relu'​))
 +model.add(Dropout(0.5))
 +model.add(MaxPooling2D(pool_size=(2,​ 2)))
 +
 +model.add(Convolution2D(32,​ 3, 3))
 +#​model.add(BatchNormalization()) ​                                                               ​
 +model.add(Activation('​relu'​))
 +#​model.add(Dropout(0.5)) ​                                                                       ​
 +model.add(MaxPooling2D(pool_size=(2,​ 2)))
 +
 +model.add(Convolution2D(64,​ 3, 3))
 +#​model.add(BatchNormalization()) ​                                                               ​
 +model.add(Activation('​relu'​))
 +model.add(MaxPooling2D(pool_size=(2,​ 2)))
 +model.add(ZeroPadding2D((1,​1)))
 +
 +model.add(Convolution2D(64,​ 3, 3))
 +#​model.add(BatchNormalization()) ​                                                               ​
 +model.add(Activation('​relu'​))
 +model.add(Dropout(0.5))
 +model.add(MaxPooling2D(pool_size=(2,​ 2)))
 +
 +model.add(Flatten()) ​ # this converts our 3D feature maps to 1D feature vectors ​                
 +model.add(Dense(64))
 +#​model.add(BatchNormalization()) ​                                                               ​
 +model.add(Activation('​relu'​))
 +model.add(Dropout(0.5))
 +
 +model.add(Dense(1))
 +model.add(Activation('​sigmoid'​))
 +
 +model.compile(loss='​binary_crossentropy',​
 +              optimizer='​rmsprop',​
 +              metrics=['​accuracy'​])
 + </​code>​
 +
 +My approach uses images stored in different directories:​
 +<​code>​
 +# import data                                                                                   
 +train_generator = train_datagen.flow_from_directory(
 +    '​./​training_set',​
 +    target_size=(256,​256),​
 +    batch_size = 32,
 +    class_mode ='​binary'​)
 +
 +test_generator =  test_datagen.flow_from_directory(
 +    '​./​validation_set',​
 +    target_size=(256,​256),​
 +    batch_size = 32,
 +    class_mode='​binary'​)
 + </​code>​
 +Then, we can train our model and save the trained weights:
 +<​code>​
 +model.fit_generator(
 +    train_generator,​
 +    samples_per_epoch=4137,​
 +    nb_epoch=10,​
 +    verbose=1,
 +    validation_data=test_generator,​
 +    nb_val_samples= 1098)
 +    #​callbacks=callbacks) ​                 ​
 +                                                         
 +model.save_weights('​weights.h5'​)
 +</​code>​
 +My particular run resulted in a validation accuracy of about 93%, which is not bad for a data set small like mine.
 +
 +You can now use the model and saved weights to make predictions on new experiment data, and save the prediction labels:
 +<​code>​
 +# suppose y is a numpy array with dimension [number_of_samples,​ color_channels,​ height, width]
 +# if using Tensorflow backend, it should be  [number_of_samples,​ height, width, ​ color_channels]
 +y = model.predict(images,​ verbose=1) ​
 +
 +f = h5py.File('​predictions.h5',​ '​w'​)
 +f.create_dataset('​predictions',​ data=y)
 +f.close()
 +</​code>​
 +
 +A hdf5 file named ''​predictions.h5''​ is generated with prediction labels for the new experiment data. You can extract images
 +accordingly to manually check the result.
 +
 +====  Some Thoughts ====
 +One should notice that the method here is very unsophisticated,​ and has several limitations such as computational power restriction and time efficiency issues. However, this is a practical method that did work for me, and should work for many single-particle diffraction pattern classification problems.

Richard Feynman

“everything that is living can be understood in terms of the jiggling and wiggling of atoms”.

and now, we want to watch atoms jiggling and wiggling.

X-rays, electrons, fluorescence light, the advances of photon sciences, together with computational modeling, are making this happen.