- Kosuke Nakago

2017-12-18 11:40:20

* Japanese blog is also written here.

We released **Chainer Chemistry**, a Chainer [1] extension to train and run neural networks for tasks in biology and chemistry.

- Github page: https://github.com/pfnet-research/chainer-chemistry
- Documentation: https://chainer-chemistry.readthedocs.io

The library helps you to easily apply deep learning on molecular structures.

For example, you can apply machine learning on toxicity classification tasks or HOMO (highest occupied molecular orbital) level regression task with compound input.

The library was developed during the PFN 2017 summer internship, and part of the library has been implemented by an internship student, Hirotaka Akita at Kyoto University.

The recently proposed Graph Convolutional Network (Refer below for detail) opened the door to apply deep learning on “graph structure” input, and the Graph Convolution Networks are currently an active area of research. We implemented several Graph Convolution Network architectures, including the network introduced in this year’s paper.

The following models are implemented:

- NFP: Neural Fingerprint [2, 3]
- GGNN: Gated-Graph Neural Network [4, 3]
- WeaveNet: Molecular Graph Convolutions [5, 3]
- SchNet: A continuous-filter convolutional Neural Network [6]

Various datasets can be used with a common interface with this library. Also, some research datasets can be downloaded automatically and preprocessed.

The following datasets are supported:

- QM9 [7, 8]: dataset of organic molecular structures with up to nine C/O/N/F atoms and their computed physical property values. The values include HOMO/LUMO level and internal energy. The computation is B3LYP/6-31G level of quantum chemistry.
- Tox21 [9]: dataset of toxicity measurements on 12 biological targets

We provide example code for training models and inference. You can easily try training/inference of the models implemented in this library for quick start.

In the new material discovery/drug discovery field, simulation of molecule behavior is important. When we need to take quantum effects into account with high precision, DFT (density functional theory) is widely used. However it requires a lot of computational resources especially for big molecules. It is difficult to apply simulation on many molecule structures.

There is a different approach from the machine learning field: learn the data measured/calculated in previous experiments, and predict the unexperimented molecule’s chemical property. The neural network may calculate the prediction faster than the quantum simulation.

An important question is how to deal with the input/output of compounds in order to apply deep learning. The main problem is that all molecular structures have variable numbers of atoms, represented as different graph structures, while conventional deep learning methods deal with a fixed size/structured input.

However “Graph Convolutional Neural Network” is proposed to deal with graph structure for input.

Convolutional Neural Networks introduce “convolutional” layers which applies a kernel on local information in an image. It shows promising results on many image tasks, including classification, detection, segmentation, and even image generation tasks.

Graph Convolutional Neural Networks introduce a “graph convolution” operation which applies a kernel among the neighboring nodes on the graph, to deal with graph structure.

Its application is not limited to molecule structure. “Graph structures” can appear in many other fields, including social networks, transportation etc, and the research of graph convolutional neural network applications is an interesting topic. For example, [10] applied graph convolution on image, [11] applied it on knowledge base, [12] applied it on traffic forecasting.

- Deep learning researchers

This library provides latest Graph Convolutional Neural Network implementation

Graph Convolution application is not limited to Biology & Chemistry, but various kinds of fields. We would like many people to use this library. - Material/drug discovery researchers

The library enables the user to build their own model to predict various kinds of chemical properties of a molecule.

This library is still a beta version, and in active development. We would like to support the following features:

- Provide pre-trained models for inference
- Add more datasets
- Implement more networks

We prepared a Tutorial to get started with this library, please try and let us know if you have any feedback.

[1] Tokui, S., Oono, K., Hido, S., & Clayton, J. (2015). Chainer: a next-generation open source framework for deep learning. In Proceedings of workshop on machine learning systems (LearningSys) in the twenty-ninth annual conference on neural information processing systems (NIPS) (Vol. 5).

[2] Duvenaud, D. K., Maclaurin, D., Iparraguirre, J., Bombarell, R., Hirzel, T., Aspuru-Guzik, A., & Adams, R. P. (2015). Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems (pp. 2224-2232).

[3] Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural message passing for quantum chemistry. arXiv preprint arXiv:1704.01212.

[4] Li, Y., Tarlow, D., Brockschmidt, M., & Zemel, R. (2015). Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493.

[5] Kearnes, S., McCloskey, K., Berndl, M., Pande, V., & Riley, P. (2016). Molecular graph convolutions: moving beyond fingerprints. Journal of computer-aided molecular design, 30(8), 595-608.

[6] Kristof T. Schütt, Pieter-Jan Kindermans, Huziel E. Sauceda, Stefan Chmiela, Alexandre Tkatchenko, Klaus-Robert Müller (2017). SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. arXiv preprint arXiv:1706.08566

[7] L. Ruddigkeit, R. van Deursen, L. C. Blum, J.-L. Reymond, Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17, J. Chem. Inf. Model. 52, 2864–2875, 2012.

[8] R. Ramakrishnan, P. O. Dral, M. Rupp, O. A. von Lilienfeld, Quantum chemistry structures and properties of 134 kilo molecules, Scientific Data 1, 140022, 2014.

[9] Huang R, Xia M, Nguyen D-T, Zhao T, Sakamuru S, Zhao J, Shahane SA, Rossoshek A and Simeonov A (2016) Tox21 Challenge to Build Predictive Models of Nuclear Receptor and Stress Response Pathways as Mediated by Exposure to Environmental Chemicals and Drugs. Front. Environ. Sci. 3:85. doi: 10.3389/fenvs.2015.00085

[10] Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst (2016), Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering, NIPS 2016.

[11] Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling (2017) Modeling Relational Data with Graph Convolutional Networks. arXiv preprint arXiv: 1703.06103

[12] Yaguang Li, Rose Yu, Cyrus Shahabi, Yan Liu (2017) Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. arXiv preprint arXiv: 1707.01926

- doipfn

2017-11-30 11:00:05

Preferred Networks, Inc. has completed ImageNet training in 15 minutes [1,2]. This is the fastest time to perform a 90-epoch ImageNet training ever achieved. Let me describe the MN-1 cluster used for this accomplishment.

Preferred Networks’ MN-1 cluster started operation this September [3]. It consists of 128 nodes with 8 NVIDIA P100 GPUs each, for 1024 GPUs in total. As each GPU unit has 4.7 TFLOPS in double precision floating point as its theoretical peak, the total theoretical peak capacity is more than 4.7 PFLOPS (including CPUs as well). The nodes are connected with two FDR Infiniband links (56Gbps x 2). PFN has exclusive use of the cluster, which is located in an NTT datacenter.

On the TOP500 list published in this November, the MN-1 cluster is listed as the 91st most powerful supercomputer, with approx. 1.39PFLOPS maximum performance on the LINPACK benchmark[4]. Compared to traditional supercomputers, MN-1’s computation efficiency (28%) is not high. One of the performance bottlenecks is the interconnect. Unlike typical supercomputers, MN-1 is connected as a thin tree (compared to a fat tree). A group of sixteen nodes is connected to a pair of redundant infiniband switches. In the cluster, we have eight groups, and links between groups are aggregated in a redundant pair of infiniband switches. Thus, if a process needs to communicate with different group, the link between groups becomes a bottleneck, which lowers the LINPACK benchmark score.

However, as stated at the beginning of this article, MN-1 can perform ultra-fast Deep Learning (DL). This is because ChainerMN does not require bottleneck-free communication for DL training. While training, ChainerMN collects and re-distributes parameter updates between all nodes. In the 15-minute trial, we used the ring allreduce algorithm. With the ring allreduce algorithm, nodes communicate with their adjacent node in the ring topology. The accumulation is performed on the first round, and the accumulated parameter update is distributed on the second round. Since we can make a ring without hitting the bottleneck on full duplex network, MN-1 cluster can efficiently finish the ImageNet training in 15 minutes with 1024 GPUs.

[1] https://arxiv.org/abs/1711.04325

[2] https://www.preferred-networks.jp/en/news/pr20171110

- jethrotan

2017-11-06 10:30:04

**Writers:** Ryoma Kawajiri, Jethro Tan

Preferred Networks (PFN) attended the 30th IEEE/RSJ IROS conference held in Vancouver, Canada. IROS is known to be the second biggest robotics conference in the world after ICRA (see here for our report on this year’s ICRA) with 2797** **total registrants, 2164 submitted papers (of which 970 were accepted amounting to an acceptance rate of 44.82%). With no less than 18 sessions being held in parallel, our members had a hard time to decide which ones to attend.

- hido

2017-10-18 07:44:24

This summer, Preferred Networks accepted a record number of interns in Tokyo from all over the world. They tackled challenging tasks around artificial intelligence together with PFN mentors. We appreciate their passion, focus, and designation to the internship.

In this post, we would like to share some of their great jobs (more to come).

- hido

2017-09-11 16:29:13

This is a guest post in an interview style with Weihua Hu, a former intern at Preferred Networks last year from University of Tokyo, whose research has been extended after the internship and accepted at ICML 2017.

“Learning Discrete Representations via Information Maximizing Self-Augmented Training,” Weihua Hu, Takeru Miyato, Seiya Tokui, Eiichi Matsumoto, and Masashi Sugiyama; Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1558-1567, 2017. (Link)

- Yuta Kikuchi

2017-09-08 13:54:22

**Writers:** Yuta Kikuchi, Sosuke Kobayashi

Preferred Networks (PFN) attended the 55th Annual Meeting of the

Association for Computational Linguistics (ACL 2017) in Vancouver, Canada. ACL is one of the largest conferences in the Natural Language Processing (NLP) field.

As in other Machine Learning research fields, use of deep learning in NLP is increasing. The most popular topic in NLP deep learning is sequence-to-sequence learning tasks. This model receives a sequence of discrete symbols (words) and learns to output a correct sequence conditioned by the input.

- hido

2017-09-04 14:49:39

Preferred Networks proudly sponsored an exciting two-day event, Deep Reinforcement Learning Bootcamp, which was held August 26-27th at UC Berkeley.

The instructors of this event included famous researchers in this field, such as Vlad Mnih (DeepMind, creator of DQN), Pieter Abbeel (OpenAI/UC Berkeley), Sergey Levine (Google Brain/UC Berkeley), Andrej Karpathy (Tesla, head of AI), John Schulman (OpenAI) and up-and-coming researchers such as Chelsea Finn, Rocky Duan, and Peter Chen (UC Berkeley).

more »- Brian Vogel

2017-08-25 10:09:00

Preferred Networks (PFN) attended the International Conference on Machine Learning (ICML) in Sydney, Australia. The first ICML was held in 1980 in Pittsburgh, last year’s conference was in New York, and the 2018 ICML will be held in Stockholm. ICML is one of the largest machine learning conferences, with approximately 2400 people attending this year. There were 434 accepted submissions spanning nearly all areas of machine learning.

- Tommi Kerola

2017-08-17 10:00:50

**Writers:** Richard Calland, Tommi Kerola

Preferred Networks (PFN) attended the CVPR 2017 conference in Honolulu, U.S., one of the flagship conferences for discussing research and applications in computer vision and pattern recognition. Computer vision is of major importance for our activities at PFN, including applications for autonomous driving, robotics, and of course products such as PaintsChainer. Modern computer vision is largely based on deep learning, which is relevant for our continued research and product development. In this blog post, we will briefly summarize trends from this conference, focusing on a few papers relevant to each topic.

- Yusuke Niitani

2017-08-14 11:07:10

We released ChainerCV: a utility library for computer vision in deep learning. This library aims at making the process of training and applying deep learning models for computer vision easier using Chainer. It contains high quality implementations of computer vision models, and tools that are necessary to conduct research in this field.

GitHub page: https://github.com/chainer/chainercv

Documentation: http://chainercv.readthedocs.io/en/stable/